赞
踩
在写带有子查询的句子时,为了阅读起来更容易,应该要按照正确的缩进方式,将内查询整体进行缩进。
问题:
对现有的这个的数据库进行分析,那些渠道平均每天发送给Patch&Posey的流量最大
解法:首先建立该问题的内查询:统计每个渠道每天的访问次数(count(*)),然后基于内查询的结果集进行外查询:计算访问次数的平均值avg().
- SELECT channel,AVG(event_count) AS avg_event_count
- FROM
- (SELECT DATE_TRUNC('day',occurred_at) as day,
- channel,
- COUNT(*) AS event_count
- FROM web_events
- GROUP BY 1,2
- ) sub
- GROUP BY 1
- ORDER BY 2 DESC;
-
- # 结果展示
- channel avg_event_count
- direct 4.8964879852125693
- organic 1.6672504378283713
- facebook 1.5983471074380165
- adwords 1.5701906412478336
- twitter 1.3166666666666667
- banner 1.2899728997289973

- 首先,我们需要按照日期和渠道分组。然后按事件数(第三列)排序,这样可以快速得出第一个问题的答案。
- SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2
- ORDER BY 3 DESC;
-
- 可以看出,要获得这一结果,提供了整个原始表格。查询的附加部分包括 *,并且我们需要为表格设置别名。此外,是在 SELECT 语句中(而不是 FROM)中提供表格。
- SELECT *
- FROM (SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2
- ORDER BY 3 DESC) sub;
-
- 最后,我们在以下语句中能够获得显示每个渠道一天的平均事件数的表格。
- SELECT channel, AVG(events) AS average_events
- FROM (SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2) sub
- GROUP BY channel
- ORDER BY 2 DESC;

在你写的第一个子查询中,你编写了一个子查询来创建表格,然后可以在 FROM 语句中查询该表格。但是,如果只返回一个值,则可以在逻辑语句中使用该值,例如 WHERE、HAVING,甚至 SELECT,该值可以嵌套在 CASE 语句中。
在下一页面中,我们将讲解这个示例,然后你将尝试自己回答一些问题。
注意,在条件语句中编写子查询时,不能包含别名。这是因为该子查询会被当做单个值(或者对于 IN 情况是一组值),而不是一个表格。
同时注意,这里的查询对应的是单个值。如果我们返回了整个列,则需要使用 IN 来执行逻辑参数。如果我们要返回整个表格,则必须为该表格使用别名,并对整个表格执行其他逻辑。
- 1.
- SELECT DATE_TRUNC('month',MIN(occurred_at)) as min_month
- FROM orders;
- # result
- min_month
- 2013-12-01T00:00:00.000Z
-
- 2.
- SELECT *
- FROM orders
- WHERE DATE_TRUNC('month',occurred_at) =
- (SELECT DATE_TRUNC('month',MIN(occurred_at)) as min_month
- FROM orders)
- ORDER BY occurred_at DESC;
- # result
- 仅仅返回了2013-12的所有订单的信息。

- 1.
- SELECT r.name region_name, s.name rep_name,MAX(total_amt_usd)
- FROM region r
- JOIN Sales_reps s
- ON r.id = s.region_id
- JOIN accounts a
- ON a.Sales_rep_id = s.id
- JOIN orders o
- ON a.id = o.account_id
- GROUP BY 1,2
- # result 50 results
- region_name rep_name max
- Midwest Carletta Kosinski 8836.64
- Midwest Charles Bidwell 18784.55
- Midwest Chau Rowles 11049.08
- Midwest Cliff Meints 48675.90
- Midwest Cordell Rieder 45465.23
- Midwest Delilah Krum 11093.97
- Midwest Julie Starr 13947.39
- Midwest Kathleen Lalonde 8776.53
- Midwest Sherlene Wetherington 13211.32
- Northeast Akilah Drinkard 4642.17
- Northeast Ayesha Monica 27902.70
- Northeast Cara Clarke 9021.14
- Northeast Debroah Wardle 13226.23
- Northeast Elba Felder 9078.63
- Northeast Elna Condello 40928.70
- Northeast Ernestine Pickron 92991.05
- Northeast Eugena Esser 81243.96
- Northeast Gianna Dossey 12079.83
- Northeast Julia Behrman 84099.62
- Northeast Lavera Oles 9711.46
- Northeast Michel Averette 23989.98
- Northeast Nakesha Renn 7382.16
- Northeast Necole Victory 93106.81
- Northeast Renetta Carew 43421.28
- Northeast Retha Sears 31329.24
- Northeast Samuel Racine 14054.42
- Northeast Shawanda Selke 29699.28
- Northeast Sibyl Lauria 16998.48
- Northeast Silvana Virden 24885.98
- Northeast Tia Amato 77285.75
- Southeast Babette Soukup 11264.85
- Southeast Calvin Ollison 10116.04
- Southeast Derrick Boggess 24000.14
- Southeast Dorotha Seawell 95005.82
- Southeast Earlie Schleusner 19195.48
- Southeast Maren Musto 17859.13
- Southeast Moon Torian 20855.55
- Southeast Nelle Meaux 22460.24
- Southeast Saran Ram 28073.65
- Southeast Vernita Plump 24111.28
- West Arica Stoltzfus 107533.55
- West Brandie Riva 93505.69
- West Dawna Agnew 232207.07
- West Elwood Shutt 82163.71
- West Georgianna Chisholm 15849.34
- West Hilma Busick 21032.39
- West Marquetta Laycock 22694.65
- West Maryanna Fiorentino 112875.18
- West Micha Woodford 16631.12
- West Soraya Fulton 93547.84
- ………………

上面对于第一个问题的解答是错误的,这样的结果是每个区域的每个销售代表的所有记录中的最大值;首先没有进行销售额求和,也没有做到统同一区域下面的销售代表的总销售额的比较,从而取最大销售额的销售代表。、
- SELECT region_name, MAX(total_amt) total_amt
- FROM(SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY 1, 2) inner1
- GROUP BY 1;
以上才是第一题的正确答案和相应的结果返回。
- 2.
- SELECT region_name,total_amt
- FROM
- (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY 1
- ORDER BY 2) sub
- LIMIT 1;
- #result
- region_name total_amt
- Midwest 3013486.51
-
- # 以下是官网给出的答案,稍显复杂,但逻辑要更严谨一点
- 对于具有最高销售额 (total_amt_usd) 的区域,总共下了多少个订单?
-
- 我写的第一个查询是获取每个区域的 total_amt_usd。
- SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY r.name;
-
- 然后,我们仅从该表格中获取销售额最高的区域。可以通过两种方法来获取,一种是使用子查询后的最大值,另一种是按降序排序,然后获取最高值。
- SELECT MAX(total_amt)
- FROM (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY r.name) sub;
-
- 最终,我们要获取具有该区域销售额的总订单量:
- SELECT r.name, SUM(o.total) total_orders
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY r.name
- HAVING SUM(o.total_amt_usd) = (
- SELECT MAX(total_amt)
- FROM (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY r.name) sub);
-
- 结果就是 Northeast,总订单为 1230378 个。

对于第三个问题,分析:题意是首先根据得到标准纸销售量最大的客户的名字、标准纸的购买数、所有类型的纸的股买数;而最后的结果是返回所有客户中总购买数大于该客户的总体购买数的客户名称 。
- 首先,我们要得出购买标准纸张数量 (standard_qty) 最多的客户。以下查询获取了该客户,以及总消费:
- SELECT a.name account_name, SUM(o.standard_qty) total_std, SUM(o.total) total
- FROM accounts a
- JOIN orders o
- ON o.account_id = a.id
- GROUP BY 1
- ORDER BY 2 DESC
- LIMIT 1;
-
- 现在,我将使用上述信息获取总消费更高的所有客户:
- SELECT a.name
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY 1
- HAVING SUM(o.total) > (SELECT total
- FROM (SELECT a.name act_name, SUM(o.standard_qty) tot_std, SUM(o.total) total
- FROM accounts a
- JOIN orders o
- ON o.account_id = a.id
- GROUP BY 1
- ORDER BY 2 DESC
- LIMIT 1) sub);
-
- 上述查询列出了具有更多订单的客户列表。我们还可以使用另一个简单的子查询获取数量。
- SELECT COUNT(*)
- FROM (SELECT a.name
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY 1
- HAVING SUM(o.total) > (SELECT total
- FROM (SELECT a.name act_name, SUM(o.standard_qty) tot_std, SUM(o.total) total
- FROM accounts a
- JOIN orders o
- ON o.account_id = a.id
- GROUP BY 1
- ORDER BY 2 DESC
- LIMIT 1) inner_tab)
- ) counter_tab;

对于问题四
- SELECT a.name, w.channel,COUNT(*)
- FROM accounts a
- JOIN web_events w
- ON a.id = w.account_id AND a.id =
- (SELECT a.id FROM (
- SELECT a.id,a.name,SUM(total_amt_usd)
- FROM accounts a
- JOIN orders o
- ON a.id = o.account_id
- GROUP BY 1,2
- ORDER BY 3 DESC
- LIMIT 1) sub)
- GROUP BY 1,2
- ORDER BY 3 DESC;
- # 上面的结果是错误的,其返回了所有相应的客户的在一些渠道上的events。与下面的主要的区别在于:子查询部分的结果中的第一个字段没有进行重命名,下面的部分进行了a_id的重命名。
-
- SELECT a.name, w.channel,COUNT(*)
- FROM accounts a
- JOIN web_events w
- ON a.id = w.account_id AND a.id =
- (SELECT a_id FROM (
- SELECT a.id as a_id,a.name,SUM(total_amt_usd)
- FROM accounts a
- JOIN orders o
- ON a.id = o.account_id
- GROUP BY 1,2
- ORDER BY 3 DESC
- LIMIT 1) sub)
- GROUP BY 1,2
- ORDER BY 3 DESC;

对于问题五
- SELECT AVG(total_amt)
- FROM(SELECT a.id,a.name,SUM(total_amt_usd) AS total_amt
- FROM accounts a
- JOIN orders o
- ON a.id = o.account_id
- GROUP BY 1,2
- ORDER BY 3 DESC
- LIMIT 10) sub;
-
- avg
- 304846.969000000000
-
- # 官方答案
- 首先,我们需要找出总消费 (total_amt_usd) 在前十名的客户。
- SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY a.id, a.name
- ORDER BY 3 DESC
- LIMIT 10;
-
- 现在计算这十个客户的平均消费。
- SELECT AVG(tot_spent)
- FROM (SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY a.id, a.name
- ORDER BY 3 DESC
- LIMIT 10) temp;

对于问题六:
- SELECT AVG(avg_amt)
- FROM
- (SELECT o.account_id,AVG(o.total_amt_usd) avg_amt
- FROM orders o
- GROUP BY 1
- HAVING AVG(o.total_amt_usd) >
- (SELECT AVG(o.total_amt_usd) avg_total_amt
- FROM orders o
- JOIN accounts a
- ON o.account_id = a.id)
- ) subA
-
-
- # 官网给出的答案
- 首先,算出所有客户的总消费 (total_amt_usd) 平均值:
- SELECT AVG(o.total_amt_usd) avg_all
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id;
-
-
- 然后,只获取高于这一平均值的客户。
- SELECT o.account_id, AVG(o.total_amt_usd)
- FROM orders o
- GROUP BY 1
- HAVING AVG(o.total_amt_usd) > (SELECT AVG(o.total_amt_usd) avg_all
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id);
-
-
- 最后,算出这些值的平均值。
- SELECT AVG(avg_amt)
- FROM (SELECT o.account_id, AVG(o.total_amt_usd) avg_amt
- FROM orders o
- GROUP BY 1
- HAVING AVG(o.total_amt_usd) > (SELECT AVG(o.total_amt_usd) avg_all
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id)) temp_table;

- 问题:你需要算出每个渠道每天的平均事件数。
-
- 解决方案:
-
- SELECT channel, AVG(events) AS average_events
- FROM (SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2) sub
- GROUP BY channel
- ORDER BY 2 DESC;
- 我们使用 WITH 语句重新编写查询。
-
- 注意:你可以获取内部查询:
-
- SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2
- 我们在此部分放入 WITH 语句。注意,在下面我们将表格的别名设为 events:
-
- WITH events AS (
- SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2)
- 现在,我们可以像对待数据库中的任何其他表格一样使用这个新创建的 events 表格:
-
- WITH events AS (
- SELECT DATE_TRUNC('day',occurred_at) AS day,
- channel, COUNT(*) as events
- FROM web_events
- GROUP BY 1,2)
-
- SELECT channel, AVG(events) AS average_events
- FROM events
- GROUP BY channel
- ORDER BY 2 DESC;
- 对于上述示例,我们只需一个额外的表格,但是想象下我们要创建第二个表格来从中获取数据。我们可以按照以下方式来创建额外的表格并从中获取数据:
-
- WITH table1 AS (
- SELECT *
- FROM web_events),
-
- table2 AS (
- SELECT *
- FROM accounts)
-
-
- SELECT *
- FROM table1
- JOIN table2
- ON table1.account_id = table2.id;
- 然后,你可以按照相同的方式使用 WITH 语句添加越来越多的表格。底部的练习将确保你掌握了这些新查询的所有必要组成部分。
-
- 例如
- WITH table1 AS (
- SELECT *
- FROM web_events),
-
- table2 AS (
- SELECT *
- FROM accounts)
-
-
- SELECT *
- FROM table1
- JOIN table2
- ON table1.account_id = table2.id;

# 针对上面利用子查询做的结果,利用公共表达式重新做了优化。
以下是使用 WITH 条件重新编写的之前问题的答案。这些查询通常更容易读懂。
- WITH t1 AS (
- SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY 1,2
- ORDER BY 3 DESC),
- t2 AS (
- SELECT region_name, MAX(total_amt) total_amt
- FROM t1
- GROUP BY 1)
- SELECT t1.rep_name, t1.region_name, t1.total_amt
- FROM t1
- JOIN t2
- ON t1.region_name = t2.region_name AND t1.total_amt = t2.total_amt;

- WITH t1 AS (
- SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY r.name),
- t2 AS (
- SELECT MAX(total_amt)
- FROM t1)
- SELECT r.name, SUM(o.total) total_orders
- FROM sales_reps s
- JOIN accounts a
- ON a.sales_rep_id = s.id
- JOIN orders o
- ON o.account_id = a.id
- JOIN region r
- ON r.id = s.region_id
- GROUP BY r.name
- HAVING SUM(o.total_amt_usd) = (SELECT * FROM t2);

- WITH t1 AS (
- SELECT a.name account_name, SUM(o.standard_qty) total_std, SUM(o.total) total
- FROM accounts a
- JOIN orders o
- ON o.account_id = a.id
- GROUP BY 1
- ORDER BY 2 DESC
- LIMIT 1),
- t2 AS (
- SELECT a.name
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY 1
- HAVING SUM(o.total) > (SELECT total FROM t1))
- SELECT COUNT(*)
- FROM t2;

- WITH t1 AS (
- SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY a.id, a.name
- ORDER BY 3 DESC
- LIMIT 1)
- SELECT a.name, w.channel, COUNT(*)
- FROM accounts a
- JOIN web_events w
- ON a.id = w.account_id AND a.id = (SELECT id FROM t1)
- GROUP BY 1, 2
- ORDER BY 3 DESC;
- WITH t1 AS (
- SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id
- GROUP BY a.id, a.name
- ORDER BY 3 DESC
- LIMIT 10)
- SELECT AVG(tot_spent)
- FROM t1;
- WITH t1 AS (
- SELECT AVG(o.total_amt_usd) avg_all
- FROM orders o
- JOIN accounts a
- ON a.id = o.account_id),
- t2 AS (
- SELECT o.account_id, AVG(o.total_amt_usd) avg_amt
- FROM orders o
- GROUP BY 1
- HAVING AVG(o.total_amt_usd) > (SELECT * FROM t1))
- SELECT AVG(avg_amt)
- FROM t2;
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。