SELECTステートメントで句を実行するシーケンス

MaxComputeのSELECT構文に従って記述されたSELECT文の句は、標準のSELECT文の句とは異なる順序で実行されます。このトピックでは、MaxComputeのSELECTステートメントで句を実行する手順について説明し、参照用の例を示します。

SELECT構文には、次の句が含まれます。

select
from
where
group by
having
window
qualify
order by
distribute by
sort by
limit

DISTRIBUTE BYまたはSORT BYでは、ORDER BYおよびGROUP BYを使用できません。 SELECTステートメントの句は、次のいずれかの順序で実行できます。

シーケンス1: FROM > WHERE > GROUP BY > HAVING > SELECT > ORDER BY > LIMIT
シーケンス2: FROM > WHERE > SELECT > DISTRIBUTE BY > SORT BY

混乱を防ぐため、MaxComputeでは、上記のシーケンスでSELECTステートメントを記述できます。 SELECTステートメントの構文は、次の形式に変更できます。

from <table_reference>
[where <where_condition>]
[group by <col_list>]
[having <having_condition>]
[window <window_name> AS (<window_definition>)]
[qualify <expression>]
select [all | distinct] <select_expr>, <select_expr>, ...
[order by <order_condition>]
[distribute by <distribute_condition> [sort by <sort_condition>] ]
[limit <number>]

サンプルデータ

サンプルソースデータは、このトピックの例をよりよく理解するために提供されています。サンプル文:

-- Create a partitioned table named sale_detail. 
create table if not exists sale_detail
(
shop_name     string,
customer_id   string,
total_price   double
)
partitioned by (sale_date string, region string);

-- Add partitions to the sale_detail table. 
alter table sale_detail add partition (sale_date='2013', region='china') partition (sale_date='2014', region='shanghai');

-- Insert data into the sale_detail table. 
insert into sale_detail partition (sale_date='2013', region='china') values ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3);
insert into sale_detail partition (sale_date='2014', region='shanghai') values ('null','c5',null),('s6','c6',100.4),('s7','c7',100.5);

sale_detailテーブルのデータを照会します。例：

set odps.sql.allow.fullscan=true;
select * from sale_detail; 
-- The following result is returned: 
+------------+-------------+-------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     |
+------------+-------------+-------------+------------+------------+
| s1         | c1          | 100.1       | 2013       | china      |
| s2         | c2          | 100.2       | 2013       | china      |
| s3         | c3          | 100.3       | 2013       | china      |
| null       | c5          | NULL        | 2014       | shanghai   |
| s6         | c6          | 100.4       | 2014       | shanghai   |
| s7         | c7          | 100.5       | 2014       | shanghai   |
+------------+-------------+-------------+------------+------------+

例

例1: SELECTステートメントの句はシーケンス1で実行されます。
説明
次のステートメントを実行してパーティションテーブルのデータを照会する場合は、ステートメントの前にset odps.sql.allow.fullscan=true; を追加して、フルテーブルスキャンを有効にするか、ステートメントでパーティションを指定する必要があります。
```
-- Write a SELECT statement based on the SELECT syntax. 
set odps.sql.allow.fullscan=true;
select region,max(total_price) 
from sale_detail 
where total_price > 100
group by region 
having sum(total_price)>300.5 
order by region 
limit 5;
-- Write a SELECT statement based on Sequence 1. The following statement is equivalent to the preceding statement. 
from sale_detail 
where total_price > 100 
group by region 
having sum(total_price)>300.5 
select region,max(total_price) 
order by region 
limit 5;
```
次の応答が返されます。
```
+------------+------------+
| region     | _c1        |
+------------+------------+
| china      | 100.3      |
+------------+------------+
```
SELECTステートメントで句を実行するロジック:
1. sale_detailテーブル (from sale_detail) から条件 (WHERE total_price > 100) を満たすデータを取得します。
2. ステップaから取得したデータをリージョン列 (GROUP BY) の値に基づいてグループ化します。
3. ステップb (HAVING sum(total_price)>305) で取得したデータから、total_priceの合計が305より大きいグループのデータを取得します。
4. ステップcで取得したデータから各リージョンのtotal_price列の最大値 (SELECT region,max(total_price)) を取得します。
5. ステップdで取得したデータをregion列 (ORDER BY region) の値に基づいてソートします。
6. ステップeで取得したデータの最初の5つのデータレコード (LIMIT 5) を表示します。

例2: SELECTステートメントの句はシーケンス2で実行されます。

-- Write a SELECT statement based on the SELECT syntax. 
set odps.sql.allow.fullscan=true;
select shop_name
       ,total_price
       ,region
from   sale_detail
where  total_price > 100.2
distribute by region
sort by total_price;
-- Write a SELECT statement based on Sequence 2. The following statement is equivalent to the preceding statement. 
from   sale_detail 
where  total_price > 100.2 
select shop_name
       ,total_price
       ,region 
distribute by region 
sort by total_price;

次の応答が返されます。

+------------+-------------+------------+
| shop_name  | total_price | region     |
+------------+-------------+------------+
| s3         | 100.3       | china      |
| s6         | 100.4       | shanghai   |
| s7         | 100.5       | shanghai   |
+------------+-------------+------------+

SELECTステートメントで句を実行するロジック:

sale_detailテーブル (from sale_detail) から条件 (WHERE total_price > 100.2) を満たすデータを取得します。
ショップ名、合計価格、リージョン列 (SELECT shop_name, total_price, region) の値に基づいて、ステップaから取得したデータからデータを取得します。
ステップbで取得したデータをregion列 (DISTRIBUTE BY region) の値に基づいてハッシュ分割します。
total_price列 (SORT BY total_price) の値に基づいて、ステップcで取得したデータを昇順に並べ替えます。