In MaxCompute, you can use a LATERAL VIEW clause with a user-defined table-valued function (UDTF) to split a single row of data into multiple rows. This topic describes how to use LATERAL VIEW.
Introduction
LATERAL VIEW is primarily used in MaxCompute to process complex data types, such as Array and Map. A standard SELECT statement returns one output row for each input row. However, a user-defined table-valued function (UDTF) has a one-to-many relationship between its input and output. Because of this, you cannot use a UDTF directly in the SELECT list. To overcome this limitation, you can use LATERAL VIEW with a UDTF to split a single row of data into multiple rows and then aggregate the resulting data.
If the LATERAL VIEW statement includes the OUTER keyword, as in LATERAL VIEW OUTER ..., and the UDTF does not return any rows, the original input row is retained in the result set. All columns that would be generated by the UDTF are set to NULL.
Command format
LATERALVIEW: LATERAL VIEW [OUTER] <udtf_name>(<expression>) <table_alias> AS <columnAlias> (',' <columnAlias>)
fromClause: FROM <baseTable> (LATERALVIEW) [(LATERALVIEW) ...]Parameters
udtf_name: Required. The UDTF used to split a single row of data into multiple rows. For more information, see Other functions.
expression: Required. The name of the column that contains the data to be split.
table_alias: Required. The alias for the UDTF result set.
columnAlias: Required. The alias for the columns generated by the row splitting.
baseTable: Required. The source table.
You can use multiple LATERAL VIEW statements after the FROM clause. A subsequent LATERAL VIEW statement can reference all tables and column names that precede it. This lets you split data from different columns.
Sample data
The pageAds table contains three columns: pageid STRING, col1 ARRAY<INT>, and col2 ARRAY<STRING>. The data is as follows.
pageid | col1 | col2 |
front_page | [1, 2, 3] | ["a", "b", "c"] |
contact_page | [3, 4, 5] | ["d", "e", "f"] |
The following statements create and populate the table:
CREATE TABLE pageAds (pageid STRING,col1 ARRAY<INT>,col2 ARRAY<STRING>);
INSERT INTO pageAds VALUES ('front_page',ARRAY(1,2,3),ARRAY("a","b","c"));
INSERT INTO pageAds VALUES ('contact_page',ARRAY(3,4,5),ARRAY("d","e","f"));Usage examples
Single LATERAL VIEW statement
Example 1: Split the col1 column. The command is as follows:
SELECT pageid, col1_new, col2 FROM pageAds LATERAL VIEW EXPLODE(col1) adTable AS col1_new;The returned result is as follows:
+--------------+------------+---------------+ | pageid | col1_new | col2 | +--------------+------------+---------------+ | front_page | 1 | ["a","b","c"] | | front_page | 2 | ["a","b","c"] | | front_page | 3 | ["a","b","c"] | | contact_page | 3 | ["d","e","f"] | | contact_page | 4 | ["d","e","f"] | | contact_page | 5 | ["d","e","f"] | +--------------+------------+---------------+Example 2: Split the col1 column and perform an aggregate operation. The command is as follows:
SELECT col1_new, COUNT(1) AS COUNT FROM pageAds LATERAL VIEW EXPLODE(col1) adTable AS col1_new GROUP BY col1_new;The returned result is as follows:
+------------+------------+ | col1_new | count | +------------+------------+ | 1 | 1 | | 2 | 1 | | 3 | 2 | | 4 | 1 | | 5 | 1 | +------------+------------+
Multiple LATERAL VIEW statements
Split the col1 and col2 columns. The command is as follows:
SELECT pageid,mycol1, mycol2 FROM pageAds LATERAL VIEW EXPLODE(col1) myTable1 AS mycol1 LATERAL VIEW EXPLODE(col2) myTable2 AS mycol2;The returned result is as follows:
+--------------+------------+------------+ | pageid | mycol1 | mycol2 | +--------------+------------+------------+ | front_page | 1 | a | | front_page | 1 | b | | front_page | 1 | c | | front_page | 2 | a | | front_page | 2 | b | | front_page | 2 | c | | front_page | 3 | a | | front_page | 3 | b | | front_page | 3 | c | | contact_page | 3 | d | | contact_page | 3 | e | | contact_page | 3 | f | | contact_page | 4 | d | | contact_page | 4 | e | | contact_page | 4 | f | | contact_page | 5 | d | | contact_page | 5 | e | | contact_page | 5 | f | +--------------+------------+------------+
References
To transpose data between rows and columns, you can use the LATERAL VIEW statement. For more information, see Best practices for transposing rows and columns.