背景信息
SQL操作中,经常有行转列(Pivot)和列转行(Unpivot)的数据处理需求。本文将通过SQL示例,介绍如何在Data Lake Analytics(DLA)中使用SQL,达到行转列和列转行的目的。另外,DLA支持JSON函数和UNNEST语法,您可以更加方便、灵活地通过SQL处理数据。
前提条件
在OSS中上传测试数据文件pivot、unpiovt表,表存储路径分别为
oss://bucket-name/test/pivot/pivot.txt
和oss:/bucket-name/test/unpivot/unpivot.txt
。创建OSS Schema
CREATE SCHEMA oss_test_schema with DBPROPERTIES(
catalog='oss',
location = 'oss://bucket-name/test/'
);
创建OSS表
CREATE EXTERNAL TABLE pivot (
id int,
username string,
subject string,
score int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'oss://bucket-name/test/pivot/'
CREATE EXTERNAL TABLE unpivot (
username string,
yuwen int,
shuxue int,
yingyu int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'oss://dlaossfile1/test/unpivot/'
查询表数据
SELECT * FROM oss_test_schema.pivot
SELECT * FROM oss_test_schema.unpivot