A UNION ALL clause is used to combine two data streams. The field types and sequences of the two data streams must be the same.
select_statement UNION ALL select_statement;
Note Realtime Compute for Apache Flink also supports the
UNION ALLallows duplicate values and
UNIONdoes not allow duplicate values. In Realtime Compute for Apache Flink,
UNIONis equivalent to the combination of
UNION ALL and DISTINCT. We recommend that you do not use
UNIONbecause its operating efficiency is low.
- Test data
Table 1. test_source_union1 a (VARCHAR) b (BIGINT) c (BIGINT) test1 1 10 Table 2. test_source_union2 a (VARCHAR) b (BIGINT) c (BIGINT) test1 1 10 test2 2 20 Table 3. test_source_union3 a (VARCHAR) b (BIGINT) c (BIGINT) test1 1 10 test2 2 20 test1 1 10
- Sample code
SELECT a, sum(b) as d, sum(c) as e FROM (SELECT * from test_source_union1 UNION ALL SELECT * from test_source_union2 UNION ALL SELECT * from test_source_union3 )t GROUP BY a;
- Test results
a (VARCHAR) d (BIGINT) e (BIGINT) test1 1 10 test2 2 20 test1 2 20 test1 3 30 test2 4 40 test1 4 40Note The preceding test results are debugging results. In these results, you can view the computing process. If your job is published and the result table is stored in DataHub, Alibaba Cloud Message Queue for Apache Kafka, or Alibaba Cloud Message Queue, the result data contains data about the computing process. If your job is published and the result table is stored in a relational database such as ApsaraDB RDS, the records that have the same primary key values are combined into one record.