This topic describes the CREATE STREAM statement of Spark SQL. This statement is supported in EMR V3.23.0 and later.
Background information
You can use the SET or CREATE STREAM statement to configure WriteStream parameters. We recommend that you use CREATE STREAM to configure the required WriteStream parameters, which are checkpointLocation, outputMode, triggerType, and triggerIntervalMs.
Syntax
CREATE STREAM queryName
OPTIONS (propertyName=propertyValue[,propertyName=propertyValue]*)
INSERT INTO tbName
queryStatement;
The following table describes these parameters.
Parameter | Description | Default value |
---|---|---|
checkpointLocation | The directory where the checkpoint file for the streaming query job is stored. | No default value |
outputMode | The output mode of the query result. | Append |
triggerType | The execution mode of the streaming query. | ProcessingTime |
triggerIntervalMs | The interval between streaming queries. Unit: milliseconds. | 0 |
Example
CREATE STREAM job1
OPTIONS(
checkpointLocation='/tmp/spark',
outputMode='Append',
triggerType='ProcessingTime'
triggerIntervalMs='3000')
INSERT INTO LargeOrders
SELECT * FROM Orders WHERE units > 1000;