TUMBLE assigns each row in a data stream to a fixed-size, non-overlapping window. Use it in a GROUP BY clause to compute per-window aggregations over event time or processing time.
For example, a 5-minute tumbling window over an infinite data stream produces discrete, back-to-back intervals: [0:00, 0:05), [0:05, 0:10), [0:10, 0:15), and so on.
Syntax
TUMBLE(<time-attr>, <size-interval>)
<size-interval>: INTERVAL 'string' timeUnit<time-attr> must be a valid time attribute — either event time or processing time. For how to define time attributes, see Overview. For the full specification, see the Apache Flink Time Attributes documentation.
Window identifier functions
Use these functions in the SELECT list to retrieve window boundaries or carry the time attribute forward for cascading windows.
| Function | Return type | Description |
|---|---|---|
TUMBLE_START(time-attr, size-interval) | TIMESTAMP | Start of the window, inclusive. For [00:10, 00:15), returns 00:10. |
TUMBLE_END(time-attr, size-interval) | TIMESTAMP | End of the window, inclusive. For [00:00, 00:15], returns 00:15. |
TUMBLE_ROWTIME(time-attr, size-interval) | TIMESTAMP (rowtime attribute) | End of the window, exclusive. For (00:00, 00:15), returns 00:14:59.999. Returns a rowtime attribute for use as the time attribute in a downstream window. Only valid for event time windows. For cascading windows, see the Overview. |
TUMBLE_PROCTIME(time-attr, size-interval) | TIMESTAMP (rowtime attribute) | End of the window, exclusive. For (00:00, 00:15), returns 00:14:59.999. Returns a processing time attribute for use in downstream time operations. Only valid for processing time windows. For cascading windows, see the Overview. |
Example 1: Count clicks per user per minute (event time)
This example counts how many times each user clicks a URL within each 1-minute tumbling window, using event time with a 2-second watermark.
Test data
| username (VARCHAR) | click_url (VARCHAR) | eventtime (VARCHAR) |
|---|---|---|
| Jark | http://taobao.com/xxx | 2024-08-10 10:00:00.0 |
| Jark | http://taobao.com/xxx | 2024-08-10 10:00:10.0 |
| Jark | http://taobao.com/xxx | 2024-08-10 10:00:49.0 |
| Jark | http://taobao.com/xxx | 2024-08-10 10:01:05.0 |
| Jark | http://taobao.com/xxx | 2024-08-10 10:01:58.0 |
| Timo | http://taobao.com/xxx | 2024-08-10 10:02:10.0 |
| Timo | http://taobao.com/xxx | 2024-08-10 10:03:10.0 |
Test statements
CREATE TEMPORARY TABLE user_clicks (
username VARCHAR,
click_url VARCHAR,
eventtime VARCHAR,
ts AS TO_TIMESTAMP(eventtime),
WATERMARK FOR ts AS ts - INTERVAL '2' SECOND -- Define a watermark for the rowtime.
) WITH (
'connector' = 'sls',
...
);
CREATE TEMPORARY TABLE tumble_output (
window_start TIMESTAMP,
window_end TIMESTAMP,
username VARCHAR,
clicks BIGINT
) WITH (
'connector' = 'datahub',
...
);
INSERT INTO tumble_output
SELECT
TUMBLE_START(ts, INTERVAL '1' MINUTE) AS window_start,
TUMBLE_END(ts, INTERVAL '1' MINUTE) AS window_end,
username,
COUNT(click_url)
FROM user_clicks
GROUP BY TUMBLE(ts, INTERVAL '1' MINUTE), username;Test results
| window_start (TIMESTAMP) | window_end (TIMESTAMP) | username (VARCHAR) | clicks (BIGINT) |
|---|---|---|---|
2024-08-10 10:00:00.0 | 2024-08-10 10:01:00.0 | Jark | 3 |
2024-08-10 10:01:00.0 | 2024-08-10 10:02:00.0 | Jark | 2 |
2024-08-10 10:02:00.0 | 2024-08-10 10:03:00.0 | Timo | 1 |
Example 2: Count clicks per user per minute (processing time)
This example uses processing time instead of event time. The PROCTIME() function generates the processing time column automatically — no watermark is needed.
Test data
| username (VARCHAR) | click_url (VARCHAR) |
|---|---|
| Jark | http://taobao.com/xxx |
| Jark | http://taobao.com/xxx |
| Jark | http://taobao.com/xxx |
| Jark | http://taobao.com/xxx |
| Jark | http://taobao.com/xxx |
| Timo | http://taobao.com/xxx |
Test statements
CREATE TEMPORARY TABLE window_test (
username VARCHAR,
click_url VARCHAR,
ts AS PROCTIME()
) WITH (
'connector' = 'sls',
...
);
CREATE TEMPORARY TABLE tumble_output (
window_start TIMESTAMP,
window_end TIMESTAMP,
username VARCHAR,
clicks BIGINT
) WITH (
'connector' = 'datahub',
...
);
INSERT INTO tumble_output
SELECT
TUMBLE_START(ts, INTERVAL '1' MINUTE),
TUMBLE_END(ts, INTERVAL '1' MINUTE),
username,
COUNT(click_url)
FROM window_test
GROUP BY TUMBLE(ts, INTERVAL '1' MINUTE), username;Test results
| window_start (TIMESTAMP) | window_end (TIMESTAMP) | username (VARCHAR) | clicks (BIGINT) |
|---|---|---|---|
2024-08-10 14:43:00.000 | 2024-08-10 14:44:00.000 | Jark | 5 |
2024-08-10 14:43:00.000 | 2024-08-10 14:44:00.000 | Timo | 1 |
What's next
Overview — learn about time attributes and cascading windows