DataHub is a streaming data processing service provided by MaxCompute. It lets you publish and subscribe to data streams to easily build stream-based analytics and applications.
Use cases
DataHub is a good fit for the following scenarios:
-
Ingest application or system logs in real time — Push log data from your servers directly into a DataHub stream. Data is available for processing within seconds and is protected even if the source server fails.
-
Run real-time metrics and reporting — Collect event data into DataHub and analyze it as it arrives, without waiting for batch processing cycles.
-
Archive streaming data to MaxCompute — Route continuous data streams into MaxCompute tables automatically, enabling SQL-based analytics on fresh data.
Data archiving
DataHub provides a data archiving feature to archive streaming data in MaxCompute. For details on setting up the real-time data tunnel, see the DataHub documentation.
SDKs
DataHub provides SDKs for multiple languages:
Next steps
-
DataHub documentation — Full reference for the real-time data tunnel.