1. Real-time data transmission
1.1 Import heterogeneous data from multiple data sources and deliver the data to downstream systems
You can import heterogeneous data generated by applications, web systems, IoT devices, and databases to DataHub in real time for unified data management. The data is then delivered to downstream systems for data analysis and archiving. DataHub helps you create a clear data flow that better unlocks the potential of data.
DataHub decouples the dependencies between the big data system and business systems and removes the interrelationships among components within the big data system.
Real-time data transmission
DataHub imports your business data into the big data system in real time and reduces the cycle time of data analysis.
2. Real-time data cleansing and analysis
2.1 Import heterogeneous data from multiple data sources and achieve uniform data by using real-time data cleansing
You can cleanse the heterogeneous data from multiple data sources into uniform structured data in real time by using DataHub and Realtime Compute. This facilitates further data analysis.
Real-time extract-transform-load (ETL)
DataHub supports heterogeneous data import from multiple data sources. You can use ETL to cleanse, filter, associate, and transform the data in real time to produce structured data.
3. Real-time data warehouses
3.1 Support real-time data warehouses that substitute the traditional databases
The architecture paradigm of DataHub changes from Lambda architecture to Kappa architecture. The operation data store (ODS), data warehouse detail (DWD), and data warehouse service (DWS) layers are built in DataHub to form a real-time data warehouse.
Uniform Kappa architecture
The traditional batch pipeline and fast real-time stream pipeline in Lambda are combined, which hugely reduces maintenance costs.
Real-time big data