Before you use Dataphin, you must select a database or data warehouse that fits your business scenario to use as a data source. This data source is used to read raw data and write data during the development process. Dataphin integrates a wide range of compute engines. It supports data warehouses, such as MaxCompute and Hive, and also connects to traditional enterprise databases, such as MySQL and Oracle.
Background information
Dataphin supports connections to big data storage, file, message queue, relational, and NoSQL data sources. The data source types that are supported by each module are described as follows:
To connect to a data source in Dataphin, you must first create the data source in Data Source Management.
Dataphin supports adding both production and development data sources. Basic projects and the production (Prod) environment of Dev-Prod projects use production data sources. The development (Dev) environment of Dev-Prod projects uses development data sources. In DataService Studio, the Basic mode and the Prod environment of the Dev-Prod mode use production data sources, while the Dev environment uses development data sources. Synchronization tasks do not support dual Dev/Prod environments. They can only use production data sources.
NoteIf the data source type that you need is not available, you can create a custom offline or real-time data source type and then connect it to Dataphin. This helps you meet various integration requirements. For more information, see:
Data source descriptions
Scenarios | Description | References |
Offline integration | Offline integration supports various components, such as input, output, and transform. Generate a single offline integration pipeline by dragging, configuring, and assembling components on the canvas. Offline integration also provides a code editor for more customization. When you create a custom RDBMS data source, its input and output components are automatically added to the component library. This supports a variety of data synchronization needs. | |
Real-time integration | Dataphin supports real-time integration. This feature integrates data changes from an entire database or all tables in a source data source to a destination data source. This keeps the source and destination data sources synchronized in real time. | |
Offline development - Database SQL | After connecting a data source to Dataphin, create Database SQL nodes for development. | |
Metadata acquisition | The Metadata Center extracts, processes, and centrally stores and manages metadata from various business systems. This supports data governance and improves your organization's ability to organize, retrieve, and analyze data. | |
Real-time development | Use connected data sources to create real-time metatables and develop real-time nodes. | |
Data Quality | Data Quality, a feature of Dataphin, provides a complete set of solutions for data development and use. Create global table quality rules or data source quality rules based on your data sources. For data source quality rules, you can select any data source in Dataphin to create monitoring rules. All supported data sources can be tested for connectivity. However, only some data sources support rules that monitor table schema changes. For more information, see the Data source quality - Table schema change column in the tables below. | |
DataService Studio | DataService Studio (OneService) is the final step in building a data mid-end with Dataphin. It acts as a unified exit for data services. It provides centralized, market-oriented data management. This lowers the barrier to data accessibility and ensures data security. | |
Tag Factory | Tag Factory provides an end-to-end process from tag creation to service delivery. It is a one-stop platform for enterprise data teams and developers. It is suitable for scenarios such as risk control and marketing. Tag Factory provides tools to develop, manage, explore, and serve offline, real-time, and service tags. This empowers business applications and helps enterprises build tag assets. It makes tag development efficient and management simple. |
This topic lists the data sources that Dataphin supports and their application scenarios. For more information about the features supported by each data source in different scenarios, see:
Big data storage data sources
Data source type | Offline integration | Real-time integration | Offline development - Database SQL | Metadata acquisition | Real-time development | Global table quality | Data source quality - Table schema change | DataService Studio | Tag Factory | Creation guide |
MaxCompute | Supported | Supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Supported | |
Hive | Supported | Supported | Not supported | Supported | Supported | Supported | Supported | Not supported | Not supported | |
Hologres | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Impala | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Supported | Not supported | |
TDH Inceptor | Supported | Not supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Not supported | |
Kudu | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
StarRocks | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
Hudi | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Doris | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
GreenPlum | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | Supported | |
TDengine | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | |
ArgoDB | Supported | Not supported | Not supported | Not supported | Not supported | Supported | Supported | Not supported | Not supported | |
Paimon | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
SelectDB | Supported | Not supported | Supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Lindorm (compute engine) | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
Databricks | Supported | Supported | Not supported | Not supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon Redshift | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
DolphinDB | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | Not supported | |
Snowflake | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Data Lake Formation | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported |
File data sources
Data source type | Offline integration | Real-time integration | Offline development - Database SQL | Metadata acquisition | Real-time development | Global table quality | Data source quality - Table schema change | DataService Studio | Tag Factory | Creation guide |
HDFS | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
FTP | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
OSS | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Amazon S3 | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported |
Message queue data sources
Data source type | Offline integration | Real-time integration | Offline development - Database SQL | Metadata acquisition | Real-time development | Global table quality | Data source quality - Table schema change | DataService Studio | Tag Factory | Creation guide |
Log Service | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Kafka | Supported | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | |
DataHub | Supported | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | |
RabbitMQ | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported |
Relational data sources
Data source type | Offline integration | Real-time integration | Offline development - Database SQL | Metadata acquisition | Real-time development | Global table quality | Data source quality - Table schema change | DataService Studio | Tag Factory | Creation guide |
PolarDB | Supported | Not supported | Not supported | Not supported | Supported | Supported | Supported | Not supported | Not supported | |
PolarDB-X (formerly DRDS) | Supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Not supported | Not supported | |
PolarDB-X 2.0 | Supported | Not supported | Supported | Not supported | Not supported | Supported | Supported | Supported | Not supported | |
MySQL | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
SAP HANA | Supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
Microsoft SQL Server | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
PostgreSQL | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
AnalyticDB for MySQL 2.0 | Supported | Not supported | Supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | |
AnalyticDB for MySQL 3.0 | Supported | Not supported | Supported | Supported | Supported | Not supported | Supported | Supported | Not supported | |
AnalyticDB for PostgreSQL | Supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | |
OceanBase | Supported | Not supported | Supported | Supported | Supported | Not supported | Supported | Supported | Not supported | |
Oracle | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Vertica | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
IBM DB2 | Supported | Supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | Not supported | |
Teradata | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
ClickHouse | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
DM | Supported | Not supported | Supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
GBase 8a | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
KingbaseES | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
TiDB | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
GoldenDB | Supported | Not supported | Not supported | Not supported | Not supported | Supported | Supported | Not supported | Not supported | |
OpenGauss | Supported | Not supported | Supported | Supported | Not supported | Not supported | Not supported | Not supported | Supported | |
GaussDB (DWS) | Supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for MySQL | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for PostgreSQL | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for SQL Server | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for Oracle | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for DB2 | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | Not supported | |
TDSQL for MySQL | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | Not supported | |
GBase 8c | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported |
NoSQL data sources
Data source type | Offline integration | Real-time integration | Offline development - Database SQL | Metadata acquisition | Real-time development | Global table quality | Data source quality - Table schema change | DataService Studio | Tag Factory | Creation guide |
HBase 0.9.4 | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Supported | Supported | |
HBase 1.1.x | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | |
HBase 2.0 | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Supported | Support | Supported | |
Elasticsearch | Supported | Not supported | Not supported | Supported | Supported | Not supported | Not supported | Supported | supported | |
MongoDB | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | |
Tablestore | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | |
Aliyun HBase | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Supported | Not supported | Not supported | |
Redis | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Lindorm (wide table) | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | |
Presto | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Easysearch | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Trino | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
OpenSearch | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported |
Semi-structured storage data sources
Data source type | Offline integration | Real-time integration | Offline development - Database SQL | Metadata acquisition | Real-time development | Global table quality | Data source quality - Table schema change | DataService Studio | Tag Factory | Creation guide |
API | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | |
SAP Table | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Salesforce | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Lark Bitable data source | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported |