All Products
Search
Document Center

Dataphin:Data sources supported by Dataphin

Last Updated:Jan 21, 2025

Before leveraging Dataphin, select a database or data warehouse as a data source that aligns with your business scenario requirements. This data source is utilized for reading raw data and writing data throughout the data construction process. Dataphin integrates multiple data engines, enabling the integration of data warehouses such as MaxCompute and Hive, along with traditional enterprise databases like MySQL and Oracle.

Background information

Dataphin supports a range of data sources, including big data storage, file-based, message queues, relational databases, and NoSQL databases. The supported data sources for each module are outlined below:

  • To connect a data source in Dataphin, you must first establish the data source in data source management.

  • When adding data sources to Dataphin, you can include both production and development data sources. The Prod environment of Basic projects and Dev-Prod projects interacts with production data sources, while the Dev environment of Dev-Prod projects engages with development data sources. In DataService Studio, the Prod environment of Basic and Dev-Prod modes accesses production data sources, and the Dev environment of Dev-Prod mode connects to development data sources. Sync tasks do not support dual environments for production and development. The data sources in the product interact with production data sources.

    Note

    If the required data source type is not present in the built-in types, you can customize offline or real-time data source types and connect them to Dataphin to fulfill various data source access requirements. For specific procedures, see:

Data source description

```markdown

Common scenarios

Description

References

Offline integration

Offline integration supports various components, including input, output, and transform components, which can be assembled into a single offline integration pipeline through simple drag-and-drop, configuration, and assembly on the canvas. Offline integration also supports code editor mode, allowing for more personalized configurations. Additionally, input and output components of custom RDBMS data sources created by users will be automatically created in the component library to meet diverse data synchronization needs.

Offline pipeline

Real-time integration

Dataphin supports real-time integration features, allowing the real-time integration of data changes from the entire database or all tables in the source data source to the target data source, achieving real-time data synchronization between the source data source and the target data source.

Real-time integration

Offlinedevelopment

After connecting data sources to Dataphin, you can create database SQL tasks in Dataphin for development.

Create a new database SQL task

Metadata acquisition

The Metadata Center is responsible for extracting, processing, centrally storing, and managing metadata from various business systems to support data governance and enhance the organization's internal data organization, retrieval, and analysis capabilities.

Metadata Center

Real-time development

The connected data sources support the creation of real-time meta tables and the development of real-time tasks.

Flink_SQL task development method

Data Quality

Data Quality, also known as asset quality, is a comprehensive data quality solution provided by the Dataphin platform for data development and usage. Data Quality features include quality rule configuration, quality monitoring, schedule configuration, intelligent alerting, and verification administration.

Overview of asset quality

DataService Studio

DataService Studio (OneService) is the final step in building a data mid-end based on Dataphin. As a unified data service outlet, DataService Studio achieves unified market management of data, effectively lowering the threshold for data openness while ensuring the security of data openness.

Data service

Tag Factory

Tag Factory provides a one-stop tag development and service platform for enterprise data development teams and developers through the construction of full-link services from tag creation. It is suitable for various scenarios (such as risk control and marketing), providing offline, real-time, and service tag development, management, exploration, and service capabilities, empowering upper-layer business applications and enabling enterprises to accumulate tag assets, making tag development efficient, easy to find, easy to use, and easy to manage.

Overview of Tag Factory

This topic only lists the data sources supported for integration with Dataphin and the application scenarios supported in Dataphin. For detailed information on the specific features supported by data sources in each scenario, see:

Big data storage data sources

Data Source Type

Offline Integration

Real-Time Integration

Offline Development

Metadata Acquisition

Real-Time Development

Data Quality

DataService Studio

Tag Factory

Creation Guide

MaxCompute

Available

Available

No

No

Available

Available

No

Available

Create a MaxCompute data source .

Hive

Available

Available

No

No

Available

Available

No

No

Create a Hive data source.

Hologres

Available

No

No

No

Available

Available

Available

Available

Create a Hologres data source.

Impala

Available

No

No

No

No

No

Available

No

Create an Impala Data Source.

TDH Inceptor

Available

No

No

No

No

No

No

No

Create a TDH Inceptor data source.

Kudu

Available

No

No

No

No

No

No

No

Create a Kudu data source.

StarRocks

Available

No

No

No

Available

No

Available

No

Create a StarRocks data source.

Hudi

Available

No

No

No

Available

No

No

No

Create a Hudi Data Source .

Doris

Available

No

No

No

Available

No

No

No

Create a new Doris data source .

GreenPlum

Available

No

No

No

No

No

No

Available

Create a GreenPlum data source or .

TDengine

No

No

No

No

No

No

Available

No

Create a new TDengine data source.

ArgoDB

Available

No

No

No

No

No

No

No

Creating a New ArgoDB Data Source

Paimon

No

No

No

No

Available

No

No

No

Create a new Paimon data source.

SelectDB

Available

No

No

No

No

No

Available

No

Create SelectDB data source.

Lindorm (Compute Engine)

Available

No

No

No

No

No

No

No

Creating a Lindorm Compute Engine Data Source

File data sources

Data Source Type

Offline Integration

Real-time Integration

Offline Development

Metadata Acquisition

Real-time Development

Data Quality

DataService Studio

Tag Factory

Creation Guide

HDFS

Supported

No

No

No

No

No

No

No

Create an HDFS Data Source

FTP

Supported

No

No

No

No

No

No

No

Create an FTP data source.

OSS

Supported

No

No

No

No

No

No

No

Create an OSS data source.

Amazon S3

Supported

No

No

No

No

No

No

No

Create a new Amazon S3 data source.

Message queue data sources

Data Source Type

Offline Integration

Real-time Integration

Offline Development

Metadata Acquisition

Real-time Development

Data Quality

DataService Studio

Tag Factory

Creation Guide

Log Service

Supported

No

No

No

Supported

No

No

No

Create a Log Service data source

Kafka

Supported

Supported

No

No

Supported

No

No

Supported

Create a Kafka data source

DataHub

Supported

Supported

No

No

Supported

No

No

Supported

Create a DataHub data source

Relational data sources

Data Source Type

Offline Integration

Real-Time Integration

Offline Development

Metadata Acquisition

Real-Time Development

Data Quality

DataService Studio

Tag Factory

Creation Guide

PolarDB

Valid values:

No.

No.

No.

Valid values:

No.

No.

No.

Create a PolarDB data source

PolarDB-X (formerly DRDS)

Acceptable values:

Valid values: None.

Valid values: None.

Valid values: None.

Valid values: None.

Valid values: None.

Valid values: None.

Valid values: None.

Creating a PolarDB-X data source.

MySQL

Valid values.

Valid values.

Valid values.

Valid values.

Valid values.

Valid values.

Valid values.

Valid values.

Create a MySQL data source

SAP HANA

Valid values:

No.

No.

No.

Valid values:

Valid values:

Valid values:

No.

Create a SAP HANA data source

Microsoft SQL Server

Valid values

Valid values

No

Valid values

Valid values

Valid values

Valid values

No

Create a Microsoft SQL Server data source

PostgreSQL

Valid values

Valid values

No

Valid values

Valid values

Valid values

Valid values

Valid values

Create a PostgreSQL data source

AnalyticDB for MySQL 2.0

Valid values: None.

None.

None.

None.

Valid values: None.

None.

Valid values: None.

None.

Create an AnalyticDB for MySQL 2.0 data source

AnalyticDB for MySQL 3.0

Acceptable values:

Valid values: No.

Valid values: No.

Valid values: No.

Valid values: No.

Valid values: No.

Valid values: No.

Valid values: No.

Create a new AnalyticDB for MySQL 3.0 data source

AnalyticDB for PostgreSQL

Valid values:

No.

Valid values:

No.

Valid values:

Valid values:

Valid values:

Valid values:

Create an AnalyticDB for PostgreSQL data source

OceanBase

Valid values:

No.

No.

No.

Valid values:

No.

No.

No.

Create an OceanBase data source

Data Source Type

Acceptable values:

Acceptable values:

Acceptable values:

Acceptable values:

Acceptable values:

Acceptable values:

Acceptable values:

Acceptable values:

Create an Oracle Data Source

Vertica

Valid values:

No

No

No

No

No

No

No

Create a Vertica data source

IBM DB2

Valid values.

Valid values.

No.

No.

No.

Valid values.

No.

No.

Create an IBM DB2 data source

Teradata

Valid values:

No.

No.

No.

No.

No.

No.

No.

Create a Teradata data source

ClickHouse

Valid values.

Valid values.

No.

No.

Valid values.

Valid values.

Valid values.

No.

Create a ClickHouse data source

DM (Dameng)

Valid values:

No.

No.

No.

No.

Valid values:

Valid values:

No.

Create a DM (Dameng) data source

GBase 8a

Valid values:

No

No

No

No

No

No

No

Create a GBase 8a data source

KingbaseES

Valid values:

No

No

No

No

No

No

No

Create a KingbaseES data source

TiDB

Valid values:

No.

No.

No.

Valid values:

No.

No.

No.

Create a TiDB Data Source

GoldenDB

Valid values:

No.

No.

No.

No.

No.

No.

No.

Create a GoldenDB data source

PolarDB

Supported

No

No

No

Supported

No

No

No

Create an OpenGauss Data Source .

NoSQL data sources

Data Source Type

Offline Integration

Real-time Integration

Offline Development

Metadata Acquisition

Real-time Development

Data Quality

DataService Studio

Tag Factory

Creation Guide

HBase0.9.4

No

No

No

No

No

No

Supported

Supported

Create an HBase data source

HBase1.1x

Supported

No

No

No

Supported

No

Supported

Supported

Create an HBase data source

HBase2.0

Supported

No

No

No

Supported

No

Supported

Supported

Create an HBase data source

Elasticsearch

Supported

No

No

No

Supported

No

Supported

Supported

Create an Elasticsearch data source

MongoDB

Supported

No

No

No

Supported

No

Supported

No

Create a MongoDB data source

Tablestore

Supported

No

No

No

No

No

No

Supported

Create a Tablestore data source

Aliyun HBase

No

No

No

No

No

No

No

No

Create an Aliyun HBase data source

Redis

Supported

No

No

No

Supported

No

No

No

Create a Redis data source

Lindorm (LindormTable)

No

No

No

No

No

No

Supported

Supported

Create a Lindorm data source

Semi-structured storage data sources

Data Source Type

Offline Integration

Real-time Integration

Offline Development

Metadata Acquisition

Real-time Development

Data Quality

DataService Studio

Tag Factory

Creation Guide

API

Available

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Available

Create an API data source

SAP Table

Available

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Create a SAP Table data source

Salesforce

Available

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Not applicable

Create a Salesforce data source