All Products
Search
Document Center

Dataphin:Connect to Dataphin by using JDBC

Last Updated:Dec 22, 2025

This topic describes how to connect to Dataphin using the Java Database Connectivity (JDBC) driver.

Prerequisites

Dataphin JDBC uses the Dataphin OpenAPI to run SQL queries. Before you can use Dataphin JDBC, you must enable the Dataphin OpenAPI feature.

Function overview

  • Two authentication modes: Dataphin JDBC supports two authentication modes: basic mode and proxy mode. For more information, see Authentication modes.

  • Consistent execution results: Running an SQL statement using Dataphin JDBC is equivalent to running an SQL statement in Dataphin. Therefore, the SQL execution results are consistent with the results from the Dataphin UI. Configurations in Dataphin, such as permissions, data masking rules, security settings, and code specifications, also affect the execution results of Dataphin JDBC.

Limits

  • Dataphin does not support JDBC connections if the compute engine is Databricks, SelectDB, or Doris.

  • When Dataphin JDBC runs an SQL statement, Dataphin performs pre-processing operations, such as SQL translation and data masking. The result data is also transmitted and forwarded through Dataphin. These processes cause performance degradation. Therefore, the query takes longer than a direct query to the compute engine.

  • Dataphin JDBC runs SQL statements by submitting them to Dataphin through the Dataphin OpenAPI. Dataphin then processes the statements and submits them to the underlying compute engine for execution. Before you use Dataphin JDBC, you must assess the number of calls and determine whether you need to scale out the Dataphin cluster and compute engine.

    Note

    If you cannot assess the capacity, you can submit a capacity assessment request for the Dataphin cluster to the Dataphin Operations and Maintenance (O&M) team. This request does not include a capacity assessment for the compute engine.

  • Dataphin does not support throttling or concurrency control. You must carefully assess the impact.

JDBC driver version and download link

Contact a Dataphin O&M engineer to obtain the driver JAR package.

Connection parameters

Note

When Dataphin JDBC uses the platform AccessKey ID and AccessKey secret (proxy mode), you can use the set syntax to specify the access user.

  • set dp_delegation_uid = 'target_source_user_uid';

  • set dp_delegation_name = 'target_user_name';

JDBC URL format: jdbc:dataphin://host:port/catalog?[tenant_id=TenantID][&ssl=true][&log_level=Log_Level][&user=UserName][&password=PassWord][&delegation_uid=DelegationUid][&account_type=AccountType][&connect_timeout=ConnectTimeout][&engine=engine_type][compute_project=ProjectName]

Important
  • Parameter names are case-sensitive.

  • The square brackets ([]) are for illustrative purposes only. You must remove them when you run the command.

Parameter

Required

Description

Example

host

Yes

The domain name of the Dataphin OpenAPI.

You can obtain the domain name of Dataphin from OpenAPI Call Address in Personal Center > AccessKey Management. The following figure shows an example.

image

dataphin-openapi.****.aliyun.com

port

No

The port number depends on whether HTTPS is enabled for the Dataphin OpenAPI. If HTTPS is enabled, the port is 443. If HTTPS is not enabled, the port is 80. The default value is 80.

80

catalog

Yes

The default queryable data scope.

  • To use Dataphin physical tables, enter the English name of the Dataphin project (project_name).

  • To use Dataphin logical tables, enter the English name of the Dataphin data domain. The name must start with LD_.

  • To use data source tables managed by Dataphin, enter the code of the data source that is configured in Dataphin. The code must start with ds_.

Note
  • If the default queryable scope is a data domain, compute_project is required.

  • If the default queryable scope is a data source table managed by Dataphin, you do not need to specify compute_project.

Exprojectname

tenant_id

Yes

The ID of the tenant for the query.

111***111

ssl

No

Specifies whether to use HTTPS.

  • True: Use an HTTPS domain name.

  • False: Use an HTTP domain name. This is the default value.

False

currentschema

No

The schema of the data source.

  • If you use Dataphin physical or logical tables in the catalog, you do not need to specify this parameter.

  • If the data source type used in the catalog does not support schemas, such as MySQL, you do not need to specify this parameter.

  • If the data source type used in the catalog supports schemas, such as Oracle, this parameter is optional.

    • If you specify a schema, the data in the specified schema is queried.

    • If you do not specify a schema, the default schema of the database is queried.

information_schema

compute_project

No

The English name of the project. This project is used to run SQL queries. The project and its attached compute engine must have the read permission on the queried tables.

  • If the specified query scope is a Dataphin project, this parameter is optional. If you do not specify this parameter, the project specified in the default query scope is used to run SQL statements.

  • If the specified query scope is a Dataphin data domain, this parameter is required.

  • If the specified query scope is a Dataphin data source, you do not need to specify this parameter.

Exprojectname

user

Yes

The AccessKey ID of the user or the platform. In proxy mode, use the platform's AccessKey ID.

  • Contact the Dataphin O&M team to obtain the platform AccessKey ID.

  • You can obtain the user's AccessKey ID in Personal Center > AccessKey Management.

kIB**********PT0

log_level

No

The logging level. Valid values:

  • DEBUG

  • INFO

  • WARNING

  • ERROR

DEBUG

password

Yes

The AccessKey secret of the access user.

  • Contact the Dataphin O&M team to obtain the platform AccessKey.

  • You can obtain the user's AccessKey ID in Personal Center > AccessKey Management.

Cy**************r2T

delegation_uid

No

When you use a platform AccessKey, you can act as a proxy for other users to establish connections and access Dataphin. This parameter specifies the Dataphin user to be proxied.

For the Dataphin user to be proxied, you must pass the corresponding account ID based on the selected account_type. After you set this parameter, the proxy mode is used for authentication.

999***999

account_type 

No

When you use proxy mode for authentication, you must specify the account type of the proxied user.

  • ACCOUNT_NAME: The username in Dataphin. This mode is recommended if the application username is the same as the Dataphin username.

  • USER_ID: The unique ID in Dataphin. This mode is not recommended.

  • SOURCE_USER_ID: The source system account ID. This can be used when Dataphin is configured with single sign-on (SSO) authentication, such as Resource Access Management (RAM), Security Assertion Markup Language (SAML), or OAuth authentication. This is the user account in the identity provider (IDP).

Note
  • You must configure this parameter only when delegation_uid is set.

    If you do not specify this parameter, the default type is USER_ID.

  • If duplicate users exist, authentication fails.

USER_ID

connect_timeout

No

The maximum timeout period for obtaining a connection, in seconds.

  • Greater than 0: The timeout period. The minimum value is 10s.

  • Less than or equal to 0: Wait indefinitely.

10

engine

No

The offline engine for the project specified by project_name in the session or connection string. If it is a Hadoop compute source, the default engine is Hive. You can set this parameter to Impala or Spark. To set the engine type, you must configure it in the compute source in advance. If the specified engine is not supported by the project, the setting is ignored, but an alert is reported. Valid values:

  • MaxCompute

  • Hologres

  • Hive

  • Impala

  • Inceptor

  • ArgoDB

  • Spark

Note

This parameter is ignored when you access a data source.

MaxCompute

acceleration_source

No

Specifies the acceleration source. You can select the code of any acceleration source within the tenant.

starrocks_code

acceleration_resource_group

No

The acceleration resource group. You can select a configured resource group for the selected acceleration source.

starrocks_resource_group

Authentication modes

Basic mode

You can set the username to the user's AccessKey ID and the password to the user's AccessKey secret. The current access is then identified as that user. For more information about how to view the AccessKey, see Dataphin OpenAPI AccessKey management.

When you access Dataphin using the JDBC driver, Dataphin authenticates the AccessKey and authorizes the user of the AccessKey to access resources or run SQL statements. Dataphin then identifies the user who runs the SQL statements.

image

Proxy mode

Important

Before you use proxy mode, you must contact the Dataphin O&M team to enable and configure this feature.

Proxy mode is typically used for integration between systems and Dataphin. This method avoids the need to distribute or configure AccessKeys for different users. By specifying the proxy user in the JDBC connection string (JDBC URL), you can ensure that operations are authorized based on the specified user. Platform-level AccessKeys have high permissions and can act as a proxy for all users. As shown in the following figure, the client is configured with a platform AccessKey. When userA accesses the client, you must specify delegation_uid=userA to indicate that this access is a proxy for userA. The function permissions and data permissions of userA are then used for permission verification.

image

Dataphin driver details

com.aliyun.dataphin.jdbc.DataphinDriver

Interface

Function description

Interface definition

connect

Obtains a database connection.

Connection connect
(String url, Properties
info) throws 
SQLException;  

acceptsURL

Checks whether the URL is supported.

boolean acceptsURL(String url) 
throws SQLException;

com.aliyun.dataphin.jdbc.DataphinConnection

Interface

Description

Interface definition

createStatement

Creates a Statement object.

Statement createStatement
(int resultSetType, 
int resultSetConcurrency)
throws SQLException;

prepareStatement

Creates a PreparedStatement object.

PreparedStatement prepareStatement(String sql, int resultSetType,int resultSetConcurrency)throws SQLExcept;
PreparedStatement prepareStatement(String sql, int resultSetType, int resultSetConcurrency, int resultSetHoldability);

com.aliyun.dataphin.jdbc.DataphinStatement

Interface

Description

Interface definition

executeQuery

Runs an SQL statement and returns a ResultSet object.

ResultSet executeQuery
(String sql) throws 
SQLException;

setFetchSize

Retrieves result data in batches based on the specified number of rows. If this parameter is not set or is set to 0, the default value is 1,000.

void setFetchSize(int rows) 
throws SQLException

cancel

Cancels the execution of the Statement object.

void cancel() 
throws SQLException;

com.aliyun.dataphin.jdbc.DataphinPrepareStatement

Interface

Description

Interface definition

executeQuery

Runs an SQL statement and returns a ResultSet object.

ResultSet executeQuery
(String sql) 
throws SQLException;

com.aliyun.dataphin.jdbc.DataphinResultSetMetaData

Interface

Description

Interface definition

getColumnCount

Obtains the number of columns in the result schema.

int getColumnCount() 
throws SQLException;

getColumnName

Obtains the column name of the result schema.

int getColumnCount() 
throws SQLException;

com.aliyun.dataphin.jdbc.ResultSet

Interface

Description

Interface definition

next

Retrieves the data of SQL execution results row by row.

boolean next() 
throws SQLException;

com.aliyun.dataphin.jdbc.DatabaseMetaData

Interface

Description

Interface definition

getTables

Obtains table information.

  • Parameter description:

    • catalog: The default value is default.

    • schemaPattern: The project name or data domain name.

    • tableNamePattern: The table name. Fuzzy match is supported. Regular expressions are not supported.

    • types: This parameter is not supported.

  • Return result:

    • A ResultSet object is returned.

    • You can call the next method to retrieve the data of SQL execution results row by row. Each row contains the metadata of a single table. Only table names can be obtained.

  • Obtain the table name:

    • resultSet.getString("table_name")

getTables(String catalog, String schemaPattern, String tableNamePattern, String[] types) throws SQLException 

getColumns

Obtains the field information of a table.

  • Parameter description:

    • catalog: The project name or data domain name.

    • schemaPattern: This parameter is not supported.

    • tableNamePattern: The full name of the table. Fuzzy match and regular expressions are not supported.

    • columnNamePattern: This parameter is not supported.

  • Return result:

    • A ResultSet object is returned.

    • You can call the next method to retrieve the data of SQL execution results row by row. Each row contains the metadata of a single field in the table. Only column names and data types can be obtained.

  • Obtain the column name: resultSet.getString("column_name").

  • Obtain the data type of the column: resultSet.getString("data_type").

ResultSet getColumns(String catalog, String schemaPattern, String tableNamePattern, String columnNamePattern)

Examples

You can retrieve catalog information using the Dataphin JDBC driver.

1. Obtain a list of tables

This operation retrieves the list of physical tables and physical views in a project.

Syntax

SHOW TABLES
    [FROM db_name]
    [LIKE 'pattern']

Parameter description

db_name:

  • The name of the Dataphin project. The physical tables in the specified project are displayed.

  • For development projects and data domains, you must explicitly add _Dev.

  • If you do not specify db_name, the tables in the project that is specified by project_name in the URL are displayed by default.

Return result

Name

Type

Comment

dim_user

LOGICAL TABLE

User table.

ods_user

PHYSICAL TABLE

User source table.

ods_user_logical_view

LOGICAL_VIEW

Logical view.

ods_user_physical_view2

PHYSICAL_VIEW

Physical view.

2. Obtain the schema of a table

This operation retrieves the field details of a physical table or a physical view.

Syntax

{DESCRIBE | DESC} table_name;
Note

Only physical tables and physical views are supported.

Parameter description

table_name: The name of the physical table or physical view.

Return result

Name

Type

Comment

ID

BigInt

User ID.

Name

String

Username.

DS

String

Partition time.

Server-side connection control

  • Maximum connections: The default value is 100.

  • Connection timeout: The default value is 288,000 s (2 hours). If a service connects to the database but remains idle for a specific period, the connection is automatically disconnected. If you perform an operation over this connection, an error is reported or the connection is closed unexpectedly. This corresponds to the Connection_Idle_Timeout parameter in the connection URL.

Information about Dataphin JDBC jobs submitted to MaxCompute instances for execution

When you submit a Dataphin JDBC job to a MaxCompute instance for execution, the system also submits Dataphin-related information. The following table describes the information.

Parameter

Description

logical_project

The name of the Dataphin project where the JDBC job is run.

EXT_JDBC_TASKRUN_ID

The ID of the JDBC job.

EXT_DPN_TENANT_ID

The ID of the Dataphin tenant where the JDBC job is run.

EXT_PLATFORM_ID

The ID of the upper-layer platform that submits the job to MaxCompute. The default value is Dataphin.

biz_id

The ID of the Dataphin member who runs the JDBC job.

odps.idata.userenv

The user environment information. This includes the Java SDK version, Java version, IP address, and device MAC address. Example:

JavaSDK Revision:fcedc4d,Version:0.37.6,JavaVersion:1.8.0_152,IP:11.**.***.**,MAC:00-**-**-**-**-25.

You can use the submitted information to analyze MaxCompute bills and view the time consumed by jobs. For more information, see Collect statistics on accounts of top N costs and time-consuming jobs of MaxCompute projects.