All Products
Search
Document Center

Dataphin:Connect to Dataphin by using JDBC

Last Updated:Mar 31, 2026

This topic describes how to connect to Dataphin by using the JDBC driver.

Prerequisites

You must enable the Dataphin OpenAPI feature before using the JDBC driver.

Overview

  • Two authentication modes: The Dataphin JDBC driver supports two authentication modes: Simple Mode and Proxy Mode. For more information, see Authentication modes.

  • Consistent execution results: Executing an SQL statement with the Dataphin JDBC driver is equivalent to executing it in the Dataphin console. Therefore, the query results are identical. Configurations in Dataphin, such as permissions, Desensitization Rules, security settings, and code specifications, also apply to executions through the JDBC driver.

Limitations

  • You cannot connect to Dataphin by using JDBC when the Compute Engine is Databricks, SelectDB, Doris, or Aliyun EMR Serverless Spark.

  • The Dataphin JDBC driver processes SQL statements through Dataphin for preprocessing, such as SQL translation and data masking. Result sets are also transmitted and forwarded through Dataphin. This process introduces performance overhead and increases query times compared to querying the Compute Engine directly.

  • The Dataphin JDBC driver submits SQL statements to Dataphin for processing via the OpenAPI before sending them to the underlying Compute Engine. Before using the Dataphin JDBC driver, evaluate your expected call volume and determine if Scaling is necessary for the Dataphin cluster and Compute Engine.

    Note

    If you cannot estimate the required capacity, contact the Dataphin operations and maintenance (O&M) team to request a capacity assessment for your Dataphin cluster. This assessment does not include the Compute Engine.

  • Dataphin does not currently support traffic or concurrency control. Carefully assess the potential impact.

JDBC driver version and download link

To obtain the driver JAR file, contact the Dataphin O&M team.

Connection parameters

Note

When the Dataphin JDBC driver uses a platform-level AccessKey ID and AccessKey Secret (Proxy Mode), you can use the SET statement to specify the user to access resources.

  • set dp_delegation_uid = 'target_source_user_uid';

  • set dp_delegation_name = 'target_user_name';

JDBC URL format: jdbc:dataphin://host:port/catalog?[tenant_id=TenantID][&ssl=true][&log_level=Log_Level][&user=UserName][&password=PassWord][&delegation_uid=DelegationUid][&account_type=AccountType][&connect_timeout=ConnectTimeout][&engine=engine_type][compute_project=ProjectName]

Important
  • Parameter names are case-sensitive.

  • Square brackets [] indicate optional parameters and should not be included in the final URL.

Parameter

Required

Description

Example

host

Yes

The domain name of the Dataphin OpenAPI.

The domain name is listed under OpenAPI Invocation Address on the Personal Center > AccessKey Management page.

image

dataphin-openapi.****.aliyun.com

port

No

The port number depends on whether HTTPS is enabled for the Dataphin OpenAPI. Use 443 for HTTPS and 80 for HTTP. The default is 80.

80

catalog

Yes

The default query scope.

  • To query a Dataphin Physical Table, enter the English name of the Dataphin project (project_name).

  • To query a Dataphin Logical Table, enter the English name of the Dataphin data domain (starts with LD_).

  • To query a table from a Data Source managed by Dataphin, enter the code of the Data Source configured in Dataphin (starts with ds_).

Note
  • If the default query scope is a data domain, compute_project is required.

  • If the default query scope is a table from a Data Source managed by Dataphin, you do not need to specify compute_project.

Exprojectname

tenant_id

Yes

The ID of the tenant to be queried.

111***111

ssl

No

Specifies whether to use HTTPS.

  • True: Use an HTTPS Domain Name.

  • False: Use an HTTP Domain Name. This is the default.

False

currentschema

No

The schema of the Data Source.

  • This parameter is not required if the catalog points to a Dataphin Physical Table or Logical Table.

  • This parameter is not required if the Data Source type specified in the catalog does not support schemas, such as MySQL.

  • This parameter is optional if the Data Source type specified in the catalog supports schemas, such as Oracle.

    • If you specify a schema, queries are performed on that schema.

    • If you do not specify a schema, queries are performed on the default schema of the database.

information_schema

compute_project

No

The English name of the project for executing SQL queries. This project and its associated Compute Source must have read permissions for the queried tables.

  • If the specified query scope is a Dataphin project, this parameter is optional. If not specified, the project defined in the default query scope is used for SQL execution.

  • If the specified query scope is a Dataphin data domain, this parameter is required.

  • This parameter is not required if the specified query scope is a Dataphin Data Source.

Exprojectname

user

Yes

The AccessKey ID of the user or platform. For Proxy Mode, use the platform's AccessKey ID.

  • To obtain the platform AccessKey ID, contact the Dataphin O&M team.

  • To obtain your user AccessKey ID, go to Personal Center > AccessKey Management.

kIB**********PT0

log_level

No

The logging level. Valid values are:

  • DEBUG

  • INFO

  • WARNING

  • ERROR

DEBUG

password

Yes

The AccessKey Secret for the specified user.

  • To obtain the platform AccessKey, contact the Dataphin O&M team.

  • To obtain your user AccessKey ID, go to Personal Center > AccessKey Management.

Cy**************r2T

delegation_uid

No

Specifies the Dataphin user to impersonate when connecting with a platform AccessKey.

The ID provided for the impersonated user must match the specified account_type. Setting this parameter enables the Proxy Mode authentication method.

999***999

account_type 

No

For Proxy Mode, you must specify the account type of the user being impersonated.

  • ACCOUNT_NAME: The Dataphin username. This mode is recommended when the application and Dataphin use the same username.

  • USER_ID: The unique user ID within Dataphin (not generally recommended).

  • SOURCE_USER_ID: The user ID from the source system. This option is available when Dataphin is configured with SSO authentication (such as RAM, SAML, or OAuth) and represents the user's account in the Identity Provider (IdP).

Note
  • This parameter is required only when delegation_uid is set.

    If this parameter is not specified, the default type is USER_ID.

  • If duplicate users exist, authentication fails.

USER_ID

connect_timeout

No

The connection timeout in seconds.

  • Greater than 0: The timeout period. The minimum value is 10s.

  • Less than or equal to 0: Wait indefinitely.

10

engine

No

Specifies the offline engine for the project (project_name) defined in the session or connection string. If the Compute Source is Hadoop, the default engine is hive, but you can set this parameter to Impala or Spark. The selected engine type must be pre-configured in the Compute Source. If an unsupported engine type is specified for the project, the setting is ignored, and a warning is issued. Valid values are:

  • MaxCompute

  • Hologres

  • Hive

  • Impala

  • Inceptor

  • ArgoDB

  • Spark

Note

This parameter is ignored when accessing a Data Source.

MaxCompute

acceleration_source

No

Specifies the Acceleration Source. You can enter the code of any Acceleration Source within the tenant.

starrocks_code

acceleration_resource_group

No

The acceleration Resource Group. You can select a configured Resource Group from the specified Acceleration Source.

starrocks_resource_group

Authentication modes

Simple mode

Set the username to the user's AccessKey ID and the password to the user's AccessKey Secret. The connection is then authenticated as this user. To view your AccessKey, see Manage Dataphin OpenAPI AccessKeys.

When you connect to Dataphin through the JDBC driver, Dataphin authenticates the AccessKey and authorizes the user for the requested resources or SQL statements. Dataphin then executes the SQL queries as the authenticated user.

image

Proxy mode

Important

To use Proxy Mode, you must contact the Dataphin O&M team to enable and configure this feature.

Proxy Mode is typically used for system-level integrations with Dataphin. This mode avoids the need to distribute or manage individual user AccessKeys. You can specify the user to impersonate in the JDBC connection URL, ensuring that all operations are authorized based on that user's permissions. A platform-level AccessKey has high privileges and can impersonate any user. As shown in the following figure, the client is configured with a platform AccessKey. When userA connects through the client, the delegation_uid=userA parameter is used to impersonate userA. All permission checks are then performed based on userA's functional and data permissions.

image

Dataphin driver API reference

com.aliyun.dataphin.jdbc.DataphinDriver

Interface

Description

Syntax

connect

Establishes a database connection.

Connection connect
(String url, Properties
info) throws 
SQLException;  

acceptsURL

Checks whether the driver can handle a given URL.

boolean acceptsURL(String url) 
throws SQLException;

com.aliyun.dataphin.jdbc.DataphinConnection

Interface

Description

Syntax

createStatement

Creates a Statement object.

Statement createStatement
(int resultSetType, 
int resultSetConcurrency)
throws SQLException;

prepareStatement

Creates a PreparedStatement object.

PreparedStatement prepareStatement(String sql, int resultSetType,int resultSetConcurrency)throws SQLExcept;
PreparedStatement prepareStatement(String sql, int resultSetType, int resultSetConcurrency, int resultSetHoldability);

com.aliyun.dataphin.jdbc.DataphinStatement

Interface

Description

Syntax

executeQuery

Executes an SQL statement and returns a ResultSet object.

ResultSet executeQuery
(String sql) throws 
SQLException;

setFetchSize

Fetches result data in batches based on the specified number of rows. If not set or set to 0, the default fetch size is 1,000.

void setFetchSize(int rows) 
throws SQLException

cancel

Cancels the execution of the Statement object.

void cancel() 
throws SQLException;

com.aliyun.dataphin.jdbc.DataphinPrepareStatement

Interface

Description

Syntax

executeQuery

Executes an SQL statement and returns a ResultSet object.

ResultSet executeQuery
(String sql) 
throws SQLException;

com.aliyun.dataphin.jdbc.DataphinResultSetMetaData

Interface

Description

Syntax

getColumnCount

Gets the number of columns in the result schema.

int getColumnCount() 
throws SQLException;

getColumnName

Gets the name of a column in the result schema.

String getColumnName(int column) 
throws SQLException;

com.aliyun.dataphin.jdbc.ResultSet

Interface

Description

Syntax

next

Retrieves SQL query results row by row.

boolean next() 
throws SQLException;

com.aliyun.dataphin.jdbc.DatabaseMetaData

Interface

Description

Syntax

getTables

Retrieves table metadata.

  • Parameters:

    • catalog: Defaults to default.

    • schemaPattern: Project name or data domain name.

    • tableNamePattern: Table name. Wildcard matching is supported, but regular expressions are not.

    • types: This parameter is not currently supported.

  • Results:

    • Returns a ResultSet object.

    • You can use the next() method to retrieve results row by row. Each row contains the metadata of a single table (only the table name is supported).

  • Get table name:

    • resultSet.getString("table_name")

ResultSet getTables(String catalog, String schemaPattern, String tableNamePattern, String[] types) throws SQLException 

getColumns

Retrieves column metadata for a table.

  • Parameters:

    • catalog: Project name or data domain name.

    • schemaPattern: This parameter is not currently supported.

    • tableNamePattern: The full name of the table. Wildcard matching and regular expressions are not supported.

    • columnNamePattern: This parameter is not currently supported.

  • Results:

    • Returns a ResultSet object.

    • You can use the next() method to retrieve results row by row. Each row contains the metadata for one table column (only column name and data type are supported).

  • Get column name: resultSet.getString("column_name").

  • Get column data type: resultSet.getString("data_type").

ResultSet getColumns(String catalog, String schemaPattern, String tableNamePattern, String columnNamePattern)

Examples

Use the Dataphin JDBC driver to retrieve catalog information.

1. Get a list of tables

Gets a list of Physical Tables and Physical Views in a project.

Syntax

SHOW TABLES
    [FROM db_name]
    [LIKE 'pattern']

Parameters

db_name:

  • Dataphin project name: Lists the Physical Tables in this project.

  • For development projects and data domains, you must explicitly add the _Dev suffix.

  • If db_name is not specified, the tables in the project_name set in the URL are listed by default.

Results

Name

Type

Comment

dim_user

Logical Table

User table.

ods_user

Physical Table

User source table.

ods_user_logical_view

Logical View

Logical view.

ods_user_physical_view2

Physical View

Physical view.

2. Get the structure of a table

Gets the field details of a Physical Table or Physical View.

Syntax

{DESCRIBE | DESC} table_name;
Note

Only Physical Tables and Physical Views are supported.

Parameters

table_name: The name of the Physical Table or Physical View.

Results

Name

Type

Comment

ID

BigInt

User ID.

Name

String

User name.

DS

String

Partition time.

Server-side connection control

  • Maximum connections: The default is 100.

  • Connection timeout: The default is 288,000s (80 hours). If a connection is idle for this period, it is automatically closed. Subsequent operations on this connection will fail.

JDBC task information in MaxCompute

When a Dataphin JDBC task is submitted to a MaxCompute instance, Dataphin-related information is also passed to the instance.

Parameter

Description

logical_project

The name of the Dataphin project where the JDBC task is executed.

EXT_JDBC_TASKRUN_ID

The task ID of the JDBC task.

EXT_DPN_TENANT_ID

The ID of the Dataphin tenant where the JDBC task is executed.

EXT_PLATFORM_ID

The ID of the upper-level platform that submitted the task to MaxCompute. The default is Dataphin.

biz_id

The Dataphin member ID of the user who executed the JDBC task.

odps.idata.userenv

User environment information, including the Java SDK revision, Java version, IP address, and device MAC address. For example:

JavaSDK Revision:fcedc4d,Version:0.37.6,JavaVersion:1.8.0_152,IP:11.**.***.**,MAC:00-**-**-**-**-25.

You can use this information for scenarios such as analyzing MaxCompute bills and viewing job execution times. For more information, see Collect statistics on TOP N cost accounts and time-consuming jobs.