This topic describes how to use ApsaraDB for SelectDB with a Java Database Connectivity (JDBC) data source. This connection lets you perform federated analysis on JDBC-compatible data sources.
Overview
A JDBC Catalog lets you connect to other data sources through the standard JDBC protocol. After a connection is established, SelectDB automatically synchronizes database and table metadata from the data source. This allows for quick access to external data.
SelectDB supports the following JDBC data sources: MySQL, PostgreSQL, Oracle, SQLServer, ClickHouse, Doris, SAP HANA, Trino/Presto, and OceanBase.
Prerequisites
Ensure that all nodes in the data source cluster and the SelectDB instance can communicate with each other over the network.
All nodes in the data source cluster and the SelectDB instance are in the same Virtual Private Cloud (VPC). If they are not, you must first resolve the network connectivity issue. For more information, see How do I resolve network connectivity issues between a SelectDB instance and a data source?
The IP addresses of all nodes in the data source cluster are added to the whitelist of the SelectDB instance. For more information, see Configure a whitelist.
If the source cluster uses a whitelist, add the CIDR block of the SelectDB instance to the source cluster's whitelist.
To obtain the IP address of the SelectDB instance in the VPC to which the SelectDB instance belongs, you can perform the operations provided in How do I view the IP addresses in the VPC to which my ApsaraDB SelectDB instance belongs?
To obtain the public IP address of the SelectDB instance, you can run the ping command to access the public endpoint of the SelectDB instance and obtain the IP address of the instance.
You have a basic understanding of catalogs and their operations. For more information, see Data lakehouse.
Syntax
CREATE CATALOG <catalog_name>
PROPERTIES ("key"="value", ...)Parameters
Parameter | Required | Default value | Description |
user | Yes | None | The username for the database account. |
password | Yes | None | The password for the database account. |
jdbc_url | Yes | None | The JDBC connection string. |
driver_url | Yes | None | The name of the JDBC driver JAR file. |
driver_class | Yes | None | The class name of the JDBC driver. |
lower_case_table_names | No | "false" | Specifies whether to synchronize the database and table names from the external JDBC data source in lowercase. |
only_specified_database | No | "false" | Specifies whether to synchronize only specified databases. |
include_database_list | No | "" | If |
exclude_database_list | No | "" | If |
Driver package path
When you create a JDBC catalog, you must specify the path of the driver package for the corresponding database. You can specify the driver_url parameter in one of the following ways:
File name. For example,
mysql-connector-java-8.0.25.jar. SelectDB automatically searches for the corresponding JAR file in the localjdbc_drivers/directory. By default, this directory contains the following four common driver packages that are available for use:mysql-connector-java-8.0.25.jar,postgresql-42.5.1.jar,mssql-jdbc-11.2.3.jre8.jar, andojdbc8.jar.HTTP URL. For example,
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar. SelectDB downloads the driver file from this URL. Only HTTP services that do not require authentication are supported.
Lowercase table name synchronization
When lower_case_table_names is set to true, SelectDB can query databases and tables with uppercase or mixed-case names by maintaining a mapping from lowercase names to the actual names in the remote system.
SelectDB 2.X supports only Oracle databases and transforms all database and table names to uppercase when executing a query. For example,
In SelectDB 2.X, this feature is effective only for Oracle databases. When you run a query, all database and table names are converted to uppercase before the query is sent to Oracle. For example, if an Oracle database has a table named TEST in the TEST schema and you set
lower_case_table_names=truewhen you create the catalog in SelectDB, you can query the TEST table by runningSELECT * FROM oracle_catalog.test.test. SelectDB automatically convertstest.testtoTEST.TESTbefore sending the query to Oracle. Note that this is the default behavior, which means you cannot query tables with lowercase names in Oracle.For other databases, you must specify the actual database and table names in your queries.
In SelectDB 3.X and later, this feature is effective for all databases. When you run a query, all database and table names are converted to their actual names before the query is executed. If you upgrade to version 3.X from an earlier version, you must run
Refresh <catalog_name>for this change to take effect.However, if database or table names differ only in case, such as
SelectDBandselectdb, SelectDB cannot query them because of the ambiguity.When the
lower_case_table_namesparameter of the frontend (FE) is set to 1 or 2, thelower_case_table_namesparameter for the JDBC catalog must be set to true. If thelower_case_table_namesparameter of the FE is set to 0, the parameter for the JDBC catalog can be set to true or false, and the default value is false. This ensures consistency and predictability when SelectDB handles internal and foreign table configurations.
Synchronize specified databases
The following parameters are related to synchronizing specified databases:
only_specified_database: When you connect through JDBC, you can specify which database or schema to connect to. For example, you can specify a database in thejdbc_urlfor MySQL, or specify `currentSchema` in thejdbc_urlfor PostgreSQL.include_database_list: This parameter takes effect only whenonly_specified_database=true. It specifies the databases to synchronize. Separate multiple database names with commas (,). Database names are case-sensitive.exclude_database_list: This parameter takes effect only whenonly_specified_database=true. It specifies the databases to exclude from synchronization. Separate multiple database names with commas (,). Database names are case-sensitive.
If a database is configured in both
include_database_listandexclude_database_list,exclude_database_listtakes precedence.If you use these parameters to connect to an Oracle database, you must use
ojdbc8.jaror a later version of the JAR package.
Examples
MySQL
Example
CREATE CATALOG jdbc_mysql PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:mysql://127.0.0.1:3306/demo",
"driver_url" = "mysql-connector-java-8.0.25.jar",
"driver_class" = "com.mysql.cj.jdbc.Driver"
)Level mapping
SelectDB | MySQL |
Catalog | MySQL Server |
Database | Database |
Table | Table |
Type mapping
MYSQL Type | SelectDB Type | Notes |
BOOLEAN | TINYINT | None |
TINYINT | TINYINT | None |
SMALLINT | SMALLINT | None |
MEDIUMINT | INT | None |
INT | INT | None |
BIGINT | BIGINT | None |
UNSIGNED TINYINT | SMALLINT | SelectDB does not have an UNSIGNED data type, so the range is expanded to the next level. |
UNSIGNED MEDIUMINT | INT | SelectDB does not have an UNSIGNED data type, so the range is expanded to the next level. |
UNSIGNED INT | BIGINT | SelectDB does not have an UNSIGNED data type, so the range is expanded to the next level. |
UNSIGNED BIGINT | LARGEINT | None |
FLOAT | FLOAT | None |
DOUBLE | DOUBLE | None |
DECIMAL | DECIMAL | None |
UNSIGNED DECIMAL(p,s) | DECIMAL(p+1,s) / STRING | If |
DATE | DATE | None |
TIMESTAMP | DATETIME | None |
DATETIME | DATETIME | None |
YEAR | SMALLINT | None |
TIME | STRING | None |
CHAR | CHAR | None |
VARCHAR | VARCHAR | None |
JSON | JSON | None |
SET | STRING | None |
BIT | BOOLEAN/STRING | BIT(1) is mapped to BOOLEAN. Other BIT types are mapped to STRING. |
TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT | STRING | None |
BLOB, MEDIUMBLOB, LONGBLOB, TINYBLOB | STRING | None |
TINYSTRING, STRING, MEDIUMSTRING, LONGSTRING | STRING | None |
BINARY, VARBINARY | STRING | None |
Other | UNSUPPORTED | None |
PostgreSQL
Example
CREATE CATALOG jdbc_postgresql PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:postgresql://127.0.0.1:5432/demo",
"driver_url" = "postgresql-42.5.1.jar",
"driver_class" = "org.postgresql.Driver"
);Level mapping
When you map PostgreSQL, a SelectDB database corresponds to a schema in the specified PostgreSQL catalog, such as the schemas in the demo database from the preceding example's jdbc_url. A table in a SelectDB database corresponds to a table in a PostgreSQL schema. The mapping is as follows:
SelectDB | PostgreSQL |
Catalog | Database |
Database | Schema |
Table | Table |
SelectDB runs the SQL statement SELECT nspname FROM pg_namespace WHERE has_schema_privilege('<UserName>', nspname, 'USAGE'); to retrieve all schemas that the PostgreSQL user can access and maps them to SelectDB databases.
Type mapping
POSTGRESQL Type | SelectDB Type | Notes |
boolean | BOOLEAN | None |
smallint/int2 | SMALLINT | None |
integer/int4 | INT | None |
bigint/int8 | BIGINT | None |
decimal/numeric | DECIMAL | None |
real/float4 | FLOAT | None |
double precision | DOUBLE | None |
smallserial | SMALLINT | None |
serial | INT | None |
bigserial | BIGINT | None |
char | CHAR | None |
varchar/text | STRING | None |
timestamp | DATETIME | None |
date | DATE | None |
json/jsonb | JSON | None |
time | STRING | None |
interval | STRING | None |
point/line/lseg/box/path/polygon/circle | STRING | None |
cidr/inet/macaddr | STRING | None |
bit | BOOLEAN/STRING | bit(1) is mapped to BOOLEAN. Other bit types are mapped to STRING. |
uuid | STRING | None |
Other | UNSUPPORTED | None |
Oracle
Example
CREATE CATALOG jdbc_oracle PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:oracle:thin:@127.0.0.1:1521:helowin",
"driver_url" = "ojdbc8.jar",
"driver_class" = "oracle.jdbc.driver.OracleDriver"
);Level mapping
When you map Oracle, a SelectDB database corresponds to a user in Oracle. A table in a SelectDB database corresponds to a table that the user has permissions to access in Oracle. The mapping is as follows:
SelectDB | Oracle |
Catalog | Database |
Database | User |
Table | Table |
Synchronization of Oracle SYNONYM TABLEs is not supported.
Type mapping
ORACLE Type | SelectDB Type | Notes |
number(p) / number(p,0) | TINYINT/SMALLINT/INT/BIGINT/LARGEINT | SelectDB selects the corresponding type based on the value of p:
|
number(p,s),[ if(s>0 && p>s) ] | DECIMAL(p,s) | None |
number(p,s),[ if(s>0 && p < s) ] | DECIMAL(s,s) | None |
number(p,s),[ if(s<0) ] | TINYINT/SMALLINT/INT/BIGINT/LARGEINT | If |
number | None | SelectDB does not currently support Oracle types where p and s are not specified. |
decimal | DECIMAL | None |
float/real | DOUBLE | None |
DATE | DATETIME | None |
TIMESTAMP | DATETIME | None |
CHAR/NCHAR | STRING | None |
VARCHAR2/NVARCHAR2 | STRING | None |
LONG/ RAW/ LONG RAW/ INTERVAL | STRING | None |
Other | UNSUPPORTED | None |
SQLServer
Example
If you use SelectDB 3.0.8 or later, you must include the encrypt=false parameter in the jdbc_url connection string to ensure that you can access data in SQLServer.
CREATE CATALOG jdbc_sqlserver PROPERTIES (
"type"="jdbc",
"user"="SA",
"password"="SelectDB123456",
"jdbc_url" = "jdbc:sqlserver://localhost:1433;DataBaseName=SelectDB_test;encrypt=false",
"driver_url" = "mssql-jdbc-11.2.3.jre8.jar",
"driver_class" = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
);Level mapping
When you map SQLServer, a SelectDB database corresponds to a schema in the specified SQLServer database, such as a schema in the SelectDB_test database from the preceding example's jdbc_url. A table in a SelectDB database corresponds to a table in a SQLServer schema. The mapping is as follows:
SelectDB | SQLServer |
Catalog | Database |
Database | Schema |
Table | Table |
Type mapping
SQLServer Type | SelectDB Type |
bit | BOOLEAN |
tinyint | SMALLINT |
smallint | SMALLINT |
int | INT |
bigint | BIGINT |
real | FLOAT |
float | DOUBLE |
money | DECIMAL(19,4) |
smallmoney | DECIMAL(10,4) |
decimal/numeric | DECIMAL |
date | DATE |
datetime/datetime2/smalldatetime | DATETIMEV2 |
char/varchar/text/nchar/nvarchar/ntext | STRING |
binary/varbinary | STRING |
time/datetimeoffset | STRING |
Other | UNSUPPORTED |
Doris
The SelectDB JDBC Catalog also supports connections to Doris databases.
Example
CREATE CATALOG jdbc_doris PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:mysql://127.0.0.1:9030?useSSL=false",
"driver_url" = "mysql-connector-java-8.0.25.jar",
"driver_class" = "com.mysql.cj.jdbc.Driver"
)Type mapping
Doris Type | JDBC Catalog SelectDB Type | Notes |
BOOLEAN | BOOLEAN | None |
TINYINT | TINYINT | None |
SMALLINT | SMALLINT | None |
INT | INT | None |
BIGINT | BIGINT | None |
LARGEINT | LARGEINT | None |
FLOAT | FLOAT | None |
DOUBLE | DOUBLE | None |
DECIMALV3 | DECIMALV3/STRING | The type is selected based on the precision and scale of the DECIMAL field. |
DATE | DATE | None |
DATETIME | DATETIME | None |
CHAR | CHAR | None |
VARCHAR | VARCHAR | None |
STRING | STRING | None |
TEXT | STRING | None |
HLL | HLL | To query HLL, set |
Array | Array | The internal type mapping for Array follows the logic of the types listed above. Nested complex types are not supported. |
BITMAP | BITMAP | To query BITMAP, set |
Other | UNSUPPORTED | None |
ClickHouse
Creation example
CREATE CATALOG jdbc_clickhouse PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:clickhouse://127.0.0.1:8123/demo",
"driver_url" = "clickhouse-jdbc-0.4.2-all.jar",
"driver_class" = "com.clickhouse.jdbc.ClickHouseDriver"
);Level mapping
SelectDB | ClickHouse |
Catalog | ClickHouse Server |
Database | Database |
Table | Table |
Type mapping
ClickHouse Type | SelectDB Type |
Bool | BOOLEAN |
String | STRING |
Date/Date32 | DATE |
DateTime/DateTime64 | DATETIME |
Float32 | FLOAT |
Float64 | DOUBLE |
Int8 | TINYINT |
Int16/UInt8 | SMALLINT |
Int32/UInt16 | INT |
Int64/Uint32 | BIGINT |
Int128/UInt64 | LARGEINT |
Int256/UInt128/UInt256 | STRING |
DECIMAL | DECIMALV3/STRING |
Enum/IPv4/IPv6/UUID | STRING |
Array | ARRAY |
Other | UNSUPPORTED |
SAP HANA
Example
CREATE CATALOG jdbc_hana PROPERTIES (
"type"="jdbc",
"user"="SYSTEM",
"password"="SAPHANA",
"jdbc_url" = "jdbc:sap://localhost:31515/TEST",
"driver_url" = "ngdbc.jar",
"driver_class" = "com.sap.db.jdbc.Driver"
)Level mapping
SelectDB | SAP HANA |
Catalog | Database |
Database | Schema |
Table | Table |
Type mapping
SAP HANA Type | SelectDB Type |
BOOLEAN | BOOLEAN |
TINYINT | TINYINT |
SMALLINT | SMALLINT |
INTEGER | INT |
BIGINT | BIGINT |
SMALLDECIMAL | DECIMALV3 |
DECIMAL | DECIMALV3/STRING |
REAL | FLOAT |
DOUBLE | DOUBLE |
DATE | DATE |
TIME | STRING |
TIMESTAMP | DATETIME |
SECONDDATE | DATETIME |
VARCHAR | STRING |
NVARCHAR | STRING |
ALPHANUM | STRING |
SHORTTEXT | STRING |
CHAR | CHAR |
NCHAR | CHAR |
OceanBase
Example
CREATE CATALOG jdbc_oceanbase PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:oceanbase://127.0.0.1:2881/demo",
"driver_url" = "oceanbase-client-2.4.2.jar",
"driver_class" = "com.oceanbase.jdbc.Driver"
)When SelectDB connects to OceanBase, it automatically detects whether OceanBase is in MySQL or Oracle mode. For information about level mapping and type mapping, see the MySQL and Oracle sections in this topic.
Query data
Example
SELECT * FROM mysql_catalog.mysql_database.mysql_table WHERE k1 > 1000 AND k3 ='term';A field name may be a keyword reserved by the database. To ensure that SelectDB can still execute queries correctly in this case, SelectDB automatically adds escape characters to field and table names in SQL statements based on the standards of each database. For example, backticks (``) are used for MySQL, double quotation marks ("") for PostgreSQL and Oracle, and square brackets ([]) for SQLServer. This may cause field names to become case-sensitive. You can run the EXPLAIN SQL command to view the escaped query statement sent to the remote database.
Predicate pushdown
When you run a query with a WHERE clause, such as WHERE dt = '2022-01-01', SelectDB can push these filter conditions down to the external data source. This process filters out unqualified data at the source, reducing unnecessary data retrieval and network transfer. This greatly improves query performance and reduces the load on the external data source.
When the enable_func_pushdown session variable is set to true, SelectDB also pushes down functions in the WHERE clause to the external data source. This feature is currently supported only for MySQL. If you use functions that are not supported by MySQL, you can set this parameter to false. SelectDB automatically identifies some functions that are not supported by MySQL and filters them from the pushdown conditions. You can run the EXPLAIN SQL command to view the specific query statement.
The following functions are not supported for pushdown: DATE_TRUNC and MONEY_FORMAT.
Row count limit
If a query contains the LIMIT keyword, SelectDB translates it into the appropriate semantics for different data sources.
Write data
After you create a JDBC catalog in SelectDB, you can write data using the INSERT INTO or INSERT INTO...SELECT statement. You can also write query results from SelectDB to a JDBC catalog or import data from one JDBC catalog to another.
The INSERT INTO statement is inefficient for writing large amounts of data. You should use the INSERT INTO...SELECT statement instead.
Example
INSERT INTO mysql_catalog.mysql_database.mysql_table VALUES(1, "doris");
INSERT INTO mysql_catalog.mysql_database.mysql_table SELECT * FROM table;Transactions
Data from SelectDB is written to a JDBC catalog in batches. If an import is interrupted, previously written data may need to be rolled back. To handle this, the JDBC Catalog supports transactions for data writes. To enable transaction support, you can set the enable_odbc_transcation session variable.
SET enable_odbc_transcation = TRUE; Transactions ensure the atomicity of data writes to JDBC foreign tables but may slightly decrease write performance. You should enable this feature only when necessary.