This topic describes how to develop an efficient data write program by using the Client SDK. This program enables you to write data into AnalyticDB for PostgreSQL by calling API operations.

The Client SDK supports both custom development and integration of data write programs. A custom data write program developed by using the Client SDK simplifies the data write process with its internal mechanisms such as parallel processing and performs multiple times better than COPY and INSERT statements. You do not need to worry about connection pool or cache issues.

Note The Client SDK used with AnalyticDB for PostgreSQL is designed for efficient data writes. It does not read or process raw data.

Maven repositories

You can use Maven to configure the version of the Client SDK. The configuration of Maven is as follows:

<dependency>
  <groupId>com.alibaba.cloud.analyticdb</groupId>
  <artifactId>adb4pgclient</artifactId>
  <version>1.0.0</version>
</dependency>
Note
  • The Client SDK used with AnalyticDB for PostgreSQL depends on the following packages: druid (version 1.1.17), PostgreSQL JDBC (version 42.2.5), commons-lang3 (version 3.4), slf4j-api (version 1.7.24), and slf4j-log4j12 (version 1.7.24).
  • If the system prompts version conflicts with the Client SDK, you must verify that you are using the correct versions of these packages.

Related operations

Table 1. Operations used for the DatabaseConfig class
Operation Description
setHost(String adbHost) Specifies the endpoint used to connect to the AnalyticDB for PostgreSQL database.
setPort(int port) Specifies the port used to connect to the AnalyticDB for PostgreSQL database. The default port is 3432.
setDatabase(String database) Specifies the name of the AnalyticDB for PostgreSQL database.
setUser(String username) Specifies the username used to log on to the AnalyticDB for PostgreSQL database.
setPassword(String pwd) Specifies the password used to log on to the AnalyticDB for PostgreSQL database.
addTable(List<String> table, String schema) Specifies the tables into which you want to insert data. This operation can be called multiple times. Each time you can only specify tables that use the same schema. However, after you use the DatabaseConfig class to create an Adb4pgClient object, this operation no longer takes effect.
setColumns(List<String> columns, String tableName, String schemaName) Specifies the columns you want to insert into a table. If you want to insert all columns into a table, set the columns parameter to columnList.add("*") for that table. You must specify columns for all tables listed in the table parameter. Otherwise, the operation fails the system check.
setInsertIgnore(boolean insertIgnore) Specifies whether to ignore the rows with conflicted primary keys. This operation takes effect on all specified tables. Default value: true.
setEmptyAsNull(boolean emptyAsNull) Specifies whether to set empty values to null. This operation takes effect on all specified tables. Default value: false.
setParallelNumber(int parallelNumber) Specifies the maximum number of concurrent threads used to write data into the AnalyticDB for PostgreSQL database. This operation takes effect on all specified tables. Default value: 4. In normal cases, we recommend that you retain the default value.
setLogger(Logger logger) Specifies the logger used by the Adb4pgClient object. In AnalyticDB for PostgreSQL, slf4j.Logger is used.
setRetryTimes(int retryTimes) Specifies the maximum number of retries to commit data in the event of an error. Default value: 3.
setRetryIntervalTime(long retryIntervalTime) Specifies the interval between every two retries. Unit: milliseconds. Default value: 1000.
setCommitSize(long commitSize) Specifies the volume of data automatically committed. Unit: bytes. Default value: 1000000. In normal cases, we recommend that you retain the default value.
Table 2. Operations used for the Row class
Operation Description
setColumn(int index, Object value) Specifies the value of a column in the row. You can call this operation multiple times based on the sequence of columns in the row to specify multiple columns. This operation does not support the reuse of row objects. Each data record must be associated with a specific row object.
setColumnValues(List<Object> values) Specifies the values of multiple columns in the row.
updateColumn(int index, Object value) Updates the value of a column in the row. This operation allows you to reuse a row object by updating the data in it.
Table 3. Operations used for the Adb4pgClient class
Operation Description
addRow(Row row, String tableName, String schemaName)/addRows(List<Row> rows, String tableName, String schemaName) Inserts one or more row-formatted data records into a table. The data records are stored in the buffer of the Client SDK until they are committed. If the volume of data records stored in the buffer reaches the value of the commitSize parameter, the system automatically commits the data records when you call the addRow or addRows operation. If the automatic commit fails, the system reports an error that contains the failed data records. You must handle the error based on the error information.
addMap(Map<String, String> dataMap,String tableName, String schemaName)/addMaps(List<Map<String, String>> dataMaps, String tableName, String schemaName) Inserts one or more map-formatted data records into a table. The data records are stored in the buffer of the Client SDK. If the volume of data records stored in the buffer reaches the value of the commitSize parameter, the system automatically commits the data records when you call the addMap or addMaps operation. If the automatic commit fails, the system reports an error that contains the failed data records. You must handle the error based on the error information.
commit() Commits the data records stored in the buffer of the Client SDK. If the commit fails, the system reports an error that contains the failed statements. You must handle the error based on the error information.
TableInfo getTableInfo(String tableName, String schemaName) Obtains the schema of a table.
List<ColumnInfo> getColumnInfo(String tableName, String schemaName) Obtains the columns of a table. You can call the columnInfo.isNullable() operation to check whether the ColumnInfo class can be null.
stop() Releases the internal thread pools and resources used by the Adb4pgClient object after you no longer need it. If data records in the buffer of the Client SDK are not committed when you call this operation, the system reports an error. To forcibly release the internal thread pools and resources, call the forceStop() operation.
forceStop() Forcibly releases the internal thread pools and resources used by the Adb4pgClient object. After you call this operation, data records that have not been committed in the buffer of the Client SDK are lost. Therefore, we recommend that you do not call this operation unless necessary.
Connection getConnection() throws SQLException Retrieves the connection with the AnalyticDB for PostgreSQL database from the connection pool of the Adb4pgClient object. The retrieved connection works the same as a Java Database Connectivity (JDBC) connection. You can use the retrieved connection to perform write operations except COPY.
Note You must release the resources such as ResultSet, Statement, and Connection used by the Adb4pgClient object after you no longer need it.
Table 4. Operations used for the ColumnInfo class
Operation Description
boolean isNullable() Specifies whether the ColumnInfo class can be null.
Error code HTTP status code Description
COMMIT_ERROR_DATA_LIST 101 The error returned because some data records have failed to be committed.
Note You can call the e.getErrData() operation to obtain the list of failed data records (List<String>). This error may be returned for the addMap, addMaps, addRow, addRows, or commit operation. You must handle this error for each operation separately.
COMMIT_ERROR_OTHER 102 The error returned because an exception other than a commit failure has occurred.
ADD_DATA_ERROR 103 The error returned because data records have failed to be added.
CREATE_CONNECTION_ERROR 104 The error returned because a connection has failed to be established.
CLOSE_CONNECTION_ERROR 105 The error returned because a connection has failed to be terminated.
CONFIG_ERROR 106 The error returned because the configuration of the DatabaseConfig class has failed.
STOP_ERROR 107 The error returned because the Adb4pgClient object has failed to be stopped.
OTHER 999 The default error returned.

Sample code

public class Adb4pgClientUsage {
    public void demo() {
        DatabaseConfig databaseConfig = new DatabaseConfig();
        //Should set your database real host or url
        databaseConfig.setHost("100.100.100.100");
        //Should set your database real port
        databaseConfig.setPort(8888);
        //Specify the username used to log on to your AnalyticDB for PostgreSQL database.
        databaseConfig.setUser("your user name");
        //Specify the password used to log on to your AnalyticDB for PostgreSQL database.
        databaseConfig.setPassword("your password");
      //Specify the name of your AnalyticDB for PostgreSQL database.
        databaseConfig.setDatabase("your database name");
        //Specify the tables into which you want to insert data.
        List<String> tables = new ArrayList<String>();
        tables.add("your table name 1");
        tables.add("your table name 2");

        //You can call the addTable operation to specify tables that use the same schema. However, after you use the DatabaseConfig class to create an Adb4pgClient object, the specified tables cannot be changed.
        //By default, if the table schema name parameter is set to null, the public schema is used.
        databaseConfig.addTable(tables, "table schema name");

        //Specify the columns you want to insert into a table.
        List<String> columns = new ArrayList<String>();
        columns.add("column1");
        columns.add("column2");
        //If you want to insert all columns into a table, set the columns parameter to columns.add("*") for that table.
        databaseConfig.setColumns(columns, "your table name 1", "table schema name");
        databaseConfig.setColumns(Collections.singletonList("*"),"your table name 2", "table schema name");


        //If the value of column is empty, set null
        databaseConfig.setEmptyAsNull(false);
        //Specify whether to ignore the rows with conflicted primary keys.
        databaseConfig.setInsertIgnore(true);
        //Specify to retry up to three times if data fails to be committed into AnalyticDB for PostgreSQL.
        databaseConfig.setRetryTimes(3);
        //Specify a 1000-millisecond retry interval.
        databaseConfig.setRetryIntervalTime(1000);
        //Specify to initialize the Adb4pgClient object. After the Adb4pgClient object is initialized, the configuration of the DatabaseConfig class cannot be modified.
        Adb4pgClient adbClient = new Adb4pgClient(databaseConfig);

        //Specify to store data records to the buffer of the Client SDK and commit them simultaneously. For more information, see the "Precautions" section.
        for (int i = 0; i < 10; i++) {
            //Add row(s) to buffer. One instance for one record
            Row row = new Row(columns.size());
            //Set column
            //the column index must be same as the sequence of columns
            //the column value can be any type, internally it will be formatted according to column type
            row.setColumn(0, i); // Number value
            row.setColumn(1, "string value"); // String value
            //If the volume of data records in the buffer reaches the upper limit, the system automatically commits them when you call the addRow or addMap operation.
            //If the automatic commit fails, the system returns the "COMMIT_ERROR_DATA_LIST" error.
            adbClient.addRow(row, "your table name 1", "table schema name");
        }

        Row row = new Row();
        row.setColumn(0, 10); // Number value
        row.setColumn(1, "2018-01-01 08:00:00"); // Date/Timestamp/Time value
        adbClient.addRow(row, "your table name 1", "table schema name");
        //Update column. A Row object can be reused.
        row.updateColumn(0, 11);
        row.updateColumn(1, "2018-01-02 08:00:00");
        adbClient.addRow(row, "your table name 1", "table schema name");

        //Add map(s) to buffer
        Map<String, String> rowMap = new HashMap<String, String>();
        rowMap.put("t1", "12");
        rowMap.put("t2", "string value");
        //If you want to commit more than one data record, we recommend that you store these data records to the buffer of the Client SDK and commit them at a time.
        adbClient.addMap(rowMap, "your table name 2", "table schema name");

        //Commit buffer to ADS
        //Buffer is cleaned after successfully commit to ADS
        try {
            adbClient.commit();
        } catch (Exception e) {
            // TODO: Handle exception here
        } finally {
            adbClient.stop();
        }
    }

}

Precautions

  • AnalyticDB for PostgreSQL is not thread-safe. If you call more than one thread, make sure that each thread maintains its own Adb4pgClient object.
    Notice If multiple threads share one Adb4pgClient object, security issues arise and data write performance is constrained.
  • A data record is only considered to be written into AnalyticDB for PostgreSQL after it is committed.
  • If the Adb4pgClient object reports an error, you must handle the error based on the error information. If a data record cannot be committed, you can commit it again. You also have the option to skip the failed data record after you note it down.
  • When you write data into AnalyticDB for PostgreSQL, the data records may be stored to the buffer and batch committed. This increases CPU utilization. Therefore, we recommend that you maintain an appropriate number of threads and watch the garbage collection status of your application.
  • We recommend that you commit 10,000 data records at a time.
  • You must complete the configuration of the DatabaseConfig class before you create an Adb4pgClient object. After the Adb4pgClient object is created, the configuration of the DatabaseConfig class cannot be modified.
  • The Client SDK aims to optimize data writes (INSERT statements). To optimize the execution of other SQL statements, you can call the getConnection() operation to establish a JDBC connection.