edit-icon download-icon

Configure DRDS reader

Last Updated: Apr 17, 2018

The Drds Reader plug‑in provides the ability to read data from DRDS (distributed RDS). At the underlying implementation level, Drds Reader connects to a remote DRDS database through JDBC and runs corresponding SQL statements to select data from the DRDS database.

Currently, Drds plug‑in is only adapted to the Mysql engine. Drds is a distributed Mysql database, and most of the communication protocols are applicable to Mysql use cases.

Specifically, Drds Reader connects to a remote Drds database through the JDBC connector. The SELECT SQL query statements are generated and sent to the remote DRDS database based on your configurations. Then, the SQL statements are run and the returned results are assembled into abstract datasets by using the custom data types of data synchronization. Datasets are passed to the downstream writer for processing.

Drds Reader concatenates the table, column, and WHERE information you configured into SQL statements and sends them to the Drds database. Unlike the Mysql database, Drds as a distributed database is unable to adapt to all Mysql protocols, and does not support complex statements, such as Join.

Drds Reader supports most data types in MySQL. Check whether your data type is supported.

Drds Reader converts Mysql data types as follows.

Category MySQL data type
Integer int, tinyint, smallint, mediumint, int, bigint
Floating point float, double, decimal
String varchar, char, tinytext, text, mediumtext, longtext
Date and time date, datetime, timestamp, time, year
Boolean bit, bool
Binary tinyblob, mediumblob, blob, longblob, varbinary

Note: Apart from the field types listed here, other types are not supported.

Parameter description

Parameter Description Required Default value
datasource Data source name. It must be identical to the data source name added. Adding data source is supported in script mode. Yes None
table The table selected for extraction. Yes None
column The column name set to be synchronized in the configured table. Field information is described with arrays in JSON. [‘*’] indicates all columns by default.
- Column pruning is supported, which means you can select some columns to export. - Change of column order is supported, which means you can export the columns in an order different from the schema order of the table.
- Constant configuration is supported. You must follow the Mysql SQL syntax format, for example [“id”, “`table`“, “1”, “‘bazhen.csy’”, “null”, “to_char(a + 1)”, “2.3” , “true”], where id refers to the ordinary column name, `table` is the name of the column containing reserved words, 1 is an integer constant, ‘bazhen.csy’ is a string constant, null refers to null pointer, CHARLENGTH(s) is the function expression to calculate the string length, 2.3 is a floating point, and true is a boolean value.
- Column must contain the specified column set to be synchronized and it cannot be blank.
Yes None
where Filtering condition. Drds Reader concatenates an SQL command based on specified column, table, and WHERE conditions and extracts data according to the SQL. For example, you can set the where condition during a test. In actual business scenarios, the data on the current day is usually required to be synchronized, in which case you can set the where condition as STRTODATE(‘${bdp.system.bizdate}’, ‘%Y%m%d’) <= taday AND taday < DATEADD(STRTODATE(‘${bdp.system.bizdate}’, ‘%Y%m%d’), interval 1 day).
- The where condition can be effectively used for incremental synchronization.
- If the where condition is not set or is left null, full table data synchronization is applied.
No None

Development in wizard mode

WizardModeDRDSReader

  • Data Sources: Select drds.

  • Table: Select a table to be synchronized.

  • Data Filtering: You can enter a WHERE filtering statement (with WHERE keyword not required) based on the corresponding SQL syntax. This filtering condition can be used as an incremental synchronization criterion.

  • the split button: In this case, column splitting is not required. The efficiency of data synchronization is improved if splitting is implemented according to topology rules.

Development in script mode

Configure a job to synchronously extract data from a Drds database.

  1. {
  2. "type": "job",
  3. "traceId": "You can enter a trace ID for your job here. We recommend you to organize your trace ID in the following format: service name + job ID",
  4. "version": "1.0",
  5. "configuration": {
  6. "reader": {
  7. "plugin": "drds",
  8. "parameter": {
  9. "datasource": "datasourceName",
  10. "table": "tableName",
  11. "column": [
  12. "id",
  13. "age",
  14. "name"
  15. ],
  16. "where": "id > 100"
  17. }
  18. },
  19. "writer": {
  20. }
  21. }
  22. }

Supplemental instructions

Consistency view

As a distributed database, Drds cannot provide a consistency view of multiple tables in multiple databases. Different from MySQL where data is synchronized in a single table of one database, Drds Reader cannot extract the snapshot of database/table sharding at the same time slice, that is to say, Drds Reader obtains different snapshots of table shards when extracting data from different underlying table shards. Therefore, strong consistency cannot be ensured.

Database encoding

Drds provides flexible encoding options, including database‑level, table‑level, and field‑level encoding. Different encodings can also be configured. The priority (from high to low) is field, table, database, and instance. We recommend that you use UTF-8 for database encoding at the database level.

Drds Reader extracts data by using JDBC at the underlying level. JDBC is applicable to all types of encodings and can complete transcoding at the underlying level. Therefore, Drds Reader can identify the encoding and complete transcoding automatically without the need to specify the encoding.

Drds Reader cannot identify the inconsistency between the encoding written to the underlying layer of Drds and the configured encoding, nor provide a solution. Due to this issue, the exported codes may contain garbage codes.

Incremental data synchronization

Since Drds Reader extracts data by using JDBC SELECT statements, you can extract incremental data by using the SELECT…WHERE…conditions in the following ways.

  • When database online applications write data into the database, the modify field is filled with the modification timestamp, including addition, update, and deletion (logical deletion). For this type of applications, Drds Reader only requires the WHERE condition followed by the timestamp of the last synchronization phase.

  • For new streamline data, Drds Reader requires the WHERE condition followed by the maximum auto-increment ID of the last synchronization phase.

In case that no field is provided for the business to identify the addition or modification of data, Drds Reader cannot perform incremental data synchronization and can only perform full data synchronization.

SQL security

Drds Reader provides querySql statements for you to select data by yourself. Drds Reader conducts no security verification on querySql. The security during use is ensured by data synchronization users.

Thank you! We've received your feedback.