All Products
Search
Document Center

Advanced options

Last Updated: Oct 11, 2019

When you Create schemas by using the wizard, you can customize some settings in advanced options, such as filtering by fields or tables and controlling the number of connections used for table synchronization.

Filter by field

Setting method: sensitive-columns=<table_name>.<column_name>...<table_name>.<column_name> can specify multiple fields separated by commas (,).

For example, sensitive-columns=tbl01.col1,tbl01.col2,tbl02.col3 indicates that col1 and col2 in tbl01 and col3 in tbl02 are sensitive fields. col1, col2, and col3 are not synchronized to Object Storage Service (OSS) during schema creation.

Synchronize only some tables

Setting method: In include-tables=<table_name>, table_name indicates a common table name or a table name containing the wildcard %.

For example, include-tables=tbl01,view_% indicates that only the tbl01 table is synchronized or all tables prefixed with view_ are synchronized.

Filter by table

Setting method: In exclude-tables=<table_name>, table_name indicates a common table name or a table name containing the wildcard %.

For example, exclude-tables=tbl01,view_% indicates that the tbl01 table or all tables prefixed with view_ are not synchronized.

Note:

  • We recommend that either include-tables or exclude-tables be configured.

  • When both include-tables and exclude-tables are configured, exclude-tables is prior to include-tables.

Specify the number of connections used for single table synchronization

When Data Lake Analytics (DLA) synchronizes data, 20 connections are used by default. When the ApsaraDB for RDS (RDS) table contains a numeric auto-increment primary key and the RDS table contains a large amount of data, you can set the number of connections used for data synchronization.

Setting method: connections-per-job=<number of connections>.

For example, connections-per-job=100.

Set the total number of connections

You can set the total number of connections used for data synchronization in DLA, to prevent synchronization tasks from using all connections and affecting other tasks.

Setting method: total-allowed-connections=<number of connections> is used with connections-per-job=<number of connections>.

For example, the following sample indicates that a synchronization task uses 100 connections and 1,000 connections at most. In this case, DLA can synchronize 10 tables at a time.

  1. connections-per-job=100
  2. total-allowed-connections=1000