All Products
Search
Document Center

OpenSearch:Configure a PolarDB for MySQL data source

Last Updated:May 31, 2023

PolarDB for MySQL is an online database service developed by Alibaba Cloud. PolarDB for MySQL is stable, reliable, and scalable, and provides out-of-box capabilities. For more information, see What is PolarDB for MySQL Enterprise Edition?

Before you purchase a PolarDB for MySQL cluster, take note of the following items:

  • OpenSearch supports only PolarDB for MySQL 5.6, 5.7, and 8.0.

  • You must use the Alibaba Cloud account that you use to log on to the OpenSearch console to create the PolarDB for MySQL cluster. Otherwise, OpenSearch cannot access this cluster.

  • The PolarDB for MySQL cluster must reside in the same region as the OpenSearch application for which you want to configure a data source.

  • By default, the binary logging feature is disabled after the PolarDB for MySQL cluster is created. In this case, you may fail to add a database in the cluster as a data source. Therefore, the binary logging feature must be enabled. To enable the binary logging feature, set the loose_polar_log_bin parameter to ON_WITH_GTID and retain the default value FULL for the binlog_row_image parameter. By default, the binlog_row_image parameter is set to FULL. The default value of the loose_polar_log_bin parameter is OFF.

  • The PolarDB for MySQL cluster supports cloned instance.

  • The PolarDB for MySQL cluster supports read/write splitting.

Supported features

  • All the data in the specified database or table can be manually pulled or automatically pulled at the specified time.

  • You can merge rows of multiple tables from one or more data sources. To merge the rows, make sure that the source tables use the same schema and that the configurations of the data processing plug-ins are the same in the data sources. In addition, the primary key values cannot be duplicated. Data with duplicate primary key values can be overwritten. This feature can be used in the following scenarios:

    • Data is synchronized from multiple source tables in a data source to a table in an application.

    • Multiple data sources are configured for a table in an application. Data is synchronized from one or more source tables in each data source to the table in the application.

  • You can use plug-ins to convert the values of data source fields.

  • You can specify filter conditions to filter full data.

  • You can use an asterisk (*) to match tables whose names have the same prefix.

Limits

  • The binlog_row_image parameter must be set to FULL for a PolarDB for MySQL cluster. To enable the binary logging feature, set the loose_polar_log_bin parameter to ON_WITH_GTID and retain the default value FULL for the binlog_row_image parameter. The default value of the loose_polar_log_bin parameter is OFF.

  • Only PolarDB for MySQL 5.6, 5.7, and 8.0 are supported.

  • You must use the Alibaba Cloud account that you use to log on to the OpenSearch console to create a PolarDB for MySQL cluster.

  • The PolarDB for MySQL cluster must reside in the same region as the OpenSearch application for which you want to configure a data source.

  • After you configure a PolarDB for MySQL data source for a standard application, you cannot use SDKs or call API operations to push incremental data to the application.

  • If you configure a PolarDB for MySQL data source for a standard application, you cannot specify filter conditions to filter the data of the data source.

  • REPLACE INTO statements are not supported.

  • You cannot use TRUNCATE or DROP statements to delete data. To delete data, use DELETE statements.

  • The password that is used to access the PolarDB for MySQL data source cannot contain the percent sign (%). Otherwise, the reindexing fails.

  • You cannot merge columns of source tables from different databases.

  • We recommend that you set both the loose_max_statement_time and connect_timeout parameters to 0. You can change the values of the parameters after reindexing or offline changes are performed and full data is synchronized.

Usage notes

  • If an ApsaraDB RDS for MySQL or PolarDB data source is attached to a PolarDB-X instance, you must specify the name of the database shard that is actually used on the PolarDB-X instance when you configure the data source. After database sharding, a database on the PolarDB-X instance has one shadow database and eight database shards. Data is randomly written to the database shards.

  • You can switch between the internal and public endpoints of a PolarDB for MySQL cluster. No traffic fee is generated when OpenSearch accesses the data of the PolarDB for MySQL cluster.

  • OpenSearch allows you to pull full data only from the primary database. We recommend that you perform reindexing and import full data during off-peak hours based on your business requirements.

  • If a source table from a PolarDB for MySQL cluster contains fields of the DATETIME or TIMESTAMP format, the values of these fields are converted to the values in milliseconds. You must set the type of destination fields in your application to TIMESTAMP.

  • After full data is synchronized, the documents that do not meet the filter conditions are filtered out. If specific documents have the same primary key values as those in the destination table of an application, these documents are also filtered out.

  • If no incremental data is generated for a long period of time (15 days or longer), a data synchronization error may occur. In this case, you can manually perform reindexing or offline changes.

  • If SSL encryption is enabled for PolarDB for MySQL, make sure that SSL certificates are valid. If the SSL certificates expire, a connection error may occur. You must update the validity periods of the SSL certificates in a timely manner.

  • When you use OpenSearch to synchronize data from a PolarDB for MySQL cluster, you must add the IP address and CIDR block of the server in the region in which OpenSearch resides to the IP address whitelist of the PolarDB for MySQL cluster. The following table provides the IP addresses of the servers in different regions in which OpenSearch resides.

    Region

    IP address

    China (Hangzhou)

    100.104.190.128/26,100.104.241.128/26

    China (Beijing)

    100.104.16.192/26,100.104.179.0/26

    China (Shanghai)

    100.104.37.0/26,100.104.46.0/26

    China (Shenzhen)

    100.104.87.192/26,1100.104.132.192/26

    China (Qingdao)

    100.104.240.128/26,1100.104.111.128/26

    China (Zhangjiakou)

    100.104.155.192/26,100.104.238.64/26

    Germany (Frankfurt)

    100.104.127.0/26,100.104.35.192/26

    US (Silicon Valley) and US (Virginia)

    100.104.193.128/26,100.104.119.128/26

    Singapore

    100.104.58.192/26,100.104.74.192/26

Account authorization

  • To connect to a PolarDB for MySQL cluster, you must obtain the required permissions to access the cluster, and enter an account and a password. Exercise caution when you enter a specific account and password to connect to the data source for the first time.

  • Make sure that your account has the permissions to query all the tables in the source database in Data Transmission Service (DTS). In addition, make sure that the SHOW CREATE TABLE *. * statement can be executed as expected. If your account does not have the permissions, real-time synchronization may fail.

  • Make sure that the account and permissions are not changed unless necessary. Account changes may lead to a failure of real-time synchronization. In addition, if changes are made to the account, a new version of the application may fail to be created. If the account and password are changed, you must remove the existing data source and add the PolarDB for MySQL cluster again as a new data source.

Troubleshooting

  • After a PolarDB for MySQL data source is configured, reindexing may be suspended. In this case, create a test table in the database in which source tables reside. Then, write one or two data entries to or update one or two data entries in the test table every minute. This ensures that continuous binary logs are generated when data is reindexed.

  • If an advanced application for which a PolarDB for MySQL database source is configured has overdue payments, you can manually trigger reindexing after you pay the overdue payments.

  • The password that is used to access the PolarDB for MySQL cluster cannot contain the percent sign (%). Otherwise, the reindexing fails and the error message "Illegal hex characters in escape (%) pattern" appears.

  • The primary key value of a table in an application must be unique. If data is synchronized from multiple tables with the same table schema, duplicate primary key values may exist. In this case, data may be overwritten. To prevent this issue, you can use the StringCatenateExtractor plug-in to concatenate the values of multiple fields. The source fields are pk and $table. Replace pk with the primary key field of a table in the PolarDB for MySQL cluster. $table is the default system variable, which indicates the table name. $table can be used only when a table-sharding wildcard needs to be configured. Concatenate the source fields by using a hyphen (-) or a custom character.

For example, a table in the PolarDB for MySQL cluster is named my_table_0 and the primary key value is 123456. In this case, the new primary key value obtained after concatenation is 123456-my_table_0.

  • The fields of the DATE or DATETIME type in a source table are filtered. For example, if the name of a source field in the database table is createtime, the time format that is specified in the filter conditions to filter the data of the data source must be createtime>'2018-03-01 00:00:00'. If the value of the field is in the createtime>'2018-3-1 00:00:00' format, an error is reported.

Configure a PolarDB for MySQL data source

  • You can configure a PolarDB for MySQL data source when you configure a new application.

  • If you want to configure a PolarDB for MySQL data source for an existing application, you can go to the details page of the application and perform offline changes.

Procedure and usage notes

1. In the Select Data Source panel, select PolarDB and click Connect to Database.

image..png

2. Configure the parameters of the PolarDB for MySQL data source, and click Connect. image..png

Parameter

Description

Cluster ID

The ID of the PolarDB for MySQL cluster. You can obtain the ID in the PolarDB console. The ID is case-sensitive. Example: pc-uf6c056ny9tiaj1l7.

Database Name

The name of the database to be connected in the PolarDB for MySQL cluster. The name is not case-sensitive.

Username

The database account that is used to obtain the table schemas and full data of the database. The username is case-sensitive.

Password

The password of the database account.

Then, OpenSearch attempts to connect to the data source and returns an error message if the connection fails.

Error message

Solution

The PolarDB for MySQL cluster does not reside in the same region as the OpenSearch application.

Check whether the cluster ID is correct and make sure that the PolarDB for MySQL cluster resides in the same region as the OpenSearch application. If the PolarDB for MySQL cluster and the OpenSearch application reside in the same region but the error message is still returned, submit a ticket.

The connection to a PolarDB for MySQL data source failed.

Check whether the cluster ID, database name, username, and password in the connection string that is used to connect to the data source are valid.

The table does not exist in the current PolarDB for MySQL cluster.

Check whether the table name is valid and whether the PolarDB for MySQL cluster contains the table.

The configurations of the PolarDB for MySQL cluster are invalid.

Go to the parameter configuration page of this cluster in the PolarDB console to modify the configurations. Then, try again later.

3. After the data source is connected, select a table. The following figure shows the page that is displayed after the connection is established. After you select a table, click OK. image..png

  • Select or enter the name of the table to be accessed. The name is case-sensitive.

  • You can use an asterisk (*) to match multiple tables with the same table schema in a database. The names of these tables have the same prefix. For example, you can enter table_* to match tables such as table_a and table_b.

4. After the connection is successful, configure field mappings. OpenSearch automatically synchronizes source fields from the table. For more information about data processing plug-ins, see Data processing plug-ins. image..png

5. Specify filter conditions to filter the source data. You cannot specify filter conditions for a standard application. After you complete the configurations in the Data Source step, click Submit. The application schema is configured. image..png