This topic provides answers to frequently asked questions (FAQ) about PolarDB for MySQL.
- What is PolarDB?
PolarDB is a cloud-based relational database service. PolarDB has been deployed in data centers in more than 10 regions around the world. PolarDB provides out-of-the-box online database services. PolarDB supports three independent engines. This allows PolarDB to be fully compatible with MySQL and PostgreSQL and highly compatible with Oracle syntax. A PolarDB cluster supports a maximum storage space of 100 TB. You can purchase PolarDB on a pay-as-you-go basis based on your needs. For more information, see Overview.
- Why does PolarDB outperform traditional databases?
Compared with traditional databases, PolarDB can store hundreds of terabytes of data. It also provides a wide array of features, such as high availability, high reliability, rapid elastic upgrades and downgrades, and lock-free backups. For more information, see Benefits.
- When was PolarDB released? When was it available for commercial use?
It was released for public preview in September 2017, and available for commercial use in March 2018.
- What are clusters and nodes?
PolarDB Cluster Edition uses a multi-node cluster architecture. A cluster has one primary node and multiple read-only nodes. A single PolarDB cluster can be deployed across zones but not across regions. The PolarDB service is managed based on clusters, and you are charged for the service based on clusters. For more information, see Glossary.
- Which programming languages are supported?
PolarDB supports programming languages, including Java, Python, PHP, Golang, C, C++, .NET, and Node.js. PolarDB for MySQL supports all the programming languages that are supported by native MySQL. For more information, see the MySQL official website.
- Which storage engines are supported?
PolarDB supports three Product editions. The following items describe the storage engines that are supported by different editions:
- All the tables in PolarDB for MySQL Cluster Edition and Single Node are stored in the InnoDB storage engine. When you create a table, PolarDB for MySQL automatically converts non-InnoDB engines, such as MyISAM, Memory, and CSV, to InnoDB engines. Therefore, the tables that are not stored in the InnoDB engine can be migrated to PolarDB for MySQL as expected.
- By default, PolarDB for MySQL Archive Database uses X-Engine. X-Engine provides powerful data compression capabilities and allows you to use archive databases at a low storage cost. For more information, see Overview of Archive Database.
- Is a user-created secondary instance supported? How do I implement a primary/secondary
Yes, it is supported. To implement a primary/secondary architecture, you can enable the binary log feature to synchronize data from a PolarDB for MySQL cluster to a user-created MySQL instance. To facilitate subsequent maintenance, we recommend that you use Data Transmission Service (DTS) to synchronize data. For more information, see Synchronize data from an Apsara PolarDB for MySQL cluster to an ApsaraDB RDS for MySQL instance.
- Is PolarDB a distributed database?
Yes, PolarDB is a distributed storage cluster based on the Parallel Raft protocol. The computing engine consists of 1 to 16 compute nodes that are distributed on different servers. A cluster supports a maximum storage space of 100 TB and a maximum of 88 cores and 710 GB of memory. You can dynamically scale out storage and computing resources online. The services can run as expected during the scale-out.
- After I purchase PolarDB, do I need to purchase database middleware to implement sharding.
- Does PolarDB support table partitioning?
Yes, PolarDB supports table partitioning.
- Does PolarDB automatically include a partition mechanism?
PolarDB implements partitioning at the storage layer. This is transparent and imperceptible to users.
- Compared with native MySQL, what is the maximum size of data that a single table can
store in PolarDB?
PolarDB has no limits on the size of a single table. However, the size of a single table is limited by the size of disk space. For more information, see Limits.
- Is PolarDB for MySQL compatible with MySQL Community Edition?
Yes, PolarDB for MySQL is fully compatible with MySQL Community Edition.
- Which transaction isolation levels are supported?
PolarDB for MySQL supports three isolation levels: READ_UNCOMMITTED, READ_COMMITTED, and REPEATABLE_READ. The default isolation level is READ_COMMITTED. The SERIALIZABLE isolation level is not supported.
- Are the query results of the SHOW PROCESSLIST statement in PolarDB for MySQL the same
as those in MySQL Community Edition?
If you use a primary endpoint to execute the SHOW PROCESSLIST statement, the query results are the same. If you use a cluster endpoint to execute the SHOW PROCESSLIST statement, the query results are different between PolarDB for MySQL and MySQL Community Edition. In the query results of the statement in PolarDB for MySQL, you can find multiple records that have the same thread ID. Each of these records corresponds to a node in the PolarDB for MySQL cluster.
- Is the lock mechanism of PolarDB for MySQL different from that of MySQL Community
Yes, the lock mechanism of PolarDB for MySQL is different from that of MySQL Community Edition. PolarDB for MySQL uses redo logs to synchronize the exclusive metadata locks (MDLs) that are involved in data definition language (DDL) operations to read-only nodes. The read-only nodes hold the locks until the DDL operations are complete. This prevents other user threads on the read-only nodes from accessing the data that is stored in the tables when the DDL operations are in progress. PolarDB for MySQL is different from MySQL Community Edition in terms of data storage. In PolarDB for MySQL, primary nodes and read-only nodes share the stored data. As a result, when the primary nodes perform DDL operations, the read-only nodes may retrieve the intermediate data that is generated by the DDL operations and an error occurs.
- Is the binary log format of PolarDB for MySQL the same as the native binary log format
Yes, the binary log format of PolarDB for MySQL is the same as the native binary log format of MySQL.
- Are the performance schema and the sys schema supported?
Yes, they are supported.
- Are the table statistics in PolarDB for MySQL consistent with those in MySQL Community
Yes, the table statistics in the primary node of PolarDB for MySQL are consistent with those in MySQL Community Edition. Each update of table statistics in the primary node is synchronized to the read-only nodes to ensure that execution plans are consistent between the primary node and the read-only nodes. You can also perform the ANALYZE TABLE operation on the read-only nodes to proactively load the latest statistics from disks.
- Does PolarDB support extended architecture (XA) transactions? Does PolarDB for MySQL support XA
transactions in the same way as the native MySQL system?
Yes, PolarDB for MySQL supports XA transactions in the same way as the native MySQL system.
- Does PolarDB support full-text indexes?
Yes, PolarDB supports full-text indexes.Note When you query data by using full-text indexes, index caches are used on read-only nodes. Due to the index caches, you cannot retrieve the latest data based on the indexes. We recommend that you use primary endpoints to read and write data based on full-text indexes. This ensures that you can retrieve the latest data.
- Is Percona Toolkit supported?
Yes, it is supported. However, we recommend that you use online DDL.
- Is gh-ost supported?
Yes, it is supported. However, we recommend that you use online DDL.
- What are the billing items for a PolarDB cluster?
The billing items include the storage space, compute nodes, data backup feature (with a free quota), and SQL Explorer feature (optional). For more information, see Billable items.
- Which files are stored in the storage space that incurs fees?
The storage space that incurs fees stores database table files, index files, undo log files, redo log files, binary log files, slowlog files, and a few system files. For more information, see Billable items.
- How do I use storage plans of PolarDB?
You can purchase storage plans to deduct the storage fees of clusters that use the subscription or pay-as-you-go billing method. For example, if you have three clusters and each cluster has a storage capacity of 40 GB, the total storage capacity is 120 GB. The three clusters can share a 100 GB storage plan. You are charged for the excess 20 GB of storage space on a pay-as-you-go basis. For more information, see Purchase a storage plan.
Cluster access (read/write splitting)
- How do I implement read/write splitting in PolarDB?
You need only to use a cluster endpoint in your application so that read/write splitting can be implemented based on the specified reader nodes. For more information, see Custom cluster endpoints.
- How many read-only nodes are supported in a PolarDB cluster?
PolarDB uses a distributed cluster architecture. A cluster consists of one primary node and a maximum of 15 read-only nodes. At least one read-only node must be contained to ensure high availability.
- Why are loads unbalanced among read-only nodes?
One of the possible reasons is that only a small number of connections to read-only nodes exist. Another possible reason is that one of the read-only nodes is not specified as a reader node when you create a custom cluster endpoint.
- What are the causes of heavy or light loads on the primary node?
Heavy loads on the primary node may occur due to the following possible causes: 1. The primary endpoint is used to connect your applications to the cluster. 2. The primary node accepts read requests. 3. A large number of transaction requests exist. 4. Requests are routed to the primary node because of a high primary/secondary replication delay. 5. Read requests are routed to the primary node due to read-only node exceptions.
The possible cause of light loads on the primary node is that the Offload Reads from Primary Node feature is enabled.
- How do I reduce the loads on the primary node?
You can reduce the loads on the primary node by using the following methods:
- You can use a cluster endpoint to connect to a PolarDB cluster. For more information, see Modify and delete a cluster endpoint.
- If a large number of transactions cause heavy loads on the primary node, you can enable the transaction splitting feature in the console. This way, part of queries in the transactions are routed to read-only nodes. For more information, see Transaction splitting.
- If requests are routed to the primary node because of a replication delay, you can decrease the consistency level. For example, you can use the eventual consistency level. For more information, see Consistency levels.
- If the primary node accepts read requests, the loads on the primary node may also become heavy. In this case, you can disable the feature that allows the primary node to accept read requests in the console. This reduces the number of read requests that are routed to the primary node. For more information, see Feature.
- Why am I unable to immediately retrieve the newly inserted data?
The possible cause is that the specified consistency level does not allow you to immediately retrieve the newly inserted data. The cluster endpoints of PolarDB support the following consistency levels:
Note A high consistency level results in heavy loads on the primary node. This compromises the performance of the primary node. Use caution when you select a consistency level. In most scenarios, the session consistency level can ensure service availability. For a few SQL statements that require strong consistency, you can add the
- Eventual consistency: This consistency level does not ensure that you can immediately retrieve the newly inserted data regardless of whether based on the same session (connection) or different sessions.
- Session consistency: This consistency level ensures that you can immediately retrieve the newly inserted data based on the same session.
- Global consistency: This consistency level ensures that you can immediately retrieve the latest data based on the same session or different sessions.
/* FORCE_MASTER */hint to the SQL statements to meet the consistency requirements. For more information, see Consistency levels.
- How do I force an SQL statement to be executed on the primary node?
If you use a cluster endpoint, add
/* FORCE_MASTER */or
/* FORCE_SLAVE */before an SQL statement to forcibly specify where the SQL statement is routed. For more information, see Hint syntax.
/* FORCE_MASTER */is used to forcibly route requests to the primary node. This method applies to a few scenarios where strong consistency is required for read requests.
/* FORCE_SLAVE */is used to forcibly route requests to a read-only node. This method applies to scenarios where the PolarDB proxy requests that special syntax be routed to the primary node to ensure accuracy. For example, statements that call stored procedures and use multistatement are routed to the primary node by default.
- If you need to execute the preceding statement that contains the hint on the official command line of MySQL, add the -c parameter on the command line. Otherwise, the hint becomes invalid because the official command line of MySQL filters out the hint. For more information, see the official command line of MySQL.
- The route priorities of hints are the highest and are not limited by consistency levels and transaction splitting. Perform an evaluation before you use the hints.
- The hints cannot contain the statements that change environment variables, such as
/*FORCE_SLAVE*/ set names utf8;. This kind of statements may cause unexpected query results.
- Can I assign different endpoints to different services? Can I use different endpoints
to isolate my services?
Yes, you can create multiple custom endpoints and assign them to different services. If the underlying nodes are different, the custom cluster endpoints can be used to isolate the services and do not affect each other. For more information about how to create a custom endpoint, see Specify a cluster endpoint.
- How do I separately create a single-node endpoint for one of the read-only nodes if
multiple read-only nodes exist?
You can create a single-node endpoint only if the Read/write Mode parameter for the cluster endpoint is set to Read Only and the cluster has three and more nodes. For more information, see Specify a cluster endpoint.Warning However, if you create a single-node endpoint for a read-only node and the read-only node becomes faulty, the single-node endpoint may be unavailable for up to 1 hour. We recommend that you do not create single-node endpoints in your production environment.
- What is the maximum number of single-node endpoints that I can create in a cluster?
If your cluster has three nodes, you can create a single-node endpoint for only one of the read-only nodes. If your cluster has four nodes, you can create single-node endpoints for two of the read-only nodes, one for each. Similar rules apply if your cluster has five or more nodes.
- Read-only nodes have loads when I use only the primary endpoint. Does the primary
endpoint support read/write splitting?
No, the primary endpoint does not support read/write splitting. The primary endpoint is always connected to only the primary node. Read-only nodes may have a small number of queries per second (QPS). This is a normal case and is irrelevant to the primary endpoint.
Management and maintenance
- How do I add fields online?
You can use tools such as the native online DDL of MySQL, pt-osc, and gh-ost to add fields online. We recommend that you use the native online DDL of MySQL.
- How do I add indexes online?
You can use tools such as the native online DDL of MySQL, pt-osc, and gh-ost to add fields online. We recommend that you use the native online DDL of MySQL.
- Is the bulk insert feature supported?
Yes, it is supported.
- Can I bulk insert data if I write data to only a write-only node? What is the maximum
number of values can I insert at a time?
Yes, you can. The maximum number of values you can insert at a time is determined by the value of the max_allowed_packet parameter. For more information, see Replication and max_allowed_packet.
- Can I use cluster endpoints to perform the bulk insert operation?
Yes, you can.
- Does a replication delay occur when I replicate data from the primary node to the
Yes, a replication delay of a few milliseconds occurs.
- When does a replication delay increase?
A replication delay increases in the following scenarios:
- The primary node processes a large number of write requests and generates excessive redo logs. As a result, these redo logs cannot be replayed on the read-only nodes in time.
- To process heavy loads, the read-only nodes occupy a large number of resources that are used to replay redo logs.
- The system reads and writes redo logs at a low rate due to I/O bottlenecks.
- How do I ensure the consistency of query results if a replication delay occurs?
You can use a cluster endpoint and select an appropriate consistency level for the cluster endpoint. The following consistency levels are listed in descending order: global consistency (strong consistency), session consistency, and eventual consistency. For more information, see Specify a cluster endpoint.
- Can the recovery point objective (RPO) be zero if a single node fails?
Yes, the RPO can be zero if a single node fails.
- How are node specifications upgraded in the backend, for example, upgrading node specifications
from 2 cores and 8 GB memory to 4 cores and 16 GB memory? What are the impacts of
the upgrade on my services?
The proxy and database nodes of PolarDB must be upgraded to the latest configurations. A rolling upgrade method is used to upgrade multiple nodes to minimize the impacts on your services. It takes about 10 to 15 minutes for each upgrade. The impacts on your services last for no more than 30 seconds. During this period, one to three transient connection errors may occur. For more information, see Change configurations.
- How long does it take to add a node? Are my services affected when the node is added?
It takes about 5 minutes to add a node. Your services are not affected when the node is added. For more information about how to add a node, see Add a read-only node.Note A read/write splitting connection that is created after a read-only node is added forwards requests to the read-only node. A read/write splitting connection that is created before a read-only node is added does not forward requests to the read-only node. You must close the connection and establish the connection again. For example, you can restart the application to establish the connection.
- How long does it take to upgrade a kernel minor version to the latest revision version?
Does the version upgrade affect my services?
PolarDB uses a rolling upgrade method to upgrade multiple nodes to minimize the impacts on your services. It takes about 10 to 15 minutes for each upgrade. The impacts on your services last for no more than 30 seconds. During this period, one to three transient connection errors may occur. For more information, see Upgrade versions.
- How is an automatic failover implemented?
PolarDB uses an active-active high-availability cluster architecture. This architecture supports automatic failovers between the primary node that supports reads and writes and the read-only nodes. The system automatically elects a new primary node. Each node in a PolarDB cluster has a failover priority. This priority determines the probability at which a node is elected as the primary node during a failover. If multiple nodes have the same failover priority, they all have the same probability of being elected as the primary node. For more information, see Switch over services between primary and read-only nodes.
Backup and restoration
- How does PolarDB back up data?
PolarDB uses snapshots to back up data. For more information, see Back up data.
- How fast can a database be restored?
It takes 40 minutes to restore or clone 1 TB of data in a database based on backup sets or snapshots. If you want to restore data to a specific time point, you must include the time required to replay the redo logs. It takes about 20 to 70 seconds to replay 1 GB of redo log data. The total restoration time is the sum of the time required to restore data based on backup sets and the time required to replay the redo logs.
Performance and capacity
- Why does PolarDB for MySQL fail to show significant performance improvements when I compare PolarDB for MySQL
with ApsaraDB RDS for MySQL?
Before you compare the performance of PolarDB for MySQL with that of ApsaraDB RDS for MySQL, take note of the following considerations to obtain accurate and reasonable performance comparison results:
- Use PolarDB for MySQL and ApsaraDB RDS for MySQL of the same specifications to compare performance.
- Use PolarDB for MySQL and ApsaraDB RDS for MySQL of the same version to compare performance.
The reason is that implementation mechanisms vary based on versions. For example, MySQL 8.0 optimizes multi-core CPUs by separately abstracting threads, such as Log_writer, log_fluser, log_checkpoint, and log_write_notifier. However, if only a few CPU cores are used, the performance of MySQL 8.0 is lower than that of MySQL 5.6 or MySQL 5.7. We recommend that you do not compare PolarDB for MySQL 5.6 with ApsaraDB RDS for MySQL 5.7 or 8.0. This is because the optimizer of PolarDB for MySQL 5.6 is not as excellent as that of the later versions of PolarDB for MySQL.
- We recommend that you simulate the loads in actual online environments or use the sysbench benchmark suite to compare the performance. This makes the obtained performance data closer to that obtained in actual online scenarios.
- We recommend that you do not use a single SQL statement to compare the read performance
between PolarDB for MySQL and ApsaraDB RDS for MySQL.
This is because PolarDB uses an architecture where computing is decoupled from storage and the network latency affects the response time of a single SQL statement. Therefore, the read performance of PolarDB for MySQL is lower than that of ApsaraDB RDS for MySQL. However, the cache hit ratio for an online database is greater than 99% in most cases. Only the first read request consumes I/O resources, and the read performance is compromised. The subsequent read requests do not consume I/O resources because the data is stored in a buffer pool. For the subsequent read requests, PolarDB for MySQL and ApsaraDB RDS for MySQL offer the same read performance.
- We recommend that you do not use a single SQL statement to compare the write performance.
Instead, we recommend that you simulate a production environment and perform stress
We recommend that you compare the primary nodes and the read-only nodes in PolarDB with the primary instances and the read-only instances in ApsaraDB RDS for MySQL for performance comparison. Semi-synchronous replication is implemented for the read-only instances in ApsaraDB RDS for MySQL. This is because the architecture of PolarDB uses the quorum mechanism for data writes by default. If the data is written to two of the triplicate or all of the triplicate, the system determines that the write operation is successful. PolarDB implements data redundancy at the storage layer, and ensures strong consistency and high reliability for the triplicate. Therefore, an appropriate comparison method is to compare PolarDB for MySQL with ApsaraDB RDS for MySQL where semi-synchronous replication instead of asynchronous replication is implemented.
For information about the performance comparison results between PolarDB for MySQL and ApsaraDB RDS for MySQL, see Comparison with ApsaraDB RDS for MySQL.
- Why does a deleted database occupy a large amount of storage space?
This is because the redo log files of the deleted database occupy storage space. In most cases, the redo log files occupy 2 GB to 11 GB storage space. If a total of 11 GB storage space is occupied, 8 GB storage space is occupied by the eight redo log files in the buffer pool. The remaining 3 GB storage space is evenly occupied by the redo log file that is being written, the pre-created redo log file, and the latest redo log file.
loose_innodb_polar_log_file_max_reuseparameter specifies the number of redo log files in the buffer pool. The default value of this parameter is 8. You can change the value of this parameter to reduce the storage space that is occupied by log files. In this case, periodic performance fluctuations may occur if heavy loads need to be processed.
- What is the maximum number of tables? What is the upper limit for the number of tables
if I need to ensure that the performance is not compromised?
The maximum number of tables depends on the number of files. For more information, see Limits.
- Can table partitioning improve the query performance of PolarDB?
In most cases, if the SQL query statement falls into a partition, the performance can be improved.
- Can I create 10,000 databases in PolarDB? What is the maximum number of databases can I create?
Yes, you can create 10,000 databases in PolarDB. The maximum number of databases you can create depends on the number of files. For more information, see Limits.
- Is the number of read-only nodes relevant to the maximum number of connections? Can
I increase the maximum number of connections by adding read-only nodes?
The number of read-only nodes is irrelevant to the maximum number of connections. The maximum number of connections of PolarDB is determined by node specifications. For more information, see Limits. Upgrade specifications if you need more connections.
- How are the input/output operations per second (IOPS) limited and isolated? Do the
multiple nodes of a PolarDB cluster compete for I/O resources?
The IOPS is specified for each node of a PolarDB cluster based on the node specifications. The IOPS of each node is isolated from that of the other nodes and does not affect each other.
- Is the primary node affected if the performance of the read-only nodes is compromised?
Yes, the memory consumption of the primary node is slightly increased if the loads on the read-only nodes are excessively heavy and the replication delay increases.
- What is the impact on the database performance if I enable the binary log feature?
After you enable the binary log feature, only the write and update (INSERT, UPDATE, and DELETE) performance is affected and the query (SELECT) performance is not affected. In most cases, if you enable the binary log feature for the database in which read and write requests are balanced, the database performance decreases by no more than 10%.
- What is the impact on the database performance if I enable the SQL Explorer (full
SQL log audit) feature?
If you enable the SQL Explorer feature, the database performance is not affected.
- Which high-speed network protocol does PolarDB use?
PolarDB uses dual-port Remote Direct Memory Access (RDMA) to ensure high I/O throughput between compute nodes and storage nodes, and between data replicas. Each port provides a data rate of up to 25 Gbit/s at a low latency.
- What is the maximum bandwidth that I can use if I access PolarDB from the Internet?
If you access PolarDB from the Internet, the maximum bandwidth is 10 Gbit/s.
- What can I do if it takes a long time to restart nodes?
A larger number of files in your cluster result in the longer time that is consumed to restart nodes. In this case, you can specify the innodb_fast_startup parameter as ON to accelerate the restart process. For more information about how to modify the parameter, see Specify cluster parameters.Note You can specify this parameter for only the PolarDB for MySQL 8.0 clusters.
- What are the advantages of the large tables in PolarDB for MySQL over the local disks of traditional databases?
A large table in a PolarDB for MySQL database is split and stored across N physical storage servers. Therefore, the I/O operations for the large table are allocated to multiple disks. The overall throughput (rather than the I/O latency) of I/O read operations in the PolarDB for MySQL database is higher than that of the database where all I/O operations are scheduled to local disks.
- How do I optimize large tables?
We recommend that you use partitioned tables to optimize large tables.
- What are the application scenarios of partitioned tables?
You can use partitioned tables when you need to prune large tables to control the amount of scanned data for queries and do not want to modify the service code. For example, you can use partitioned tables to clear the historical data of your services at regular intervals. You can delete the partitions that are created in the earliest month and create partitions for the next month, and retain only the data of the latest six months.
- What method is suitable if I copy a table that has a large amount of data in the same
PolarDB for MySQL database, for example, copy all the data of table A to table B?
You can execute the following SQL statement to directly copy data:
create table B as select * from A
- Can I optimize PHP short-lived connections in high concurrency scenarios?
Yes, you can optimize PHP short-lived connections in high concurrency scenarios. To optimize PHP short-lived connections, enable the session-level connection pool in the settings of cluster endpoints. For more information, see Specify a cluster endpoint.
- How do I prevent slow SQL queries from decreasing the performance of the entire database?
If you use PolarDB for MySQL 5.6 or 8.0 clusters, you can use the statement concurrency control feature to implement rate limiting and throttling on the specified SQL statements. For more information about this feature, see Statement Concurrency Control.
- Does PolarDB support the idle session time-out feature?
Yes, PolarDB supports the idle session time-out feature. You can change the value of the wait_timeout parameter to specify a time-out period for idle sessions. For more information, see Specify cluster parameters.
- How do I identify slow SQL queries?
You can identify slow SQL queries by using the following two methods:
- Retrieve slow SQL queries in the console. For more information, see Slow SQL queries.
- Connect to a database cluster and execute the
show processlist;statement to find the SQL statements that take a long time to be executed. For more information about how to connect to database clusters, see Connect to a PolarDB for MySQL cluster.
- How do I terminate slow SQL queries?
After you identify a slow SQL query, find the ID of the slow SQL query and run the
kill <Id>command to terminate the SQL query.