Performance test: notes worth attention in self-built databases v.s. RDS
Created#More Posted time:Apr 11, 2017 14:46 PM
Many users often test the performance differences between self-built databases and RDS, and they find that RDS is not as good as self-built DBs on ECS. They wonder why this happens?
First, the test must be fair to deliver a convincing result. Providing public database services, RDS must be highly available and secure, based on which the high performance makes sense. I believe few users are willing to use unstable and unsafe services no matter how powerful they are. First, to ensure stability, RDS must have master and slave nodes which are even set up in different computer rooms. This enables it to switch to another room quickly in case of issues in one room. Second, RDS must ensure data security. Are you willing to continue using RDS if your data suddenly disappears or if your database falls victim to SQL injection? As a service vendor, RDS must consider data security for its users. It adds an intermediate layer which will intercept SQL injection requests; it uses the highest security level on base-level data writes to ensure data loss doesn't occur when the host powers down abnormally. Last, regarding performance, the RDS source code team continues to optimize MySQL. Both the performance and stability are higher than the community edition, evidenced by standard benchmark tests.
Here we will summarize issues with the performance tests between RDS and self-built databases:
I. Network differences
1. Available zones
RDS can be divided into the single-zone RDS and the multi-zone RDS. For the single-zone RDS, its master and slave databases are in the same computer room, while for the multi-zone RDS, its master and slave databases are in different computer rooms. As a result, the ECS and RDS must be in the same zone during RDS testing.
2. Network links
ECS to RDS contains many network links, such as ECS > DNS > SLB > Proxy > DB. Self-built databases on ECS have 2 (ECS > ECS), meaning RDS has 3 more links than self-built databases on ECS.
3. Case analysis
1. An e-business company found the performance of his system relatively low after the system was migrated to the cloud server.
2. The application code and database configuration are identical.
RDS access links:
II. Configuration differences
1. Specification configuration
The RDS specification configuration mainly includes memory and CPU. The number of CPU cores of the ECS and those of the RDS must be the same during RDS testing.
2. Parameter configuration
(1) Security configuration: In order to ensure data security, RDS uses the highest protection level for committing transactions and flushing binlogs.
a. The innodb_flush_log_at_trx_commit parameters specify the log write frequency after the transaction is committed by InnoDB. When the value is 1, the log buffer will be written to the log file and flushed to disk during each transaction commit. 1 is the default value. This configuration is the safest, but there is a certain performance loss due to the operation of disk I/O during each transaction commit.
b. The sync_binlog is the frequency that the MySQL binlog syncs to disk. After the binlog is written sync_binlog times, it will be flushed to disk by MySQL. The value of 1 is the safest. It will synchronize the binary log once after writing each statement or transaction, so one statement or transaction log, at most, will be lost even in the event of failure. As a trade-off, it is the slowest performing configuration.
(2) Performance configuration: RDS allows user configured parameters except for specification parameters. Most of the parameters have been optimized by the official team, so the users can operate the database normally without adjusting the parameters. However, these parameters are only suitable for most application scenarios. For some specific user scenarios, the users need to perform customization.
Function: This parameter is used to determine the maximum value of the internal temporary memory table, which will be assigned to each thread (The actual limits are the values of tmp_table_size and max_heap_table_size). If the temporary memory table exceeds the limit, MySQL will automatically convert it to the disk-based MyISAM table. When optimizing the query statement, avoid using the temporary tables. If you really cannot avoid using them, those temporary tables must be ensured in memory.
Phenomenon: If the temporary tables are used when a complex SQL statement contains group by/distinct, which cannot be optimized through an index, it will lead to longer SQL execution.
Suggestion: If the application has a lot of group by/distinct statements, and the database has enough memory, you can increase the value of tmp_table_size (max_heap_table_size) to improve the query performance.
Function: This parameter is used to control the size of MySQL query cache memory; if MySQL opens query cache, it will first lock the query cache during execution of each query, and then determine whether the query is in the query cache. If it is, the result will be returned immediately. If not, the engine query and other operations can be carried out. At the same time, operations like insert, update and delete will make the query cache failure. The failure also includes any changes in structure or index. The cost for maintaining the cache failure is high, which results in heavy pressure on MySQL. When the database is not updated so frequently, query cache is very useful. On the contrary, when you write data very frequently and focus on a few tables, the query cache lock mechanism will cause frequent lock conflicts. You can't write and read on this table until the query cache lock is unlocked, which will decrease the select query efficiency.
Phenomenon: There are many connection statuses in the database, including checking query cache for query, waiting for query cache lock and storing results in query cache;
Suggestion: The RDS closes the query cache function by default. If you enable the query cache in the instance, you can disable the query cache when the above-mentioned statuses appear. Certainly, you can enable query cache in some cases, for example: use query cache to resolve database performance issues skillfully.
3. Case analysis
1. A user is migrating the local service system to the cloud
2. The execution time on RDS is doubled over that in the offline self-built database
Users' local parameter configuration:
join_buffer_size = 128M
read_rnd_buffer_size = 128M
tmp_table_size = 128M
RDS parameter configuration
join_buffer_size = 1M
read_buffer_size = 1M
III. Framework differences
1. Master-slave mechanism
RDS adopts a high availability model, the master-slave mechanism, and enables a semi-synchronous mechanism at the same time. The semi-synchronous mechanism is an improvement upon the MySQL asynchronous mechanism. After the master database finishes executing the transaction submitted by the client, it will wait to receive acknowledgment from the slave database which writes the data to the relay log, and then reply to the client instead of replying immediately. Compared with the asynchronous mechanism, the semi-synchronous mechanism improves data security, but it also causes a certain degree of delay, which is at least TCP/IP round-trip time. This means that the semi-synchronous mechanism increases the response time of the transaction.
Note: The same problems also exist when the high availability SQLSERVER uses mirror.
[Adolph edited the post at Apr 11, 2017 15:52 PM]