Database Selection
Alibaba Cloud has a rich selection of cloud database products, covering relational databases, non-relational databases, data warehouses, and database ecosystem tools in four major parts, providing services throughout the entire data lifecycle for business production and integration, real-time processing, analysis and discovery, and development and management. The main relational databases include Alibaba Cloud RDS MySQL, Alibaba Cloud RDS PostgreSQL, Alibaba Cloud RDS SQL Server, Alibaba Cloud PolarDB MySQL, Alibaba Cloud PolarDB PostgreSQL, Alibaba polarDB distributed version (PolarDB-X) and Alibaba Cloud MariaDB. The main NoSQL databases include Alibaba Cloud Redis, Alibaba Cloud MongoDB, Alibaba Cloud native memory database Tair and Alibaba Cloud native multi-model database Lindorm. The choice of database needs to take into consideration various stages of customer cloud adoption and post-cloud business development, different access volumes, business scales, and varying business characteristics to better fit the customer's business demands. For example, using MySQL, in the startup or just launched on cloud phase, the RDS MySQL High Availability instance can meet basic disaster recovery demands using master-slave replication. As the business develops, in production environments, business read/write pressure may be borne by RDS read/write separation, and in scenarios with high concurrency, elastic deployment and massive storage, the PolarDB series may be needed to meet business requirements.
Common Problems and Optimization
Slow SQL Issues & Optimization: There are numerous reasons for slow SQL, and its optimization needs to be based on actual conditions. Common issues are as follows: SQL architecture design and table index design will affect query performance; unreasonable designs, such as unreasonable table structure design, missing indexes, excessive numbers of scanned lines, etc., may all lead to slow SQL. SQL needs to be optimized according to actual business conditions; the selected database instance may not meet the needs of business growth, resulting in full load resource usage and reaching the instance bottleneck. In this case, the instance specification needs to be upgraded; if there are a lot of data queries and import operations, it may also lead to slow SQL execution. You can consider performing large-volume operations during off-peak business hours, or breaking up large-scale operations to be executed in batches. These are some common issues, and more problems and optimization can be accomplished using the Database Autonomy Service (DAS).
High Active Thread Count Issue & Optimization: If there are a large number of active connection counts, it indicates that SQL accumulation and response are slowing, which can cause the instance to stop responding if severe. At the same time, active connections consume the connection pool, and when it reaches its limit, it will cause new connections to fail to be established, and the database will not be able to process new application requests. Common issues are as follows: slow SQL accumulation issues - if there are a lot of SQLs with many scanned lines, it is easy to lead to high active connection counts. SQL throttling functions can be used, or sessions can be ended to reduce the impact of slow SQL; when the table cache (Table Cache) is insufficient, it will lead to a large number of SQLs being in the Opening table state. It is common in scenarios with high QPS or a large number of tables and can be resolved by adjusting the parameters table_open_cache and table_open_cache_instances; if a metadata lock (MDL) occurs, it will cause a large number of SQLs to be in the Waiting for table metadata lock state. During the DDL prepare and commit stages, DDL sentences need to obtain MDL locks. If there are uncommitted transactions or slow SQLs on the table, it will block DDL operations. DDL operations will then block other SQLs, ultimately leading to a rise in the active thread count. The problem can be solved by abolishing uncommitted transactions, slow SQLs, or DDLs that are being executed.
Memory, Disk, I/O, etc. Issues & Optimization: Other typical issues include memory, disk, I/O, etc., which might be related to Buffer Pool, temporary tables, indexes, and Binlog, among many other factors. More problems and optimization can be accomplished using the Database Autonomy Service (DAS).