Database management often feels like a constant battle against three enemies: slow queries, unexpected storage spikes, and invisible locking issues.
The Database Autonomy Service (DAS) O&M Service transforms this experience. By enabling this feature, you move from reactive troubleshooting to proactive, automated management. This guide walks you through how to activate the service and, more importantly, how to use its powerful tools to safeguard your business.
The following gives a workflow to enable the DAS O&M Service:
● Step 1: Log in to the DAS Console.
○ DAS Console: https://hdm.console.alibabacloud.com/?#/general
● Step 2: Select your target instance.
(Currently support ApsaraDB RDS for MySQL, PolarDB for MySQL, ApsaraDB for Redis and Tair.)
● Step 3: Navigate to the O&M Service section to activate it.
About Price: Now, you can enjoy a limited-time promotion.
● List Price: 45 USD/Instance/Month
● Promotion Price: 15 USD/Instance/Month



We will use ApsaraDB RDS for MySQL as the primary example to demonstrate how the O&M Service solves critical database challenges.
Console: https://hdm.console.alibabacloud.com/?#/dbMonitor/MySQL

Notes: The O&M Service offers a wide range of features. Here, we will highlight only those that are most frequently used and offer the most unique value.
Doc: https://www.alibabacloud.com/help/en/das/user-guide/slow-query-logs-7
Scenario: Your application slows down. You know it's a database issue, but digging through thousands of logs to find the bad query takes hours.
Before: You are not able to use Slow Query Log Statistics.

After: Once you purchased O&M Service, you can use Slow Query Log Statistics to do following operations:
● SQL Diagnosis and Automated Optimization
○ Get Optimization Suggestions: By clicking Optimize in the "Actions" column, the system runs an SQL diagnostic engine.
○ Review Expected Benefits: The diagnostics provide results and expected performance gains if you apply the changes.
○ Apply Fixes: You can copy the optimized SQL statement directly (via the Copy button) to paste into your database client or management tool.
● Detailed Performance Analysis: The interface provides granular data to help you understand why a query is slow.
○ View Key Metrics: You can analyze queries based on specific statistics visible in the table:
○ Execution Frequency: See how often a bad query runs ("Executions").
○ Time Consumption: Check "Average Execution Duration" and "Maximum Execution Duration."
○ Resource Usage: Analyze "Average Scanned Rows" vs. "Returned Rows" (a high ratio here usually indicates a missing index).
○ Inspect SQL Templates: Click Details to view the specific "Slow Log Sample" and the full SQL template to understand the structure of the query.
● Custom Filtering
○ Refine Your View:You can "Configure filter conditions." This allows you to search for specific SQL IDs, time ranges, or databases to isolate the noise and focus on the most critical issues.

User Value: Stop guessing which index to add. DAS analyzes the query structure and data distribution to give you the mathematically best solution.
Doc: https://www.alibabacloud.com/help/en/das/user-guide/storage-analysis-7
Scenario: A sudden "Disk Full" error stops your business at 3:00 AM. Or, you are paying for 2TB of storage but only using 500GB of actual data due to hidden fragmentation.
Before: You can only use basic monitoring of storage but not possible to get instant insights (the Storage Analysis is not allowed to use).


After: You can easily get storage insights through one-click analysis with O&M Service enabled. The analysis report can be automatic generated on a customized frequency (daily, weekly, or specific days of a week).

With Storage Analysis, you can easily get insights like "It tells you exactly when you will run out of space and who (which tables) is eating up your storage." and these are what you can expect for:
● Accurate "Doomsday" Prediction (Capacity Planning): Instead of guessing when you need to buy more storage, the system calculates it for you.
○ Why you need it: This gives you a precise timeline. You know you have exactly 2.5 months to archive data or upgrade storage, preventing a sudden "Disk Full" outage that stops your business.
● Identify the "Space Hogs" (Granular Visibility): Storage bills are often high, but it's hard to know why. This report names and shames the specific culprits.
○ Why you need it: You can immediately target these specific tables for cleanup or archiving. Without this view, you might waste time optimizing small tables that don't move the needle.
● Distinguish Between Data and "Waste": Database storage isn't just user data; it's also logs, temp files, and fragmentation. The report can show you everything wih details and predictions (based on trends). DAS identifies "holes" in your storage (fragmentation) caused by frequent deletes. You can often reclaim gigabytes of space simply by optimizing tables, saving money on unnecessary storage upgrades.
○ Why you need it: If your storage spikes, the report tells you if it's because of actual business growth (Good) or an exploding log file due to a bad configuration or stuck transaction (Bad).
● Automated Hygiene (Fragmentation Analysis): Over time, deleting data leaves "holes" (fragmentation) in database files, meaning you pay for space you aren't using.
○ Why you need it: It acts like an automated DBA. If there were fragmented tables, you could click a button to reclaim gigabytes of space without manual calculation.
Doc: https://www.alibabacloud.com/help/en/das/user-guide/deadlock-analysis-1
Scenario: The database CPU is low, but the application is freezing. This is usually a "Deadlock" or "Metadata Lock"—one query waiting indefinitely for another.
Before: You are not allowed to use Create Analysis Task for lock analysis.

After: You can easily do lock analysis in just one click and then it will provide immediate, deep-dive troubleshooting for database freezes and blocks.

Why automatic lock analysis is important to your business?
● Real-Time "Stuck" Query Diagnosis: Instead of looking at historical logs of what happened, this task analyzes the "current sessions... in real time."
○ Value: If your application is currently hanging or unresponsive, this task tells you exactly what is happening right now.
● Identification of Invisible Blocks (Metadata Locks): It specifically looks for "metadata locks" and "transaction blockages."
○ Value: Sometimes a database isn't slow because of a complex query (CPU usage), but because one simple query is waiting for another query to finish holding a lock. This tool identifies these "traffic jams" that standard monitoring might miss.
● Visualization of Complex Dependencies: Database locks often form chains (Transaction A is waiting for B, which is waiting for C).
○ Value: By generating a "lock analysis relationship graph," it turns complex technical data into a visual map. You can hover over sessions to see exactly who is blocking whom, making it much faster to find the "root culprit" session that needs to be killed to free up the system.
● High-Precision Data: The analysis relies on information_schema and performance_schema.
○ Value: It bases its conclusions on the database's internal truth (the most accurate system tables available), ensuring the diagnostic results are precise and reliable.
Doc: https://www.alibabacloud.com/help/en/das/user-guide/session-management-5
In addition, you can also use Session Management to help you monitor, diagnose, and control database connections in real-time.
This feature isn't exclusive to the O&M Service, but it is also used with very high frequency.
Sometimes, you need this feature to prevent database outages caused by stuck or resource-intensive queries. It gives you immediate visibility into what is running on your database and the control to terminate harmful processes without needing to log in via a command-line interface.

The O&M Service is equally powerful for Redis architectures, addressing the two most common causes of Redis failure: Network Saturation and Memory Exhaustion.
You can also use DAS O&M Service with your Tair/Redis instances.
As it shown below, with O&M Service enabled, you can see more advanced metrics, including hot keys by traffic & big keys by memory usage.
Before:


After:
● Hot Key (Traffic): Tells you what is clogging your Network.
● Big Key (Memory): Tells you what is clogging your RAM and CPU.
You can use these features to prevent two specific types of database failures: Network Saturation (choked bandwidth) and Memory Outage (OOM). They tell you which specific data is causing these problems so you can fix them, for example, moving from "My database is slow" to "Key user_rankings is too big and needs to be split."


Pro Tip: Use the "History" mode to perform weekly health checks. Look for patterns (e.g., "Key X always becomes a Big Key on Monday mornings") to refactor code before it breaks production.
Use Case: The Optimization Mode (Use "History")
● When to use: You want to perform a weekly health check or investigate why the system was slow yesterday at 3:00 PM.
● Action:
a. Click the History tab.
b. Look for patterns. Do certain keys become "Big" or "Hot" every Monday morning?
c. Refactor Code:
■ For Big Keys: Tell developers to "shard" the data (e.g., instead of one big list, break it into list_1, list_2).
■ For Hot Keys: Implement caching strategies (like CDN or local memory cache) so the database isn't hammered by heavy traffic repeatedly.
The DAS O&M Service changes the role of a database administrator. It shifts your focus from fixing outages to preventing them.
● For Performance: Use Slow Log Analysis.
● For Cost & Stability: Use Storage Analysis.
● For Emergencies: Use Lock Analysis and Session Management.
Disaster Recovery for Databases: High-availability Architecture of PolarDB-X
ApsaraDB - July 8, 2020
Alibaba Cloud New Products - June 1, 2020
Alibaba Clouder - December 21, 2020
Alibaba Cloud New Products - June 1, 2020
ApsaraDB - June 27, 2022
ApsaraDB - October 27, 2020
DBStack
DBStack is an all-in-one database management platform provided by Alibaba Cloud.
Learn More
Resource Management
Organize and manage your resources in a hierarchical manner by using resource directories, folders, accounts, and resource groups.
Learn More
Database for FinTech Solution
Leverage cloud-native database solutions dedicated for FinTech.
Learn More
Oracle Database Migration Solution
Migrate your legacy Oracle databases to Alibaba Cloud to save on long-term costs and take advantage of improved scalability, reliability, robust security, high performance, and cloud-native features.
Learn MoreMore Posts by ApsaraDB