All Products
Search
Document Center

ApsaraDB RDS:Diagnose and optimize memory pressure in RDS SQL Server

Last Updated:Mar 28, 2026

Memory pressure in RDS SQL Server can silently degrade performance long before it triggers an obvious alert. Unlike CPU spikes, memory issues often appear first as elevated disk I/O or slow queries — symptoms that point elsewhere until you trace them back to the buffer pool or query workspace. This topic explains how to confirm memory pressure, identify which of the four pressure types is occurring, and apply the right fix for each.

Common signals that bring you here:

  • Error 701: insufficient memory to run a query

  • Error 802: failure to allocate buffer pool pages

  • RESOURCE_SEMAPHORE waits in query wait statistics

  • Page Life Expectancy (PLE) dropping without an obvious cause

  • Disk I/O rising while data volume stays constant

  • Queries spilling to tempdb

How SQL Server uses memory

SQL Server uses dynamic memory management. By default, it claims as much available memory as possible — using the buffer pool to cache data pages and the plan cache to store execution plans — and releases memory only when the OS signals low-memory conditions. High memory usage is therefore expected and normal in a well-utilized instance.

Memory pressure occurs when the current workload exceeds what memory can support. It appears in four forms:

Pressure typeWhat happensKey signals
Buffer pool pressureData pages are evicted before they can be reused, forcing disk readsPLE drops, disk I/O rises, cache hit ratio falls
Query workspace pressureQueries cannot get enough memory for sorting or hashingRESOURCE_SEMAPHORE waits, tempdb spills, slow query execution
Non-buffer pool (Stolen Memory) pressureInternal components — plan cache, locks, connections, Common Language Runtime (CLR) — consume memory that would otherwise serve the buffer poolStolen_Server_Memory_Kb grows, buffer pool shrinks, PLE drops indirectly
External memory pressureThe host OS runs low on memory and forces SQL Server to release its cachePerformance fluctuations during backups or on small instances with large datasets
RDS automatically configures parameters to protect OS memory, but external pressure can still occur on instances where memory is undersized for the workload — for example, a 4-core, 8 GB instance managing 2 TB of data, particularly when a physical backup starts.

Monitoring tools availability

The diagnostic paths in this topic use different tools. Check which are available on your instance before starting.

ToolWhere to find itAvailability
Monitoring and AlertsLeft navigation pane of the instance details pageAll editions and versions
Autonomy Service > Performance InsightLeft navigation pane > Autonomy Service > Performance OptimizationSupported regions and versions only; not available on RDS SQL Server 2008 R2 instances using cloud disks
DMV queries (SQL console)Run directly against the instanceAll editions and versions

Determine whether memory pressure exists

Check key metrics in Monitoring and Alerts

  1. Go to the RDS instance list, select a region, and click the target instance ID.

  2. In the left navigation pane, click Monitoring and Alerts.

  3. Review the following metrics:

MetricWhat to look for
mem_usageValues above 90% are generally normal. See the instance-specific thresholds below.
Page_life_expectancySustained drops below the calculated threshold signal buffer pool pressure.
bufferpool_hit_ratioA declining ratio alongside rising Page_Reads confirms buffer pool pressure.

Instance-specific `mem_usage` thresholds:

Instance memoryAlert threshold
512 GB97%
256 GB96%
192 GB and smaller95%

If high-memory-overhead features are enabled — such as Linked Server, CLR, In-Memory OLTP, or large numbers of Extended Events (XEvents) or Traces — actual memory consumption may significantly exceed the configured max server memory value. In that case, lower max server memory to reserve headroom for the OS and management services.

Calculate the healthy PLE threshold for your instance:

The traditional threshold of 300 seconds does not scale to large-memory instances. Use this formula instead:

PLE threshold = (Buffer pool memory in GB / 4) × 300

Examples:

  • 16 GB instance (buffer pool approximately 12 GB): healthy PLE > 900 seconds

  • 128 GB instance (buffer pool approximately 110 GB): healthy PLE > 8,250 seconds

Analyze memory composition in Performance Insight

If Monitoring and Alerts shows a potential problem, use Performance Insight to get a breakdown of what memory is being used for.

  1. In the left navigation pane, choose Autonomy Service > Performance Optimization to open the Performance Insight tab.

  2. Click Custom Metrics in the upper-right corner. Add metrics from the Memory Usage Categorization and AdvancedMemUsage groups.

Identify the pressure type and optimize

The three-stage workflow below matches how DBAs typically approach these problems: confirm at the server level, then isolate by memory composition, then act.

Scenario 1: insufficient buffer pool cache

Symptoms

  • Page_Reads rises significantly in Performance Insight alongside a drop in PLE

  • Disk I/O throughput grows while query data volume has not changed

  • PLE stays consistently below the instance-specific threshold

Diagnostic steps

  1. Confirm correlation. Verify that PLE drops and Page_Reads spikes are occurring at the same time and during business peak hours. If PLE drops during backups, index rebuilds, or checkpoints but recovers afterward, this is normal fluctuation and does not require action.

  2. Rule out short-term fluctuations. A brief PLE dip is not pressure. Look for sustained low PLE across multiple samples.

  3. Assess capacity. Compare Database_Cache_Memory_Kb against the estimated size of your active (hot) dataset. If the buffer pool can only hold a small fraction of the hot data, you have a capacity bottleneck.

Optimization options

OptionWhen to use
Scale up memoryHot dataset is significantly larger than available memory; I/O latency is the primary bottleneck
Add or optimize indexesFull table scans are evicting hot pages; adding a covering index converts scans to seeks
Archive cold dataHistorical data is competing with hot data for buffer pool space
Enable data compressionMore data pages fit into the same buffer pool memory
Rebuild fragmented indexesFragmented indexes waste buffer pool space; rebuilding increases page density
Drop unused indexesUnused indexes consume buffer pool memory during reads and increase write overhead

Scenario 2: plan cache bloat

Symptoms

  • Stolen_Server_Memory_Kb grows continuously and remains elevated, close to the internal quota limit for the Plan Cache

  • Database_Cache_Memory_Kb is squeezed downward as Stolen Memory grows

  • CPU usage rises from frequent cache eviction and plan recompilation

  • Disk I/O rises because the buffer pool is shrinking

  • The application sends many structurally identical SQL statements with different literal values (for example, WHERE id = 123, WHERE id = 456)

Diagnostic steps

Step 1: Check the proportion of single-use plans.

A high percentage of plans executed only once indicates that the cache is filling with non-reusable plans.

WITH PlanStats AS
(
    SELECT
        cp.usecounts,
        cp.size_in_bytes / 1024.0 / 1024.0 AS size_mb
    FROM sys.dm_exec_cached_plans AS cp
    WHERE cp.cacheobjtype = 'Compiled Plan'
)
SELECT
    total_plans           = COUNT(*),
    single_use_plans      = SUM(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END),
    single_use_ratio_pct  =
        CAST(
            100.0 * SUM(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END) / NULLIF(COUNT(*), 0)
            AS DECIMAL(5,2)
        ),
    total_plan_mb         = CAST(SUM(size_mb) AS DECIMAL(18,2)),
    single_use_plan_mb    = CAST(SUM(CASE WHEN usecounts = 1 THEN size_mb ELSE 0 END) AS DECIMAL(18,2)),
    single_use_mem_pct    =
        CAST(
            100.0 * SUM(CASE WHEN usecounts = 1 THEN size_mb ELSE 0 END)
            / NULLIF(SUM(size_mb), 0)
            AS DECIMAL(5,2)
        )
FROM PlanStats;

If single-use plans account for more than 50% of plan cache memory, the cache is being wasted on non-reusable plans.

Step 2: Identify the SQL statements generating single-use plans.

-- Top 20 ad hoc queries consuming the most plan cache memory
SELECT TOP 20
    cp.usecounts AS [execution_count],
    cp.size_in_bytes / 1024 AS [plan_size_kb],
    cp.objtype AS [object_type],
    st.text AS [sql_text]
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype = 'Adhoc'
  AND cp.usecounts = 1
ORDER BY cp.size_in_bytes DESC;

If the results show many statements with the same structure but different literal values, the root cause is unparameterized queries.

Optimization options

Option 1: Enable `optimize for ad hoc workloads` (recommended)

This parameter caches only a lightweight stub on the first execution of an ad hoc query. The full plan is stored only if the query runs again. This reduces Stolen Memory from single-use plans without affecting CPU compilation overhead.

  1. Go to the RDS instance list and open the instance details page.

  2. In the left navigation pane, click Parameter Settings.

  3. Search for optimize for ad hoc workloads and enable it.

Option 2: Parameterize queries in the application

Modify application code to use parameterized queries instead of string concatenation. If the application uses an Object-Relational Mapping (ORM) framework, check whether automatic parameterization or a second-level cache is available.

Option 3: Enable forced parameterization at the database level

If modifying the application is not feasible, enable forced parameterization on the affected database:

  1. Go to the RDS instance list and open the instance details page.

  2. In the left navigation pane, click Database Management.

  3. Click View Details for the target database.

  4. In the Basic Information section, set parameterization to FORCED and click Submit.

Important

Test forced parameterization in a staging environment first. For some queries with complex predicates, changing the execution plan can cause performance regression.

Scenario 3: insufficient query memory grant

Symptoms

Queries that involve sorting or hashing — such as ORDER BY, GROUP BY, DISTINCT, or hash joins — request a block of memory called a memory grant. If this grant cannot be satisfied, two failure modes occur:

Failure modeWhat you observe
Queuing (wait)Query stays in SUSPENDED state; RESOURCE_SEMAPHORE wait type appears; Memory Grants Pending counter rises
Spilling to diskQuery runs but very slowly; tempdb I/O writes rise sharply; execution plan shows a yellow warning icon on Sort or Hash Match operators with the message "Operator used tempdb to spill data"
This pressure type does not cause PLE to drop. If PLE looks healthy but you see the above symptoms, the issue is in the query workspace, not the buffer pool.

Diagnostic steps

Find queries currently waiting for or consuming large memory grants:

-- Real-time memory grants
SELECT
    mg.session_id,
    mg.request_time,
    mg.grant_time, -- NULL means the query is waiting for a grant (RESOURCE_SEMAPHORE)
    (mg.requested_memory_kb / 1024.0) AS requested_mb,
    (mg.granted_memory_kb / 1024.0)   AS granted_mb,
    (mg.required_memory_kb / 1024.0)  AS required_mb,
    mg.queue_id,
    mg.wait_order,
    st.text AS sql_text,
    qp.query_plan
FROM sys.dm_exec_query_memory_grants AS mg
CROSS APPLY sys.dm_exec_sql_text(mg.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(mg.plan_handle) AS qp
ORDER BY mg.granted_memory_kb DESC;

Find historical queries that have requested the largest grants:

SELECT TOP 20
    qs.execution_count,
    (qs.max_grant_kb / 1024.0)                             AS max_grant_mb,
    (qs.total_grant_kb / qs.execution_count / 1024.0)      AS avg_grant_mb,
    (qs.total_worker_time / qs.execution_count / 1000.0)   AS avg_cpu_ms,
    qs.last_execution_time,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
          WHEN -1 THEN DATALENGTH(st.text)
          ELSE qs.statement_end_offset
          END - qs.statement_start_offset)/2) + 1) AS query_text,
    qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
ORDER BY qs.max_grant_kb DESC;

In the query_plan column, search for the Spill keyword or look for warning icons on Sort or Hash Match operators.

Root causes and fixes

Root causeSymptomsFix
Outdated statisticsOptimizer estimates 1 row but processes 1 million; tiny grant request causes frequent spillsUpdate statistics regularly, especially after large data changes
Concurrent large queriesMultiple memory-intensive reports run simultaneously; later queries queue on RESOURCE_SEMAPHORESchedule memory-intensive report jobs to run sequentially during off-peak hours
Missing indexes on sort/hash operationsComplex JOIN, GROUP BY, or ORDER BY without a covering index forces SQL Server to build large in-memory hash tablesAdd covering or ordered indexes so the optimizer can avoid in-memory sorting
Wide row selectionSELECT * or long text columns inflate memory grant size (memory grant size = estimated rows × average row width)Select only needed columns; break large multi-join queries into simpler steps

SOSNODE memory growth issue

On RDS SQL Server High-availability Edition instances with a Mirroring architecture, particularly small instances (2-core 4 GB or 4-core 8 GB), MEMORYCLERK_SOSNODE_KB may show a slow, continuous upward trend over weeks or months. This growth is unrelated to business workload peaks and does not decrease on its own.

This is a known SOSNODE object memory leak in SQL Server. Microsoft has not completely resolved it across affected versions. The leaked memory counts toward max server memory and reduces the buffer pool, eventually causing a measurable memory bottleneck. A 701 error may appear in the error log as pressure increases.

Mitigation steps:

  1. Upgrade to a later SQL Server version to reduce exposure.

  2. Based on the observed leak rate, schedule a planned restart every 2 to 4 months during off-peak hours to release accumulated leaked memory.

RDS monitors SOSNODE growth automatically. When the metric reaches the threshold, RDS triggers a proactive operations and maintenance (O&M) task. View the corresponding task in the Event Center in the console.