Diagnose and resolve different types of memory pressure on SQL Server - ApsaraDB RDS

Memory pressure in RDS SQL Server can silently degrade performance long before it triggers an obvious alert. Unlike CPU spikes, memory issues often appear first as elevated disk I/O or slow queries — symptoms that point elsewhere until you trace them back to the buffer pool or query workspace. This topic explains how to confirm memory pressure, identify which of the four pressure types is occurring, and apply the right fix for each.

Common signals that bring you here:

Error 701: insufficient memory to run a query
Error 802: failure to allocate buffer pool pages
RESOURCE_SEMAPHORE waits in query wait statistics
Page Life Expectancy (PLE) dropping without an obvious cause
Disk I/O rising while data volume stays constant
Queries spilling to tempdb

How SQL Server uses memory

SQL Server uses dynamic memory management. By default, it claims as much available memory as possible — using the buffer pool to cache data pages and the plan cache to store execution plans — and releases memory only when the OS signals low-memory conditions. High memory usage is therefore expected and normal in a well-utilized instance.

Memory pressure occurs when the current workload exceeds what memory can support. It appears in four forms:

Pressure type	What happens	Key signals
Buffer pool pressure	Data pages are evicted before they can be reused, forcing disk reads	PLE drops, disk I/O rises, cache hit ratio falls
Query workspace pressure	Queries cannot get enough memory for sorting or hashing	`RESOURCE_SEMAPHORE` waits, tempdb spills, slow query execution
Non-buffer pool (Stolen Memory) pressure	Internal components — plan cache, locks, connections, Common Language Runtime (CLR) — consume memory that would otherwise serve the buffer pool	`Stolen_Server_Memory_Kb` grows, buffer pool shrinks, PLE drops indirectly
External memory pressure	The host OS runs low on memory and forces SQL Server to release its cache	Performance fluctuations during backups or on small instances with large datasets

RDS automatically configures parameters to protect OS memory, but external pressure can still occur on instances where memory is undersized for the workload — for example, a 4-core, 8 GB instance managing 2 TB of data, particularly when a physical backup starts.

Monitoring tools availability

The diagnostic paths in this topic use different tools. Check which are available on your instance before starting.

Tool	Where to find it	Availability
Monitoring and Alerts	Left navigation pane of the instance details page	All editions and versions
Autonomy Service > Performance Insight	Left navigation pane > Autonomy Service > Performance Optimization	Supported regions and versions only; not available on RDS SQL Server 2008 R2 instances using cloud disks
DMV queries (SQL console)	Run directly against the instance	All editions and versions

Determine whether memory pressure exists

Check key metrics in Monitoring and Alerts

Go to the RDS instance list, select a region, and click the target instance ID.
In the left navigation pane, click Monitoring and Alerts.
Review the following metrics:

Metric	What to look for
`mem_usage`	Values above 90% are generally normal. See the instance-specific thresholds below.
`Page_life_expectancy`	Sustained drops below the calculated threshold signal buffer pool pressure.
`bufferpool_hit_ratio`	A declining ratio alongside rising `Page_Reads` confirms buffer pool pressure.

Instance-specific `mem_usage` thresholds:

Instance memory	Alert threshold
512 GB	97%
256 GB	96%
192 GB and smaller	95%

If high-memory-overhead features are enabled — such as Linked Server, CLR, In-Memory OLTP, or large numbers of Extended Events (XEvents) or Traces — actual memory consumption may significantly exceed the configured max server memory value. In that case, lower max server memory to reserve headroom for the OS and management services.

Calculate the healthy PLE threshold for your instance:

The traditional threshold of 300 seconds does not scale to large-memory instances. Use this formula instead:

PLE threshold = (Buffer pool memory in GB / 4) × 300

Examples:

16 GB instance (buffer pool approximately 12 GB): healthy PLE > 900 seconds
128 GB instance (buffer pool approximately 110 GB): healthy PLE > 8,250 seconds

Analyze memory composition in Performance Insight

If Monitoring and Alerts shows a potential problem, use Performance Insight to get a breakdown of what memory is being used for.

In the left navigation pane, choose Autonomy Service > Performance Optimization to open the Performance Insight tab.
Click Custom Metrics in the upper-right corner. Add metrics from the Memory Usage Categorization and AdvancedMemUsage groups.

Identify the pressure type and optimize

The three-stage workflow below matches how DBAs typically approach these problems: confirm at the server level, then isolate by memory composition, then act.

Scenario 1: insufficient buffer pool cache

Symptoms

Page_Reads rises significantly in Performance Insight alongside a drop in PLE
Disk I/O throughput grows while query data volume has not changed
PLE stays consistently below the instance-specific threshold

Diagnostic steps

Confirm correlation. Verify that PLE drops and Page_Reads spikes are occurring at the same time and during business peak hours. If PLE drops during backups, index rebuilds, or checkpoints but recovers afterward, this is normal fluctuation and does not require action.
Rule out short-term fluctuations. A brief PLE dip is not pressure. Look for sustained low PLE across multiple samples.
Assess capacity. Compare Database_Cache_Memory_Kb against the estimated size of your active (hot) dataset. If the buffer pool can only hold a small fraction of the hot data, you have a capacity bottleneck.

Optimization options

Option	When to use
Scale up memory	Hot dataset is significantly larger than available memory; I/O latency is the primary bottleneck
Add or optimize indexes	Full table scans are evicting hot pages; adding a covering index converts scans to seeks
Archive cold data	Historical data is competing with hot data for buffer pool space
Enable data compression	More data pages fit into the same buffer pool memory
Rebuild fragmented indexes	Fragmented indexes waste buffer pool space; rebuilding increases page density
Drop unused indexes	Unused indexes consume buffer pool memory during reads and increase write overhead

Scenario 2: plan cache bloat

Symptoms

Stolen_Server_Memory_Kb grows continuously and remains elevated, close to the internal quota limit for the Plan Cache
Database_Cache_Memory_Kb is squeezed downward as Stolen Memory grows
CPU usage rises from frequent cache eviction and plan recompilation
Disk I/O rises because the buffer pool is shrinking
The application sends many structurally identical SQL statements with different literal values (for example, WHERE id = 123, WHERE id = 456)

Diagnostic steps

Step 1: Check the proportion of single-use plans.

A high percentage of plans executed only once indicates that the cache is filling with non-reusable plans.

WITH PlanStats AS
(
    SELECT
        cp.usecounts,
        cp.size_in_bytes / 1024.0 / 1024.0 AS size_mb
    FROM sys.dm_exec_cached_plans AS cp
    WHERE cp.cacheobjtype = 'Compiled Plan'
)
SELECT
    total_plans           = COUNT(*),
    single_use_plans      = SUM(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END),
    single_use_ratio_pct  =
        CAST(
            100.0 * SUM(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END) / NULLIF(COUNT(*), 0)
            AS DECIMAL(5,2)
        ),
    total_plan_mb         = CAST(SUM(size_mb) AS DECIMAL(18,2)),
    single_use_plan_mb    = CAST(SUM(CASE WHEN usecounts = 1 THEN size_mb ELSE 0 END) AS DECIMAL(18,2)),
    single_use_mem_pct    =
        CAST(
            100.0 * SUM(CASE WHEN usecounts = 1 THEN size_mb ELSE 0 END)
            / NULLIF(SUM(size_mb), 0)
            AS DECIMAL(5,2)
        )
FROM PlanStats;

If single-use plans account for more than 50% of plan cache memory, the cache is being wasted on non-reusable plans.

Step 2: Identify the SQL statements generating single-use plans.

-- Top 20 ad hoc queries consuming the most plan cache memory
SELECT TOP 20
    cp.usecounts AS [execution_count],
    cp.size_in_bytes / 1024 AS [plan_size_kb],
    cp.objtype AS [object_type],
    st.text AS [sql_text]
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype = 'Adhoc'
  AND cp.usecounts = 1
ORDER BY cp.size_in_bytes DESC;

If the results show many statements with the same structure but different literal values, the root cause is unparameterized queries.

Optimization options

Option 1: Enable `optimize for ad hoc workloads` (recommended)

This parameter caches only a lightweight stub on the first execution of an ad hoc query. The full plan is stored only if the query runs again. This reduces Stolen Memory from single-use plans without affecting CPU compilation overhead.

Go to the RDS instance list and open the instance details page.
In the left navigation pane, click Parameter Settings.
Search for optimize for ad hoc workloads and enable it.

Option 2: Parameterize queries in the application

Modify application code to use parameterized queries instead of string concatenation. If the application uses an Object-Relational Mapping (ORM) framework, check whether automatic parameterization or a second-level cache is available.

Option 3: Enable forced parameterization at the database level

If modifying the application is not feasible, enable forced parameterization on the affected database:

Go to the RDS instance list and open the instance details page.
In the left navigation pane, click Database Management.
Click View Details for the target database.
In the Basic Information section, set parameterization to FORCED and click Submit.

Important

Test forced parameterization in a staging environment first. For some queries with complex predicates, changing the execution plan can cause performance regression.

Scenario 3: insufficient query memory grant

Symptoms

Queries that involve sorting or hashing — such as ORDER BY, GROUP BY, DISTINCT, or hash joins — request a block of memory called a memory grant. If this grant cannot be satisfied, two failure modes occur:

Failure mode	What you observe
Queuing (wait)	Query stays in `SUSPENDED` state; `RESOURCE_SEMAPHORE` wait type appears; `Memory Grants Pending` counter rises
Spilling to disk	Query runs but very slowly; tempdb I/O writes rise sharply; execution plan shows a yellow warning icon on `Sort` or `Hash Match` operators with the message "Operator used tempdb to spill data"

This pressure type does not cause PLE to drop. If PLE looks healthy but you see the above symptoms, the issue is in the query workspace, not the buffer pool.

Diagnostic steps

Find queries currently waiting for or consuming large memory grants:

-- Real-time memory grants
SELECT
    mg.session_id,
    mg.request_time,
    mg.grant_time, -- NULL means the query is waiting for a grant (RESOURCE_SEMAPHORE)
    (mg.requested_memory_kb / 1024.0) AS requested_mb,
    (mg.granted_memory_kb / 1024.0)   AS granted_mb,
    (mg.required_memory_kb / 1024.0)  AS required_mb,
    mg.queue_id,
    mg.wait_order,
    st.text AS sql_text,
    qp.query_plan
FROM sys.dm_exec_query_memory_grants AS mg
CROSS APPLY sys.dm_exec_sql_text(mg.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(mg.plan_handle) AS qp
ORDER BY mg.granted_memory_kb DESC;

Find historical queries that have requested the largest grants:

SELECT TOP 20
    qs.execution_count,
    (qs.max_grant_kb / 1024.0)                             AS max_grant_mb,
    (qs.total_grant_kb / qs.execution_count / 1024.0)      AS avg_grant_mb,
    (qs.total_worker_time / qs.execution_count / 1000.0)   AS avg_cpu_ms,
    qs.last_execution_time,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
          WHEN -1 THEN DATALENGTH(st.text)
          ELSE qs.statement_end_offset
          END - qs.statement_start_offset)/2) + 1) AS query_text,
    qp.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
ORDER BY qs.max_grant_kb DESC;

In the query_plan column, search for the Spill keyword or look for warning icons on Sort or Hash Match operators.

Root causes and fixes

Root cause	Symptoms	Fix
Outdated statistics	Optimizer estimates 1 row but processes 1 million; tiny grant request causes frequent spills	Update statistics regularly, especially after large data changes
Concurrent large queries	Multiple memory-intensive reports run simultaneously; later queries queue on `RESOURCE_SEMAPHORE`	Schedule memory-intensive report jobs to run sequentially during off-peak hours
Missing indexes on sort/hash operations	Complex `JOIN`, `GROUP BY`, or `ORDER BY` without a covering index forces SQL Server to build large in-memory hash tables	Add covering or ordered indexes so the optimizer can avoid in-memory sorting
Wide row selection	`SELECT *` or long text columns inflate memory grant size (`memory grant size = estimated rows × average row width`)	Select only needed columns; break large multi-join queries into simpler steps

SOSNODE memory growth issue

On RDS SQL Server High-availability Edition instances with a Mirroring architecture, particularly small instances (2-core 4 GB or 4-core 8 GB), MEMORYCLERK_SOSNODE_KB may show a slow, continuous upward trend over weeks or months. This growth is unrelated to business workload peaks and does not decrease on its own.

This is a known SOSNODE object memory leak in SQL Server. Microsoft has not completely resolved it across affected versions. The leaked memory counts toward max server memory and reduces the buffer pool, eventually causing a measurable memory bottleneck. A 701 error may appear in the error log as pressure increases.

Mitigation steps:

Upgrade to a later SQL Server version to reduce exposure.
Based on the observed leak rate, schedule a planned restart every 2 to 4 months during off-peak hours to release accumulated leaked memory.

RDS monitors SOSNODE growth automatically. When the metric reaches the threshold, RDS triggers a proactive operations and maintenance (O&M) task. View the corresponding task in the Event Center in the console.