Query memory usage of various system components - ApsaraDB for ClickHouse

Use SQL queries against ApsaraDB for ClickHouse system tables to identify what is consuming memory across your cluster—caches, merges, in-memory data parts, dictionaries, special-engine tables, and queries.

Prerequisites

Before you begin, make sure you have:

An ApsaraDB for ClickHouse cluster
Access to run SQL queries against the cluster

Sample environment

The queries in this document use s-2-r-0 as the sample node name. Replace it with your actual node name before running the queries.

To find your node names:

On the Cluster Monitoring tab in the ApsaraDB for ClickHouse console. For details, see View cluster monitoring information.
Run the following SQL statement:
```
SELECT * FROM system.clusters;
```

View system memory and caches

The system.asynchronous_metrics table provides a snapshot of overall memory and cache usage. Query it to understand the baseline memory state of a node before investigating specific components.

Query memory and cache metrics

The following query returns all memory- and cache-related metrics for the s-2-r-0 node, including a human-readable size of each metric:

SELECT metric,
       value,
       formatReadableSize(value)
FROM clusterAllReplicas('default', system.asynchronous_metrics)
WHERE (metric LIKE '%Cach%'
  OR metric LIKE '%Mem%')
  AND substring(hostname(), 38, 8) = 's-2-r-0'
ORDER BY metric;

Sample output:

┌─metric───────────────────────┬───value─────┬─formatReadableSize(value)─┐
│ CGroupMemoryTotal            │ 34359738368 │ 32.00 GiB                 │
│ CGroupMemoryUsed             │  4641132544 │ 4.32 GiB                  │
│ CompiledExpressionCacheBytes │       16384 │ 16.00 KiB                 │
│ CompiledExpressionCacheCount │           2 │ 2.00 B                    │
│ FilesystemCacheBytes         │           0 │ 0.00 B                    │
│ FilesystemCacheFiles         │           0 │ 0.00 B                    │
│ HashTableStatsCacheEntries   │           8 │ 8.00 B                    │
│ HashTableStatsCacheHits      │      528011 │ 515.64 KiB                │
│ HashTableStatsCacheMisses    │          15 │ 15.00 B                   │
│ IndexMarkCacheBytes          │           0 │ 0.00 B                    │
│ IndexMarkCacheFiles          │           0 │ 0.00 B                    │
│ IndexUncompressedCacheBytes  │           0 │ 0.00 B                    │
│ IndexUncompressedCacheCells  │           0 │ 0.00 B                    │
│ MMapCacheCells               │           0 │ 0.00 B                    │
│ MarkCacheBytes               │        7968 │ 7.78 KiB                  │
│ MarkCacheFiles               │          24 │ 24.00 B                   │
│ MemoryCode                   │   390758400 │ 372.66 MiB                │
│ MemoryDataAndStack           │ 14373392384 │ 13.39 GiB                 │
│ MemoryResident               │   761221120 │ 725.96 MiB                │
│ MemoryShared                 │   377688064 │ 360.19 MiB                │
│ MemoryVirtual                │ 18072178688 │ 16.83 GiB                 │
│ OSMemoryAvailable            │ 32480075776 │ 30.25 GiB                 │
│ OSMemoryCached               │  4239949824 │ 3.95 GiB                  │
│ OSMemoryFreePlusCached       │ 32480169984 │ 30.25 GiB                 │
│ OSMemoryFreeWithoutCached    │ 28240220160 │ 26.30 GiB                 │
│ OSMemoryTotal                │ 32881352704 │ 30.62 GiB                 │
│ QueryCacheBytes              │           0 │ 0.00 B                    │
│ QueryCacheEntries            │           0 │ 0.00 B                    │
│ UncompressedCacheBytes       │      349487 │ 341.30 KiB                │
│ UncompressedCacheCells       │         114 │ 114.00 B                  │
└──────────────────────────────┴─────────────┴───────────────────────────┘

Key metrics for memory analysis:

Metric	Description
`MarkCacheBytes`	Size of the mark cache in bytes. Stores index marks within data files, allowing ClickHouse to locate data blocks without scanning entire files.
`UncompressedCacheBytes`	Size of the uncompressed data cache in bytes. Caches data blocks after they are read from disk and decompressed, so repeat access to the same blocks skips disk I/O entirely.

Clear caches

After identifying oversized caches, run the following DDL statements to release memory.

Important

ApsaraDB for ClickHouse does not support clearing caches on a single node. These commands execute as distributed DDL and clear caches across all nodes in the cluster. Cache clearing reduces available memory for ongoing queries and may slow down query execution temporarily. Proceed with caution.

Clear the mark cache:

SYSTEM DROP MARK CACHE;

Clear the uncompressed data cache:

SYSTEM DROP UNCOMPRESSED CACHE;

View memory used by merge operations

Merges run continuously in the background to consolidate data parts and improve query performance. Heavy merge activity can consume significant memory.

Check total merge memory on a node

SELECT formatReadableSize(sum(memory_usage))
FROM clusterAllReplicas('default', system.merges)
WHERE substring(hostname(), 38, 8) = 's-2-r-0';

Check merge memory within a time range

Use system.part_log to review completed merge events and their peak memory usage. The following query retrieves the most memory-intensive merge on the s-2-r-0 node between 01:15:00 and 01:30:00 on September 17, 2023:

SELECT *
FROM clusterAllReplicas('default', system.part_log)   -- Records of past part operations
WHERE (event_type = 'MergeParts')                     -- Filter for merge events only
  AND (event_time >= '2023-09-17 01:15:00')           -- Start of the time range
  AND (event_time <= '2023-09-17 01:30:00')           -- End of the time range
  AND substring(hostname(), 38, 8) = 's-2-r-0'
ORDER BY peak_memory_usage DESC
LIMIT 1;

To get the full list of available event types for your cluster, run:

SELECT DISTINCT event_type
FROM clusterAllReplicas('default', system.part_log)
WHERE substring(hostname(), 38, 8) = 's-2-r-0';

Key parameters:

Parameter	Description
`event_type`	The type of part operation. Use `MergeParts` to filter for merge events.
`event_time`	The time the event occurred. Use this to scope the query to a specific time range.
`peak_memory_usage`	The maximum memory consumed during the event. Sort by this field to find the most memory-intensive merges.

View memory used by in-memory data parts

ClickHouse supports storing data parts directly in memory (part_type = 'InMemory') to accelerate queries. Use system.parts to check how much memory these parts occupy.

SELECT sum(data_uncompressed_bytes)
FROM clusterAllReplicas('default', system.parts)
WHERE part_type = 'InMemory'
  AND substring(hostname(), 38, 8) = 's-2-r-0';

View memory used by dictionaries

Dictionaries in ApsaraDB for ClickHouse are loaded entirely into memory to speed up lookups. Use system.dictionaries to check their memory footprint.

SELECT formatReadableSize(sum(bytes_allocated))
FROM clusterAllReplicas('default', system.dictionaries)
WHERE substring(hostname(), 38, 8) = 's-2-r-0';

View memory used by Memory, Set, and Join engine tables

Tables that use the Memory, Set, and Join engines hold their data in RAM. Monitor their memory usage to avoid out-of-memory (OOM) errors when these tables grow unexpectedly.

SELECT
    `database`,
    name,
    formatReadableSize(total_bytes)
FROM clusterAllReplicas('default', system.tables)
WHERE engine IN ('Memory', 'Set', 'Join');

View memory used by queries

Use system.processes for currently running queries and system.query_log for historical analysis.

Check total memory used by running queries

SELECT formatReadableSize(sum(memory_usage))
FROM clusterAllReplicas('default', system.processes)
WHERE substring(hostname(), 38, 8) = 's-2-r-0';

Find the top memory-consuming running queries

The following query returns the 10 queries with the highest peak memory usage currently running on the s-2-r-0 node:

SELECT initial_query_id,
       elapsed,                                -- Seconds elapsed since the query started
       formatReadableSize(memory_usage),       -- Current memory used
       formatReadableSize(peak_memory_usage),  -- Peak memory used
       query                                   -- Query text
FROM clusterAllReplicas('default', system.processes)
WHERE substring(hostname(), 38, 8) = 's-2-r-0'
ORDER BY peak_memory_usage DESC
LIMIT 10;

Analyze historical query memory usage

Use system.query_log to review memory usage for past queries.

The following query returns the 10 most memory-intensive queries on the s-2-r-0 node between 17:00:00 and 17:40:00 on January 5, 2024:

SELECT type,
       event_time,
       initial_query_id,
       formatReadableSize(memory_usage),
       query
FROM clusterAllReplicas('default', system.query_log)
WHERE (event_time >= '2024-01-05 17:00:00')
  AND (event_time <= '2024-01-05 17:40:00')
  AND substring(hostname(), 38, 8) = 's-2-r-0'
ORDER BY memory_usage DESC
LIMIT 10;