When you deploy an ApsaraMQ for Confluent cluster, throughput requirements, partition count, data retention needs, and latency tolerance all affect the resources you need. This topic walks through how to estimate resources for each cluster component and provides recommended specifications for production and test environments.
Architecture overview
An ApsaraMQ for Confluent cluster consists of seven components. The following table lists each component with its default replica count and role.
| Component | Default replicas | Role |
|---|---|---|
| Kafka Broker | 3 | Handles produce and consume requests |
| ZooKeeper | 3 | Manages cluster metadata and coordination |
| Connect | 2 | Runs source and sink connectors for data integration |
| SchemaRegistry | 2 | Enforces schema management and compatibility |
| ControlCenter | 1 | Provides web-based monitoring and management |
| KsqlDB | 2 | Processes streams with a SQL interface |
| KafkaRestProxy | 2 | Exposes an HTTP/REST interface for producing and consuming messages |
Adjust the replica count for each component based on your workload.
Estimate Kafka Broker resources
Kafka Broker sizing has the largest impact on cluster performance. Define your workload parameters first, then use the formulas in this section to calculate broker count, Compute Units (CUs) per broker, and disk size.
Define workload parameters
| Parameter | Description | Default |
|---|---|---|
| Fan-out factor | Number of independent consumer groups reading the same data. Excludes inter-broker replication traffic. | -- |
| Peak data inflow | Maximum producer throughput (MB/s) | -- |
| Average data inflow | Average producer throughput (MB/s) | -- |
| Data retention period | How long messages are stored before deletion | 7 days |
| Replication factor | Number of copies per partition | 3 |
Calculate broker count
A single Kafka Broker supports up to 100 MB/s of throughput. Plan for 50% I/O bandwidth headroom to handle traffic spikes.
Production clusters require at least 4 brokers:
Broker count = Max(4, peak_inflow x (fan_out + 2 x replication_factor - 1) x 2 / 400)Test clusters require at least 3 brokers:
Broker count = Max(3, peak_inflow x (fan_out + 2 x replication_factor - 1) x 2 / 300)Partition limits also constrain broker count:
Each broker: no more than 2,000 partition replicas
Entire cluster: no more than 200,000 partition replicas
If your total partition replica count is high, size the broker count based on partition limits rather than throughput alone.
Worked example
Consider a production workload with:
Peak inflow: 200 MB/s
Fan-out factor: 2 (two consumer groups)
Replication factor: 3 (default)
Broker count = Max(4, 200 x (2 + 2 x 3 - 1) x 2 / 400)
= Max(4, 200 x 7 x 2 / 400)
= Max(4, 7)
= 7 brokersChoose CUs per broker
CU requirements depend on cluster configuration, client behavior, partition count, and the number of producers and consumers. Use these guidelines as a starting point:
| Environment | CUs per broker | Partition limits per broker (at 4 CUs) |
|---|---|---|
| Production | 8 or more | 100 leader replicas or 300 partition replicas (including leaders) |
| Development and testing | 4 | 100 leader replicas or 300 partition replicas (including leaders) |
Calculate disk size per broker
Disk per broker = Max(1 TB, average_inflow x retention_period x replication_factor / broker_count)Convert all values to consistent units before calculating. If average_inflow is in MB/s, convert retention_period to seconds (1 day = 86,400 seconds).
Worked example
With the following parameters:
Average inflow: 50 MB/s
Retention period: 7 days (604,800 seconds)
Replication factor: 3
Broker count: 7
Disk per broker = Max(1 TB, 50 MB/s x 604,800 s x 3 / 7)
= Max(1 TB, ~12,960,000 MB)
= Max(1 TB, 12.4 TB)
= 12.4 TB per brokerEstimate resources for other components
Connect
| Resource | Recommendation |
|---|---|
| Nodes | 2 or more for high availability |
| CUs per node | 8 or more |
SchemaRegistry
| Resource | Recommendation |
|---|---|
| Nodes | 2 |
| CUs per node | 2 |
ControlCenter
| Resource | Recommendation |
|---|---|
| Nodes | 1 |
| CUs | More than 4 |
| Storage | 300 GB or more |
KsqlDB
| Resource | Recommendation |
|---|---|
| Nodes | 2 or more for high availability |
| CUs per node | 5 or more |
| Storage | 100 GB (default). Increase based on your aggregation statement count and concurrent query volume. |
KafkaRestProxy
| Resource | Recommendation |
|---|---|
| Nodes | 2 or more for high availability |
| CUs per node | 8 or more for continuous produce/consume workloads; 4 for light usage |
Performance benchmarks
The following benchmarks were measured on a 4-broker cluster with a single topic, 300 partitions, 60 producers, and 1 KB messages.
Throughput and latency by CU count
| Broker spec | Total cluster throughput (unthrottled) | Avg. producer throughput (unthrottled) | Avg. latency (unthrottled) | Total throughput (latency < 100 ms) |
|---|---|---|---|---|
| 4 CU per broker | 370 MB/s | 5.95 MB/s | 9,718 ms | 130 MB/s |
| 8 CU per broker | 400 MB/s | 7.33 MB/s | 8,351 ms | 195 MB/s |
| 12 CU per broker | 400 MB/s | 7.39 MB/s | 8,343 ms | 240 MB/s |
| 16 CU per broker | 400 MB/s | 7.47 MB/s | 8,335 ms | 290 MB/s |
| 20 CU per broker | 400 MB/s | 7.58 MB/s | 8,237 ms | 305 MB/s |
With 4 brokers at 8 or more CUs each, the cluster reaches a baseline throughput of 400 MB/s. Additional CUs primarily improve latency-constrained throughput rather than peak throughput.
Actual performance varies depending on message size, partition count, consumer count, and client configuration. Use these benchmarks as a baseline, not a guarantee.
Scale out with additional brokers
Each additional broker adds approximately 100 MB/s of throughput. Allocate at least 8 CUs per broker when scaling out to prevent compute from becoming a bottleneck.
| Brokers | Cluster throughput | Messages per hour | Partitions supported (at 1 MB/s per partition) |
|---|---|---|---|
| 4 | 400 MB/s | 14.7 billion | 400 |
| 8 | 800 MB/s | 29.5 billion | 800 |
| 12 | 1,200 MB/s | 44.2 billion | 1,200 |
| 16 | 1,600 MB/s | 59 billion | 1,600 |
| 20 | 2,000 MB/s | 73.7 billion | 2,000 |
The recommended throughput per partition is 1--5 MB/s. For low-latency workloads, keep per-partition throughput at the lower end. As partition count grows, cluster throughput decreases and tail latency increases.
Recommended cluster specifications
The following tables provide starting-point specifications for production and test environments. Adjust based on your workload characteristics.
Production environment
Total cluster throughput: 400 MB/s (excluding replication traffic).
| Component | CUs per node | Disk per node | Nodes |
|---|---|---|---|
| Kafka Broker | 12 | 2,400 GB | 4 |
| ZooKeeper | 4 | 100 GB | 3 |
| Connect | 12 | -- | 2 |
| ControlCenter | 12 | 300 GB | 1 |
| SchemaRegistry | 2 | -- | 2 |
| KafkaRestProxy | 16 | -- | 2 |
| KsqlDB | 5 | 100 GB | 2 |
Test environment
Total cluster throughput: 300 MB/s (excluding replication traffic).
| Component | CUs per node | Disk per node | Nodes |
|---|---|---|---|
| Kafka Broker | 4 | 800 GB | 3 |
| ZooKeeper | 2 | 100 GB | 3 |
| Connect | 4 | -- | 2 |
| ControlCenter | 4 | 300 GB | 1 |
| SchemaRegistry | 2 | -- | 2 |
| KafkaRestProxy | 4 | -- | 2 |
| KsqlDB | 5 | 100 GB | 2 |
After you create a cluster, you can adjust resource configurations by scaling individual components up or out.
Component resource ranges
The following table lists the supported configuration ranges for each component.
| Component | Edition | Replicas (default / min / max) | CUs per node (default / min / max) | Disk per node (default; range) |
|---|---|---|---|---|
| Kafka Broker | Professional, Enterprise | 3 / 3 / 20 | 4 / 4 / 20 | 800 GB; 800--30,000 GB |
| ZooKeeper | Professional, Enterprise | 3 / 3 / 3 | 2 / 2 / 20 | 100 GB; 100--30,000 GB |
| ControlCenter | Professional, Enterprise | 1 / 1 / 1 | 8 / 8 / 20 | 300 GB; 300--30,000 GB |
| SchemaRegistry | Professional, Enterprise | 2 / 2 / 3 | 1 / 1 / 20 | No storage |
| Connect | Professional, Enterprise (optional) | 2 / 1 / 20 | 4 / 1 / 20 | No storage |
| KsqlDB | Professional, Enterprise (optional) | 2 / 1 / 20 | 5 / 5 / 20 | 100 GB; 100--30,000 GB |
| KafkaRestProxy | Professional, Enterprise (optional) | 2 / 2 / 20 | 4 / 4 / 20 | No storage |
References
Sizing Calculator for Apache Kafka and Confluent Platform -- model your specific workload with Confluent's interactive calculator.
Confluent Platform system requirements -- additional hardware guidance from Confluent.