Configure client parameters for producers and consumers - ApsaraMQ for Kafka

This topic describes how to configure client parameters for ApsaraMQ for Kafka. Properly configured parameters directly affect message throughput, delivery reliability, and consumer stability. The following sections cover producer and consumer parameters with recommended values and tuning guidance for production workloads.

Producer parameters

Message delivery

`acks`

Controls how many broker acknowledgments the producer requires before considering a send successful.

Value	Behavior	Trade-off
`0`	No acknowledgment from the broker.	Highest throughput, highest risk of data loss.
`1`	Acknowledgment after the leader writes the data.	Balanced throughput and durability. Data loss is possible if the leader fails before followers replicate the data.
`all`	Acknowledgment after the leader and all in-sync replicas write the data.	Lowest throughput, strongest durability. Data loss occurs only if the leader and all in-sync replicas fail simultaneously.

Recommended value: 1 for most workloads that prioritize throughput over strict durability.

`retries`

Maximum number of times the producer retries a failed send. A higher value helps the producer recover from transient broker failures such as leader elections. Combine with retry.backoff.ms to control retry pacing.

`retry.backoff.ms`

Delay between send retries. A value that is too low can cause retry storms during broker failovers.

Recommended	Default	Unit
1000	--	milliseconds

Batching and throughput

Batching amortizes network overhead by combining multiple records into a single request. Two parameters control when a batch is sent: size and time.

`batch.size`

Maximum size of a single batch per partition. When a batch reaches this size, the producer sends it immediately.

Type	Default	Valid values	Unit
int	16384	[0,...]	bytes

Keep the default value of 16384 for most workloads. A smaller value increases the number of network requests and reduces throughput. If you increase batch.size, make sure buffer.memory is large enough to accommodate the larger batches.

`linger.ms`

Maximum time the producer waits for a batch to fill before sending it. This works like Nagle's algorithm in TCP: once a batch reaches batch.size, it is sent immediately regardless of the linger timer. If the batch is still below batch.size when linger.ms elapses, the producer sends whatever has accumulated.

Recommended	Default	Unit
100 to 1000	0	milliseconds

A higher linger.ms increases batching efficiency and throughput at the cost of per-message latency.

Memory management

`buffer.memory`

Total memory the producer allocates for buffering unsent records across all partitions. If this pool is exhausted, send() blocks or throws an exception depending on max.block.ms. An undersized buffer causes slow memory allocation, reduced throughput, or send timeouts.

Unit: bytes. Default: 33554432 (32 MB).

Sizing formula:

buffer.memory >= batch.size x number_of_partitions x 2

For example, with batch.size=16384 and 50 partitions:

16384 x 50 x 2 = 1,638,400 bytes (~1.6 MB minimum)

If you increase batch.size for throughput, scale buffer.memory proportionally.

Partitioning

`partitioner.class`

Determines how the producer assigns records to partitions. The sticky partitioning strategy reduces the number of incomplete batches by filling one partition's batch before moving to the next.

Kafka client version	Default strategy
2.4 and later	Sticky partitioning (default)
Earlier than 2.4	Round-robin

If your producer client is earlier than version 2.4, explicitly set the sticky partitioner to improve batching efficiency.

Consumer parameters

Fetch tuning

These parameters control how much data the consumer retrieves per fetch request. Tuning them affects both throughput and latency.

`fetch.min.bytes`

Minimum amount of data the broker accumulates before returning a fetch response. A larger value reduces fetch frequency and broker CPU overhead, which improves throughput but increases end-to-end message latency. Unit: bytes.

Evaluate your producer's message rate before setting this value. If messages arrive slowly, a large fetch.min.bytes adds unnecessary delay.

`fetch.max.wait.ms`

Maximum time the broker waits to accumulate fetch.min.bytes before returning a response. Unit: milliseconds.

Behavior varies by storage type:

Local storage: The broker waits until fetch.min.bytes is reached or fetch.max.wait.ms elapses, whichever comes first.
Cloud storage: The broker returns a response immediately when new data arrives, regardless of fetch.min.bytes.

`max.partition.fetch.bytes`

Maximum amount of data the broker returns per partition in a single fetch. Unit: bytes.

Session management and rebalancing

Misconfigured session and polling parameters are the most common cause of unexpected consumer rebalances. A rebalance pauses all consumption in the group until partitions are reassigned, so avoid triggering unnecessary rebalances.

`session.timeout.ms`

Maximum time between heartbeats before the broker considers the consumer dead and triggers a rebalance.

Recommended	Valid range	Default	Unit
30000 to 60000	6000 to 300000	10000	milliseconds

Note In Java client 0.10.1 and later, a dedicated background thread sends heartbeats independently of poll(). In earlier Java versions or non-Java clients, heartbeats are sent during poll() calls, so session.timeout.ms must account for both data processing time and the heartbeat interval.

Note Tip: Set heartbeat.interval.ms to no more than one-third of session.timeout.ms. For example, if session.timeout.ms is 45000, set heartbeat.interval.ms to 15000 or lower.

`max.poll.records`

Maximum number of records returned in a single poll() call. If the consumer cannot process this many records before the next poll() deadline, the broker considers it dead and triggers a rebalance.

Sizing formula:

max.poll.records < messages_per_thread_per_second x consumer_threads x session_timeout_seconds

For example, with 500 msg/s per thread, 4 threads, and a 45-second session timeout:

500 x 4 x 45 = 90,000

Set max.poll.records below this value to ensure the consumer always finishes processing before the session times out.

`max.poll.interval.ms`

Maximum interval between consecutive poll() calls before the broker removes the consumer from the group. This parameter applies only to Java client 0.10.1 and later, where heartbeats run on a separate thread.

Default	Unit
300000	milliseconds

Sizing formula:

max.poll.interval.ms > time_per_record x max.poll.records

In most cases, the default of 300000 (5 minutes) is sufficient. Increase it only if your processing logic is exceptionally slow.

Offset management

`enable.auto.commit`

Controls whether the consumer automatically commits offsets at a fixed interval.

Value	Behavior
`true` (default)	Offsets are committed automatically every `auto.commit.interval.ms` milliseconds. Simpler to manage, but may cause duplicate processing after a crash.
`false`	Your application must call `commitSync()` or `commitAsync()` explicitly. Use this for at-least-once or exactly-once semantics when combined with idempotent processing.

`auto.commit.interval.ms`

Interval for automatic offset commits when enable.auto.commit is true.

Default	Unit
1000	milliseconds

A shorter interval reduces the window for duplicate messages after a crash but increases the number of commit requests to the broker.

`auto.offset.reset`

Determines what happens when the consumer has no committed offset or the committed offset is invalid (e.g., the offset has been deleted due to retention policies).

Value	Behavior
`latest`	Start consuming from the most recent offset. New messages only.
`earliest`	Start consuming from the oldest available offset. Reprocesses all retained messages.
`none`	Throw an exception. Use this when your application manages offsets manually.

Recommended value: latest. Using earliest causes the consumer to reprocess all retained messages whenever an invalid offset is encountered, which can lead to duplicate processing and a spike in consumer lag.

If your application handles offset management manually (e.g., storing offsets in an external database), set this to none.

Parameter quick reference

Producer parameters

Parameter	Default	Recommended	Unit
`acks`	--	`1` (throughput) or `all` (durability)	--
`retries`	--	Set based on availability requirements	--
`retry.backoff.ms`	100	1000	ms
`batch.size`	16384	16384 (default)	bytes
`linger.ms`	0	100 to 1000	ms
`buffer.memory`	33554432	>= `batch.size` x partitions x 2	bytes
`partitioner.class`	Sticky (2.4+)	Sticky partitioning	--

Consumer parameters

Parameter	Default	Recommended	Unit
`fetch.min.bytes`	1	Tune based on producer message rate	bytes
`fetch.max.wait.ms`	500	--	ms
`max.partition.fetch.bytes`	1048576	--	bytes
`session.timeout.ms`	10000	30000 to 60000	ms
`heartbeat.interval.ms`	3000	<= 1/3 of `session.timeout.ms`	ms
`max.poll.records`	500	See sizing formula	--
`max.poll.interval.ms`	300000	300000 (default)	ms
`enable.auto.commit`	true	Depends on delivery semantics	--
`auto.commit.interval.ms`	1000	1000 (default)	ms
`auto.offset.reset`	`latest`	`latest`	--