The Kafka service is supported in E-MapReduce (EMR) V3.4.0 and later.

Create a Kafka cluster

In the EMR console, create a Kafka cluster. For more information, see Create a Dataflow-Kafka cluster.

Kafka clusters with local disks

When you deploy the Kafka service on instances that use local disks, you must configure the parameters described in the following table on the Configure tab of the Kafka service in the EMR console.
ParameterDescription
default.replication.factorThe value is fixed to 3, which indicates that each topic has three replicas.
min.insync.replicasThe value is fixed to 2, which indicates that the number of replicas is greater than or equal to 2.

A write is successful only if the producer sets the request.required.acks parameter to all or -1 and the number of replicas that acknowledge the write is greater than or equal to 2.

Parameters

You can view the configurations of the Kafka service on the Configure tab of the Kafka service in the EMR console.
ParameterDescription
zookeeper.connectSpecifies the hostname and port of the ZooKeeper server that is connected to the Kafka cluster.
kafka.heap.optsThe heap memory size of the Kafka broker.
num.io.threadsThe number of I/O threads of the Kafka broker. The default value is twice the number of CPU cores of the master node.
num.network.threadsThe number of network threads of the Kafka broker. The default value is the number of CPU cores of the master node.