×
Community Blog How to Configure an Apache Kafka Cluster on Ubuntu 16.04

How to Configure an Apache Kafka Cluster on Ubuntu 16.04

In this tutorial, we will learn how to configure an Apache Kafka cluster for stream-processing on an Alibaba Cloud ECS Ubuntu 16.04 instance.

By Hitesh Jethva, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

Apache Kafka is a free and open source stream-processing software platform developed by the Apache Software Foundation written in Scala. It is a distributed message agent specially designed to deal with huge volumes of real-time information effectively. Compared to other message brokers systems like ActiveMQ and RabbitMQ, Apache Kafka has a much higher throughput. Apache Kafka is based on the commit log that allows users to subscribe to it and publish data to any number of systems or real-time applications. Apache Kafka can be deployed on a single web server or in a distributed clustered environment. Apache Kafka has a four major APIs, Producer API, Consumer API, Connector API, and Streams API.

Apache Kafka features:

  1. Support for parallel data load into Hadoop.
  2. High throughput, supporting hundreds of thousands of messages per second, even with modest hardware.
  3. Persistent messaging with O(1) disk structures that provide constant time performance, even with terabytes of stored messages.
  4. The distributed system scales easily with no downtime.

In this tutorial, we will learn how to configure an Apache Kafka cluster for stream-processing on an Alibaba Cloud Elastic Compute Service (ECS) Ubuntu 16.04 instance.

Prerequisites

  1. A fresh Alibaba cloud instance with Ubuntu 16.04 server installed.
  2. A static IP address 192.168.0.103 is configured on the instance.
  3. A Root password is setup on the server.

Launch an Alibaba Cloud ECS Instance

First, login to your Alibaba Cloud ECS Console. Create a new ECS instance choosing Ubuntu 16.04 as the operating system with at least 2GB RAM. Connect to your ECS instance and log in as the root user.

Once you are logged into your Ubuntu 16.04 instance, run the following command to update your base system with the latest available packages.

apt-get update -y

Install Java

Apache Kafka needs a Java runtime environment, so you will need to install the latest version of Java to your system. By default, the latest version of the java is not available in Ubuntu 16.04 repository. So, you will need to add Java repository to your system. You can do this by running the following command:

add-apt-repository ppa:webupd8team/java

Next, update the repository and install Java by running the following command:

apt-get install oracle-java8-installer -y

Once the Java is installed, you can check the Java version using the following command:

java -version

Output:

java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

Install ZooKeeper

Apache Kafka depends on ZooKeeper for maintaining configuration information, providing distributed synchronization, and naming and providing group services. So, you will need to install ZooKeeper to your system. You can install it by running the following command:

apt-get install zookeeperd -y

By default, ZooKeeper listen on port 2181. You can check by running the following command:

netstat -nlpt | grep ':2181'

You should see the following output:

tcp6       0      0 :::2181                 :::*                    LISTEN

Install Apache Kafka

First, you will need to download the latest version of the Kafka from the Apache website. You can download it by running the following command:

wget  http://redrockdigimark.com/apachemirror/kafka/1.1.0/kafka_2.12-1.1.0.tgz 

Once the download is completed, extract the downloaded file using the following command:

tar -xvzf kafka_2.12-1.1.0.tgz 

Next, copy the extracted directory to the /opt:

cp -r kafka_2.12-1.1.0 /opt/Kafka

Next, start the Kafka server by running the following script:

/opt/Kafka/bin/kafka-server-start.sh /opt/Kafka/config/server.properties

You should see the following output:

 [2018-05-20 08:13:54,271] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
[2018-05-20 08:13:54,449] INFO Kafka version : 1.1.0 (org.apache.kafka.common.utils.AppInfoParser)
[2018-05-20 08:13:54,461] INFO Kafka commitId : fdcf75ea326b8e07 (org.apache.kafka.common.utils.AppInfoParser)
[2018-05-20 08:13:54,466] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)

Kafka server is now up and listening on port 9092.

Test Apache Kafka

Now, create your first topic named Topic1 with a single partition and only one replica by running the following command:

/opt/Kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1  --partitions 1 --topic Topic1

You should see the following output:

Created topic "Topic1".

Now you can see the created topic on Kafka by running the following command:

/opt/Kafka/bin/kafka-topics.sh --list --zookeeper localhost:2181

You should see the following output:

Topic1

Now, post a sample message to the Apache kafka topic named Topic1 with the following command:

/opt/Kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Topic1

Hello Kafka
How R You

Ok

Next, run Kafka consumer command to read data from Kafka cluster and display messages to standard output:

/opt/Kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic Topic1 --from-beginning

You should see your posted messages in the following output:

Hello Kafka
How R You
Ok

That's it! You have successfully set up Apache Kafka on you Alibaba Cloud Elastic Compute Service (ECS) instance.

Related Alibaba Cloud Products

Auto Scaling is a service to automatically adjust computing resources based on your volume of user requests. When demand for computing resources increase, Auto Scaling automatically adds ECS instances to serve additional user requests, or alternatively removes instances in the case of decreased user requests. This service is available free of cost. You will be charged only for the standard cost of adding additional ECS resources.

Auto Scaling comes with three cool features: elastic scale-out, elastic scale-in, and elastic self-health.

Elastic Scale-Out: During peak periods, Auto Scaling automatically adds additional computing resource to the pool.
Elastic Scale-In: When user requests decrease, Auto Scaling automatically releases ECS resources to cut down your costs.
Elastic Self-Health: When an unhealthy instance has been detected, the auto-scaling service automatically replaces the instance with a new one to ensure uninterrupted service.

Alibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products. It replaces the need to install, operate and scale your container cluster infrastructure. Being a fully-managed service, Container Service for Kubernetes helps you to focus on your applications rather than managing container infrastructure.

Alibaba Cloud Container Service can be integrated with Server Load Balancer, VPC, and other cloud services, allowing you to manage container applications from the console or terminal. The product maintains compatibility with native Kubernetes and provides security, high availability, and stable upgrading services.

1 1 1
Share on

Alibaba Clouder

2,599 posts | 763 followers

You may also like

Comments

Raja_KT February 9, 2019 at 6:49 am

Good one. Will it not be better to do by a tool?