×
Community Blog How to Set Up Apache Cassandra on Ubuntu 16.04

How to Set Up Apache Cassandra on Ubuntu 16.04

In this tutorial, we will be learning how to install and configure a single node Apache Cassandra on an Alibaba Cloud ECS with Ubuntu 16.04.

By Hitesh Jethva, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

Apache Cassandra is a free and open source NoSQL database management system intended for storing large amounts of data in a decentralized, highly available cluster. It is specially designed to handle large amounts of data across many servers and providing high availability with no single point of failure. Cassandra data model is inspired by Google Bigtable and developed by Facebook for its Facebook inbox search feature. It differs sharply from relational database management systems.

Features

  1. Data is distributed across the cluster. So each node contains different data and no single point of failure. Failed nodes can be replaced with no downtime.
  2. Designed to have read and write throughput. So you can add new nodes without any downtime or interruption.
  3. Supports MapReduce and Hadoop integration.
  4. Cassandra is designed as a distributed system. So you can deploy large numbers of nodes across multiple data centers.
  5. Support for strong or eventual data consistency across a widely distributed cluster.
  6. It performs fast writes and stores hundreds of terabytes of data, without sacrificing the read efficiency.

In this tutorial, we will install and configure a single node Apache Cassandra on Ubuntu 16.04 with an Alibaba Cloud Elastic Compute Service (ECS) instance.

Prerequisites

  1. A fresh Alibaba Cloud Ubuntu 16.04 instance with minimum 2GB RAM.
  2. A static IP address 192.168.0.103 is configured on the instance.
  3. A Root password is setup on the server.

Launch Alibaba Cloud ECS Instance

First, log in to your https://ecs.console.aliyun.com">Alibaba Cloud ECS Console. Create a new ECS instance, choosing Ubuntu 16.04 as the operating system with at least 2GB RAM. Connect to your ECS instance and log in as the root user.

Once you are logged into your Ubuntu 16.04 instance, run the following command to update your base system with the latest available packages.

apt-get update -y

Install Java

Apache Cassandra is a cross-platform application written in Java. So you will need to install the latest version of Java to your server. By default, the latest version of Java is not available in the Ubuntu 16.04 default repository. So you will need to add the repository for that,

You can do it by running the following command:

apt-get install python-software-properties -y
add-apt-repository ppa:webupd8team/java -y

Next, update the repository and install java with the following command:

apt-get update -y
apt-get install oracle-java8-installer -y

Once the Java is installed, check the Java version with the following command:

java -version

Output:

java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

Install Apache Cassandra

By default, Apache Cassandra is not available in the Ubuntu16.04 repository. So you will need to add Apache Software Foundation repository to your server.

First, add the repository with the following command:

echo "deb http://www.apache.org/dist/cassandra/debian 36x main" | tee -a /etc/apt/sources.list.d/cassandra.list

Next, add public key for Cassandra with the following command:

curl https://www.apache.org/dist/cassandra/KEYS | apt-key add -

Next, update the repository and install Cassandra using the following command:

apt-get install cassandra -y

Once Cassandra is installed, start Cassandra service and enable it to start on boot time with the following command:

systemctl start cassandra
systemctl enable cassandra

You can check the status of Cassandra with the following command:

systemctl status cassandra

You should see the following output:

cassandra.service - LSB: distributed storage system for structured data
   Loaded: loaded (/etc/init.d/cassandra; bad; vendor preset: enabled)
   Active: active (running) since Sun 2018-07-08 17:02:50 IST; 15s ago
     Docs: man:systemd-sysv-generator(8)
   CGroup: /system.slice/cassandra.service
           ├─6617 /bin/sh /usr/sbin/cassandra -p /var/run/cassandra/cassandra.pid -H /var/lib/cassandra/java_1531049570.hprof -E /var/lib/cassa
           ├─6817 java -cp /etc/cassandra:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/
           └─6818 grep -q Error: Exception thrown by the agent : java.lang.NullPointerException

Jul 08 17:02:50 Node1 systemd[1]: Starting LSB: distributed storage system for structured data...
Jul 08 17:02:50 Node1 systemd[1]: Started LSB: distributed storage system for structured data.
Jul 08 17:03:00 Node1 systemd[1]: Started LSB: distributed storage system for structured data.

Test Cassandra Cluster

Apache Cassandra is now installed, it's time to verify Cassandra Cluster. You can test it using the nodetool:

nodetool status

You should see the following output:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  127.0.0.1  112.09 KiB  256          100.0%            a4539bba-2394-4cff-b85d-d9c0dd564b0a  rack1

Cassandra comes with built-in command line interface tool cqlsh. Before using cqlsh tool, you will need to install Cassandra driver to your system. You can install it with the following command:

apt-get install python-pip -y
pip install cassandra-driver
export CQLSH_NO_BUNDLED=true

Now, you can connect the Cassandra Cluster using the following command:

cqlsh

After connecting Cassandra Cluster, you should see the following output:

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.6 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>

Use Cassandra

Cassandra is now installed, it's time to use Cassandra.

Let's create a test database and keyspace. First, connect the Cassandra Cluster using the following command:

cqlsh

Next, create a test database and keyspace:

cqlsh> CREATE KEYSPACE testdb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };

Next, use the keyspace testdb:

cqlsh> use testdb;

Next, create a table name mybooks:

cqlsh:testdb> CREATE TABLE mybooks (id int PRIMARY KEY, title text, year text);

Next, describe the table using the following command:

cqlsh:testdb> DESC mybooks;

Output:

CREATE TABLE testdb.mybooks (
    id int PRIMARY KEY,
    title text,
    year text
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

Related Alibaba Cloud Products

You can combine your newly deployed Cassandra database with Alibaba Cloud products for big data development.

ECS Bare Metal Instance is based on next-generation virtualization technology independently developed by Alibaba Cloud, featuring both the elasticity of a virtual server and the high-performance and comprehensive features of a physical server.Super Computing Cluster, based on Elastic Bare Metal (EBM) instances and high-speed interconnectivity of RDMA (Remote Direct Memory Access) technology, provides ultimate computing performance and parallel computing cluster services for high-performance computing.

0 1 0
Share on

Alibaba Clouder

2,605 posts | 747 followers

You may also like

Comments