Docker comes with DNS configuration Hadoop | with code

Date: Oct 25, 2022

Related Tags:1. How to Install LAMP on Ubuntu
2. Installation of LibreNMS on Ubuntu

Abstract: Recently, the author compiled a set of Hadoop construction solutions. The size of the final image is more than 1.4G. Using the docker subnet, the container restart does not need to reconfigure the /etc/hosts file.

Preparation



1: Download and unzip the jdk1.8 folder
2: Download and unzip the Hadoop2.8.5 folder
3: docker pull centos7 base image
4: Create the mydocker folder
5: Move the jdk1.8 folder and the Hadoop folder to the mydocker folder (here the Hadoop2.8.5 folder is renamed to Hadoop for brevity)
6: Edit the files in the Hadoop folder (avoid having to modify it three times after creating an image)

* First enter the Hadoop/etc/Hadoop folder and modify hadoop.env.sh
*Modify JAVA_HOME to the following path
* Modify yarn-site.xml
*Modify mapred-site.xml, here is mapred-site.xml.template
* Modify hdfs-site.xml (according to datanode configuration)

The purpose of modifying the hadoop configuration file here is to directly copy the hadoop folder to the image when building the image, and then use the image to create three containers without repeating the configuration work in the three containers.

Build image using Dockerfile



1. Edit the Dockerfile and build it based on the prepared centos:7 base image

2. Execute the build command in the mydocker folder (because the file access permission is required to add sudo, the command is the last, do not lose it)

Create a container



1. Create a subnet Hadoopnetwork

2. Create a container and specify the subnet and ip

*Create Hadoop0 container
* Create hadoop1 container
* Create Hadoop2 container

Configure ssh



1. Enter Hadoop0

2. Configure ssh keys


To make the key, type ssh-keygen -t rsa and hit enter three times

The generated key exists in the /root/.ssh/id_rsa.pub file, and executes the instruction to store the key in the /root/.ssh/authorized_keys file

Modify the sshd_config configuration to make the prompt more concise, and merge the instructions as follows:

Modify the corresponding line in the configuration file as follows:

Esc enters command mode, :wq saves
Modify the StrictHostKeyChecking ask in the ssh_config file to no

Ctrl+P+Q to quit the Hadoop0 container

Enter the Hadoop1 and Hadoop2 containers and do the same to generate ssh keys and configuration

3. Add the secret keys of the container to each other

The /root/.ssh/authorized_keys file of each container needs to be filled with the keys of all containers

After the above operation, in the Hadoop2 container, view the file and copy the haoop2 key

[root@hadoop2 /]# vim /root/.ssh/authorized_keys
ctrl+P+Q to exit, also copy the Hadoop1 and Hadoop0 keys, and copy all the three keys to the /root/.ssh/authorized_keys file of the three containers.

Execute the /usr/sbin/sshd command once in each container after the copy is complete.

At this point, you can access each other through ssh, test it

ctrl+D to return




Improve Hadoop configuration



Enter Hadoop0 and modify the hdfs-site.xml file as follows:

Because the datanode configuration is used when COPY enters the container, to modify the namenode configuration, you only need to modify the three data to name.
Enter Hadoop1 through ssh, delete and rebuild the hdfs file

ctrl+D to exit, also enter Hadoop2, delete and rebuild the hdfs file
ctrl+D to exit, go back to Hadoop0, delete and rebuild the hdfs file, pay attention to the name here.

Modify the slaves file
Fill in Hadoop1, Hadoop2

Format hdfs (an error was reported here, and JAVA_HOME could not be found, because there was an extra space after JAVA_HOME= in Hadoop-env.sh, and it ran successfully after deleting it)

Modify the /etc/profile file, you can use the jps or Hadoop fs xx command after modification

Add the following code to the end of the file, save and exit

source makes the configuration take effect

[root@hadoop1 ~]# source /etc/profile
ssh into Hadoop1 and hadoop2 to do the same modification.
Go back to Hadoop0, enter Hadoop/sbin, and execute the start-all command to start

You can run the following command to view the node status

Test the wordcount program
After starting Hadoop (be sure to start it first), you can run the built-in wordcount program to test it

Enter the Hadoop folder and create an input folder in hdfs

Create my_wordcount.txt file in container, edit some words, Esc :wq save

upload local files to hdfs

Start the wordcount program, specify the input file and output file (the version number needs to be changed)

View running results

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us