By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
This tutorial is different from the first 3 as there are no copy and pasting, no shell, and no commands to run.
To follow the steps in this tutorial, make sure you have access to an Alibaba Cloud Elastic Compute Service instance with a recent version of Docker already installed. You can refer to this tutorial to learn how to install Docker on your Linux server.
This tutorial consists of these parts:
The review of snippets of Dockerfiles are in no particular order.
Note that several of the snippets are very basic with only a small number of basic insights - not to worry. When you combine many of those basic concepts you can create a quality Dockerfile.
From
# 7000: intra-node communication
# 7001: TLS intra-node communication
# 7199: JMX
# 9042: CQL
# 9160: thrift service
EXPOSE 7000 7001 7199 9042 9160
CMD ["cassandra", "-f"]
This snippet is taken from right at the bottom of the Dockerfile.
EXPOSE right at bottom is useful - I do not have to go read whole Dockerfile just to find randomly scattered EXPOSE instructions - like others do it.
Port numbers neatly in number order. Port numbers all documented in brief. Brief is perfect - I just need a one word reminder which port does what.
Correct and complete - the port numbers in # doclines and in EXPOSE lines match. No inconsistencies.
Your EXPOSEs must look like this: professional and beginner friendly.
Just out of interest - What is Cassandra?
From https://en.wikipedia.org/wiki/Apache_Cassandra
Apache Cassandra is a free and open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.
Cassandra offers robust support for clusters spanning multiple data centers, with asynchronous masterless replication allowing low latency operations for all clients.
From
# explicitly set user/group IDs
RUN groupadd -r cassandra --gid=999 && useradd -r -g cassandra --uid=999 cassandra
RUN groupadd -r sonarqube && useradd -r -g sonarqube sonarqube
Neo4j is a highly scalable, robust native graph database.
RUN addgroup -S neo4j && adduser -S -H -h /var/lib/neo4j -G neo4j neo4j
I found several other Dockerfiles - all do this the same way: addgroup and adduser all on one line.
Below I changed neo4j to be split over 2 lines: it just looks slightly better.
RUN addgroup -S neo4j \
&& adduser -S -H -h /var/lib/neo4j -G neo4j neo4j
From
Nextcloud: A safe home for all your data. Access & share your files, calendars, contacts, mail & more from any device, on your terms.
RUN set -ex; \
\
apt-get update; \
apt-get install -y --no-install-recommends \
rsync \
bzip2 \
busybox-static \
; \
rm -rf /var/lib/apt/lists/*; \
The most readable apt-get instructions that I could find.
Compare the text above to the other examples below to see how this is superior.
Apache Maven is a software project management and comprehension tool.
RUN apt-get update && \
apt-get install -y \
curl procps \
&& rm -rf /var/lib/apt/lists/*
Inconsistent, unaligned indentations.
From
Logstash is a tool that can be used to collect, process and forward events and log messages
# install plugin dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
apt-transport-https \
libzmq5 \
&& rm -rf /var/lib/apt/lists/*
apt-get update && apt-get install all in one line. Neo4j ( 20 lines below ) have only one instruction per line. See how easily that reads.
Java is a concurrent, class-based, object-oriented language.
RUN apt-get update && apt-get install -y --no-install-recommends \
bzip2 \
unzip \
xz-utils \
&& rm -rf /var/lib/apt/lists/*
apt-get update && apt-get install all in one line. Others have only one instruction per line. See below how easily that reads.
Neo4j is a highly scalable, robust native graph database.
RUN apk add --no-cache --quiet \
bash \
curl \
tini \
su-exec \
&& curl --fail --silent --show-error --location --remote-name ${NEO4J_URI} \
&& echo "${NEO4J_SHA256} ${NEO4J_TARBALL}" | sha256sum -csw - \
&& tar --extract --file ${NEO4J_TARBALL} --directory /var/lib \
&& mv /var/lib/neo4j-* /var/lib/neo4j \
&& rm ${NEO4J_TARBALL} \
&& mv /var/lib/neo4j/data /data \
&& chown -R neo4j:neo4j /data \
&& chmod -R 777 /data \
&& chown -R neo4j:neo4j /var/lib/neo4j \
&& chmod -R 777 /var/lib/neo4j \
&& ln -s /data /var/lib/neo4j/data \
&& apk del curl
Perfectly aligned one instruction per line. Very readable.
List of apk packages to install sorted alphabetically.
Curl used on line 6 then deleted on last line. Not needed anymore, therefore deleted.
${NEO4J_TARBALL} extracted on line 10 and deleted on line 12. Cleanup.
From
What is Nextcloud?
A safe home for all your data. Access & share your files, calendars, contacts, mail & more from any device, on your terms.
RUN { \
echo 'opcache.enable=1'; \
echo 'opcache.enable_cli=1'; \
echo 'opcache.interned_strings_buffer=8'; \
echo 'opcache.max_accelerated_files=10000'; \
echo 'opcache.memory_consumption=128'; \
echo 'opcache.save_comments=1'; \
echo 'opcache.revalidate_freq=1'; \
} > /usr/local/etc/php/conf.d/opcache-recommended.ini; \
\
echo 'apc.enable_cli=1' >> /usr/local/etc/php/conf.d/docker-php-ext-apcu.ini; \
\
echo 'memory_limit=512M' > /usr/local/etc/php/conf.d/memory-limit.ini; \
\
mkdir /var/www/data; \
chown -R www-data:root /var/www; \
chmod -R g=u /var/www
Very neat / pro first 7 lines echo settings to opcache-recommended.ini.
Neat and empty lines separate the 4 different purposes cleanly.
From
https://github.com/jenkinsci/docker/blob/587b2856cd225bb152c4abeeaaa24934c75aa460/Dockerfile
Jenkins Continuous Integration and Delivery server.
FROM openjdk:8-jdk
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*
ARG user=jenkins
ARG group=jenkins
ARG uid=1000
ARG gid=1000
ARG http_port=8080
ARG agent_port=50000
ENV JENKINS_HOME /var/jenkins_home
ENV JENKINS_SLAVE_AGENT_PORT ${agent_port}
'# Jenkins is run with user `jenkins`, uid = 1000
'# If you bind mount a volume from the host or a data container,
'# ensure you use the same uid
RUN groupadd -g ${gid} ${group} \
&& useradd -d "$JENKINS_HOME" -u ${uid} -g ${gid} -m -s /bin/bash ${user}
'# for main web interface:
EXPOSE ${http_port}
'# will be used by attached slave agents:
EXPOSE ${agent_port}
Professional looking:
PostgreSQL object-relational database system
ENV PATH $PATH:/usr/lib/postgresql/$PG_MAJOR/bin
ENV PGDATA /var/lib/postgresql/data
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA" # this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
VOLUME /var/lib/postgresql/data
Long RUN squashed between other lines.
My improved version:
RUN mkdir -p "$PGDATA" \
&& chown -R postgres:postgres "$PGDATA" \
&& chmod 777 "$PGDATA" \
# this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
The rest of their Dockerfile looks great - see link above. They also have more comments than most others.
From https://hub.docker.com/_/httpd/
'# install httpd runtime dependencies
'# https://httpd.apache.org/docs/2.4/install.html#requirements
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libapr1 \
libaprutil1 \
libaprutil1-ldap \
libapr1-dev \
libaprutil1-dev \
liblua5.2-0 \
libnghttp2-14=$NGHTTP2_VERSION \
libpcre++0 \
libssl1.0.0=$OPENSSL_VERSION \
libxml2 \
&& rm -r /var/lib/apt/lists/*
Sorted list of apt packages to install.
Here is the mess if unsorted:
RUN apt-get update
&& apt-get install -y --no-install-recommends \
libxml2 \
libaprutil1 \
libnghttp2-14=$NGHTTP2_VERSION \
libapr1-dev \
liblua5.2-0 \
libaprutil1-dev \
libpcre++0 \
libapr1 \
libssl1.0.0=$OPENSSL_VERSION \
libaprutil1-ldap \
&& rm -r /var/lib/apt/lists/*
You do not have to sort that by hand. Your editor probably has the functionality where you can just highlight the list and click - SORT.
Even during development it helps to have such lists sorted - to help you find a specific libapr. Right now the messy list must be carefully read from top to bottom to avoid missing any data. If this happens even once then the one-click SORT would have been a sound time investment.
Now that we have seen how good versus bad Dockerfiles look like, let's create a Dockerfile to be proud of.
Your Dockerfile must use each of these instructions below at least once.
Your Dockerfile does not have to result in a cool finished application. You are just experimenting with the syntax and functionality possible with these commands.
Run any commands, add any files, make a workdir, add environment variables and arguments. Expose some ports and label it all. Have an ENTRYPOINT. Our purpose here is just to be familiar with the commands.
Copy any text snippets you can find at https://hub.docker.com/explore/
The more different software you use as input, the more interesting your learning process will be.
FROM
COPY
ADD
RUN
LABEL
EXPOSE
ENVIRONMENT
ENTRYPOINT
VOLUME
USER
WORKDIR
ARG
The purpose of the texts below is for you to test how well you understand Dockerfile terminology.
It contains sentences stuffed with Dockerfile concepts.
It is not meant to teach you anything new.
If you can understand most of what is said below, you are comfortable with Dockerfile terminology.
Around 50% of the text below is from:
https://docs.docker.com/glossary/
That text got edited to include more Docker terms.
. . .
The docker build command builds Docker images using a Dockerfile.
A Dockerfile is a text file that contains all the Linux commands you would normally run at the Linux command shell in order to build a Docker image. Docker can build images by reading the instructions from a Dockerfile.
An image is a layered collection of all the software needed to run your software application in an isolated container. An image is not running - its just software files in directories.
A container is a runtime instance of a docker image. You can use the docker run command to create a container from an image.
A Docker container includes the Docker image used to create it. A container is like a mini VM.
The Docker Hub - https://hub.docker.com - is a website that stores Docker images
A registry is a website service containing repositories of Docker images. The Docker Hub is a registry.
The default registry - https://hub.docker.com - can be accessed using a browser at Docker Hub or using the docker search command.
A repository is a set of Docker images. A repository can be shared by pushing it to a registry server - using the docker push command. We did not use this docker push command during this set of 5 tutorials.
1000s of other people used docker push to add 1000s of public images at https://hub.docker.com
Docker images are the basis of containers. An Image is an layered collection of root filesystem changes and the corresponding execution parameters for use within a container runtime.
If you want your Dockerfile to be runnable without specifying additional arguments to the docker run command, you must specify either ENTRYPOINT, CMD, or both.
A named volume is a volume which Docker manages: you can use docker volume list to get a list of those volumes.
You can specify a friendly text name when you create a named volume.
An anonymous volume is similar to a named volume, however, it can be difficult, to refer to the same volume over time when it is an anonymous volumes. Docker handle where the files are stored. An anonymous volume
From
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
Below are one-liner summaries of several Dockerfile best practices.
Phase 1: Review the demo Dockerfile you created. Fix things you did wrong.
Phase 2: Visit https://hub.docker.com/explore/
Click on first package name: nginx - find the first Dockerfile listed there - click again to see that Dockerfile. Then briefly scan it and see if you can spot best practices employed and abused.
Also consider the neat / pro snippets briefly discussed above. Study those Dockerfiles and see if you can spot similar types of problems or good practices.
Do this for as many official packages you have time for.
Also enter your favourite Linux software's name in the top left box on that web page. Investigate how well 'your' software dockerized itself.
Now apply all you have learnt in these 4 tutorials at your workplace.
You are now ready to read the full Dockerfile reference at https://docs.docker.com/engine/reference/builder/
You should be familiar with nearly all concepts mentioned there. Based on the first 3 tutorials you have practically experimented with most Dockerfile instructions.
That should all be VERY easy reading now.
Everything just said applies equally well to the best practices below:
https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
You can now start building cool apps with Docker containers on Alibaba Cloud Elastic Compute Service (ECS) instances.
2,599 posts | 763 followers
FollowAlibaba Clouder - December 28, 2018
Alibaba Clouder - December 29, 2018
Alibaba Clouder - December 28, 2018
Alibaba Clouder - April 12, 2018
JDP - June 10, 2022
Farruh - March 6, 2023
2,599 posts | 763 followers
FollowElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreLearn More
Alibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreMore Posts by Alibaba Clouder