Docker

Why use Docker?

Make it easy to install and run software on any computer without having to worry about setup or installation of dependencies.

It is more lightweight, economical, and scalable than a virtual machine.

Docker containers share the host operating system whereas VMs have a guest operating system above the host operating system.

What is Docker?

Docker is an entire ecosystem
Docker client, server, hub, compose, machine, images

Image

A single file with all of the deps and config required to run a program
I can use an image to create a container

Container

A container is a program with its own set of isolated hardware resources
A container is an instance of an image, its sole purpose is to run one very specific program

Docker Client

Used to issue commands
Helps me interact with Docker server

Docker Server (aka Docker Daemon)

Tool that is responsible for creating images, running containers, etc

Example of running a command: Docker client e.g. Docker CLI communicates with Docker Server which does the heavy lifting.

docker run image-name e.g. docker run hello-world

What is a container?

Need to understand how OS runs on computer.

Most OSes have a Kernel, a running software process that governs access between all the programs running on your computer and all the physical hardware that is connected to your computer.

Chrome -> System Call -> Kernel -> CPU, Memory, HDD, etc.

Kernel is an intermediate layer that governs access between programs and your actual hardware.

Running programs interact with kernel through system calls, which are just like function invocations or APIs from the OS.

Container includes:

Name spacing
Control groups
A process or set of processes that has a grouping of resources specifically assigned to it (e.g. CPU, Memory, Disk, Network, etc.)

An image is a filesystem snapshot. An image also contains a startup command.

The image is placed inside the containers filesystem or portion of the hard drive assigned to it.

The process is isolated within the container.

A container is a running process along with a subset of physical resources that are allocated to that process specifically. The resources are assigned via name spacing. Control groups limit the amount of resources used per process. These are available as Linux specific features.

Docker is technically a Linux virtual machine. The containers are created within the virtual machine. Inside the virtual machine, we have the Linux kernel that is in charge of limiting access and isolating access to the hardware on my computer.

An image is a snapshot of the file system along with a very specific startup command.

Docker commands

Run a container with image

docker run <image name>

docker references the Docker Client
The image name is the name of the image to use for this container

docker run <image name> <command>

docker run busybox echo hi there
docker run busybox ls

The additional commands must exist within the image's file system, so I cannot use ls on everything, it must be a part of the image filesystem.

List running containers

docker ps

Gives me useful information about all running containers CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Most often used to get a container's ID so that I can issue commands on a specific container

docker ps --all

shows me all containers that have ever been used on this machine

Container lifecycle

When does a container actually get shut down?

Creating and starting a container are two separate processes.

docker run = docker create + docker start

docker create <image name> docker start <containerId>

Creating a container is about the file system
Starting it is about the startup command

docker start -a <containerId>

-a tells Docker to watch for output from the container and print it out to the terminal. Attach to the terminal and give me the output.

Deletes all stopped containers and build cache

This will clear all docker images downloaded from docker hub too

docker system prune

Retrieving log outputs

Say you start up a container without the -a flag and don't want to stop and restart the container with the flag.

docker logs <containerId>

This will present all the logs that the container has emitted

Stopping containers

Some containers will have processes that continue to run, even when detached from terminal output.

docker stop <containerId>

sigterm message sent to container
give process time to shut itself down gracefully

docker kill <containerId>

sigkill message sent to container
kill process immediately, shut down, do no additional work
no grace period
Prefer to use STOP command
If STOP does not work within 10 seconds, Docker will issue KILL

Multi-command containers

Executing commands in running containers

Outside a container, I have no access to anything inside the container through simple means. For example, if my container is running redis, I cannot use the global redis-cli from my terminal to interact with the redis instance within the container.

docker exec -it <containerId> <command>

Give me the ability to enter text (-it)
Startup a second program in our container

it allows me to type input directly into the container

The purpose of the IT flag

All of my commands within a container are running in Linux environment
Linux envs have three communication channels attached
STDIN, STDOUT, STDERR
Things I type into my terminal are sent to STDIN

-it is actually two separate flags

-i allows me to type input directly into the container

-t allows me to attach to the container's terminal, makes text show up pretty, all the text I enter and that is output shows up nicely on the screen, otherwise output is pretty bare bones and basic.

Getting a command prompt in a container

This is the most common way to get shell access to my running container
Run commands inside my container without having to run docker exec

docker exec -it <containerId> sh

sh is a command processor or shell

Starting with a shell

docker run -it image-name sh

Downside is that I am not running any other process, e.g start up my web server

Container isolation

Building custom images through Docker

Docker file -> Docker client -> Docker server -> Usable image

Docker server interprets Docker file and creates an image

Most files follow a specific format:

Specify a base image
Run some commands to install additional programs
Specify command to run on container start

Building a docker file

Goal: create an image that runs redis-server

FROM <baseImage> // Install an OS, a base image

RUN <command> // Run commands to install all programs

CMD <command> // Command to run on startup

cd into directory of build file

docker build .

. is the current directory which is the build context. This is the set of files we want to encapsulate in the image.

Each command results in a step in the build process.

Create a base image using FROM command
Docker Daemon downloads the image
Docker server runs the RUN command, it looks for the image from the previous step
That image was used to create a temporary container
The RUN command was executed inside the temporary container
It installed Redis inside the temporary container's file system
We then took a snapshot of that container's new file system
We then shut down temporary container and got the image ready for next instruction. We were left with an image of the original base image with Redis installed. We took a snapshot of that file system.
CMD then adds to that container what the image should do when it is started up as a container.
We end up with an image with a modified primary command.
The output from all of the steps is whatever image was generated from the last step.

What is a base image?

Analogy: writing a docker file is like being given a brand new computer with no OS and being told install Google Chrome on it.

Install OS
Install Chrome
Execute Chrome

Rebuilds with cache

This gives Docker a lot of performance.
Each instruction we get a new image.
Docker uses a cache when steps are the same
The FROM command may use one that is already downloaded
Installed deps may also be cached and be reused when new RUN steps are added.
If the order of steps is different, such as changing the order of dep installations, it will not be able to use cache

Each image has:

File system snapshot e.g. bin, etc, dev, home, proc
Startup command

Lesson to learn is to put changes below last command so that Docker can always use the cache.

Tagging an image with a name

docker build -t <name> .

-t allows me to tag the image

docker build -t postmac/redis:latest .

The convention for name is:

myDockerId/repoName:version

version can be a number too, but the most recent build is usually latest

Community images have shorter names as they have been open sourced

The . at the end is the build context. It specifies the directory of the files/folders to use for the build.

When running, if I don't specify the version, then the latest is used by default

docker run postmac/redis

This entire process is called tagging the image.

Technically, the version is the tag.

I can manually do what a Docker file does by starting a container and manually running commands from the container to generate an image that I can use in the future.

Making real projects with Docker

Create NodeJS web app
Create Dockerfile
Build an image from the Dockerfile
Run the image as a container
Connect to web app from a browser

Flow

Specify a base image
Run some commands to install additional programs
Specify a command to run on container startup

e.g.

FROM alpine
RUN npm install
CMD ["npm", "start"]

In the Docker ecosystem, alpine is synonymous with a small image. It means the image of whatever it is you're pulling is as compact as possible.

Many images have alpine versions. It is a very stripped down image. You might get simple programs like ping, ls, a small text editor like nano, etc.

None of the files in my project directory are available inside the container by default. This must be manually specified.

COPY ./ ./ , this would place all my files in the root directory as siblings to bin, etc, home, etc. This is a bad practice.

first ./ path to folder to copy from on your machine relative to the build context
second ./ place to copy stuff to inside the container

Port mapping/forwarding

By default, no traffic that is coming into my computer, such as a port, is routed into the container.
The container has its own isolated set of ports but by default no traffic is directed there.
We have to set up a port mapping from my computer to the container.
The above is only with respect to incoming requests. The Docker container can make outgoing requests.

The port forwarding is strictly a run time configuration. In other words, it is something we do when starting up the container. We use the -p flag to specify the port mapping.

-p 8080:8080

docker run -p 5000:8080 <imageId>

Take incoming requests to port 5000 and forward them to port 8080
Route them to port 8080 on the container
The application needs to listen on the same target port of the container, e.g. app.listen 5000 means that my container needs to listen on port 5000.

Specifying a working directory

WORKDIR <referenceToAppFolder>

Example:

WORKDIR /usr/app

There isn't consensus on where to put the working directory.

Cache busting and rebuilds

Cannot update an app file without rebuilding the container.
We can achieve this with some configuration though.

Docker compose with multiple local containers

We have two options when we need to set up some networking between two containers.

Use Docker CLI's networking features. This is unpleasant as you have to re-run commands each time. This is almost never done.
Use Docker compose. This is much easier to use and is the recommended way to set up networking.

Docker compose

Exists to keep you from having to run repetitive commands using the Docker CLI.

Makes it easy to start up multiple docker containers at the same time and connect them together with some networking.
Exists to function as Docker CLI and issue multiple commands more quickly.
Automate commands
Startup multiple containers at the same time and connect them together
Uses a special syntax in a yaml file
Feed the file to the Docker CLI

By specifying services in the same Docker compose file, Docker automatically networks them together so that they can communicate.

There is no need to open ports manually.

Docker will automatically resolve the hostname of services defined in the file, e.g. a database. You can also use the full string.

Running

Docker automatically looks for a compose file when running the following command:

docker-compose up

Run in the background:

docker-compose up -d

Rebuild

docker-compose up --build

Tells Docker compose to start up containers and also rebuild
Use this whenever you change code

Stopping containers

Close all running containers from Docker compose docker-compose down

How to deal with containers that crashed

Containers where the app crashed are closed

We can automatically restart it
The exit status code is useful for our restart policy
no, always, on-failure, unless-stopped
always is a good choice for a web server where you want it up always

Print status of my containers from docker compose

Must be run from the directory where the docker-compose.yml file is located. It needs the docker compose file to determine which containers to get the status of.

docker-compose ps

Creating a production-grade workflow

Cycle through the following flow:

Development Testing Deploying

Docker is just a tool in the normal development flow, e.g. part of pushing, merging a PR, CI, and deployment.

I will want to create multiple docker files

One for each environment

Running a docker file with a custom name

docker build -f

-f allows me to specify the file directly when it is not explicitly Dockerfile

Because create react app comes with node_modules, and I am installing deps via Dockerfile, I end up with two sets of deps

It is best to delete the node_modules that came with create react app

Getting changes inside my container

I want changes made to my source code to get automatically propagated into my container

Rebuild the image...not a great option
Do something smarter?

Docker volumes

Making a change in a source file and seeing it reflected in the container

volumes

placeholder inside of docker container
no longer copy over entire dir
more like a reference inside the container
the reference points back to the local machine
map to /src
map to /public
map from a folder inside the container to a folder outside the container

-v for VOLUME

docker run -p -v /app/node_modules -v $(pwd):/app <imageId>

-v $(pwd):/app interpolate the pwd
-v /app/node_modules This sort of puts a bookmark on the node_modules folder
mapping between a file/folder in the container and file/folder on my machine

-v /app/node_modules with no colon, this means we want this to be a placeholder for the folder that is inside the container, don't map it up to anything. Node modules was already installed in our container and we don't want to override that with the second command where we are looking at our app directory outside of our container.

-v $(pwd):/app <imageId> with colon, map the pwd to the /app folder inside the container

sort of like port mapping, pwd:container

changes now get propagated to the container from the project directory

Docker Compose

We can use docker compose to make it easier to start up our volumes

one volume for node_modules
one volume for my source code, map it from my machine to container

Executing tests

build container
execute command

docker run -it <containerId> npm run test

-it hooks up the standard input and output of the container to the terminal

This is good for running tests locally

Live updating tests

I could attach to the existing running container that would allow me to run the tests

Run docker-compose up
Get id of running container with docker ps
docker exec -it image-name npm run test
this reuses the existing container and starts up the test suite inside there

This might be a nice option for adding as a script to package.json

Docker compose for running tests

Creating another service inside my docker compose file is an option however it isn't perfect. The test output is mixed with the other services output and I cannot type commands in.

Each process has its own stdin and stdout. It is per process.

Docker attach Attach to container and forward input from my terminal to the container

docker attach <containerId>

Docker attach always connects to the primary process, not any other PIDs in the container, cannot therefore route terminal input into other processes in the container

docker exec -it 1577a221603a sh running ps in the shell will print out all running processes inside the container

PID process id

Need for Nginx

Multi-step build process

Build phase
Run phase

Allows me to use different base images for different phases

Do not want all of the deps, just the bundle

Continuous integration and deployment

These steps assume I am in root dir, with GHA, I might need a ../

Lint step
Tell GHA we need a copy of Docker running
Build our image using Dockerfile.dev docker build -t postmac/repo-name -f Dockerfile.dev .

I could name this whatever I want, e.g. my-image, maybe test-image

Tell GHA to run our test suite

docker run postmac/repo-name npm run test

Tell GHA how to deploy our code to Heroku using the Dockerfile (prod version)

Tips

For docker compose, depending upon the environment I am working in, I may need to specify a specific docker compose file

docker-compose -f docker-compose-dev.yml up

docker-compose -f docker-compose-dev.yml up --build

docker-compose -f docker-compose-dev.yml down

Exposing ports via the Dockerfile

EXPOSE 8080

Some web servers require this but in terms of Docker, it is just for documentation for other developers who look at the Dockerfile

Multi-container application

Env vars

Build image
Create instance of container from image (run time is when container is started up)
The env var is not encoded inside the image
The image doesn't have memory of the env var
Only when the container is created is the env var set up inside of it

variableName=value

Sets a variable in the container at run time

variablename

Sets a variable in the container at run time where the value is taken from your computer
.env file

Multi-container deployments

syntax=docker/dockerfile:1

FROM node:12-alpine
WORKDIR /app
COPY package.json yarn.lock ./
RUN yarn install --production
COPY . .
CMD ["node", "src/index.js"]

I should also copy package-lock.json or yarn.lock on the same line as package.json to lock in the dep versions!