Dockerizing a Node.js Web Application


17 minute read


Several months ago, I wrote about how you might go about ‘dockerizing’ a node.js web application. I was able to write an expanded version of this post for Semaphore CI’s Community site. I am re-posting it here — enjoy!


Dockerizing a Node.js Web Application

Introduction

If you’ve ever developed anything that needs to ‘live’ somewhere besides your local machine, you know that getting an application up and running on a different machine is no simple task. There are countless considerations to be had, from the very basics of “how do I get my environment variables set” to which runtimes you’ll need and which dependencies those will rely on, not to mention the need to automate the process. It’s simply not feasible for software teams to rely on a manual deploy process anymore.

A number of technologies have sought to solve this problem of differing environments, automation, and deployment configuration, but the most well-known and perhaps the most notable attempt in recent years is Docker.

By the end of this tutorial you should be able to:

  • understand what Docker is and what it does
  • create a simple Dockerfile
  • run a Node.js application using Docker
  • deploy your Node.js application

What is Docker, Anyway?

Docker’s homepage describes Docker as follows:

“Docker is an open platform for building, shipping and running distributed applications. It gives programmers, development teams and operations engineers the common toolbox they need to take advantage of the distributed and networked nature of modern applications.”

Put differently, Docker is an abstraction on top of low-level operating system tools that allows you to run one or more containerized processes or applications within one or more virtualized Linux instances.

Advantages of Using Docker

Before we dive in, it’s important to stress again the potential usefulness of Docker in your software development workflow. It’s not a “silver bullet”, but it can be hugely helpful in certain cases. Note the many potential benefits it can bring, including:

  • Rapid application deployment
  • Portability across machines
  • Version control and component reuse
  • Sharing of images/dockerfiles
  • Lightweight footprint and minimal overhead
  • Simplified maintenance

Prerequisites

Before you begin this tutorial, ensure the following is installed to your system:

Directory Structure

We’ll be using a basic Express application as our example Node.js application to run in our Docker container. To keep things moving, we’ll use Express’s scaffolding tool to generate our directory structure and basic files.

# This will make the generator available to use anywhere
$ npm i -g express-generator
$ cd <your project directory>
$ git init # (if you haven't set up your repository already)
$ express
# ...
$ npm install

This should have created a number of files in your directory, including bin, views, and routes directories. Make sure to run npm install so that npm can get all of your Node.js modules set up and ready to use.

Setting Up Express

Now that we’ve got our basic Express files generated for us, let’s write some basic tests to ensure that we’re working with good development practices and can have some idea of when we’re done.

To run our tests, we’ll use just two tools: SuperTest and tape.

Let’s get them installed first as development tools, so they won’t be installed in production:

$ npm install --save-dev tape supertest

Since we’re not focusing on Express in this tutorial, we won’t go too deeply into how it works or extensively testing it. At this point, we just want to know that the application will send back some basic JSON responses when we create GET requests. SuperTest will spin up an instance of our application, assign it an ephemeral port, and let us send requests to it with a fluent API. We also get a couple assertions we can run; we run them against the response type and ‘Content-Type’ header.

test/routes.js We can set up our tests that will focus on our rotes in this file. Normally, we would break related tests into several different files, but our application is so lightweight this will suffice.

const supertest = require("supertest");
const app = require("../app");
const api = supertest(app);

const test = require("tape");

test("GET /health", (t) => {
    api.get("/health")
        .expect("Content-type", /json/)
        .expect(200)
        .end((err, res) => {
            if (err) {
                t.fail(err);
                t.end();
            } else {
                t.ok(res.body, "It should have a response body");
                t.equals(
                    res.body.healthy,
                    true,
                    "It should return a healthy parameter and it should be true"
                );
                t.end();
            }
        });
});

// We describe our test and send a GET request to the /docker path, which we
// expect to return a JSON response with a docker property that equals 'rocks!'
test("GET /docker", (t) => {
    api.get("/docker")
        .expect("Content-type", /json/)
        .expect(200)
        .end((err, res) => {
            if (err) {
                t.fail(err);
                t.end();
            } else {
                t.ok(res.body, "It should have a response body");
                t.equals(
                    res.body.docker,
                    "rocks!",
                    "It should return a docker parameter with value rocks!"
                );
                t.end();
            }
        });
});

// Ensure we get the proper 404 when trying to GET an unknown route
test("GET unknown route", (t) => {
    api.get(`/${Math.random() * 10}`)
        .expect(404)
        .end((err, res) => {
            if (err) {
                t.fail(err);
                t.end();
            } else {
                t.end();
            }
        });
});

We can run our tests with node test/*.js, but it’s better to make sure anyone can run the tests and use our package.json file to standardize the test command to npm test:

package.json A package.json file lets npm and end-users of your application what dependencies your application depends on and provides other useful metadata.

  "scripts": {
    "start": "node ./bin/www",
    "test": "node test/routes.js" // we added this line
  },

Run your tests with npm test, and you should see two failing tests. Let’s get them passing by adding some routes to our bare bones application.

app.js This is the main file for our express application. The bin/www file will do the simple work of running our server, but this is where we set up our middleware, application configuration, and other options.

const express = require("express");
const path = require("path");
const favicon = require("serve-favicon");
const logger = require("morgan");
const cookieParser = require("cookie-parser");
const bodyParser = require("body-parser");

const health = require("./routes/health");
const docker = require("./routes/docker");

const app = express();

app.use(logger("dev"));
app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: false }));
app.use(cookieParser());
app.use(express.static(path.join(__dirname, "public")));

app.use("/health", health);
app.use("/docker", docker);

// catch 404 and forward to error handler
app.use((req, res, next) => {
    const err = new Error("Not Found");
    err.status = 404;
    next(err);
});

// error handlers

// development error handler
// will print stacktrace
if (app.get("env") === "development") {
    app.use((err, req, res, next) => {
        res.status(err.status || 500);
        res.send();
    });
}

// production error handler
// no stacktraces leaked to user
app.use(function (err, req, res, next) {
    res.status(err.status || 500);
    res.send();
});

module.exports = app;

health.js This is a simple health route that will send back a JSON-encoded payload when clients visit <our url>/health.

const router = require("express").Router();

router.get("/", (req, res, next) => {
    return res.json({
        healthy: true,
    });
});

module.exports = router;

docker.js This is another simple route that will send back a JSON-encoded payload when clients visit <our url>/docker.

const router = require("express").Router();

router.get("/", (req, res, next) => {
    return res.json({
        docker: "rocks!",
    });
});

module.exports = router;

We’ve got our basic Node.js application all set up and ready to go. If you want to run it outside of using npm test, you can run it using this:

$ node bin/www

Setting Up PM2

While running our Node.js application with node bin/www is fine for most cases, we want a more robust solution to keep everything running smoothly in production. It’s recommended to use pm2, since you get a lot of tunable features. We can’t go too deep into how pm2 works or how to use it, but we will create a basic processes.json file that pm2 can use to run our application in production.

$ npm install --save pm2
$ cd <your project directory> && touch processes.json

processes.json To make it easier to run our node.js application and understand what parameters we are giving to PM2, we can use an arbitrarily-named JSON file (processes.json here) to set up our production configuration.

{
    "apps": [
        {
            "name": "api",
            "script": "./bin/www",
            "merge_logs": true,
            "max_restarts": 20,
            "instances": 4,
            "max_memory_restart": "200M",
            "env": {
                "PORT": 4500,
                "NODE_ENV": "production"
            }
        }
    ]
}

Our processes.json file names our application, gives it a file to run, sets Node.js arguments, and then gives it environment variables. Moving all of our environment variables out of our code lets us keep everything stateless, which is how it should be, and will let us easily scale horizontally if we need to.

Installing Docker

With one of the core tenets of Docker being platform freedom and portability, you’d expect it to run on a wide variety of platforms. You would be correct, the Docker installation page lists over 17 cloud and Linux-supported platforms.

We can’t go through every installation possibility, but we’ll walk through installing Docker using Docker Machine.

We’ll install Docker Machine using Homebrew. This is generally preferred over installing binaries and packages in an inconsistent and/or scatter-shot way, since you will probably end up littering your computer with old versions, upgrading will be difficult, and you might end up using sudo when you don’t necessarily need to. If you prefer not to use Homebrew, there are further installation instructions available here.

So, once you have have Homebrew installed, you can run the following:

$ brew update && brew upgrade --all && brew cleanup && brew prune # makes sure everything is up to date and cleans out old files
$ brew install docker-machine

Now that Docker Machine is installed, we can use it to create some virtual machines and run Docker clients. You can run docker-machine from your command line to see what options you have available. You’ll notice that the general idea of docker-machine is to give you tools to create and manage Docker clients. This means you can easily spin up a virtual machine and use that to run whatever Docker containers we want or need on it.

We’re going to create a VirtualBox virtual machine and specify how many CPUs and how much disk space it should have. It’s generally best to try to mirror the production environment you’ll be using as closely as possible. For this case, we’ve chosen a machine with 2 CPUs, 4GB of memory, and 5GB of disk space since that matches the cloud instance we have most recently worked with. We named the machine ‘dev2’:

$ docker-machine create dev2 --driver virtualbox --virtualbox-disk-size "5000" --virtualbox-cpu-count 2 --virtualbox-memory "4112"

This will spin up your machine and let you know when everything is finished. The next step is to use Docker Machine’s env command to finish your setup:

$ docker-machine env dev2

export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://123.456.78.910:1112"
export DOCKER_CERT_PATH="/Users/user/.docker/machine/machines/dev2"
export DOCKER_MACHINE_NAME="dev2"
# Run this command to configure your shell:
# eval "$(docker-machine env dev2)"

Those are the environment variables Docker Machine will need to let you interact and work with your new machine. Finish up the setup by running eval "$(docker-machine env <the name of your machine>)", and check docker-machine ls to ensure your new machine is up and running. You should now be able to run docker from your command line and see feedback — we’re almost ready to dockerize all the things.

Creating a Dockerfile

There are many ways to use Docker, but one of the most useful is through the creation of Dockerfiles. These are files that essentially give build instructions to Docker when you build a container image. This is where the magic happens — we can declaritively specify what we want to have happen and Docker will ensure our container gets created according to our specifications. Let’s create a Dockerfile in the root of our project directory:

$ cd <your project root>
$ touch Dockerfile && touch .dockerignore

Note that we also created a .dockerignore file. This is similar to a .gitignore file and lets us safely ignore files or directories that shouldn’t be included in the final Docker build. A side benefit is that we also eliminate a set of possible errors by only including the files we really care about.

Along those lines, let’s add some files to our .dockerignore file:

.dockerignore This file acts like a .gitignore and tells Docker which files it should ignore.

.git
.gitignore
node_modules

Now we’re ready to create a Dockerfile. You can think of a Dockerfile as a set of instructions to Docker for how to create our container, very much like a procedural piece of code.

To get started, we need to choose which base image to pull from. We are essentially telling Docker “Start with this.” This can be hugely useful if you want to create a customized base image and later create other, more-specific containers that ‘inherit’ from a base container. We’ll be using the debian:jessie base image, since it gives us what we need to run our application and has a smaller footprint than the Ubuntu base image. This will end up saving us some time during builds and let us only use what we really need.

Dockerfile Using a Dockerfile is one way to tell Docker how to build images for us.

# The FROM directive sets the Base Image for subsequent instructions
FROM debian:jessie

Next, let’s add a couple minor housekeeping tasks so we can later use nvm to choose whatever version of Node.js we want and then set an environment variable:

# ...
# Replace shell with bash so we can source files
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
# Set environment variables
ENV appDir /var/www/app/current

The RUN command executes any commands in a new layer on top of the current image and then commits the results. The resulting image will then be used in the next steps.

This command starts to get us into the incremental aspect of Docker that we mentioned briefly as one of its benefits. Each RUN command acts as sort of git commit-like action in that it takes the current image, executes commands on top of it, and then returns a new image with the committed changes. This creates a build process that has high granularity — any point in the build phases should be a valid image — and lets us think of the build more atomically (where each step is self-contained).

With that in mind, let’s install some packages that we’ll need to run our Node.js application later:

# ...
# Run updates and install deps
RUN apt-get update

# Install needed deps and clean up after
RUN apt-get install -y -q --no-install-recommends \
    apt-transport-https \
    build-essential \
    ca-certificates \
    curl \
    g++ \
    gcc \
    git \
    make \
    nginx \
    sudo \
    wget \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get -y autoclean

Note that we grouped all the apt-get install-related actions into a single command. Because we did that, the build is, in that phase, only doing things related to installing needed packages with apt-get and subsequent cleanup.

Next, we’ll install nvm so we can install any version of Node.js that we want. There are base images out there that let you install Node.js with Docker, but there are several reasons why you might not want to use them:

  • Speed: nvm lets you upgrade to a latest version of Node.js immediately. There are sometimes critical security fixes that get released and you shouldn’t need to wait for a new version
  • Clean separation of concerns: changing to/from a version of Node.js is done with nvm, which is dedicated to managing Node.js installations
  • Lightweight: you get what you need with a simple curl-to-bash installation

Dockerfile We are adding node.js-related commands to our Dockerfile.

# ...
ENV NVM_DIR /usr/local/nvm
ENV NODE_VERSION 6.0.0

# Install nvm with node and npm
RUN curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.26.0/install.sh | bash \
    && source $NVM_DIR/nvm.sh \
    && nvm install $NODE_VERSION \
    && nvm alias default $NODE_VERSION \
    && nvm use default

# Set up our PATH correctly so we don't have to long-reference npm, node, &c.
ENV NODE_PATH $NVM_DIR/versions/node/v$NODE_VERSION/lib/node_modules
ENV PATH      $NVM_DIR/versions/node/v$NODE_VERSION/bin:$PATH

We just ran the basic nvm setup instructions, installed the version of Node.js we want, made sure it is set as the default for later, and set some environment variables to use later (PATH and NODE_PATH).

One thing to note: We highly recommend downloading a copy of the nvm install script and hosting it yourself if you’re going to use this setup in production, since you really don’t want to be relying on the persistence of a hosted file for your entire build process.

Now that we have Node.js installed and ready to use, we can add our files and get ready to run everything. First, we need to create a directory to hold our application files. Then, we’ll set the workdir, so Docker knows where to add files later. This affects RUN, CMD, ENTRYPOINT, COPY, and ADD instructions that follow it in the Dockerfile. We waited to set it till now because our commands have not needed to be run from a particular directory.

# Set the work directory
RUN mkdir -p /var/www/app/current
WORKDIR ${appDir}

# Add our package.json and install *before* adding our application files
ADD package.json ./
RUN npm i --production

# Install pm2 *globally* so we can run our application
RUN npm i -g pm2

# Add application files
ADD . /var/www/app/current

This part is crucial for understanding how to speed up our container builds. Since Docker will intelligently cache files between incremental builds, the further down the pipeline we can move buildsteps, the better. That is, Docker won’t re-run commits (RUNs and other commands) when those buildsteps have not changed.

So, we add in only our package.json file and run npm install --production. Once that’s done, then we can add our files using ADD. Since we ordered the steps this way and chose to have Docker ignore our local node_modules directory, the costly npm install --production step will only be run when package.json has changed. This will save build time and hopefully result in a speedier deploy process.

The last two commands are quite important: they handle access to our container and what happens when we run our container, respectively.

Dockerfile These final Dockerfile commands are what Docker will use to actually run our application.

# ...
#Expose the port
EXPOSE 4500

CMD ["pm2", "start", "processes.json", "--no-daemon"]
# the --no-daemon is a minor workaround to prevent the docker container from thinking pm2 has stopped running and ending itself

EXPOSE will open up a port on our container, but not necessarily the host system. Remember, these instructions are for Docker, not the host environment.

We can map ports to external ports later, so choosing a privileged port like 80 or 443 isn’t absolutely necessary here.

CMD is what will happen when you run your container using docker run from the command line. It takes arguments as an array, somewhat similar to how Node’s child_process#spawn() API works.

Our final Dockerfile should look more or less as follows:

Dockerfile

# using debian:jessie for it's smaller size over ubuntu
FROM debian:jessie

# Replace shell with bash so we can source files
RUN rm /bin/sh && ln -s /bin/bash /bin/sh

# Set environment variables
ENV appDir /var/www/app/current

# Run updates and install deps
RUN apt-get update

RUN apt-get install -y -q --no-install-recommends \
    apt-transport-https \
    build-essential \
    ca-certificates \
    curl \
    g++ \
    gcc \
    git \
    make \
    nginx \
    sudo \
    wget \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get -y autoclean

ENV NVM_DIR /usr/local/nvm
ENV NODE_VERSION 6.0.0

# Install nvm with node and npm
RUN curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.29.0/install.sh | bash \
    && source $NVM_DIR/nvm.sh \
    && nvm install $NODE_VERSION \
    && nvm alias default $NODE_VERSION \
    && nvm use default

# Set up our PATH correctly so we don't have to long-reference npm, node, &c.
ENV NODE_PATH $NVM_DIR/versions/node/v$NODE_VERSION/lib/node_modules
ENV PATH      $NVM_DIR/versions/node/v$NODE_VERSION/bin:$PATH

# Set the work directory
RUN mkdir -p /var/www/app/current
WORKDIR ${appDir}

# Add our package.json and install *before* adding our application files
ADD package.json ./
RUN npm i --production

# Install pm2 so we can run our application
RUN npm i -g pm2

# Add application files
ADD . /var/www/app/current

#Expose the port
EXPOSE 4500

CMD ["pm2", "start", "processes.json", "--no-daemon"]

# voila!

Bundling and Running the Docker Container

We’re almost there. To run our container locally, we need to do two things

  1. Build the container:
$ cd <your project directory>
$ docker build -t markthethomas/dockerizing-nodejs-app .
#          ^    ^                                 ^
#        build  w/ tag              this directory
# ... lots of output

Before moving on, try running the build command again and see how much faster it is with everything cached.

  1. Run it
$ docker run -p 4500:4500 markthethomas/dockerizing-nodejs-app
#               ^^^^^^^^^
#          bind the exposed container port to host port (on the virtual machine)

Since we’re running locally, we can get the IP that Docker Machine set up for us with docker-machine ip dev2, and then visit that :4500/docker and we’ll get a response.

Conclusion

We have looked at Docker — what is, how it works, how we can use it — and how we might run a simple Node.js application in a container. Hopefully, you feel able and ready to create your own Dockerfile and take advantage of the many powerful features it brings to your development life.

Related: