Fast Deployments with Docker

April 14, 2017

A couple of weeks ago I did an overall upgrade of my server and moved this jekyll blog from Heroku’s servers over to my own dedicated machine. During this process I also gave my blog an overhaul and added SSL encryption and made it more efficient by minimizing more resources and adhering to Google’s PageSpeed Best Practices.

Simpliyfing Deployments with Docker

docker

The new setup is based on the awesome dokku, a docker-powered Platform as a Service (PaaS). Coming from the heroku platform it now allows me to quickly deploy changes of my webpage and host it on my own machine. Furthermore I can run multiple different applications that are isolated from each other (anagram, screen, sudoku). But Docker can provide even more benefits:

Rapid application deployment – containers include the minimal runtime requirements of the application, reducing their size and allowing them to be deployed quickly.
Portability across machines – an application and all its dependencies can be bundled into a single container that is independent from the host version of Linux kernel, platform distribution, or deployment model. This container can be transfered to another machine that runs Docker, and executed there without compatibility issues.
Version control and component reuse – you can track successive versions of a container, inspect differences, or roll-back to previous versions. Containers reuse components from the preceding layers, which makes them noticeably lightweight.
Sharing – you can use a remote repository to share your container with others. Red Hat provides a registry for this purpose, and it is also possible to configure your own private repository.
Lightweight footprint and minimal overhead – Docker images are typically very small, which facilitates rapid delivery and reduces the time to deploy new application containers.
Simplified maintenance – Docker reduces effort and risk of problems with application dependencies.

To setup an application with dokku I just had to create a simple Dockerfile. It is based of the official ruby image and adds node and yarn to the mix.

FROM ruby

... # install node.js and yarn (see https://github.com/nodejs/docker-node/blob/a82c9dcd3f85ff8055f56c53e6d8f31c5ae28ed7/7.9/Dockerfile)

Afterwards I install all dependencies through yarn, gem and bower.

ADD . /app
WORKDIR /app

RUN yarn global add node-gyp
RUN yarn install
RUN bundle install
RUN yarn run bower
RUN yarn run gulp

Finally I can build the site with the jekyll command.

ENV JEKYLL_ENV production
RUN bundle exec jekyll build

EXPOSE 5000
CMD ["bundle","exec","unicorn","-p","5000","-c","./unicorn.rb"]

Docker Caching Issues

One major issue I encountered was that changes often took a while to be pushed. Each time I change any small file, e.g. correct a typo on a blog post, docker would reinstall the entire blog with its dependencies. Usually Docker is smart about this and keeps a cache of the different parts of the Dockerfile. But if you added something to the docker context and something changes docker has to restart from there. Since I added my complete project folder at once, any change, as small as it might be, always triggered a complete rebuild of my application.

Dependency Jungle

The steps of the process that took the most time where the dependency installations. For my blog I use three different commands that take a longer time to execute:

The ruby gem command
The node.js yarn package manager
The bower tool for html components

Ideally I still would like to test my blog under the same conditions as on the server. If I would have to reinstall all dependencies everytim I make a minor change that would slow me down drastically. So what do we do about this?

Solution

To ensure that dependencies are only reinstalled when the corresponding configuration changes we have to selectively add the files to the docker image.

For my blog these files are:

Gemfile (Ruby)
package.json (Node.js)
bower.json (bower)

Dependency Files

We have to add these 3 files beforehand. And furthermore install the dependencies in a global or temporary location so that we can add them back to the rest of the application later.

Gemfile

ADD Gemfile /tmp/Gemfile
ADD Gemfile.lock /tmp/Gemfile.lock
RUN cd /tmp && bundle install

package.json

RUN yarn global add node-gyp
ADD package.json /tmp/package.json
RUN cd /tmp && yarn install
RUN mkdir -p /app && cp -a /tmp/node_modules /app/

bower.json

ADD bower.json /tmp/bower.json
RUN cd /tmp && yarn run bower

Ordering

Depending on which configuration file changes, all the steps that come afterwards have to be executed again. If we want to be smart about our approach, we can order the different installation steps in a more efficient way. In my case this would be:

yarn
gem
bower

For my application bower dependencies are changed the most frequent, so I install it last. After that I sometimes install new jekyll plugins. And finally comes yarn, where there are mostly just development requirements like gulp that almost never change and I therefore install it right at the beginning.

Result

Final Words

Of course you can use a similar setup for all kinds of dependencies and installation steps that you have. The core idea is to selectively add the files and build your docker image on a step-by-step basis.

Below you can find the final version of my Dockerfile. Don’t hesitate to ask, if you have any questions!

FROM ruby

RUN groupadd --gid 1000 node \
  && useradd --uid 1000 --gid node --shell /bin/bash --create-home node

# gpg keys listed at https://github.com/nodejs/node
RUN set -ex \
  && for key in \
    9554F04D7259F04124DE6B476D5A82AC7E37093B \
    94AE36675C464D64BAFA68DD7434390BDBE9B9C5 \
    0034A06D9D9B0064CE8ADF6BF1747F4AD2306D93 \
    FD3A5288F042B6850C66B31F09FE44734EB7990E \
    71DCFD284A79C3B38668286BC97EC7A07EDE3FC1 \
    DD8F2338BAE7501E3DD5AC78C273792F7D83545D \
    B9AE9905FFD7803F25714661B63B535A4C206CA9 \
    C4F0DFFF4E8C1A8236409D08E73BC641CC11F4C8 \
    56730D5401028683275BD23C23EFEFE93C4CFFFE \
  ; do \
    gpg --keyserver ha.pool.sks-keyservers.net --recv-keys "$key"; \
  done

ENV NPM_CONFIG_LOGLEVEL info
ENV NODE_VERSION 7.7.2

RUN curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/node-v$NODE_VERSION-linux-x64.tar.xz" \
  && curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/SHASUMS256.txt.asc" \
  && gpg --batch --decrypt --output SHASUMS256.txt SHASUMS256.txt.asc \
  && grep " node-v$NODE_VERSION-linux-x64.tar.xz\$" SHASUMS256.txt | sha256sum -c - \
  && tar -xJf "node-v$NODE_VERSION-linux-x64.tar.xz" -C /usr/local --strip-components=1 \
  && rm "node-v$NODE_VERSION-linux-x64.tar.xz" SHASUMS256.txt.asc SHASUMS256.txt \
  && ln -s /usr/local/bin/node /usr/local/bin/nodejs

ENV YARN_VERSION 0.21.3

RUN set -ex \
  && for key in \
    6A010C5166006599AA17F08146C2130DFD2497F5 \
  ; do \
    gpg --keyserver ha.pool.sks-keyservers.net --recv-keys "$key"; \
  done \
  && curl -fSL -o yarn.js "https://yarnpkg.com/downloads/$YARN_VERSION/yarn-legacy-$YARN_VERSION.js" \
  && curl -fSL -o yarn.js.asc "https://yarnpkg.com/downloads/$YARN_VERSION/yarn-legacy-$YARN_VERSION.js.asc" \
  && gpg --batch --verify yarn.js.asc yarn.js \
  && rm yarn.js.asc \
  && mv yarn.js /usr/local/bin/yarn \
  && chmod +x /usr/local/bin/yarn

RUN yarn global add node-gyp
ADD package.json /tmp/package.json
RUN cd /tmp && yarn install
RUN mkdir -p /app && cp -a /tmp/node_modules /app/

ADD Gemfile /tmp/Gemfile
ADD Gemfile.lock /tmp/Gemfile.lock
RUN cd /tmp && bundle install

ADD bower.json /tmp/bower.json
RUN cd /tmp && yarn run bower

ADD . /app
WORKDIR /app

RUN mv /tmp/bower_components source/_assets/components
RUN yarn run gulp

ENV JEKYLL_ENV production
RUN bundle exec jekyll build

EXPOSE 5000
CMD ["bundle","exec","unicorn","-p","5000","-c","./unicorn.rb"]

By Cecil Wöbker

I do science during the day and develop or design at night. If you like my work, hire me.

Feel free to follow me on Twitter or email me with any questions.