Docker Image Building – Lessons learned

My cps-WhatsNew script has a new permanent page on the blog site now, as well as a new Docker image!  Check out the links and let me know what you think.

While the cps-WhatsNew script is relatively simple, I was thinking about how could I make it easier to implement the solution.  Docker was a clear choice.  As I’ve only been a consumer of Docker images, this “short” project was a very interesting and took an, embarrassingly enough, long time to complete.

It took close to 3 full days, off and on, to get everything put together and optimized.  The optimization part was the longest and I’m still not happy with the size of the docker image now that I’ve published everything.

Here are some hard-won lessons:

Have a good directory structure

Initially, my repo consisted of one root directory that contained all my application files and a subdirectory for my email template.  The root directory had everything from readme files to configuration files to python scripts.  Everything under the sun!  If you ever plan on sharing your code in a Docker image, you need to think about your project’s directory structure a bit more thoroughly.

A better design turns out to be documented in some of the “best practices” out there for structuring your GitHub repos.  First, put all your code in a subdirectory off of your main root folder.  Create another directory for all files that could be configured by end users of your script. In my case, I created a /custom directory and place the application configuration files, log configuration files, and my email template subdirectory.

Also, you may want to think about using the OS’s environment variables to help you with some of this restructuring.  The more you can configure your script to use environment variables from the start, the better for you in the future when it comes to adapting to changes in the future. In my case, I put all configuration bits into the application configuration file and the logging configuration file.  In the end, to make it easier to work with Docker images I had two environment variables, one for the /custom folder and another for the location of the /logs directory.

Make sure you modify your project’s code to reflect the new directory structure and environment variables.  The run the scripts to make sure everything is still working.  Or learn to start building some automated testing so that you don’t have to continuingly run tests yourself.

Helpful tools

So I started this process with my two standard tools – iTerm2 (work) and Safari (research – Google is your friend.)  While this worked, I found out that I could have made life a lot easier for myself by using some other tools. I was basically creating and editing files right from the console as I was learning how to build a good dockerfile.  nano was my go-to editor. UGH!

First, GitHub.  If you’re half-way serious about building a Docker image, you’ll want to create a
GitHub project to store the dockerfile you’re going to create

 

– and eventually to store all of the scripts you’ll build to automate the Docker instance and all the documentation your write.  This is different than the GitHub instance you would have for your code/application.  This one is just for the Docker instance.

Yes – I’m sure there will be instances where you can build a Docker image without any supporting scripts.  However, you’re more than likely going to have to create some scripts if you want to provide the user with your image a smoother experience.  In my case, I wanted to ensure the user had a default email template and configuration files after the user had spun up a container with my image.  The only way to do this was to create a script to copy the necessary files to the mount points the user would provide.  In my research, I’ve found that there were some Docker images that had just as much scripting done on automating the Docker image as the actual project!  Thankfully I was able to limit mine to two scripts.

Second, Visual Studio Code – or some other kick-ass editor.  I’m starting to fall in love with Visual Studio Code.It’s the just seems to have all the extension I could ever want.  In this case, it helps me not only build the shell scripts, but also my README.md file and the dockerfile itself!  The awesome bit to
this is not just convenience, but rather the built-in help.  As I edited my previously-nano edited dockerfile, Visual Studio Code would tell me that some of the commands I was using in the dockerfile were deprecated and what I should be using.  Plus, it had links to online help for all the dockerfile commands.  I swear, I think that Visual Studio Code has an identity crisis – it just can’t be called a text editor anymore as it’s more akin to a full IDE!

Optimize, test, rinse, repeat

So you’ll see a lot of ‘best practices’ out there for Docker images and one re-occurring one is to have your Docker image do ONLY ONE thing.  I don’t know if I completely agree.  In my instance, the image is to house the script and provide the user with default templates.  I could have stopped there, however, I thought why not just add a default cron job to execute the code?  To me, I thought that it was a complete waste of time to configure another Docker image just to run a cron job.  It just didn’t make sense.  So my container does two things as it’s a bit more optimized that way.

Also, during the process of creating the container, I believe I rebuilt my image at least 50 times, if not more.  Each time, making iterative improvements till I painted myself into a corner and had to start two or three steps back.

One such time was when I attempted to reduce the size of the docker image by removing dev and build tools.  As my code is written in Python, I have a requirements file that gives the ability to pull in all of the need libraries my code needs to run.  Some of these libraries have to be built during the install.  In order for these libraries to build, I had to download even more libraries that eventually wouldn’t have any use in the completed build.  Hence, my dockerfile had a line to remove these packages after I built the needed libraries.  This took my 350mb image down to 150mb!  I thought I was so clever till I went to execute my code.  Turns out I removed far too much.  My code wouldn’t work at all.  It bitterly complained about everything till I started putting back the libraries I removed.

Eventually, (an embarrassingly long period of time..) I got the image down to about 325mb (150mb compressed on Docker Hub!) The lesson here is that you have to test everything even after you think you’re done with all your coding.

Thanks!

JT

1 thought on “Docker Image Building – Lessons learned

Leave a Comment