Developing inside a Docker Container in Visual Studio Code -

Reading Time: 7 minutes

Table of Contents

Setting the stage

The goal of this short post is to show how to use a Docker container as your development environment in VSCode, both locally and on a remote server. I decided to put it together as I have found a lot of very fragmented information online on the topic, and I hope to consolidate a concise go-to post for myself in the future. Hoping this will benefit others as well!

First things first. Why would you want to code inside a container?

Let’s take a step back. I am far from being a Docker wizard, having started to use the technology only less than a year ago (shame on me). Probably due to my lack of expertise, I have always considered Docker as a utility coming into play only at deployment time. So, you have an ML model you have trained and, at some point, you’ll likely ask yourself: “How do I serve it to users?”. To which, the almost unanimous answer will be “You wrap it up behind a REST API inside a Docker container”. I couldn’t agree more. As a testament to this, I have written a couple of posts on this topic.

Now, let’s say you are part of a team of devs and someone asks you to add a feature/fix a bug in a specific repo. Diligently, you clone the repo on your laptop, and then, before even writing a single line of code, you want to make sure tests are passing (the repo has tests, right?). Wait, where are you going to run those? You don’t have an environment with all the proper dependencies installed. Not a problem. You look for a `requirements.txt` file somewhere in the root folder and create a virtual env on top of it. This is easier said than done, though. I have spent countless hours fighting machine-related conflicts (even starting off from a blank env!).

Wait a sec. Isn’t this what Docker was created for? To solve “this works on my machine” problems? Yes, indeed. Which is why you should definitely go for it. If the repo you have just cloned has a Dockerfile (it has one, right?), you can build it, spin up a container on top of the image, and tell your IDE (VSCode in our case) to run inside it. All in a matter of minutes. This approach is the only bulletproof guarantee that the code you are writing runs in the exact same environment as your production application. If tests pass there, you are good to go. Now, let’s see how to accomplish that.

To showcase a real-life example I will walk through two separate scenarios:

Running VSCode on top of an IceVision Docker container on my laptop (MacOS, so no GPU available).
Running VSCode on top of an IceVision Docker container on an AWS EC2 machine, with a GPU.

In order to achieve CPU/GPU access according to the hardware we are running on, we’ll leverage the flexibility of Docker Compose. Despite its true superpower being to orchestrate a stack of multiple Docker microservices, as we will see, Docker Compose also shines when it comes to facilitating the management of a single image.

Dockerfile(s)

Along the way, we will need 4 files:

Dockerfile: this is the core of our environment, containing all the instructions for its successful creation.
docker-compose.yaml: this is the CPU version of the Docker Compose orchestration. It specifies the name of the service (icevision), how to build it, the name of the image in case it already exists (ice in my case but you can pick one yourself), and the volumes to mount (crucial if we want to persist any code changes we make inside the container after it’s terminated, which we indeed want). We’ll use this file locally (on my Mac).
docker-compose.gpu.yaml: this file complements docker-compose.yaml, adding GPU access. We’ll use it on the EC2 instance to let Docker know there is a GPU available and that we want to use it.
`devcontainer.json`: this file is VSCode specific. It works in combination with the Remote Development extension pack (which you need to install) and it contains all sorts of fully customisable instructions on how to operate the IDE inside the container of choice. This VSCode feature is excellently covered in this Microsoft tutorial, which perfectly explains at least the “developing inside a container locally” section of this post. I highly encourage to go through it. It is an invaluable resource.

# Dockerfile: with instructions on what to add to the IceVision image
FROM python:3.8
RUN pip install PyYAML>=5.1 -U
RUN pip install datascience -U
RUN pip install torchtext==0.9.0 -U
RUN pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html -U
RUN pip install fastai==2.3.1 -U
RUN pip install git+git://github.com/airctic/icevision.git#egg=icevision[all] -U
RUN pip install git+git://github.com/airctic/icedata.git -U
RUN pip install yolov5-icevision -U 
RUN pip install mmcv-full==1.3.7 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html -U
RUN pip install mmdet==2.13.0 -U
RUN pip install ipywidgets -U

# docker-compose.yaml: used to build the IceVision Dockerfile and run a container
version: "3"
services:
  icevision:
    build:
      dockerfile: Dockerfile
      context: .
    image: ice
    tty: true # attaches a terminal to the container
    volumes:
      - ../:/root/

# docker-compose.gpu.yaml: used to add GPU access to docker-compose.yaml
version: "3"
services:
  icevision:
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: 1
            capabilities: [gpu]

# devcontainer.json: used to instruct VSCode on what container to run and how
{
	"name": "Icevision",
	"dockerComposeFile": ["docker-compose.yaml"],
	"service": "icevision",
	"runServices": ["icevision"],
	"workspaceFolder": "/root",
	"extensions": [
		"ms-python.python",
		"ms-azuretools.vscode-docker"
	],
}

Develop inside a container in VSCode on your local machine

Option 1: using devcontainer.json and the Remote VSCode extension

Create a `.devcontainer` directory at the base of the folder you’d like to open in VSCode. In this screenshot, the IceVision folder. Note that the Remote Development VSCode extension can also do it for you automatically, together with creating a `devcontainer.json` file, as explained here.

This is the Remote Development VSCode extension installed and enabled in VSCode.

Now we need to ask VSCode to reopen the current folder in the `icevision` container (the name of the service specified in `docker-compose.yaml`). By selecting this option we basically instruct the IDE to follow the contents of `devcontainer.json`, e.g. to build the image, run a container on top and open up the same folder inside it.

VSCode reopens the folder inside the `icevision` container. We can check if the expected libraries are installed by opening a python terminal and importing them. Everything works! And we did nothing else than pointing VSCode to a Dockerfile. Now, thanks to the fact that we mounted the icevision folder into the root directory of the container (`../:/root/` inside `docker-compose.yaml`), whatever we do in this VSCode window will be saved to our local disk and persist even after the container gets terminated. Reminder: we don’t have access to a GPU here. Happy development!

Option 2: running the container manually and connecting to it

The other option we have is to do everything we covered in scenario 1 manually. First, we build the image with `docker-compose build`. In the above screenshot, the process was super fast as I had already run the command and Docker cached the results. As `docker images` suggests, the `ice` image has been successfully built. Then we run the `icevision` service with `docker-compose run --rm -d icevision`.

If we switch from the terminal to our VSCode window, we can see both the image and the running container listed in the Docker extension.

Right-click on the running container and select “Attach Visual Studio Code”.

And once again, VSCode reopened the folder in the running container, as we asked. Everything seems to work!

Develop inside a container in VSCode on a remote host

This is surprisingly easy as we’ll basically go through the same steps enumerated in Option 2 above. For some reason, I could not get Option 1 (`devcontainer.json`) to work on a remote host, but the manual way is still a very effective one. It will let us unlock GPU access in no time.