Setting the stage
The goal of this short post is to show how to use a Docker container as your development environment in VSCode, both locally and on a remote server. I decided to put it together as I have found a lot of very fragmented information online on the topic, and I hope to consolidate a concise go-to post for myself in the future. Hoping this will benefit others as well!
First things first. Why would you want to code inside a container?
Let’s take a step back. I am far from being a Docker wizard, having started to use the technology only less than a year ago (shame on me). Probably due to my lack of expertise, I have always considered Docker as a utility coming into play only at deployment time. So, you have an ML model you have trained and, at some point, you’ll likely ask yourself: “How do I serve it to users?”. To which, the almost unanimous answer will be “You wrap it up behind a REST API inside a Docker container”. I couldn’t agree more. As a testament to this, I have written a couple of posts on this topic.
Now, let’s say you are part of a team of devs and someone asks you to add a feature/fix a bug in a specific repo. Diligently, you clone the repo on your laptop, and then, before even writing a single line of code, you want to make sure tests are passing (the repo has tests, right?). Wait, where are you going to run those? You don’t have an environment with all the proper dependencies installed. Not a problem. You look for a `requirements.txt` file somewhere in the root folder and create a virtual env on top of it. This is easier said than done, though. I have spent countless hours fighting machine-related conflicts (even starting off from a blank env!).
Wait a sec. Isn’t this what Docker was created for? To solve “this works on my machine” problems? Yes, indeed. Which is why you should definitely go for it. If the repo you have just cloned has a Dockerfile (it has one, right?), you can build it, spin up a container on top of the image, and tell your IDE (VSCode in our case) to run inside it. All in a matter of minutes. This approach is the only bulletproof guarantee that the code you are writing runs in the exact same environment as your production application. If tests pass there, you are good to go. Now, let’s see how to accomplish that.
To showcase a real-life example I will walk through two separate scenarios:
- Running VSCode on top of an IceVision Docker container on my laptop (MacOS, so no GPU available).
- Running VSCode on top of an IceVision Docker container on an AWS EC2 machine, with a GPU.
In order to achieve CPU/GPU access according to the hardware we are running on, we’ll leverage the flexibility of Docker Compose. Despite its true superpower being to orchestrate a stack of multiple Docker microservices, as we will see, Docker Compose also shines when it comes to facilitating the management of a single image.
Dockerfile(s)
Along the way, we will need 4 files:
Dockerfile
: this is the core of our environment, containing all the instructions for its successful creation.docker-compose.yaml
: this is the CPU version of the Docker Compose orchestration. It specifies the name of the service (icevision
), how to build it, the name of the image in case it already exists (ice
in my case but you can pick one yourself), and the volumes to mount (crucial if we want to persist any code changes we make inside the container after it’s terminated, which we indeed want). We’ll use this file locally (on my Mac).docker-compose.gpu.yaml
: this file complementsdocker-compose.yaml
, adding GPU access. We’ll use it on the EC2 instance to let Docker know there is a GPU available and that we want to use it.- `devcontainer.json`: this file is VSCode specific. It works in combination with the Remote Development extension pack (which you need to install) and it contains all sorts of fully customisable instructions on how to operate the IDE inside the container of choice. This VSCode feature is excellently covered in this Microsoft tutorial, which perfectly explains at least the “developing inside a container locally” section of this post. I highly encourage to go through it. It is an invaluable resource.
# Dockerfile: with instructions on what to add to the IceVision image
FROM python:3.8
RUN pip install PyYAML>=5.1 -U
RUN pip install datascience -U
RUN pip install torchtext==0.9.0 -U
RUN pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html -U
RUN pip install fastai==2.3.1 -U
RUN pip install git+git://github.com/airctic/icevision.git#egg=icevision[all] -U
RUN pip install git+git://github.com/airctic/icedata.git -U
RUN pip install yolov5-icevision -U
RUN pip install mmcv-full==1.3.7 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html -U
RUN pip install mmdet==2.13.0 -U
RUN pip install ipywidgets -U
# docker-compose.yaml: used to build the IceVision Dockerfile and run a container
version: "3"
services:
icevision:
build:
dockerfile: Dockerfile
context: .
image: ice
tty: true # attaches a terminal to the container
volumes:
- ../:/root/
# docker-compose.gpu.yaml: used to add GPU access to docker-compose.yaml
version: "3"
services:
icevision:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# devcontainer.json: used to instruct VSCode on what container to run and how
{
"name": "Icevision",
"dockerComposeFile": ["docker-compose.yaml"],
"service": "icevision",
"runServices": ["icevision"],
"workspaceFolder": "/root",
"extensions": [
"ms-python.python",
"ms-azuretools.vscode-docker"
],
}
Develop inside a container in VSCode on your local machine
Option 1: using devcontainer.json and the Remote VSCode extension
Option 2: running the container manually and connecting to it
Develop inside a container in VSCode on a remote host
This is surprisingly easy as we’ll basically go through the same steps enumerated in Option 2 above. For some reason, I could not get Option 1 (`devcontainer.json`) to work on a remote host, but the manual way is still a very effective one. It will let us unlock GPU access in no time.
Connect to a running container (on a remote host) via SSH directly
Sure, here you go.https://t.co/2vP0kh3otI
— Sheik Mohamed Imran (@sheikmohdimran) July 18, 2021
Pingback: Blurry faces: a journey from training a segmentation model to deploying TensorRT to NVIDIA Triton on Amazon SageMaker -
Pingback: Training and Deploying a fully Dockerized License Plate Recognition app with IceVision, Amazon Textract and FastAPI -
Comments are closed.