Keypoint Detection with IceVision: my first contribution to open-source -

Reading Time: 4 minutes

If you haven’t committed to an open-source library yet, then you have no idea of how much you are missing.

In the last couple of months, I had the privilege of contributing to the amazing IceVision library, a work of love and passion from Farid Hassainia and Lucas Vazquez. Farid and Lucas, on top of being bright Machine Learning and Deep Learning experts, are also just genuinely kind, open-minded, and welcoming guys, working super hard to create an inclusive and vibrant community on their Discord server. Join it! Trust me, it is totally worth it.

What is IceVision?

Copy-pasting from the official documentation:

IceVision is an Object-Detection Framework that connects to different libraries/frameworks such as Fastai, Pytorch Lightning, and Pytorch with more to come.
Features a Unified Data API with out-of-the-box support for common annotation formats (COCO, VOC, etc.)
The IceData repo hosts community maintained parsers and custom datasets
Provides flexible model implementations with pluggable backbones
Helps researchers reproduce, replicate, and go beyond published models
Enables practitioners to get moving with object detection technology quickly

The below slide (one of Farid’s work of art, also shamelessly stolen from the docs), is a concise, superbly condensed version of the previous bullet points.

In a nutshell, IceVision checks all the boxes for the Computer Vision library you are looking for. If you have ever worked on an Object Detection (and related) task, you are fully aware of how painful it is to get it done. You generally have to write (almost) from scratch a lot of boilerplate code to:

parse the dataset to turn it into a form your model likes
visualize images and annotations (and predictions when the time comes)
pick a model, load it and train it effectively (with all the additional headaches coming with it)

IceVision has your back, making it really smooth to iterate over the above three points, in a reproducible and concise way. After using (and contributing) to it, I cannot recommend it enough.

Keypoint Detection (aka my contribution to IceVision)

Without further ado, I am very excited to unveil what Farid and Lucas helped me achieve in the last month of work: Keypoint Detection. The work involved implementing the entire end2end pipeline, from data parsers to visualization, and of course modeling and inference. Overall, it has been an incredible experience. The amount of knowledge I managed to acquire has been tremendous, from digging up ML science to handling practical software development. As for myself, the latter was the most important piece to acquire, as I had never really had the chance to participate in a large scale project like this. I mainly come from business-environment-data-science experiences, the kind of places where software development best practices are generally overlooked, in favor of more speedy deliverables. I tried to compensate the lack in this skillset by working on programming-heavy weekend gigs, but being mostly solo-projects, there was no way to reproduce the complex scenarios a team effort comes with (Git being a very good example of that). I had long thought about open-source as a potential solution to my problem. To be honest, I had never taken the courage to actually make the leap and get my hands dirty. Until I stumbled upon a library Lucas had started working on, and which he briefly talked to me about on the fastai forums. That was it. That was my chance. Once again, it took me a while before taking the step of forking the repo and kickstarting my adventure, but at some point (around 2 months ago) I did it. It can be very intimidating to land in the middle of an unknown repo and try to get something done. My suggestion: start from open GitHub issues. Some of them are quite accessible (Lucas and Farid made a great job at flagging those). I committed to solving a couple of issues before engaging in a bigger thing (aka keypoints), and it turned out to be a wise decision. Taking this path, allowed me to explore the library and get more comfortable with the code base. To the point where stepping up my efforts and jumping onto a bigger scope was the next logical thing to do. All in all a crazy and exciting ride!

But let’s get back on track to the new keypoints functionality. How does it fit into the IceVision framework? The best way to show that is to dive into the below embedded notebook, providing an end2end tutorial on the (fastai) BIWI dataset. If you are looking for something more challenging, feel free to check out this other Colab notebook (not as commented out as the first one) showcasing IceVision in action on the OCHuman dataset. For a sneak-peek into it, here some samples showing the model’s predictions. It is quite interesting to notice how the network (far from being perfect) is able to build abstraction and spot humans (together with keypoints) even when they were not explicitly labelled in the ground-truth image.