Whitening a black box: how to interpret a ML model -

Reading Time: 2 minutes

Introduction

I find the domain of interpretable ML incredibly fascinating. Being able to open a black box and understand why it made the decisions it made is not only mind-blowing per se but also extremely useful in practice. Especially in a business environment, where the largest majority of ML models are developed and deployed, having the ability to answer why a prediction looks the way it looks is of paramount importance. Most stakeholders don’t really care if some obscure metric is maximized. Of course, achieving high accuracy or AUC helps to “sell” the model, but then at some point, the model’s outputs need to make sense. If they don’t, the data scientist must explain why this is the case and pinpoint how the most relevant features are driving the potential non-sense in a clear and succinct way. It is a matter of trust. Actually, in highly regulated domains, such as the financial one, there are even legal implications in not providing a satisfactory answer.

All models, in one way or another, are built to address a business problem. Nevertheless, even in contexts where the pressure from biz stakeholders is not a primary concern, being able to interpret models’ behavior should be a top priority for the data scientist. Understanding predictions helps uncover potentially hidden issues, builds confidence with respect to the black box’s outputs, and often surfaces very valuable customer/product insights.

Given the importance of the subject, I had started to dive into it more than a year ago, in a first post about demystifying random forest. Today’s write-up goes several steps further. In this one, I review a set of common techniques used to get insights into models’ global (entire dataset’s average) and local (explaining a specific prediction) behavior. I touch upon Partial Dependency Plots and ICE, gain and permutation-based feature importance, together with LIME and SHAP. I also extend my exploration to XGBoost and CatBoost, on top of scikit-learn, writing wrapper functions (interpretable_ml_utils.py) in an attempt to make whitening a black box easier. The dataset I chose to conduct my experiments on is the Adult one from UCI, which offers a binary classification problem tailored at identifying whether an individual made more or less than 50K$/year. Here the link to the jupyter notebook embedded below in the post.

Last but not least, a huge shout out to Cristoph Molnar, whose amazing book Interpretable Machine Learning, is the source of most of the material I built upon. Thanks!

Notebook

Twitter

Whitening a black box: how to interpret a ML model

Introduction

Notebook

Discover more from