Whitening a black box: how to interpret a ML model

In [44]:
%%HTML
<style>
div.prompt {display:none}
</style>

Investigated methods

  • Awesome blog post on Interpreting a ML model: here
  • Adult dataset description: here
  • Adult dataset modeling (R): here

Still unstable methods

  • Accumulated Local Effects (ALE) Plots (pip version not up-to-date; author admitting library needs more work): theory, ALEPython
  • Scoped Rules (Anchors) from the creators of LIME (pip version not up-to-date; issues with outdated dependencies, e.g sklearn): theory, GitHub

Note

I really struggled to have the sklearn, XGBoost and CatBoost APIs function smoothly with the PDPBox, eli5, lime and shap packages. Some just accept pd.DataFrame but then the explanation part is compatible with np.array only, or the other way around. Also, not all classifiers output ptobabilities by default. For XGBoost a score is produced and it needs to be fed into a sigmoid function to convert it into a probability. Nothing dramatic per se. It just means some extra care must be used when turning towards those methods.

For the time being, I managed to get both sklearn and XGBoost to work with all the relevant explanation-related packages, with the caveat that PDPBox only accepts models tained on pd.DataFrame. For CatBoost, instead, I could not get a lime pipeline to work, even though shap does just fine.

References

Molnar, Christoph. "Interpretable machine learning. A Guide for Making Black Box Models Explainable", 2019. 
https://christophm.github.io/interpretable-ml-book/.

Imports

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

from interpretable_ml_utils import *
shap.initjs()

path = Path("C:/Users/francesco.pochetti/Notebooks/data/Varied")

Loading and processing the UCI adult dataset

In [2]:
fname = path/"adult.feather"
if not os.path.isfile(fname):
    cols = ['age', 'workclass', 'fnlwgt', 'education', 'education_num', 
            'marital_status', 'occupation', 'relationship', 'race', 'sex', 
            'capital_gain', 'capital_loss', 'hours_per_week', 'native_country', 'income']
    df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data', 
                     names=cols)
    df.to_feather(fname)
else: 
    df = pd.read_feather(fname)
In [3]:
df.shape
Out[3]:
(32561, 15)
In [4]:
df.head()
Out[4]:
age workclass fnlwgt education education_num marital_status occupation relationship race sex capital_gain capital_loss hours_per_week native_country income
0 39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
1 50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
2 38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
3 53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
4 28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba <=50K
In [5]:
missing_values(df)
Out[5]:
Total Percent
In [6]:
features = ['age', 'workclass', 'education', 'marital_status', 'occupation', 
            'relationship', 'race', 'sex', 'hours_per_week', 'native_country',
            'capital_gain', 'capital_loss']
In [7]:
for c in features+['income']:
    if not is_numeric_dtype(df[c]):
        df[c] = df[c].str.strip()
        df = reduce_cardinality(df, c)
In [8]:
raw_train, raw_valid = train_test_split(df[features+['income']], train_size=0.8, random_state=1234)
raw_train.shape, raw_valid.shape
Out[8]:
((26048, 13), (6513, 13))
In [9]:
display_all(raw_train.head())
29187 23535 14450 2727 18474
age 31 59 21 32 20
workclass Private ? Private Private Private
education HS-grad Bachelors HS-grad HS-grad Some-college
marital_status Divorced Married-civ-spouse Never-married Never-married Never-married
occupation Other-service ? Adm-clerical Adm-clerical Adm-clerical
relationship Unmarried Husband Not-in-family Not-in-family Own-child
race White White White White White
sex Female Male Female Female Female
hours_per_week 99 40 40 40 20
native_country United-States United-States United-States United-States United-States
capital_gain 0 0 0 0 0
capital_loss 0 0 0 0 0
income >50K >50K <=50K <=50K <=50K

One-hot-encoding categories (needed for PDPs) and splitting data into train/validation

In [10]:
cats = train_cats(raw_train)  
X_train, y_train, nas = proc_df(raw_train, y_fld='income', max_n_cat=15)

apply_cats(raw_valid, raw_train)
X_valid, y_valid, _ = proc_df(raw_valid, y_fld='income', na_dict=nas, max_n_cat=15)
feats = list(X_train.columns)

raw_train.shape, raw_valid.shape, X_train.shape, X_valid.shape
Out[10]:
((26048, 13), (6513, 13), (26048, 56), (6513, 56))
In [11]:
cats['workclass']
Out[11]:
['?', 'Local-gov', 'Private', 'Self-emp-not-inc', 'State-gov', 'UNK']

Training model on one-hot-encoded data

In [12]:
clf = xgb.XGBClassifier(learning_rate=0.03, n_estimators=300, n_jobs=-1)
clf.fit(X_train, y_train)
_ = print_report(clf, X_valid, y_valid, t=0.5, cal=False, roc=False, 
                 conf=False, classes=cats['income'])

#clf = RandomForestClassifier(n_estimators=150, min_samples_leaf=5, n_jobs=-1)
#clf.fit(X_train, y_train)
#_ = print_report(clf, X_valid, y_valid, t=0.5, cal=False, roc=False, 
#                 conf=False, classes=cats['income'])
AUC valid: 0.9069511545106526 
F1 valid: 0.649270913277053
              precision    recall  f1-score   support

           0       0.87      0.95      0.91      4985
           1       0.78      0.55      0.65      1528

    accuracy                           0.86      6513
   macro avg       0.83      0.75      0.78      6513
weighted avg       0.85      0.86      0.85      6513

Permutation-based feature importance

The questions this approach answers are:

  • How much does the performance of the model decrease on average when we random shuffle one feature at a time (Y-axis in the chart below shows AUC)?
  • Which features affect performance the most?

Advantages (source)

  • Nice interpretation: Feature importance is the increase in model error when the feature’s information is destroyed.
  • The importance measure automatically takes into account all interactions with other features.

Disadvantages (source)

  • It is very unclear whether you should use training or test data to compute the feature importance.
  • If features are correlated, the permutation feature importance can be biased by unrealistic data instances.
  • Adding a correlated feature can decrease the importance of the associated feature by splitting the importance between both features.
In [13]:
try: imp = get_permutation_imp(clf, X_train, y_train, feats=feats)
except: imp = get_permutation_imp(clf, X_train.values, y_train, feats=feats) 
feature weight std
0 marital_status_Married-civ-spouse 0.108362 0.001111
1 capital_gain 0.049074 0.000693
2 age 0.037389 0.001519
3 hours_per_week 0.013571 0.000667
4 capital_loss 0.008937 0.000274
5 occupation_Exec-managerial 0.005758 0.000382
6 education_Bachelors 0.005716 0.000193
7 occupation_Prof-specialty 0.004898 0.000319
8 occupation_Other-service 0.002495 0.000199
9 education_Prof-school 0.002048 0.000153
10 education_Masters 0.001880 0.000165
11 relationship_Own-child 0.001627 0.000279
12 education_HS-grad 0.001546 0.000145
13 occupation_Farming-fishing 0.001539 0.000127
14 education_7th-8th 0.001523 0.000194

Gain-based feature importance

Methodology currently implemented in scikit-learn. Source SO

It is sometimes called "gini importance" or "mean decrease impurity" and it is defined as the total decrease in node impurity (weighted by the probability of reaching that node (which is approximated by the proportion of samples reaching that node)) averaged over all trees of the ensemble

Advantages

Compared to permutation-based feature importance, this method comes at no additional cost (the gains are directly calculated as part of training) and does not suffer from the drawbacks of having potentially correlated features, as datapoints are not shuffled.

Disadvantages

Probably the biggest disadvantage is the fact that the absolute importances are less interpretable, compared to the more concise and transparent increase in error rate we had before.

Note

Note that, as expected, the two illustrated feature importance methods yield slightly different results in terms of the ordering the top variables.

In [14]:
get_styled_feat_importance(clf, feats, 15)
Out[14]:
Weight Feature
0.2949 marital_status_Married-civ-spouse
0.0747 occupation_Prof-specialty
0.0608 capital_gain
0.0543 occupation_Exec-managerial
0.0396 education_Bachelors
0.0365 education_Masters
0.0316 occupation_Other-service
0.0267 hours_per_week
0.0263 education_HS-grad
0.0241 relationship_Own-child
0.0240 capital_loss
0.0220 age
0.0214 education_Prof-school
0.0213 occupation_Farming-fishing
0.0193 education_7th-8th
… 41 more …

Partial Dependency Plots (PDP) + Individual Conditional Expectation (ICE)

The idea behind PDP is to check the relationship between the dependent variable and a feature of interest, in an approach allowing to isolate the dependence of the output wrt an input, all other things being equal.

This is achieved in the following way:

  1. For a numerical variable, select N values which split the feature in quantile-spaced or equi-distance-spaced intervals. For a categorical variable, it suffices to identify the unique levels present in the dataset.
  2. For each one of the N values, e.g. x:
    1. Replace the entire feature in the dataset with x, keeping the other variables fixed.
    2. Run the newly created dataset through the model and record predictions.
  3. Plot predictions (Y-axis) over the N values in #1 (X-axis).

Say that your dataset is composed of n data points. Each time you run 2B, you'd get n predictions, e.g. you'd come up with N x n predictions at the end of the procedure.

Now, you have two options:

  1. Plotting n lines over N data points on the X-axis. This is what ICE plots look like.
  2. Plotting 1 average line over N data points on the X-axis. This is what PDP plots look like.

The charts below show both all lines and the average at the same time.

Advantages (source, source)

  • The computation of partial dependence plots is intuitive: the partial dependence function at a particular feature value represents the average prediction if we force all data points to assume that feature value.
  • If the feature for which you computed the PDP is not correlated with the other features, then the PDPs perfectly represent how the feature influences the prediction on average. In the uncorrelated case, the interpretation is clear.
  • Unlike partial dependence plots, ICE curves can uncover heterogeneous relationships.

Disadvantages (source, source)

  • The realistic maximum number of features in a partial dependence function is two.
  • The assumption of independence is the biggest issue with PD plots.
  • To meaningfully plot both PDPs and ICE for categorical variables, the variable must be one-hot-encoded. This is a constraint as tree-ensemble algorithms (the de-facto go-to solution in the non-deep-learning world) can easily handle features which are not encoded in this way.

WARNING! As you notice from the first plot below (capital_gain), the technique does not work for highly skewed features. I could not manage to get PDPBox to reduce the range of the variable in question, cutting high quantiles.

In [15]:
plot_pdp(clf, X_valid.sample(2000), feats, 'capital_gain')
In [16]:
plot_pdp(clf, X_valid.sample(2000), feats, 'age', center=False)
In [17]:
plot_pdp(clf, X_valid.sample(2000), feats, 'hours_per_week')
In [18]:
cat = 'marital_status'
f = [c for c in feats if c.startswith(cat)]
if len(f)>1: plot_pdp(clf, X_valid.sample(2000), feats, f, classes=cats[cat], plot_params={'title': f'PDP for feature {cat}'}, center=False)
In [19]:
cat = 'occupation'
f = [c for c in feats if c.startswith(cat)]
if len(f)>1: plot_pdp(clf, X_valid.sample(2000), feats, f, classes=cats[cat], plot_params={'title': f'PDP for feature {cat}'}, center=False)

Training model on NOT one-hot-encoded data and...

In [20]:
cats = train_cats(raw_train)
X_train, y_train, nas = proc_df(raw_train, y_fld='income', max_n_cat=None)

apply_cats(raw_valid, raw_train)
X_valid, y_valid, _ = proc_df(raw_valid, y_fld='income', na_dict=nas, max_n_cat=None)
feats = list(X_train.columns)

raw_train.shape, raw_valid.shape, X_train.shape, X_valid.shape
Out[20]:
((26048, 13), (6513, 13), (26048, 12), (6513, 12))

Notice that in the below cell, in the case of XGBoost, we call clf.fit(X_train.values, y_train), e.g. we fit on np.arrays, and not on pd.DataFrame. This is due to downstream-used packages (lime, eli5, shap) not always being compatible with pandas and XGBoost or CatBoost. The sklearn API is always compatible with pandas, though.

In [21]:
clf = xgb.XGBClassifier(learning_rate=0.03, n_estimators=300, n_jobs=-1)
clf.fit(X_train.values, y_train)
_ = print_report(clf, X_valid.values, y_valid, t=0.5, cal=False, roc=False, 
                 conf=False, classes=cats['income'])

#clf = RandomForestClassifier(n_estimators=150, min_samples_leaf=5, n_jobs=-1)
#clf.fit(X_train, y_train)
#_ = print_report(clf, X_valid, y_valid, t=0.5, cal=False, roc=False, 
#                 conf=False, classes=cats['income'])
AUC valid: 0.9047545778697349 
F1 valid: 0.6396327467482785
              precision    recall  f1-score   support

           0       0.87      0.95      0.91      4985
           1       0.77      0.55      0.64      1528

    accuracy                           0.86      6513
   macro avg       0.82      0.75      0.77      6513
weighted avg       0.85      0.86      0.85      6513

... showing feature importances as before

In [22]:
imp = get_permutation_imp(clf, X_train.values, y_train, feats=feats)
feature weight std
0 capital_gain 0.051877 0.001179
1 marital_status 0.048065 0.001179
2 age 0.040742 0.000520
3 education 0.021083 0.000325
4 occupation 0.016995 0.000420
5 hours_per_week 0.013954 0.000576
6 relationship 0.010013 0.000934
7 capital_loss 0.009647 0.000272
8 sex 0.002908 0.000149
9 workclass 0.001521 0.000091
10 native_country 0.000271 0.000014
11 race 0.000152 0.000030
In [23]:
get_styled_feat_importance(clf, feats)
Out[23]:
Weight Feature
0.3290 relationship
0.1466 marital_status
0.1243 capital_gain
0.0928 occupation
0.0592 education
0.0551 hours_per_week
0.0540 capital_loss
0.0494 age
0.0404 sex
0.0260 workclass
0.0152 native_country
0.0079 race

Features contributions to individual predictions: the treeinterpreter approach

The following approach is not known in the community under a specific name and it does not come with a paper.

Even though not highly advertised, it is incredibly powerful, and got integrated (in an independent implementation) within the eli5 library.

Ando Saabas proposed the method in this series of posts. The idea is that we can decompose the contribution of each feature to a specific prediction by following the path from the root of the tree to the node where the instance of interest resides. This is done by recording, at each split of the feature space, the difference in the proportion of the positive class (in case of classification) between the parent and the child node and assigning that difference to the variable in question. If we start from the root, the percentage of the positive class would just represent the average across the entire training set. Let's assume this is 70%. Now assume that the first split happens on feature X_1, that our instance of interest sits on the right branch of the tree and that within this subset the percentage of the positive class is 80%. Then we can safely state that X_1 drives an increase in probability of 10%. Imagine doing this until the tree is fully grown and averaging that out across all trees (in case of a forest).

This is how the method works in a nutshell.

The contributions, by definition, add up to the final model's outcome, making them easy to interpret and to visualize. Both a tabular form, as eli5 proposes, and a more business-oriented waterfall chart, are nice ways of doing so.

Note

This method relies on the metrics which get actually computed at each tree's node.

In case of XGBoost, the algorithm calculates a score which cannot direcly be interpreted as a probability, as happening in RandomForest, so beware of the numbers eli5 displays.

In [24]:
# id = X_valid.sample().index.values[0]

# fixing the ID value for illustration purposes
id = 20150 
explain_pred_contrib(id, clf, X_valid, feats, cats=cats, waterfall={'rotation_value':60, 'threshold': None})
Prediction explanation for ID: 20150; Probability of event (y=1): 0.6039999723434448
Model used: <class 'xgboost.sklearn.XGBClassifier'>
Out[24]:

y=1 (probability 0.604, score 0.421) top features

Contribution? Feature Value
+0.549 relationship: Husband 0.000
+0.434 marital_status: Married-civ-spouse 1.000
+0.393 education: Masters 7.000
+0.327 age 48.000
+0.296 occupation: Exec-managerial 3.000
+0.026 workclass: Private 2.000
+0.006 native_country: United-States 6.000
+0.003 race: White 4.000
-0.028 sex: Male 1.000
-0.060 hours_per_week 40.000
-0.078 capital_loss 0.000
-0.172 capital_gain 0.000
-1.275 <BIAS> 1.000

Local interpretable model-agnostic explanations (LIME)

The principle behind LIME is based on fitting a local explainable model (e.g. a linear regression for instance) on the predictions of the black-box model. First thing to notice, LIME does not fit on the actuals, but on the black-models outputs. Those are the ones we have to explain, after all.

The recipe for training local surrogate models:

  • Select your instance of interest for which you want to have an explanation of its black box prediction.
  • Perturb your dataset and get the black box predictions for these new points.
  • Weight the new samples according to their proximity to the instance of interest.
  • Train a weighted, interpretable model on the dataset with the variations.
  • Explain the prediction by interpreting the local model.

Source

Advantages (source)

  • Explanations are short (= selective) and possibly contrastive.
  • It works for tabular data, text and images.
  • The fidelity measure (how well the interpretable model approximates the black box predictions) gives us a good idea of how reliable the interpretable model is.

Disadvantages (source)

  • The correct definition of the neighborhood is a very big, unsolved problem when using LIME with tabular data. In my opinion it is the biggest problem with LIME and the reason why I would recommend to use LIME only with great care. [...] LIME currently uses an exponential smoothing kernel to define the neighborhood. A smoothing kernel is a function that takes two data instances and returns a proximity measure. The kernel width determines how large the neighborhood is: A small kernel width means that an instance must be very close to influence the local model, a larger kernel width means that instances that are farther away also influence the model. If you look at LIME’s Python implementation (file lime/lime_tabular.py) you will see that it uses an exponential smoothing kernel (on the normalized data) and the kernel width is 0.75 times the square root of the number of columns of the training data.
  • Another really big problem is the instability of the explanations. In an article the authors showed that the explanations of two very close points varied greatly in a simulated setting.

Conclusion: Local surrogate models, with LIME as a concrete implementation, are very promising. But the method is still in development phase and many problems need to be solved before it can be safely applied.

WARNING! The below explanation makes, unfortunately, no sense, showing LIME's limitations. I decided to keep it for illustration purposes.

When asking to explain an instance, the LIME library produces 3 outputs (see below):

  • Left (Output Probabilities): those are the probabilities of the event (>50K) and non-event (<=50K) occuring.
  • Center: those are the top 5 reasons behind the prediction. As also discussed in this tutorial, the purpose of the bar chart is to show which variables are pushing the prediction in the direction of which class. In the case of the first blue capital_gain bar extending to the left of the black vertical line, if capital_gain would not be <=0 (it is 0 for this specific data point), the probability of the event >50K would be 0.63 higher. Differently stated, capital_gain<=0 pushes the prediction towards the non-event by 0.63. The second orange bar, instead, can be interpreted as: if marital_status would have been different than Married-civ-spouse, the probability of the event would be 0.13 lower, so this feature is acting in favour of >50K. Note that this explanation contradicts the prediction of the person earning more than 50K happening with probability 60%. Another resource to look at to understand how to interpret those numbers is the official LIME Github repo: the screenshot sections provides a couple of interesting examples.
  • Right: those are just the actual values the top 5 features hold for the instance in question.

An interesting fact to call out is the significant difference between LIME local prediction and the black-box model prediction. There is no guarantee those two would be equal, but still, such a big difference should hold us back from trusting the explanations as they might be inconsistent, as is actually the case.

In [25]:
explainer = get_lime_explainer(X_train, cats, feats, kernel_width=None)
In [26]:
exp = explain_pred_lime(id, clf, explainer, X_valid, num_features=5)
Model used: <class 'xgboost.sklearn.XGBClassifier'>
LIME local prediction: 0.2846695468893861; Model prediction: 0.6038179397583008; R2 score: 0.6850372854111757

Fiddling around with kernel_width to try to get better explanations...

If we reduce the kernel_width to 0.5 things improve a lot

In [27]:
explainer = get_lime_explainer(X_train, cats, feats, kernel_width=0.5)
exp = explain_pred_lime(id, clf, explainer, X_valid, num_features=5)
Model used: <class 'xgboost.sklearn.XGBClassifier'>
LIME local prediction: 0.4140323350257734; Model prediction: 0.6038179397583008; R2 score: 0.3646826645193938
In [28]:
#if 'XGB' in str(type(clf)): print(measure_lime_accuracy(clf, explainer, X_valid.sample(2000), num_features=5))

SHAP (SHapley Additive exPlanations)

Quotes taken from here and here

Goal

The goal of SHAP is to explain the prediction of an instance x by computing the contribution of each feature to the prediction. The SHAP explanation method computes Shapley values from coalitional game theory.[...] A prediction can be explained by assuming that each feature value of the instance is a “player” in a game where the prediction is the payout. Shapley values [...] tells us how to fairly [e.g. in an additive fashion] distribute the “payout” among the features.

Conditions Shapley values meet

source Consider the following scenario: a group of people are playing a game. As a result of playing this game, they receive a certain reward; how can they divide this reward between themselves in a way which reflects each of their contributions? There are a few things which everyone can agree on; meeting the following conditions will mean the game is fair according to Shapley values:

  • [Efficiency] The sum of what everyone receives should equal the total reward
  • [Symmetry] If two people contributed the same value, then they should receive the same amount from the reward
  • [Dummy] Someone who contributed no value should receive nothing
  • [Additivity] If the group plays two games, then an individual’s reward from both games should equal their reward from their first game plus their reward from the second game

Intuition

An intuitive way to understand the Shapley value is the following illustration: The feature values enter a room in random order. All feature values in the room participate in the game (= contribute to the prediction). The Shapley value of a feature value is the average change in the prediction that the coalition already in the room receives when the feature value joins them.

source Shapley values calculate the importance of a feature by comparing what a model predicts with and without the feature. However, since the order in which a model sees features can affect its predictions, this is done in every possible order, so that the features are fairly compared.

Interpretation

The interpretation of the Shapley value for feature value j is: The value of the j-th feature contributed ϕj to the prediction of this particular instance compared to the average prediction for the dataset.

Be careful to interpret the Shapley value correctly: The Shapley value is the average contribution of a feature value to the prediction in different coalitions. The Shapley value is NOT the difference in prediction when we would remove the feature from the model.

What does the SHAP package offer?

  1. Single instance force plots

    Each feature value is a force that either increases or decreases the prediction. The prediction starts from the baseline. The baseline for Shapley values is the average of all predictions. In the plot, each Shapley value is an arrow that pushes to increase (positive value) or decrease (negative value) the prediction. These forces balance each other out at the actual prediction of the data instance.

  2. Aggregated SHAP-based feature importance: sort features based on mean absolute Shapley values.
  3. Instance-level SHAP-based feature importance: sort features by aggregated SHAP-based importance and show instance-granular Shapley values dispersed on the y-axis.
  4. Clustering instances based on Shapley values
  5. Dependence plots: showing relationship between Shapley values of feature j and values of feature j, with additional possibility to colour code by the intensity of a second variable.

Advantages (source)

  • The difference between the prediction and the average prediction is fairly distributed among the feature values of the instance [e.g. it mst add up to the that difference] – the Efficiency property of Shapley values. This property distinguishes the Shapley value from other methods such as LIME. LIME does not guarantee that the prediction is fairly distributed among the features.
  • The Shapley value is the only explanation method with a solid theory. The axioms – efficiency, symmetry, dummy, additivity – give the explanation a reasonable foundation. Methods like LIME assume linear behavior of the machine learning model locally, but there is no theory as to why this should work.

Disadvantages (source)

  • The Shapley value requires a lot of computing time.
  • The Shapley value can be misinterpreted. The Shapley value of a feature value is not the difference of the predicted value after removing the feature from the model training. The interpretation of the Shapley value is: Given the current set of feature values, the contribution of a feature value to the difference between the actual prediction and the mean prediction is the estimated Shapley value.
In [29]:
shap_explainer = get_shap_explainer('tree', clf)
shap_values = get_shap_values(shap_explainer, X_valid)
Model used: <class 'xgboost.sklearn.XGBClassifier'>

SHAP force plots

The red "forces" in the below plot push the probability of the event high, whereas the "blue" ones work against them pushing in the opposite direction. Starting from the base_value, all forces add up to the final prediction.

In [30]:
# `link='logit'` only for classifiers such as XGBoost not producing probs directly
show_shap(shap_explainer.expected_value, shap_values, X_valid, id, cats=cats, 
          link='logit', matplotlib=True, feats=feats, clf=clf)
Printing top/bottom 5 features by SHAP values
Predicted probability of event: 0.6038179397583008
SHAP value Feat value
marital_status 0.488455 Married-civ-spouse
relationship 0.447789 Husband
education 0.373692 Masters
age 0.368265 48
occupation 0.320931 Exec-managerial
... ... ...
workclass -0.006547 Private
sex -0.010044 Male
hours_per_week -0.064090 40
capital_loss -0.066898 0
capital_gain -0.162112 0

12 rows × 2 columns

Training a CatBoost classifier

In [31]:
cat_cols = [c for c in cats.keys() if c!='income']
In [32]:
train_data = Pool(
    data=X_train,
    label=y_train,
    cat_features=cat_cols
)
holdout_data = Pool(
    data=X_valid,
    label=y_valid,
    cat_features=cat_cols
)
In [33]:
clf_c = CatBoostClassifier(
    iterations=100,
    random_seed=random_state,
    learning_rate=0.1,
    custom_loss=['AUC']
)

_ = clf_c.fit(train_data, eval_set=holdout_data, plot=True, verbose=False)

#clf_c.fit(X_train, y_train, cat_features=cat_cols, eval_set=(X_valid, y_valid), plot=True, verbose=False)
In [34]:
_ = print_report(clf_c, X_valid.values, y_valid, t=0.5, cal=False, 
                 roc=False, conf=False, classes=cats['income'])
AUC valid: 0.9094371465180883 
F1 valid: 0.6729996346364633
              precision    recall  f1-score   support

           0       0.89      0.94      0.91      4985
           1       0.76      0.60      0.67      1528

    accuracy                           0.86      6513
   macro avg       0.82      0.77      0.79      6513
weighted avg       0.86      0.86      0.86      6513

In [35]:
shap_explainer = get_shap_explainer('tree', clf_c)
shap_values = get_shap_values(shap_explainer, holdout_data)
Model used: <class 'catboost.core.CatBoostClassifier'>
In [36]:
shap_imp = get_shap_feat_importance(shap_values, holdout_data, feats)
In [37]:
get_shap_feat_importance(shap_values, X_valid)
In [38]:
show_shap(shap_explainer.expected_value, shap_values, X_valid, id, cats=cats, 
          matplotlib=True, link='logit', feats=feats, clf=clf_c)
Printing top/bottom 5 features by SHAP values
Predicted probability of event: 0.7942827265979207
SHAP value Feat value
marital_status 1.147313 Married-civ-spouse
education 0.853373 Masters
occupation 0.812718 Exec-managerial
age 0.691886 48
sex 0.108034 Male
... ... ...
workclass 0.008833 Private
race 0.007692 White
hours_per_week -0.042783 40
capital_loss -0.062791 0
capital_gain -0.206041 0

12 rows × 2 columns

In [39]:
clf_c.predict_proba(X_valid.loc[id].values)[1]
Out[39]:
0.7942827265979207

Checking additivity of Shapley values

In [40]:
assert_shap_proba(clf_c, shap_explainer, shap_values, X_valid, id)
Out[40]:
(0.7942827265979204, 0.7942827265979207)
In [41]:
show_shap(shap_explainer.expected_value, shap_values, X_valid, id=None)
Out[41]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
In [42]:
show_shap(shap_explainer.expected_value, shap_values, X_valid, id=None, x="hours_per_week", 
          y='sex', disp=raw_train)