Interpretable Machine Learning

Interpretable Machine Learning is a comprehensive guide to making machine learning models interpretable

"Pretty convinced this is the best book out there on the subject"
– Brian Lewis, Data Scientist at Cornerstone Research

Summary

This book covers a range of interpretability methods, from inherently interpretable models to methods that can make any model interpretable, such as SHAP, LIME and permutation feature importance. It also includes interpretation methods specific to deep neural networks, and discusses why interpretability is important in machine learning. All interpretation methods are explained in depth and discussed critically. How do they work under the hood? What are their strengths and weaknesses? How can their outputs be interpreted?

"What I love about this book is that it starts with the big picture instead of diving immediately into the nitty gritty of the methods (although all of that is there, too)."
– Andrea Farnham, Researcher at Swiss Tropical and Public Health Institute

Who the book is for

This book is essential for machine learning practitioners, data scientists, statisticians, and anyone interested in making their machine learning models interpretable. It will help readers select and apply the appropriate interpretation method for their specific project.

"This one has been a life saver for me to interpret models. ALE plots are just too good!"
– Sai Teja Pasul, Data Scientist at Kohl's

You'll learn about

The concepts of machine leaning interpretability
Inherently interpretable models
Methods to make any machine model interpretable, such as SHAP, LIME and permutation feature importance
Interpretation methods specific to deep neural networks
Why interpretability is important and what's behind this concept

About the author

The author, Christoph Molnar, is an expert in machine learning and statistics, with a Ph.D. in interpretable machine learning.

Conditions of Use

This book is licensed under a Creative Commons License (CC BY-NC-SA). You can download the ebook Interpretable Machine Learning for free.

Title: Interpretable Machine Learning
Subtitle: A Guide For Making Black Box Models Explainable
Author(s): Christoph Molnar
Published: 2024-05-26
Edition: 2
Format: eBook (pdf, epub, mobi)
Pages: 328
Language: English
ISBN-10: B09TMWHVB4
ISBN-13: 9798411463330
License: CC BY-NC-SA
Book Homepage: Free eBook, Errata, Code, Solutions, etc.

Summary
1 Preface by the Author
2 Introduction
    2.1 Story Time
        Lightning Never Strikes Twice
        Trust Fall
        Fermi’s Paperclips
    2.2 What Is Machine Learning?
    2.3 Terminology
3 Interpretability
    3.1 Importance of Interpretability
    3.2 Taxonomy of Interpretability Methods
    3.3 Scope of Interpretability
        3.3.1 Algorithm Transparency
        3.3.2 Global, Holistic Model Interpretability
        3.3.3 Global Model Interpretability on a Modular Level
        3.3.4 Local Interpretability for a Single Prediction
        3.3.5 Local Interpretability for a Group of Predictions
    3.4 Evaluation of Interpretability
    3.5 Properties of Explanations
    3.6 Human-friendly Explanations
        3.6.1 What Is an Explanation?
        3.6.2 What Is a Good Explanation?
4 Datasets
    4.1 Bike Rentals (Regression)
    4.2 YouTube Spam Comments (Text Classification)
    4.3 Risk Factors for Cervical Cancer (Classification)
5 Interpretable Models
    5.1 Linear Regression
        5.1.1 Interpretation
        5.1.2 Example
        5.1.3 Visual Interpretation
        5.1.4 Explain Individual Predictions
        5.1.5 Encoding of Categorical Features
        5.1.6 Do Linear Models Create Good Explanations?
        5.1.7 Sparse Linear Models
        5.1.8 Advantages
        5.1.9 Disadvantages
    5.2 Logistic Regression
        5.2.1 What is Wrong with Linear Regression for Classification?
        5.2.2 Theory
        5.2.3 Interpretation
        5.2.4 Example
        5.2.5 Advantages and Disadvantages
        5.2.6 Software
    5.3 GLM, GAM and more
        5.3.1 Non-Gaussian Outcomes - GLMs
        5.3.2 Interactions
        5.3.3 Nonlinear Effects - GAMs
        5.3.4 Advantages
        5.3.5 Disadvantages
        5.3.6 Software
        5.3.7 Further Extensions
    5.4 Decision Tree
        5.4.1 Interpretation
        5.4.2 Example
        5.4.3 Advantages
        5.4.4 Disadvantages
        5.4.5 Software
    5.5 Decision Rules
        5.5.1 Learn Rules from a Single Feature (OneR)
        5.5.2 Sequential Covering
        5.5.3 Bayesian Rule Lists
        5.5.4 Advantages
        5.5.5 Disadvantages
        5.5.6 Software and Alternatives
    5.6 RuleFit
        5.6.1 Interpretation and Example
        5.6.2 Theory
        5.6.3 Advantages
        5.6.4 Disadvantages
        5.6.5 Software and Alternative
    5.7 Other Interpretable Models
        5.7.1 Naive Bayes Classifier
        5.7.2 K-Nearest Neighbors
6 Model-Agnostic Methods
7 Example-Based Explanations
8 Global Model-Agnostic Methods
    8.1 Partial Dependence Plot (PDP)
        8.1.1 PDP-based Feature Importance
        8.1.2 Examples
        8.1.3 Advantages
        8.1.4 Disadvantages
        8.1.5 Software and Alternatives
    8.2 Accumulated Local Effects (ALE) Plot
        8.2.1 Motivation and Intuition
        8.2.2 Theory
        8.2.3 Estimation
        8.2.4 Examples
        8.2.5 Advantages
        8.2.6 Disadvantages
        8.2.7 Implementation and Alternatives
    8.3 Feature Interaction
        8.3.1 Feature Interaction?
        8.3.2 Theory: Friedman’s H-statistic
        8.3.3 Examples
        8.3.4 Advantages
        8.3.5 Disadvantages
        8.3.6 Implementations
        8.3.7 Alternatives
    8.4 Functional Decompositon
        8.4.1 How not to Compute the Components I
        8.4.2 Functional Decomposition
        8.4.3 How not to Compute the Components II
        8.4.4 Functional ANOVA
        8.4.5 Generalized Functional ANOVA for Dependent Features
        8.4.6 Accumulated Local Effect Plots
        8.4.7 Statistical Regression Models
        8.4.8 Bonus: Partial Dependence Plot
        8.4.9 Advantages
        8.4.10 Disadvantages
    8.5 Permutation Feature Importance
        8.5.1 Theory
        8.5.2 Should I Compute Importance on Training or Test Data?
        8.5.3 Example and Interpretation
        8.5.4 Advantages
        8.5.5 Disadvantages
        8.5.6 Alternatives
        8.5.7 Software
    8.6 Global Surrogate
        8.6.1 Theory
        8.6.2 Example
        8.6.3 Advantages
        8.6.4 Disadvantages
        8.6.5 Software
    8.7 Prototypes and Criticisms
        8.7.1 Theory
        8.7.2 Examples
        8.7.3 Advantages
        8.7.4 Disadvantages
        8.7.5 Code and Alternatives
9 Local Model-Agnostic Methods
    9.1 Individual Conditional Expectation (ICE)
        9.1.1 Examples
        9.1.2 Advantages
        9.1.3 Disadvantages
        9.1.4 Software and Alternatives
    9.2 Local Surrogate (LIME)
        9.2.1 LIME for Tabular Data
        9.2.2 LIME for Text
        9.2.3 LIME for Images
        9.2.4 Advantages
        9.2.5 Disadvantages
    9.3 Counterfactual Explanations
        9.3.1 Generating Counterfactual Explanations
        9.3.2 Example
        9.3.3 Advantages
        9.3.4 Disadvantages
        9.3.5 Software and Alternatives
    9.4 Scoped Rules (Anchors)
        9.4.1 Finding Anchors
        9.4.2 Complexity and Runtime
        9.4.3 Tabular Data Example
        9.4.4 Advantages
        9.4.5 Disadvantages
        9.4.6 Software and Alternatives
    9.5 Shapley Values
        9.5.1 General Idea
        9.5.2 Examples and Interpretation
        9.5.3 The Shapley Value in Detail
        9.5.4 Advantages
        9.5.5 Disadvantages
        9.5.6 Software and Alternatives
    9.6 SHAP (SHapley Additive exPlanations)
        9.6.1 Definition
        9.6.2 KernelSHAP
        9.6.3 TreeSHAP
        9.6.4 Examples
        9.6.5 SHAP Feature Importance
        9.6.6 SHAP Summary Plot
        9.6.7 SHAP Dependence Plot
        9.6.8 SHAP Interaction Values
        9.6.9 Clustering Shapley Values
        9.6.10 Advantages
        9.6.11 Disadvantages
        9.6.12 Software
10 Neural Network Interpretation
    10.1 Learned Features
        10.1.1 Feature Visualization
        10.1.2 Network Dissection
        10.1.3 Advantages
        10.1.4 Disadvantages
        10.1.5 Software and Further Material
    10.2 Pixel Attribution (Saliency Maps)
        10.2.1 Vanilla Gradient (Saliency Maps)
        10.2.2 DeconvNet
        10.2.3 Grad-CAM
        10.2.4 Guided Grad-CAM
        10.2.5 SmoothGrad
        10.2.6 Examples
        10.2.7 Advantages
        10.2.8 Disadvantages
        10.2.9 Software
    10.3 Detecting Concepts
        10.3.1 TCAV: Testing with Concept Activation Vectors
        10.3.2 Example
        10.3.3 Advantages
        10.3.4 Disadvantages
        10.3.5 Bonus: Other Concept-based Approaches
        10.3.6 Software
    10.4 Adversarial Examples
        10.4.1 Methods and Examples
        10.4.2 The Cybersecurity Perspective
    10.5 Influential Instances
        10.5.1 Deletion Diagnostics
        10.5.2 Influence Functions
        10.5.3 Advantages of Identifying Influential Instances
        10.5.4 Disadvantages of Identifying Influential Instances
        10.5.5 Software and Alternatives
11 A Look into the Crystal Ball
    11.1 The Future of Machine Learning
    11.2 The Future of Interpretability
12 Contribute to the Book
13 Citing this Book
14 Translations
15 Acknowledgements
Impressum
References
    R Packages Used