Machine Learning Yearning - Open Tech Book

The book has been divided into 13 parts originally by Prof. Andrew NG along with the complete book with all the parts consolidated. In this book you will learn how to align on ML strategies in a team setting, as well as how to set up development (dev) sets and test sets. Recommendations for how to set up dev/test sets have been changing as Machine Learning is moving toward bigger datasets, and this explains how you should do it for modern ML projects.

Conditions of Use

This book is licensed under a Creative Commons License (CC BY-NC-SA). You can download the ebook Machine Learning Yearning for free.

Title: Machine Learning Yearning
Author(s): Andrew NG
Published: 2018-12-20
Edition: 1
Format: eBook (pdf, epub, mobi)
Pages: 118
Language: English
License: CC BY-NC-SA
Book Homepage: Free eBook, Errata, Code, Solutions, etc.

1 Why Machine Learning Strategy
2 How to use this book to help your team
3 Prerequisites and Notation
4 Scale drives machine learning progress
5 Your development and test sets
6 Your dev and test sets should come from the same distribution
7 How large do the dev/test sets need to be?
8 Establish a single-number evaluation metric for your team to optimize
9 Optimizing and satisficing metrics
10 Having a dev set and metric speeds up iterations
11 When to change dev/test sets and metrics
12 Takeaways: Setting up development and test sets
13 Build your first system quickly, then iterate
14 Error analysis: Look at dev set examples to evaluate ideas
15 Evaluating multiple ideas in parallel during error analysis
16 Cleaning up mislabeled dev and test set examples
17 If you have a large dev set, split it into two subsets, only one of which you look at
18 How big should the Eyeball and Blackbox dev sets be?
19 Takeaways: Basic error analysis
20 Bias and Variance: The two big sources of error
21 Examples of Bias and Variance
22 Comparing to the optimal error rate
23 Addressing Bias and Variance
24 Bias vs. Variance tradeoff
25 Techniques for reducing avoidable bias
26 Error analysis on the training set
27 Techniques for reducing variance
28 Diagnosing bias and variance: Learning curves
29 Plotting training error
30 Interpreting learning curves: High bias
31 Interpreting learning curves: Other cases
32 Plotting learning curves
33 Why we compare to human-level performance
34 How to define human-level performance
35 Surpassing human-level performance
36 When you should train and test on different distributions
37 How to decide whether to use all your data
38 How to decide whether to include inconsistent data
39 Weighting data
40 Generalizing from the training set to the dev set
41 Identifying Bias, Variance, and Data Mismatch Errors
42 Addressing data mismatch
43 Artificial data synthesis
44 The Optimization Verification test
45 General form of Optimization Verification test
46 Reinforcement learning example
47 The rise of end-to-end learning
48 More end-to-end learning examples
49 Pros and cons of end-to-end learning
50 Choosing pipeline components: Data availability
51 Choosing pipeline components: Task simplicity
52 Directly learning rich outputs
53 Error analysis by parts
54 Attributing error to one part
55 General case of error attribution
56 Error analysis by parts and comparison to human-level performance
57 Spotting a flawed ML pipeline
58 Building a superhero team - Get your teammates to read this

Machine Learning