The book has been divided into 13 parts originally by Prof. Andrew NG along with the complete book with all the parts consolidated. In this book you will learn how to align on ML strategies in a team setting, as well as how to set up development (dev) sets and test sets. Recommendations for how to set up dev/test sets have been changing as Machine Learning is moving toward bigger datasets, and this explains how you should do it for modern ML projects.
Conditions of Use
This book is licensed under a Creative Commons License (CC BY-NC-SA). You can download the ebook Machine Learning Yearning for free.
- Title
- Machine Learning Yearning
- Author(s)
- Andrew NG
- Published
- 2018-12-20
- Edition
- 1
- Format
- eBook (pdf, epub, mobi)
- Pages
- 118
- Language
- English
- License
- CC BY-NC-SA
- Book Homepage
- Free eBook, Errata, Code, Solutions, etc.
1 Why Machine Learning Strategy 2 How to use this book to help your team 3 Prerequisites and Notation 4 Scale drives machine learning progress 5 Your development and test sets 6 Your dev and test sets should come from the same distribution 7 How large do the dev/test sets need to be? 8 Establish a single-number evaluation metric for your team to optimize 9 Optimizing and satisficing metrics 10 Having a dev set and metric speeds up iterations 11 When to change dev/test sets and metrics 12 Takeaways: Setting up development and test sets 13 Build your first system quickly, then iterate 14 Error analysis: Look at dev set examples to evaluate ideas 15 Evaluating multiple ideas in parallel during error analysis 16 Cleaning up mislabeled dev and test set examples 17 If you have a large dev set, split it into two subsets, only one of which you look at 18 How big should the Eyeball and Blackbox dev sets be? 19 Takeaways: Basic error analysis 20 Bias and Variance: The two big sources of error 21 Examples of Bias and Variance 22 Comparing to the optimal error rate 23 Addressing Bias and Variance 24 Bias vs. Variance tradeoff 25 Techniques for reducing avoidable bias 26 Error analysis on the training set 27 Techniques for reducing variance 28 Diagnosing bias and variance: Learning curves 29 Plotting training error 30 Interpreting learning curves: High bias 31 Interpreting learning curves: Other cases 32 Plotting learning curves 33 Why we compare to human-level performance 34 How to define human-level performance 35 Surpassing human-level performance 36 When you should train and test on different distributions 37 How to decide whether to use all your data 38 How to decide whether to include inconsistent data 39 Weighting data 40 Generalizing from the training set to the dev set 41 Identifying Bias, Variance, and Data Mismatch Errors 42 Addressing data mismatch 43 Artificial data synthesis 44 The Optimization Verification test 45 General form of Optimization Verification test 46 Reinforcement learning example 47 The rise of end-to-end learning 48 More end-to-end learning examples 49 Pros and cons of end-to-end learning 50 Choosing pipeline components: Data availability 51 Choosing pipeline components: Task simplicity 52 Directly learning rich outputs 53 Error analysis by parts 54 Attributing error to one part 55 General case of error attribution 56 Error analysis by parts and comparison to human-level performance 57 Spotting a flawed ML pipeline 58 Building a superhero team - Get your teammates to read this
Related Books