An Introduction to Statistical Learning, 2nd Edition: with Applications in R

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform.

Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

This Second Edition features new chapters on deep learning, survival analysis, and multiple testing, as well as expanded treatments of naïve Bayes, generalized linear models, Bayesian additive regression trees, and matrix completion. R code has been updated throughout to ensure compatibility.

Conditions of Use

This book is licensed under a Creative Commons License (CC BY-NC-SA). You can download the ebook An Introduction to Statistical Learning, 2nd Edition: with Applications in R for free.

Title: An Introduction to Statistical Learning, 2nd Edition: with Applications in R
Publisher: Springer
Author(s): Daniela Witten, Gareth James, Robert Tibshirani, Trevor Hastie
Published: 2022-07-30
Edition: 2
Format: eBook (pdf, epub, mobi)
Pages: 622
Language: English
ISBN-10: 1071614207
ISBN-13: 9781071614204
License: CC BY-NC-SA
Book Homepage: Free eBook, Errata, Code, Solutions, etc.

Preface
Introduction
Statistical Learning
	What Is Statistical Learning?
		Why Estimate f?
		How Do We Estimate f?
		The Trade-Off Between Prediction Accuracyand Model Interpretability
		Supervised Versus Unsupervised Learning
		Regression Versus Classification Problems
	Assessing Model Accuracy
		Measuring the Quality of Fit
		The Bias-Variance Trade-Off
		The Classification Setting
	Lab: Introduction to R
		Basic Commands
		Graphics
		Indexing Data
		Loading Data
		Additional Graphical and Numerical Summaries
	Exercises
Linear Regression
	Simple Linear Regression
		Estimating the Coefficients
		Assessing the Accuracy of the CoefficientEstimates
		Assessing the Accuracy of the Model
	Multiple Linear Regression
		Estimating the Regression Coefficients
		Some Important Questions
	Other Considerations in the Regression Model
		Qualitative Predictors
		Extensions of the Linear Model
		Potential Problems
	The Marketing Plan
	Comparison of Linear Regression with K-NearestNeighbors
	Lab: Linear Regression
		Libraries
		Simple Linear Regression
		Multiple Linear Regression
		Interaction Terms
		Non-linear Transformations of the Predictors
		Qualitative Predictors
		Writing Functions
	Exercises
Classification
	An Overview of Classification
	Why Not Linear Regression?
	Logistic Regression
		The Logistic Model
		Estimating the Regression Coefficients
		Making Predictions
		Multiple Logistic Regression
		Multinomial Logistic Regression
	Generative Models for Classification
		Linear Discriminant Analysis for p=1
		Linear Discriminant Analysis for p>1
		Quadratic Discriminant Analysis
		Naive Bayes
	A Comparison of Classification Methods
		An Analytical Comparison
		An Empirical Comparison
	Generalized Linear Models
		Linear Regression on the Bikeshare Data
		Poisson Regression on the Bikeshare Data
		Generalized Linear Models in Greater Generality
	Lab: Classification Methods
		The Stock Market Data
		Logistic Regression
		Linear Discriminant Analysis
		Quadratic Discriminant Analysis
		Naive Bayes
		K-Nearest Neighbors
		Poisson Regression
	Exercises
Resampling Methods
	Cross-Validation
		The Validation Set Approach
		Leave-One-Out Cross-Validation
		k-Fold Cross-Validation
		Bias-Variance Trade-Off for k-FoldCross-Validation
		Cross-Validation on Classification Problems
	The Bootstrap
	Lab: Cross-Validation and the Bootstrap
		The Validation Set Approach
		Leave-One-Out Cross-Validation
		k-Fold Cross-Validation
		The Bootstrap
	Exercises
Linear Model Selection and Regularization
	Subset Selection
		Best Subset Selection
		Stepwise Selection
		Choosing the Optimal Model
	Shrinkage Methods
		Ridge Regression
		The Lasso
		Selecting the Tuning Parameter
	Dimension Reduction Methods
		Principal Components Regression
		Partial Least Squares
	Considerations in High Dimensions
		High-Dimensional Data
		What Goes Wrong in High Dimensions?
		Regression in High Dimensions
		Interpreting Results in High Dimensions
	Lab: Linear Models and Regularization Methods
		Subset Selection Methods
		Ridge Regression and the Lasso
		PCR and PLS Regression
	Exercises
Moving Beyond Linearity
	Polynomial Regression
	Step Functions
	Basis Functions
	Regression Splines
		Piecewise Polynomials
		Constraints and Splines
		The Spline Basis Representation
		Choosing the Number and Locationsof the Knots
		Comparison to Polynomial Regression
	Smoothing Splines
		An Overview of Smoothing Splines
		Choosing the Smoothing Parameter
	Local Regression
	Generalized Additive Models
		GAMs for Regression Problems
		GAMs for Classification Problems
	Lab: Non-linear Modeling
		Polynomial Regression and Step Functions
		Splines
		GAMs
	Exercises
Tree-Based Methods
	The Basics of Decision Trees
		Regression Trees
		Classification Trees
		Trees Versus Linear Models
		Advantages and Disadvantages of Trees
	Bagging, Random Forests, Boosting, and Bayesian Additive Regression Trees
		Bagging
		Random Forests
		Boosting
		Bayesian Additive Regression Trees
		Summary of Tree Ensemble Methods
	Lab: Decision Trees
		Fitting Classification Trees
		Fitting Regression Trees
		Bagging and Random Forests
		Boosting
		Bayesian Additive Regression Trees
	Exercises
Support Vector Machines
	Maximal Margin Classifier
		What Is a Hyperplane?
		Classification Using a Separating Hyperplane
		The Maximal Margin Classifier
		Construction of the Maximal Margin Classifier
		The Non-separable Case
	Support Vector Classifiers
		Overview of the Support Vector Classifier
		Details of the Support Vector Classifier
	Support Vector Machines
		Classification with Non-Linear DecisionBoundaries
		The Support Vector Machine
		An Application to the Heart Disease Data
	SVMs with More than Two Classes
		One-Versus-One Classification
		One-Versus-All Classification
	Relationship to Logistic Regression
	Lab: Support Vector Machines
		Support Vector Classifier
		Support Vector Machine
		ROC Curves
		SVM with Multiple Classes
		Application to Gene Expression Data
	Exercises
Deep Learning
	Single Layer Neural Networks
	Multilayer Neural Networks
	Convolutional Neural Networks
		Convolution Layers
		Pooling Layers
		Architecture of a Convolutional Neural Network
		Data Augmentation
		Results Using a Pretrained Classifier
	Document Classification
	Recurrent Neural Networks
		Sequential Models for Document Classification
		Time Series Forecasting
		Summary of RNNs
	When to Use Deep Learning
	Fitting a Neural Network
		Backpropagation
		Regularization and Stochastic Gradient Descent
		Dropout Learning
		Network Tuning
	Interpolation and Double Descent
	Lab: Deep Learning
		A Single Layer Network on the Hitters Data
		A Multilayer Network on the MNIST Digit Data
		Convolutional Neural Networks
		Using Pretrained CNN Models
		IMDb Document Classification
		Recurrent Neural Networks
	Exercises
Survival Analysis and Censored Data
	Survival and Censoring Times
	A Closer Look at Censoring
	The Kaplan–Meier Survival Curve
	The Log-Rank Test
	Regression Models With a Survival Response
		The Hazard Function
		Proportional Hazards
		Example: Brain Cancer Data
		Example: Publication Data
	Shrinkage for the Cox Model
	Additional Topics
		Area Under the Curve for Survival Analysis
		Choice of Time Scale
		Time-Dependent Covariates
		Checking the Proportional Hazards Assumption
		Survival Trees
	Lab: Survival Analysis
		Brain Cancer Data
		Publication Data
		Call Center Data
	Exercises
Unsupervised Learning
	The Challenge of Unsupervised Learning
	Principal Components Analysis
		What Are Principal Components?
		Another Interpretation of Principal Components
		The Proportion of Variance Explained
		More on PCA
		Other Uses for Principal Components
	Missing Values and Matrix Completion
	Clustering Methods
		K-Means Clustering
		Hierarchical Clustering
		Practical Issues in Clustering
	Lab: Unsupervised Learning
		Principal Components Analysis
		Matrix Completion
		Clustering
		NCI60 Data Example
	Exercises
Multiple Testing
	A Quick Review of Hypothesis Testing
		Testing a Hypothesis
		Type I and Type II Errors
	The Challenge of Multiple Testing
	The Family-Wise Error Rate
		What is the Family-Wise Error Rate?
		Approaches to Control the Family-Wise Error Rate
		Trade-Off Between the FWER and Power
	The False Discovery Rate
		Intuition for the False Discovery Rate
		The Benjamini–Hochberg Procedure
	A Re-Sampling Approach to p-Values and False Discovery Rates
		A Re-Sampling Approach to the p-Value
		A Re-Sampling Approach to the False Discovery Rate
		When Are Re-Sampling Approaches Useful?
	Lab: Multiple Testing
		Review of Hypothesis Tests
		The Family-Wise Error Rate
		The False Discovery Rate
		A Re-Sampling Approach
	Exercises
Index