What is regression in the context of Machine Learning?
- A type of classification
- Analyzing computer performance
- Predicting a categorical value
- Predicting a numerical value
Regression in Machine Learning refers to the process of predicting a continuous numerical outcome or dependent variable based on one or more independent variables. It plays a crucial role in various areas, like economics, where it might be used to predict prices, or in biology, to forecast growth rates.
In KNN, how does an increase in the value of K generally affect the bias and variance of the model?
- Decreases bias, increases variance
- Decreases both bias and variance
- Increases bias, decreases variance
- Increases both bias and variance
Increasing the value of K generally increases bias and decreases variance in the KNN model.
You've trained a model with a small training set and a large testing set. What challenges might you encounter, and how could they be addressed?
- Both Overfitting and Underfitting
- Data is perfectly balanced
- Overfitting
- Underfitting
A small training set might lead to overfitting, where the model memorizes noise from the training data. Conversely, it might also lead to underfitting if the model fails to capture the underlying pattern. Cross-validation, bootstrapping, or augmenting the training set with additional relevant data can help balance the model's ability to generalize.
In Supervised Learning, _________ and ___________ are the two main types of problems.
- Classification; Clustering
- Classification; Regression
- Regression; Clustering
- Regression; Ensemble Learning
In Supervised Learning, the two main types of problems are Classification and Regression. Classification is about categorizing data into predefined classes, while Regression is predicting a continuous outcome.
You've built a multiple linear regression model and found that two or more predictors are highly correlated. What problems might this cause, and how can you solve them?
- High bias, Address by increasing the model complexity
- High variance, Address by using Lasso regression
- Overfitting, Address by removing correlated features or using Ridge regression
- Underfitting, Address by adding more features
Multicollinearity, where predictors are highly correlated, can cause overfitting and unstable estimates. This can be addressed by removing correlated features or using Ridge regression, which penalizes large coefficients and reduces the impact of multicollinearity.
What method is commonly used to estimate the coefficients in Simple Linear Regression?
- Maximum Likelihood Estimation
- Minimizing the Sum of Absolute Errors
- Minimizing the Sum of the Squares of the Residuals
- Neural Networks Training
In Simple Linear Regression, the method used to estimate coefficients is by minimizing the sum of the squares of the residuals, known as the Ordinary Least Squares (OLS) method.
In the context of building a model, the _________ are carefully selected and processed to improve the model's performance.
- features
- parameters
- testing set
- training set
"Features" are the input variables that are carefully selected and processed (e.g., through feature engineering or scaling) to enhance the model's predictive performance.
How does the objective function differ between Ridge, Lasso, and ElasticNet?
- No difference
- Ridge and Lasso have the same objective
- Ridge uses L1, Lasso uses L2, ElasticNet uses neither
- Ridge uses L2, Lasso uses L1, ElasticNet uses both
Ridge's objective function includes an L2 penalty, Lasso's includes an L1 penalty, and ElasticNet's includes both L1 and L2 penalties.
How do Precision and Recall trade-off in a classification problem, and when might you prioritize one over the other?
- Increasing Precision decreases Recall, prioritize Precision when false positives are costly
- Increasing Precision increases Recall, prioritize Recall when false positives are costly
- Precision and Recall are independent, no trade-off
- nan
Precision and Recall often trade-off; increasing one can decrease the other. You might prioritize Precision when false positives are more costly (e.g., spam detection) and Recall when false negatives are more costly (e.g., fraud detection).
Can you detail how to prevent overfitting in Polynomial Regression?
- By ignoring the test set
- By increasing the degree
- By using all features
- By using regularization techniques like Ridge and Lasso
Overfitting in Polynomial Regression can be prevented by using regularization techniques like Ridge and Lasso. These techniques add a penalty term to the loss function, constraining the coefficients and reducing the complexity of the model.