How does the objective function differ between Ridge, Lasso, and ElasticNet?

  • No difference
  • Ridge and Lasso have the same objective
  • Ridge uses L1, Lasso uses L2, ElasticNet uses neither
  • Ridge uses L2, Lasso uses L1, ElasticNet uses both
Ridge's objective function includes an L2 penalty, Lasso's includes an L1 penalty, and ElasticNet's includes both L1 and L2 penalties.

In a case where your regression model is suffering from high variance, what regularization technique might you apply, and why?

  • Increase model complexity
  • L1 regularization
  • L2 regularization (Ridge)
  • Reduce model complexity
High variance in a regression model often signals overfitting, where the model performs well on training data but poorly on unseen data. L2 regularization (Ridge regression) can help by penalizing large coefficients, reducing overfitting, and improving generalization.

How can centering variables help in interpreting interaction effects in Multiple Linear Regression?

  • By increasing model accuracy
  • By increasing prediction speed
  • By reducing multicollinearity between main effects and interaction terms
  • By simplifying the model
Centering variables (subtracting the mean) can reduce multicollinearity between main effects and interaction terms, making it easier to interpret the individual and combined effects of the variables.

The process of training a Machine Learning model involves using a dataset known as the _________ set, while evaluating it involves the _________ set.

  • testing, validation
  • training, testing
  • validation, testing
  • validation, training
In supervised learning, a "training" set is used to train the model, and a "testing" set is used to evaluate its predictive performance on unseen data.

Which Machine Learning approach allows the system to learn and make decisions from experience?

  • Reinforcement Learning
  • Semi-Supervised Learning
  • Supervised Learning
  • Unsupervised Learning
Reinforcement Learning allows the system to learn and make decisions through trial and error, receiving rewards or penalties, and learning from experience to achieve a specific goal.

What are the implications of using R-Squared vs. Adjusted R-Squared in a multiple regression model with many predictors?

  • R-Squared favors complex models; Adjusted R-Squared is more sensitive to noise
  • R-Squared favors more predictors without penalty; Adjusted R-Squared penalizes unnecessary predictors
  • R-Squared is better for small datasets; Adjusted R-Squared is only applicable to linear models
  • R-Squared provides better interpretability; Adjusted R-Squared favors simple models
In multiple regression models with many predictors, using R-Squared may favor the inclusion of more predictors without penalizing for their irrelevance, leading to potentially overfitted models. In contrast, Adjusted R-Squared includes a penalty term for unnecessary predictors, providing a more balanced assessment of the model's performance. It helps in avoiding the trap of increasing complexity without meaningful gains in explanatory power.

Can you detail how to prevent overfitting in Polynomial Regression?

  • By ignoring the test set
  • By increasing the degree
  • By using all features
  • By using regularization techniques like Ridge and Lasso
Overfitting in Polynomial Regression can be prevented by using regularization techniques like Ridge and Lasso. These techniques add a penalty term to the loss function, constraining the coefficients and reducing the complexity of the model.

When using the Elbow Method in K-Means, the optimal number of clusters is typically found where the plot shows a(n) _________, indicating a point of diminishing returns.

  • Elbow
  • Foot
  • Hand
  • Knee
In the context of K-Means, the "elbow" refers to the point in the plot where adding more clusters does not significantly reduce the within-cluster sum of squares. It indicates a point of diminishing returns in terms of cluster separation.

You have a dataset with a clear elbow point, but the K-Means clustering is still not performing well. How could centroid initialization be contributing to this issue?

  • Centroids initialized too far from the data
  • Centroids initialized within one cluster
  • Initializing centroids based on mean
  • Poor centroid initialization causing slow convergence
Poor centroid initialization can cause slow convergence or convergence to suboptimal solutions, even when there is a clear elbow point. This leads to the K-Means clustering not performing as well as it should.

The point in the ROC Curve where the True Positive Rate equals the False Positive Rate is known as the __________ point.

  • Break-even
  • Equilibrium
  • Random
  • nan
The Break-even point on the ROC Curve is where the True Positive Rate equals the False Positive Rate. This point represents a balance between sensitivity and specificity.

Can you explain the main concept behind boosting algorithms?

  • Boosting always uses Random Forest
  • Boosting combines models sequentially, giving more weight to misclassified instances
  • Boosting focuses on the strongest predictions
  • Boosting involves reducing model complexity
Boosting is an ensemble method where models are combined sequentially, with each model focusing more on the instances that were misclassified by the previous models. This iterative process helps in correcting the mistakes of earlier models, leading to improved performance.

Cross-validation, such as _______-fold cross-validation, can help in detecting and preventing overfitting.

  • 10
  • 3
  • 5
  • any number
Any number of folds can be used in cross-validation, although commonly used numbers include 5 and 10. Cross-validation helps in model validation and prevents overfitting.