Overfitting in Polynomial Regression can be visualized by a graph where the polynomial curve fits even the _________ in the training data.

accuracy
linearity
noise
stability

A graph showing overfitting in Polynomial Regression will exhibit the polynomial curve fitting even the noise in the training data, not just the underlying trend.

Discuss it

Balancing the _________ in a training dataset is vital to ensure that the model does not become biased towards one particular outcome.

classes
features
models
parameters

Balancing the "classes" in a training dataset ensures that the model does not become biased towards one class, leading to a more accurate and fair representation of the data. This is especially crucial in classification tasks.

Discuss it

Increasing the regularization parameter in Ridge regression will ________ the coefficients but will not set them to zero.

Decrease
Increase
Maintain
nan

Increasing the regularization parameter in Ridge regression will shrink the coefficients towards zero but will not set them to zero, due to the L2 penalty.

Discuss it

You are working on a binary classification problem, and your model is consistently predicting the majority class. What could be causing this issue and how would you approach resolving it?

Data is corrupted; clean the data
Ignoring the minority class; use resampling techniques
Incorrect algorithm; change algorithm
Too many features; perform feature selection

The issue could be due to imbalanced classes. Approaching it by using resampling techniques, such as oversampling the minority class or undersampling the majority class, can help balance the classes and improve the model's performance.

Discuss it

While R-Squared describes the proportion of variance explained by the model, ________ adjusts this value based on the number of predictors, providing a more nuanced understanding of the model's fit.

Adjusted R-Squared
MSE
R-Squared
RMSE

Adjusted R-Squared is an extension of R-Squared that adjusts the value based on the number of predictors in the model. While R-Squared describes the proportion of variance explained by the model, Adjusted R-Squared takes into account the complexity of the model by considering the number of predictors. This leads to a more nuanced understanding of the model's fit, particularly when comparing models with different numbers of predictors.

Discuss it

In a real-world customer segmentation problem, how might you apply clustering to optimize marketing strategies?

All of the Above
By Clustering Based on Behavior
By Clustering Based on Geography
By Clustering Based on Product Preference

Clustering can be applied in various ways to optimize marketing strategies, including grouping customers based on product preference, geography, behavior, or a combination of these factors.

Discuss it

What are the potential drawbacks of using PCA for dimensionality reduction?

It always improves model performance
It can lead to information loss and doesn't consider class labels
It normalizes the variance of the data
It removes all noise and outliers

The potential drawbacks of using PCA include the risk of information loss since it only considers variance, not class labels, and might remove meaningful information that doesn't align with the directions of maximum variance.

Discuss it

Can you explain the complete linkage method in Hierarchical Clustering?

Using maximum distance between any two points in clusters
Using mean distance between all pairs in clusters
Using minimum distance between any two points in clusters
Using total distance between all points in clusters

The complete linkage method in Hierarchical Clustering uses the maximum distance between any two points in the clusters to determine the linkage. It ensures that clusters are as compact as possible by focusing on the farthest points, which can sometimes lead to chain-like clusters.

Discuss it

What is the risk of using the same data for both training and testing in a Machine Learning model?

Increase in accuracy; Reduction in bias
Increase in complexity; Reduction in training time
Reduction in training time; Increase in bias
Risk of overfitting; Unrealistic performance estimates

Using the same data for training and testing leads to the risk of overfitting and provides unrealistic performance estimates. The model will have seen all the data during training, so it might not generalize well to new, unseen instances.

Discuss it

A dataset with very high between-class variance but low within-class variance is given. How would the LDA approach be beneficial here?

LDA would be the same as PCA
LDA would perform optimally due to the variance characteristics
LDA would perform poorly
LDA would require transformation of the dataset

LDA would "perform optimally" in this scenario, as high between-class variance and low within-class variance align perfectly with its objective of maximizing between-class variance and minimizing within-class variance.

Discuss it

Discuss the difference between Euclidean distance and Manhattan distance metrics in the context of KNN.

Euclidean is faster, Manhattan is more accurate
Euclidean is for 3D, Manhattan for 2D
Euclidean is for continuous data, Manhattan for categorical
Euclidean uses squares, Manhattan uses absolutes

Euclidean distance is the square root of the sum of squared differences, while Manhattan distance is the sum of the absolute differences.

Discuss it

You have trained an SVM but the decision boundary is not fitting well to the data. How could adjusting the hyperplane parameters help?

Change the kernel's color
Increase the size of the hyperplane
Modify the regularization parameter 'C'
Reduce the number of support vectors

Adjusting the regularization parameter 'C' controls the trade-off between margin maximization and error minimization, helping to fit the decision boundary better.

Discuss it