Why might one prefer to use MAE over MSE in evaluating a regression model?

MAE considers the direction of errors
MAE gives more weight to larger errors
MAE is less sensitive to outliers
MAE is more computationally expensive

One might prefer to use Mean Absolute Error (MAE) over Mean Squared Error (MSE) because MAE is less sensitive to outliers. While MSE squares the differences and thus gives more weight to larger errors, MAE takes the absolute value of the differences, providing an equal weighting. This makes MAE more robust when there are outliers or when one doesn't want to overly penalize larger deviations from the true values.

Discuss it

What challenges might arise when using Hierarchical Clustering on very large datasets?

Computationally intensive and requires high memory
Less accurate and requires more hyperparameters
Less sensitive to distance metrics and more prone to noise
Prone to overfitting and less interpretable

Hierarchical Clustering can be computationally intensive and require a lot of memory, especially when dealing with very large datasets. The algorithm has to compute and store a distance matrix, which has a size of O(n^2), where n is the number of data points. This can lead to challenges in computational efficiency and memory usage, making it less suitable for large-scale applications.

Discuss it

Imagine a scenario where you want to assess the stability of a statistical estimator. How would Bootstrapping help in this context?

By fixing the bias in the estimator
By increasing the size of the dataset
By repeating the sampling process with replacement and calculating the variance
By repeating the sampling process without replacement

Bootstrapping assesses the stability of a statistical estimator by repeating the sampling process with replacement and calculating variance, standard error, or other statistics. By creating numerous "bootstrap samples," it allows insights into the estimator's distribution, thereby providing a measure of its stability and reliability.

Discuss it

Why might pruning be necessary in the construction of a Decision Tree?

Determine Leaf Nodes
Increase Complexity
Increase Size
Reduce Overfitting

Pruning is necessary to remove unnecessary branches, simplifying the model and reducing the risk of overfitting the training data.

Discuss it

You are using Simple Linear Regression for a time-series dataset, and the residuals show a pattern. What does this imply, and what might be the remedy?

Autocorrelation Present, Use Time-Series Model
Model is Perfect
Multicollinearity, Remove Variables
Normal Distribution, No Remedy Needed

If residuals show a pattern in a time-series dataset, autocorrelation might be present, violating the independence assumption. A time-series model like ARIMA may be a more suitable approach.

Discuss it

Imagine you are working with a dataset where the classes are highly overlapped. How would LDA handle this situation, and what might be the challenges?

LDA would easily separate the classes; no challenges
LDA would ignore the overlap and classify randomly
LDA would require additional data for proper classification
LDA would struggle to separate the classes; potential misclassification

LDA would "struggle to separate the classes" when there's high overlap, as it relies on maximizing between-class variance. The challenges include potential misclassification and decreased accuracy.

Discuss it

You have a dataset with clusters of varying densities. How would you configure the Epsilon and MinPts in DBSCAN to handle this?

Increase Epsilon; Decrease MinPts
Increase both Epsilon and MinPts
Reduce both Epsilon and MinPts
Use a different clustering algorithm

DBSCAN's Epsilon and MinPts are global parameters that apply to all clusters. If clusters have varying densities, tuning these parameters to fit one density might not suit others, leading to misclustering. In such a scenario, a different clustering algorithm that can handle varying densities might be more appropriate.

Discuss it

What is the main difference between Ridge and Lasso regularization?

Both use L1 penalty
Both use L2 penalty
Ridge uses L1 penalty, Lasso uses L2 penalty
Ridge uses L2 penalty, Lasso uses L1 penalty

Ridge regularization uses an L2 penalty, which shrinks coefficients but keeps them non-zero, while Lasso uses an L1 penalty, leading to some coefficients being exactly zero.

Discuss it

Explain the application of clustering algorithms in customer segmentation for marketing strategies.

Clustering Customers
Image Recognition
Supply Chain Management
Text Classification

Clustering algorithms are used in customer segmentation to group customers based on similar characteristics or behaviors. These clusters help marketing teams to target specific segments with tailored marketing strategies, improving engagement and conversion rates.

Discuss it

What are the underlying assumptions of Logistic Regression?

Linearity of predictors and log-odds, Independence of errors, No multicollinearity
Linearity, Independence, Normality, Equal Variance
No assumptions required
Nonlinearity, Dependence, Non-Normality

Logistic Regression assumes a linear relationship between predictors and log-odds, independence of errors, and no multicollinearity among predictors. It does not assume normality or equal variance of errors.

Discuss it