Machine Learning is a branch of AI that includes algorithms that learn patterns in data, while Deep Learning is a subset of _________ that involves multi-layered neural networks.
- AI
- Deep Learning
- Machine Learning
- nan
Deep Learning is a subset of Machine Learning, focusing on algorithms that utilize multi-layered neural networks.
ElasticNet is a regularized regression method that linearly combines the L1 penalty of _________ and the L2 penalty of _________.
- Lasso, Ridge
- Linear, Polynomial
- Polynomial, Linear
- Ridge, Lasso
ElasticNet is a regularized regression method that combines the L1 penalty of Lasso and the L2 penalty of Ridge, incorporating the properties of both methods.
Imagine you're working on a binary classification problem, and the model is performing well in terms of accuracy but poorly in terms of recall. What might be the issue and how would you address it?
- Issue with data imbalance; Use resampling techniques
- Issue with precision; Improve accuracy
- Threshold is too high; Lower the threshold
- Threshold is too low; Increase the threshold
The issue might be that the threshold for classification is set too high, causing true positives to be misclassified as false negatives, reducing recall. Lowering the threshold may help in improving recall without sacrificing too much precision.
Why is clustering considered an unsupervised learning method?
- Because it groups data without the need for labeled responses
- Because it predicts continuous outcomes
- Because it requires labeled data
- Because it uses decision trees
Clustering is considered unsupervised because it finds patterns and groups data without using labeled responses or guidance.
How does the bagging technique reduce the variance in a model?
- By averaging the predictions of multiple models trained on different subsets of data
- By focusing on the mean prediction
- By increasing complexity
- By reducing the number of features
Bagging reduces variance by averaging the predictions of multiple models, each trained on a different random subset of the data (with replacement). This averaging process smooths out individual variations, leading to a more stable and robust model.
If a point in DBSCAN has fewer than MinPts within its Epsilon neighborhood, it's considered a _________ point.
- border point
- cluster
- core point
- noise point
If a point in DBSCAN has fewer than MinPts within its Epsilon neighborhood, it's considered a noise point. Noise points are those that are not part of any cluster and are isolated or in low-density regions.
Explain the mathematical difference between MSE and RMSE and their interpretation.
- MSE is the square of RMSE; RMSE is less interpretable
- MSE is the square root of RMSE; RMSE emphasizes larger errors more
- RMSE is the square of MSE; MSE provides values in the original unit
- RMSE is the square root of MSE; MSE is in squared units
The Mean Squared Error (MSE) measures the average of the squared differences between the predicted values and the actual values, resulting in squared units. The Root Mean Squared Error (RMSE) is the square root of MSE, thus providing a value in the same unit as the original data. RMSE is often considered more interpretable for this reason.
In a situation where the features in your dataset are at very different scales, which regularization technique would you choose and why?
- L1 Regularization because of complexity
- L1 Regularization because of sparsity
- L2 Regularization because of scalability
- L2 Regularization because of sensitivity to noise
L2 Regularization (Ridge) would be chosen when features are at different scales because it scales the coefficients without completely eliminating them, preserving information. It can prevent overfitting while considering all features.
What are the advantages and limitations of using Ridge regression over ordinary linear regression?
- Increases bias, Reduces variance, Reduces multicollinearity, Can cause overfitting
- Increases bias, Reduces variance, Tackles multicollinearity, Can cause underfitting
- Reduces overfitting, Increases variance, Lower bias, Lower variance
- Reduces overfitting, Tackles multicollinearity, Lower bias, Lower variance
Ridge regression helps in reducing overfitting by penalizing large coefficients through L2 regularization. It can handle multicollinearity but increases bias, potentially leading to underfitting. Ordinary linear regression lacks these regularization properties.
What is the Elbow Method in the context of K-Means clustering?
- A centroid initialization technique
- A clustering visualization tool
- A method to determine the number of clusters
- A way to calculate distance between points
The Elbow Method in K-Means clustering is used to find the optimal number of clusters by plotting the variance as a function of the number of clusters and finding the "elbow" point.