What is the Mean Squared Error (MSE) in the context of regression models?
- Average of absolute differences between predictions and actuals
- Average of squared differences between predictions and actuals
- Sum of absolute differences between predictions and actuals
- Sum of squared differences between predictions and actuals
The Mean Squared Error (MSE) is the average of the squared differences between the predicted values and the actual values. It's a common metric for evaluating the performance of regression models by giving more weight to larger errors.
The __________ distance metric calculates the distance between points by summing the absolute differences in each dimension.
- Cosine
- Euclidean
- Hamming
- Manhattan
The Manhattan distance metric calculates the distance by summing the absolute differences in each dimension.
________ is a metric that measures the average magnitude of errors in a set of predictions, without considering their direction.
- Adjusted R-Squared
- MAE
- R-Squared
- RMSE
The Mean Absolute Error (MAE) is a metric that measures the average magnitude of errors without considering their direction. It calculates the average of the absolute differences between predicted and actual values. Unlike squared errors, it does not give more weight to larger errors, making it less sensitive to outliers. This property makes it a useful measure in various contexts.
What mathematical criterion is used in LDA to find the directions that maximize the between-class variance?
- Eigenvalue decomposition
- Gradient ascent
- Ratio of between-class scatter to within-class scatter
- Ratio of determinants
The mathematical criterion used in LDA to find the directions that maximize the between-class variance is the "ratio of between-class scatter to within-class scatter." Maximizing this ratio leads to better separation between classes.
You are given a dataset where the features have different units and scales. How would this affect KNN, and what should be done to handle this scenario?
- Ignore the scaling
- Increase the value of K
- Perform feature engineering
- Scale the features
Different units and scales can distort distance measures in KNN. Scaling the features to a common range can remedy this problem.
Can you explain the differences between Leave-One-Out Cross-Validation (LOOCV) and k-fold Cross-Validation?
- LOOCV is a specific case of k-fold with k equal to the number of observations
- LOOCV is a specific case of k-fold with k=1
- LOOCV is faster than k-fold
- LOOCV uses k folds, while k-fold uses LOOCV folds
Leave-One-Out Cross-Validation (LOOCV) is a specific case of k-fold Cross-Validation, where k equals the number of observations in the dataset. In LOOCV, each observation is used as a validation set exactly once, whereas in k-fold, the dataset is divided into k equally-sized folds. LOOCV is computationally more intensive but may provide a less biased estimate.
How does stratified k-fold Cross-Validation differ from regular k-fold Cross-Validation?
- Stratified ensures an equal distribution of classes in each fold
- Stratified reduces computation time
- Stratified uses a different loss function
- Stratified uses a different optimization algorithm
Stratified k-fold Cross-Validation differs from regular k-fold Cross-Validation by ensuring that each fold has an equal distribution of classes. This approach maintains the same proportion of target classes in each fold, providing a more representative sampling of the data and more robust model validation, especially in imbalanced datasets.
Boosting reduces bias and variance by building a sequence of weak learners and combining them into a strong __________.
- Learner
- Model
- Predictor
- nan
Boosting combines a sequence of weak learners into a strong learner by iteratively correcting the mistakes of previous models and giving more weight to the misclassified instances, resulting in reduced bias and variance.
A medical imaging company is trying to diagnose diseases from X-ray images. Considering the spatial structure and patterns in these images, which type of neural network would be most appropriate?
- Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN)
- Feedforward Neural Network
- Radial Basis Function Network
A Convolutional Neural Network (CNN) is designed to capture spatial patterns and structures in images effectively, making it suitable for image analysis, such as X-ray diagnosis.
ICA is often used to separate ________ that have been mixed into a single data source.
- Signals
- Components
- Patterns
- Features
Independent Component Analysis (ICA) is used to separate mixed components in a data source, making 'Components' the correct answer.