Bagging stands for Bootstrap __________, which involves creating subsets of the original dataset and training individual models on them.
- Adjustment
- Aggregation
- Algorithm
- Alignment
Bagging, or Bootstrap Aggregation, involves creating subsets of the original dataset through bootstrapping and training individual models on these subsets, which are then combined to make the final prediction.
How can you test for multicollinearity in Simple Linear Regression, and why is it important?
- By Checking Accuracy, Improves Prediction
- By Checking Residuals, Reduces Overfitting
- By Checking Variance Inflation Factor (VIF), Prevents Unstable Estimates
- By Examining Correlations between Variables, Prevents Confounding Effects
Multicollinearity can be detected by checking the Variance Inflation Factor (VIF). It is important as multicollinearity can lead to unstable estimates and make it difficult to interpret individual coefficients.
If there is a concern about the importance of closer neighbors versus further neighbors in KNN, implementing __________ can be a solution to give varying importance based on distance.
- distance weighting
- equal weighting
- random weighting
- uniform weighting
Implementing distance weighting in KNN gives varying importance based on distance, assigning more weight to closer neighbors.
Explain the concept of regularization in Machine Learning. What are some common techniques?
- Increasing complexity, Gradient Boosting
- Increasing complexity, L1/L2
- Reducing complexity, Gradient Descent
- Reducing complexity, L1/L2
Regularization is a technique to reduce overfitting by adding a penalty term to the loss function. Common techniques include L1 (lasso) and L2 (ridge) regularization, which penalize large coefficients in a model.
In a dataset with fluctuating values, you've applied Polynomial Regression, and the model seems to fit even the noise. What are the potential risks, and how could you mitigate them?
- Add more noise
- Ignore the noise
- Reduce model complexity through lower degree or regularization
- Use a linear model
The risk is overfitting the noise, which will harm the model's generalization ability. Reducing the polynomial degree or using regularization techniques can mitigate this by constraining the model's complexity.
How does Deep Learning model complexity typically compare to traditional Machine Learning models, and what are the implications of this?
- Less complex and easier to train
- Less complex and requires less data
- More complex and easier to interpret
- More complex and requires more data and computation
Deep Learning models are typically more complex, requiring more data and computational resources, which can make training and tuning more challenging.
What term in Machine Learning refers to the input variables that the model uses to make predictions?
- Features
- Labels
- Predictors
- Targets
In Machine Learning, "Features" refer to the input variables that the model uses to make predictions. They are characteristics or attributes used to describe data instances.
What is the purpose of testing a Machine Learning model?
- To cluster the data
- To evaluate the model's performance on unseen data
- To label the data
- To train the model
The purpose of testing a Machine Learning model is to evaluate its performance on unseen data. This helps in assessing how well the model generalizes to new instances.
The DBSCAN algorithm groups together points that are closely packed, forming clusters, and treats more isolated points as _________.
- Centroids
- Clusters
- Noise
- Outliers
In DBSCAN, more isolated points that don't belong to any cluster are treated as outliers.
How is the amount of variance explained related to Eigenvalues in PCA?
- Eigenvalues are unrelated to variance
- Eigenvalues represent the mean of the data
- Larger eigenvalues explain more variance
- Smaller eigenvalues explain more variance
In PCA, the amount of variance explained by each principal component is directly related to its corresponding eigenvalue. Larger eigenvalues mean that more variance is explained by that particular component.