The K-Means clustering algorithm iteratively updates the _________ to minimize the sum of squared distances within each cluster.
- Centroids
- Distance metric
- Learning rate
- Number of clusters
The K-Means algorithm works by iteratively updating the centroids, minimizing the sum of squared distances from each point to its assigned centroid, thus forming cohesive clusters.
The R-Squared value can be artificially inflated by adding more predictors, but the ________ helps mitigate this issue.
- Adjusted R-Squared
- MAE
- MSE
- RMSE
The R-Squared value can be artificially increased by adding irrelevant predictors. Adjusted R-Squared helps mitigate this by accounting for the number of predictors, penalizing models for including unnecessary complexity. It provides a more balanced evaluation of the model's fit and helps to avoid the trap of overfitting by adding more predictors.
How does LDA differ from Principal Component Analysis (PCA)?
- LDA and PCA have the same goal and method
- LDA focuses on unsupervised learning while PCA focuses on supervised learning
- LDA is concerned with maximizing between-class variance, while PCA focuses on maximizing total variance
- LDA uses Eigenvalues, while PCA uses Eigenvectors
LDA aims to maximize between-class variance and minimize within-class variance for classification, while PCA focuses on "maximizing total variance" without considering class labels. PCA is used mainly for dimensionality reduction and does not consider class separation.
What are the assumptions that must be met in Simple Linear Regression?
- Homogeneity, Variability, Linearity
- Independence, Homoscedasticity, Linearity, Normality
- Linearity, Categorization, Independence
- Linearity, Quadratic, Exponential
The assumptions in Simple Linear Regression include Independence (of errors), Homoscedasticity (equal variance), Linearity, and Normality (of errors).
Explain the difference between parametric and non-parametric models.
- The ability to update parameters during training
- The flexibility in form
- The number of features used
- The use of hyperparameters
Parametric models assume a specific form for the function they're approximating, such as a linear relationship, and have a fixed number of parameters. Non-parametric models make fewer assumptions about the function's form, often resulting in more flexibility but also requiring more data.
What is the bias-variance tradeoff in Machine Learning?
- A tradeoff between supervised and unsupervised learning
- A tradeoff between the complexity and the size of a model
- A tradeoff between the learning rate and the number of epochs
- A tradeoff between underfitting and overfitting
The bias-variance tradeoff refers to the balancing act between underfitting (high bias, low variance) and overfitting (low bias, high variance). A model with high bias oversimplifies the problem, while high variance tends to overcomplicate it.
How does the Logit function transform the probability in Logistic Regression?
- Maps odds to log-odds
- Maps odds to probability
- Maps probability to log-odds
- Maps probability to odds
The Logit function in Logistic Regression takes a probability and maps it to log-odds. It's the inverse of the Sigmoid function used to model probabilities.
In unsupervised learning, the model learns to find patterns and structures from _________ data, where no specific output values are provided.
- Balanced
- Labelled
- Sparse
- Unlabelled
In unsupervised learning, the model learns from unlabeled data, finding hidden patterns and structures without specific output values or guidance.
What is the main difference between supervised and unsupervised learning?
- Application
- Complexity
- Data size
- Use of labeled data
The main difference is the use of labeled data. Supervised Learning uses labeled data, while Unsupervised Learning does not.
Explain the concept of k-fold Cross-Validation. What does "k" signify?
- Number of equally-sized folds the data is divided into
- Number of features in the dataset
- Number of iterations in training
- Number of layers in a deep learning model
In k-fold Cross-Validation, "k" signifies the number of equally-sized folds the data is divided into. The model is trained on (k-1) folds and validated on the remaining fold, repeating this process k times. The average performance across all k trials provides a more unbiased estimate of the model's capability.