How does the ROC Curve illustrate the performance of a binary classification model?
- Plots accuracy vs. error rate, shows overall performance
- Plots precision vs. recall, shows trade-off between sensitivity and specificity
- Plots true positive rate vs. false positive rate, shows trade-off between sensitivity and specificity
- nan
The ROC Curve plots the true positive rate against the false positive rate for different threshold values. This illustrates the trade-off between sensitivity (true positive rate) and specificity (true negative rate), helping to choose the threshold that best balances these two aspects.
You are tasked with creating a model that can adapt and optimize its strategy through trial and error. Which type of learning would you employ?
- Reinforcement Learning
- Semi-Supervised Learning
- Supervised Learning
- Unsupervised Learning
Reinforcement Learning employs trial and error by learning from rewards and penalties, making it suitable for adaptive and optimization tasks.
How does Ridge regression differ in the way it penalizes large coefficients compared to Lasso?
- Both eliminate coefficients
- Both reduce coefficients
- Ridge eliminates coefficients, Lasso reduces them
- Ridge reduces coefficients, Lasso eliminates them
Ridge regularization reduces the size of coefficients but keeps them non-zero, while Lasso can eliminate some coefficients by setting them to zero.
The __________ method in K-Means clustering helps in determining the optimal number of clusters by plotting the variance as a function of the number of clusters.
- Elbow
- Gap Statistic
- Initialization
- Random
The Elbow Method is used to determine the optimal number of clusters in K-Means by plotting the variance as a function of the number of clusters. An "elbow" in the plot indicates the point beyond which additional clusters add little value.
You have fitted a Simple Linear Regression model and discovered heteroscedasticity in the residuals. What impact could this have, and how might you correct it?
- Always Leads to Overfitting, No Correction Possible
- Biased Estimates, Increase Sample Size
- Inefficiency in Estimates, Transform the Dependent Variable
- No Impact, No Correction Required
Heteroscedasticity could lead to inefficiency in the estimates, making them less reliable. Transforming the dependent variable or using weighted least squares can help correct this issue.
You have limited computational resources but need to develop a predictive model. What would you choose between AI, Machine Learning, or Deep Learning, and why?
- AI, for its flexibility and lower computational demands
- Deep Learning, for its high accuracy
- Machine Learning, for its data efficiency
- nan
Traditional AI models often require fewer computational resources compared to the complex models in Machine Learning and Deep Learning.
Explain how overfitting manifests itself in Polynomial Regression.
- Through fitting data too loosely
- Through fitting data with low-degree polynomials
- Through fitting noise and showing oscillatory behavior
- Through underfitting the model
Overfitting in Polynomial Regression is characterized by fitting the noise in the data and showing oscillatory behavior. A high-degree polynomial can capture minute fluctuations, leading to a complex model that doesn't generalize well.
How can the Eigenvalues in PCA be used to determine the significance of the corresponding Eigenvectors?
- By defining the direction of the eigenvectors
- By indicating the mean of each eigenvector
- By representing the amount of variance captured
- By showing the noise in the data
In PCA, eigenvalues are used to determine the significance of the corresponding eigenvectors by representing the amount of variance captured by each component. The larger the eigenvalue, the more significant the component.
How does DBSCAN differentiate between border points and noise points?
- By analyzing their density within Epsilon radius
- By assigning different weights to them
- By clustering them separately
- By the number of points within Epsilon distance and their relation to MinPts
In DBSCAN, border points have fewer than MinPts within their Epsilon radius but are reachable from a core point. Noise points don't satisfy either condition. Differentiating between these allows DBSCAN to form clusters without being influenced by noise and to create clusters of varying shapes by including border points.
What is the primary purpose of the hyperplane in SVM?
- Clustering
- Data Compression
- Data Transformation
- Separation of Classes
The hyperplane in SVM is used to separate classes in the feature space.