Your linear regression model has a high bias. What could be the reasons behind this, and how would you try to fix it?
- High variance in data, Address by using more data
- Irrelevant features, Address by using Lasso regression
- Oversimplified model, Address by increasing model complexity
- Too complex model, Address by reducing model complexity
High bias often stems from an oversimplified model that fails to capture the underlying patterns in the data. Increasing model complexity by adding polynomial terms, interaction terms, or more features can reduce bias and help the model better fit the data.
Deep Learning models often require substantial computational resources, such as __________, due to their complexity.
- All of the above
- CPUs
- GPUs
- RAM
GPUs (Graphics Processing Units) are particularly used in Deep Learning due to their ability to handle parallel processing, making them suited for the task.
How can one effectively determine the optimal value of K in the KNN algorithm for a given dataset?
- Always choosing K=5
- Cross-validation
- Guessing
- Only using an odd value
The optimal value of K can be determined by using cross-validation to test different values and selecting the one that performs best.
You've built a classification model, but it's highly sensitive to changes in the test data. What could be the issue and how would you fix it?
- Overfitting; Cross-validation
- Overfitting; Increase regularization
- Underfitting; Add more features
- Underfitting; Use different model
The issue could be overfitting, where the model performs well on training data but poorly on unseen data. Fixing this might involve using cross-validation to ensure the model generalizes well to new data.
In a medical diagnosis scenario, how would you evaluate a model using Precision, Recall, and the ROC Curve? Explain the considerations you would take into account.
- Focus equally on Precision and Recall, use ROC for sensitivity
- Focus on Precision to minimize false positives, use ROC for specificity
- Focus on Recall to minimize false negatives, use ROC for overall trade-off
- nan
In medical diagnosis, minimizing false negatives (missing a true condition) is often crucial, so Recall is highly valued. The ROC Curve is used to understand the trade-off between sensitivity and specificity, providing a comprehensive view of the model's performance.
Underfitting occurs when a model is too _________ and fails to capture the underlying trend of the data.
- complex
- noisy
- regularized
- simple
Underfitting happens when a model is too simple to capture the underlying patterns in the data, leading to poor predictions.
How is the within-class scatter matrix computed in LDA?
- By multiplying the covariances of each class
- By multiplying the means of each class
- By summing the covariances of each class
- By summing the means of each class
The within-class scatter matrix in LDA is computed "by summing the covariances of each class." This matrix captures the spread of data within each class and is essential for minimizing within-class variance.
The _________ in Simple Linear Regression represents the value of the dependent variable when the independent variable is zero.
- Coefficient
- Intercept
- Residual
- Slope
The intercept in Simple Linear Regression represents the value of the dependent variable when the independent variable is zero. It's the point where the regression line crosses the Y-axis.
In healthcare, Machine Learning can help in early detection of ____________ and ____________.
- Diseases, Treatment Planning
- Fraud Detection, Risk Management
- Personalized Recommendations, Text Classification
- Traffic Flow, Weather Prediction
In healthcare, Machine Learning is applied for the early detection of diseases and planning appropriate treatment, leveraging predictive analytics and pattern recognition.
Imagine you've built a spam email classifier. How would you utilize the Confusion Matrix to understand the model's performance?
- Analyze TP, FP, TN, FN to understand the type and frequency of errors
- Focus on FP and FN to understand only the mistakes made
- Focus only on TP and TN as they represent correct classifications
- nan
In spam email classification, a Confusion Matrix helps by showing True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN), thus allowing an understanding of the type and frequency of errors, not just the correct classifications.