What is the significance of the slope in Simple Linear Regression?
- It Describes the Rate of Change in Y for a One-Unit Change in X
- It Indicates the Intercept
- It Predicts the Error
- It Shows the Starting Point of the Line
The slope in Simple Linear Regression describes the rate of change in the dependent variable (Y) for a one-unit change in the independent variable (X).
How is overfitting specifically related to Polynomial Regression?
- It's not related
- Overfitting can occur with high-degree polynomials
- Overfitting only happens with linear models
- Polynomial Regression prevents overfitting
Overfitting in Polynomial Regression can occur when using high-degree polynomials. The model becomes too flexible and fits the training data too well, including the noise, which reduces its ability to generalize well to unseen data.
What are the potential risks of using too high a degree in Polynomial Regression?
- Decreased complexity
- Increased bias
- Increased variance and overfitting
- Simplified model
Using too high a degree in Polynomial Regression can lead to increased variance and overfitting. It makes the model too complex, fitting the noise in the training data, and thus failing to generalize well to unseen data.
Lasso regularization can lead to sparse solutions where some coefficients become exactly ________.
- Negative
- Positive
- Zero
- nan
Lasso regularization adds an L1 penalty, causing some coefficients to be exactly zero, leading to sparsity in the model's solution.
Explain the assumption of homoscedasticity in Simple Linear Regression.
- All Errors are Zero
- All Variables are Independent
- Equal Variance of Errors for All Values of X
- Linearity between Variables
Homoscedasticity is an assumption that the variability of the errors is constant across all levels of the independent variable(s). If this assumption is violated, it can lead to inefficiency in the estimates.
In SVM, the _________ kernel allows for complex transformations of data, making it possible to find a hyperplane even in non-linearly separable data.
- Linear
- Polynomial
- RBF
- Sigmoid
The Radial Basis Function (RBF) kernel allows for complex transformations, making it suitable for non-linearly separable data.
What are some common performance metrics used in evaluating classification models?
- Clustering Coefficient, Density
- Eigenvalues, Eigenvectors
- Mean Squared Error, R-squared
- Precision, Recall, F1 Score
Common performance metrics for classification include Precision (positive predictive value), Recall (sensitivity), and F1 Score (harmonic mean of precision and recall). These metrics help to assess the model's ability to correctly classify positive cases.
If a Polynomial Regression model is suspected of overfitting, you can perform _________ to validate the model's performance across different subsets of the data.
- accuracy testing
- cross-validation
- noise filtering
- stability testing
Cross-validation can be used to validate the model's performance across different subsets of the data and can help detect overfitting.
Which learning paradigm does not require labeled data and finds hidden patterns in the data?
- Reinforcement Learning
- Semi-supervised Learning
- Supervised Learning
- Unsupervised Learning
Unsupervised Learning does not require labeled data and works by finding hidden patterns and structures in the data.
You have a dataset with hundreds of features, some of which are redundant. How would you approach reducing the dimensionality?
- Remove all redundant features manually
- Apply PCA
- Use only the first few features
- Normalize the data
Applying Principal Component Analysis (PCA) would be the most efficient way to reduce dimensionality in this scenario. PCA transforms the data into a new set of uncorrelated features, effectively capturing the most important variance in fewer dimensions, and thus removing redundancy. Manually removing redundant features may not be practical with hundreds of features, and other options do not directly address dimensionality reduction.
How is the Adjusted R-Squared value computed, and why is it often preferred over R-Squared?
- Adjusted R-Squared adds a penalty for more predictors; preferred for its robustness to outliers
- Adjusted R-Squared considers bias; preferred for simplicity
- Adjusted R-Squared includes a penalty for more predictors; preferred for its consideration of model complexity
- Adjusted R-Squared includes mean error; preferred for interpretability
The Adjusted R-Squared value is computed by including a penalty term for the number of predictors in the model, unlike the regular R-Squared. This makes it often preferred over R-Squared, especially when dealing with multiple predictors, as it takes into consideration the complexity of the model. The adjustment ensures that only meaningful predictors enhance the model's performance, avoiding the tendency of R-Squared to increase with more variables.
A city is facing issues with traffic congestion and wants to use Machine Learning to manage traffic flow. What kind of data and algorithms would you suggest?
- Drug Development, Weather Data
- Image Recognition, Financial Data
- Recommender Systems, Text Data
- Time-Series Analysis, Traffic Data
Time-Series Analysis and Traffic Data, including real-time traffic conditions, vehicle counts, and traffic camera feeds, can be used to predict congestion patterns and optimize traffic flow using algorithms like ARIMA or LSTM.