In a Multiple Linear Regression model, you discovered a significant interaction effect between two variables. How would you interpret this finding, and what implications might it have for the model?
- Add more variables
- Ignore the interaction
- No change to the model
- The effect of one variable depends on the level of the other
A significant interaction effect indicates that the effect of one variable on the response depends on the level of another variable. This means that the relationship between variables is not simply additive, and it may require the inclusion of an interaction term in the model to capture this complex relationship accurately.
What is regression in the context of Machine Learning?
- A method for classification
- A method for clustering
- A method for predicting a continuous output
- A method for text analysis
Regression in Machine Learning is a supervised learning technique used to predict a continuous output or numerical value based on input variables.
What term is used to refer to a set of input variables and their corresponding target values used to evaluate a Machine Learning model's performance?
- Evaluation set
- Testing set
- Training set
- Validation set
The "Testing set" is a set of input variables and corresponding target values used to evaluate a Machine Learning model's performance. It helps in assessing how well the model will perform on unseen data.
Explain how the F1-Score is computed and why it is used.
- Arithmetic mean of Precision and Recall, balances both metrics
- Geometric mean of Precision and Recall, emphasizes Recall
- Harmonic mean of Precision and Recall, balances both metrics
- nan
F1-Score is the harmonic mean of Precision and Recall. It helps balance both metrics, particularly when there's an uneven class distribution. It's often used when both false positives and false negatives are important to minimize.
Why is Bootstrapping an essential technique in statistical analysis?
- It allows training deep learning models
- It enables the estimation of the distribution of a statistic
- It provides a method for feature selection
- It speeds up computation
Bootstrapping is essential in statistical analysis because it allows estimating the distribution of a statistic, even with a small sample. By repeatedly resampling with replacement, it creates numerous "bootstrap samples," enabling the calculation of standard errors, confidence intervals, and other statistical properties.
What is the role of a decision boundary in classification problems?
- Separating classes in the feature space
- Separating data into clusters
- Separating features
- Separating training and test data
A decision boundary is a hypersurface that partitions the underlying feature space into classes. It plays a crucial role in determining the class label of a new data point based on which side of the boundary it lies.
In the context of a Confusion Matrix, _________ represents the cases where the model correctly predicted the negative class.
- False Negatives
- False Positives
- True Negatives
- True Positives
True Negatives (TN) in a Confusion Matrix represent cases where the model correctly predicted the negative class. It indicates that the negative instances were classified correctly.
You are given a complex dataset with a large amount of unstructured data. Which among AI, Machine Learning, or Deep Learning would be best suited to analyze this, and why?
- AI, for its simplicity
- Deep Learning, for its ability to handle complex and unstructured data
- Machine Learning, for its structured data analysis
- nan
Deep Learning models are adept at handling unstructured data and finding complex patterns, making them suitable for such a dataset.
How does choosing the value of K in the K-Nearest Neighbors (KNN) algorithm impact the decision boundary?
- Both 1 & 2 depending on value
- Makes it more complex
- Makes it smoother
- nan
A smaller K value results in a more complex decision boundary, while a larger K value makes it smoother.
How does the Elbow Method determine the optimal number of clusters, and what are its limitations?
- By evaluating the model's accuracy
- By finding the point of maximum curvature on a plot of variance vs. clusters
- By maximizing the cluster distances
- By minimizing the inter-cluster distances
The Elbow Method determines the optimal number of clusters by finding the "elbow" point on a plot of variance vs. clusters. Limitations include ambiguity in identifying the exact "elbow" and sensitivity to initialization.