The percentage of total variance explained by a principal component in PCA can be calculated by dividing the Eigenvalue of that component by the ________.
- magnitude of Eigenvectors
- number of Eigenvectors
- number of components
- sum of all Eigenvalues
The percentage of total variance explained by a principal component is calculated by dividing its Eigenvalue by the "sum of all Eigenvalues." This ratio gives the proportion of the dataset's total variance that is captured by that specific component.
How does the Elbow Method determine the optimal number of clusters, and what are its limitations?
- By evaluating the model's accuracy
- By finding the point of maximum curvature on a plot of variance vs. clusters
- By maximizing the cluster distances
- By minimizing the inter-cluster distances
The Elbow Method determines the optimal number of clusters by finding the "elbow" point on a plot of variance vs. clusters. Limitations include ambiguity in identifying the exact "elbow" and sensitivity to initialization.
How does choosing the value of K in the K-Nearest Neighbors (KNN) algorithm impact the decision boundary?
- Both 1 & 2 depending on value
- Makes it more complex
- Makes it smoother
- nan
A smaller K value results in a more complex decision boundary, while a larger K value makes it smoother.
You are given a complex dataset with a large amount of unstructured data. Which among AI, Machine Learning, or Deep Learning would be best suited to analyze this, and why?
- AI, for its simplicity
- Deep Learning, for its ability to handle complex and unstructured data
- Machine Learning, for its structured data analysis
- nan
Deep Learning models are adept at handling unstructured data and finding complex patterns, making them suitable for such a dataset.
In the context of a Confusion Matrix, _________ represents the cases where the model correctly predicted the negative class.
- False Negatives
- False Positives
- True Negatives
- True Positives
True Negatives (TN) in a Confusion Matrix represent cases where the model correctly predicted the negative class. It indicates that the negative instances were classified correctly.
Why is the choice of distance metric significant in the K-Nearest Neighbors (KNN) algorithm?
- It affects clustering efficiency
- It defines the complexity of the model
- It determines the similarity measure
- It influences feature selection
The choice of distance metric in KNN significantly impacts how similarity between instances is measured, affecting the neighbors chosen.
How does the Kernel Trick help in SVM?
- Enhances data visualization
- Reduces data size
- Speeds up computation
- Transforms data into higher dimension
The Kernel Trick in SVM transforms the data into a higher-dimensional space to make it linearly separable.
Hierarchical Clustering can be either agglomerative, where clusters are built from the bottom up, or divisive, where clusters are split from the top down. The most common method used is _________.
- Agglomerative
- Complete Linkage
- Divisive
- Single Linkage
Agglomerative method is the most commonly used approach in Hierarchical Clustering. It builds clusters from the bottom up, starting with individual data points and merging them into progressively larger clusters. This method allows for the creation of a dendrogram, which can be analyzed to choose the optimal number of clusters and understand the hierarchical relationships within the data.
The _________ hyperplane in SVM maximizes the margin between the support vectors of different classes.
- Decision
- Fixed
- Optimal
- Random
The optimal hyperplane in SVM is the one that maximizes the margin between support vectors of different classes.
What is an interaction effect in Multiple Linear Regression?
- A combined effect of two variables
- Linear relationship between variables
- Model optimization
- Removing irrelevant features
An interaction effect occurs when the effect of one variable on the dependent variable depends on the level of another variable. It shows the combined effect.
In which type of Machine Learning does the model learn from labeled data?
- Reinforcement Learning
- Semi-Supervised Learning
- Supervised Learning
- Unsupervised Learning
Supervised Learning involves training a model on a labeled dataset, meaning the desired output for each input is known. The model learns to predict the output from the input data.
Your Logistic Regression model is suffering from separation, causing some estimated Odds Ratios to be extremely large. How could you handle this issue?
- By adding more variables
- By applying regularization techniques
- By increasing the size of the dataset
- By removing all predictors
Separation in Logistic Regression can lead to overly large coefficient estimates. Applying regularization techniques, such as Ridge or Lasso, can help in constraining the coefficient estimates and mitigate this issue.