How does the Elbow Method determine the optimal number of clusters, and what are its limitations?

  • By evaluating the model's accuracy
  • By finding the point of maximum curvature on a plot of variance vs. clusters
  • By maximizing the cluster distances
  • By minimizing the inter-cluster distances
The Elbow Method determines the optimal number of clusters by finding the "elbow" point on a plot of variance vs. clusters. Limitations include ambiguity in identifying the exact "elbow" and sensitivity to initialization.

The percentage of total variance explained by a principal component in PCA can be calculated by dividing the Eigenvalue of that component by the ________.

  • magnitude of Eigenvectors
  • number of Eigenvectors
  • number of components
  • sum of all Eigenvalues
The percentage of total variance explained by a principal component is calculated by dividing its Eigenvalue by the "sum of all Eigenvalues." This ratio gives the proportion of the dataset's total variance that is captured by that specific component.

In ElasticNet regularization, the mixing parameter 'alpha' balances the effects of ________ and ________.

  • L1, L2
  • L1, L3
  • L2, L3
  • nan
The 'alpha' parameter in ElasticNet regularization balances the effects of L1 and L2 penalties, providing a compromise between Ridge and Lasso.

The _________ hyperplane in SVM maximizes the margin between the support vectors of different classes.

  • Decision
  • Fixed
  • Optimal
  • Random
The optimal hyperplane in SVM is the one that maximizes the margin between support vectors of different classes.

Hierarchical Clustering can be either agglomerative, where clusters are built from the bottom up, or divisive, where clusters are split from the top down. The most common method used is _________.

  • Agglomerative
  • Complete Linkage
  • Divisive
  • Single Linkage
Agglomerative method is the most commonly used approach in Hierarchical Clustering. It builds clusters from the bottom up, starting with individual data points and merging them into progressively larger clusters. This method allows for the creation of a dendrogram, which can be analyzed to choose the optimal number of clusters and understand the hierarchical relationships within the data.

How does the Kernel Trick help in SVM?

  • Enhances data visualization
  • Reduces data size
  • Speeds up computation
  • Transforms data into higher dimension
The Kernel Trick in SVM transforms the data into a higher-dimensional space to make it linearly separable.

Why is the choice of distance metric significant in the K-Nearest Neighbors (KNN) algorithm?

  • It affects clustering efficiency
  • It defines the complexity of the model
  • It determines the similarity measure
  • It influences feature selection
The choice of distance metric in KNN significantly impacts how similarity between instances is measured, affecting the neighbors chosen.

What is an interaction effect in Multiple Linear Regression?

  • A combined effect of two variables
  • Linear relationship between variables
  • Model optimization
  • Removing irrelevant features
An interaction effect occurs when the effect of one variable on the dependent variable depends on the level of another variable. It shows the combined effect.

How can you assess the accuracy and reliability of a regression model's predictions?

  • Through classification metrics
  • Through clustering metrics
  • Through regression metrics like RMSE, R-Squared, MAE
  • Through text analysis
The accuracy and reliability of a regression model's predictions can be assessed through regression metrics like Root Mean Squared Error (RMSE), R-Squared, Mean Absolute Error (MAE), etc. These metrics provide quantitative measures of how well the model's predictions align with the actual values, considering both the direction and magnitude of errors.

Differentiate between feature selection and feature extraction in the context of dimensionality reduction.

  • Both are the same
  • Depends on the data
  • Feature selection picks, extraction transforms
  • Feature selection transforms, extraction picks
Feature selection involves picking a subset of the original features, whereas feature extraction involves transforming the original features into a new set. Feature extraction usually leads to new features that are combinations of the original ones, while feature selection maintains the original features but reduces their number.