Explain how a Decision Tree works in the context of Machine Learning.

Based on complexity, combines data at each node
Based on distance, groups data at each node
Based on entropy, splits data at each node
Based on gradient, organizes data at each node

A Decision Tree works by splitting the data into subsets based on feature values. This is done recursively at each node by selecting the feature that provides the best split according to a metric like entropy or Gini impurity. The process continues until specific criteria are met, creating a tree-like structure.

Discuss it

How can you tune hyperparameters in SVM to prevent overfitting?

Changing the color of hyperplane
Increasing data size
Reducing feature dimensions
Using appropriate kernel and regularization

Tuning hyperparameters like the choice of kernel and regularization helps in controlling model complexity to prevent overfitting in SVM.

Discuss it

How does adding regularization help in avoiding overfitting?

By adding noise to the training data
By fitting the model closely to the training data
By increasing model complexity
By reducing model complexity

Regularization helps in avoiding overfitting by "reducing model complexity." It adds a penalty to the loss function, constraining the weights and preventing the model from fitting too closely to the training data.

Discuss it

The ___________ regression technique can be used when the relationship between the independent and dependent variables is not linear.

L1 Regularization
Logistic
Polynomial
Simple Linear

Polynomial Regression can model non-linear relationships between independent and dependent variables by transforming the predictors into a polynomial form, allowing for more complex fits.

Discuss it

You reduced the complexity of your model to prevent overfitting, but it led to underfitting. How would you find a balance between complexity and fit?

Add regularization
All of the above
Increase dataset size
Try cross-validation

Finding a balance might involve using cross-validation to systematically find the right level of complexity that fits well with the training data but also generalizes well to the validation data. This process helps in finding the right hyperparameters without biasing the test data.

Discuss it

How does LDA maximize the separation between different classes in a dataset?

By maximizing between-class variance and minimizing within-class variance
By maximizing both within-class and between-class variance
By minimizing between-class variance and maximizing within-class variance
By minimizing both within-class and between-class variance

LDA maximizes the separation between different classes by "maximizing between-class variance and minimizing within-class variance." This process ensures that different classes are far apart, while data points within the same class are close together, resulting in better class separation.

Discuss it

_________ clustering builds a tree-like diagram called a dendrogram, allowing you to visualize the relationships between clusters.

DBSCAN
Hierarchical
K-Means
Spectral

Hierarchical clustering builds a dendrogram, which allows visualization of the relationships between clusters, showing how the clusters are connected.

Discuss it

To detect multicollinearity in a dataset, one common method is to calculate the ___________ Inflation Factor (VIF).

Validation
Variable
Variance
Vector

The Variance Inflation Factor (VIF) is a measure used to detect multicollinearity. It quantifies how much a variable is inflating the standard errors due to its correlation with other variables. A high VIF indicates multicollinearity.

Discuss it

The risk of overfitting can be increased if the same data is used for both _ and _ of the Machine Learning model.

evaluation, processing
training, testing
training, validation
validation, training

If the same data is used for both "training" and "testing," the model may perform well on that data but poorly on unseen data, leading to overfitting.

Discuss it

You have applied PCA to your dataset and the first three principal components explain 95% of the variance. What does this signify, and how would you proceed?

This indicates an error in the PCA process
This means that 5% of the data is missing, so you should reapply PCA
This means that 95% of the variance is captured, so you may choose to proceed with these components
This means that the data is uniformly distributed and PCA is not needed

The first three principal components explaining 95% of the variance means that most of the original information is captured, and you may proceed with these components if the loss of 5% is acceptable.

Discuss it

In the context of DBSCAN, if two core points are within the Epsilon distance of each other, they are said to be __________.

directly dense reachable
indirectly dense reachable
separate
unrelated

In DBSCAN, if two core points are within the Epsilon distance of each other, they are said to be directly dense reachable. This concept is used to establish connections between core points and helps in forming clusters by linking together points that are close to one another.

Discuss it

How can the 'k-distance graph' be used in selecting the optimal Epsilon for DBSCAN?

By calculating the average distance to k-nearest neighbors
By determining the distance between k centroids
By displaying k clusters' distances
By plotting the distance to the kth nearest neighbor of each point

The 'k-distance graph' can be used to select the optimal Epsilon by plotting the distance to the kth nearest neighbor for each point and looking for an "elbow" or a point of inflection. This inflection point can be a good estimate for Epsilon, helping to choose a value that balances density requirements without overly segmenting the data.

Discuss it

Explain how a Decision Tree works in the context of Machine Learning.

How can you tune hyperparameters in SVM to prevent overfitting?

How does adding regularization help in avoiding overfitting?

The ___________ regression technique can be used when the relationship between the independent and dependent variables is not linear.

You reduced the complexity of your model to prevent overfitting, but it led to underfitting. How would you find a balance between complexity and fit?

How does LDA maximize the separation between different classes in a dataset?

_________ clustering builds a tree-like diagram called a dendrogram, allowing you to visualize the relationships between clusters.

To detect multicollinearity in a dataset, one common method is to calculate the ___________ Inflation Factor (VIF).

The risk of overfitting can be increased if the same data is used for both _________ and _________ of the Machine Learning model.

You have applied PCA to your dataset and the first three principal components explain 95% of the variance. What does this signify, and how would you proceed?

In the context of DBSCAN, if two core points are within the Epsilon distance of each other, they are said to be __________.

How can the 'k-distance graph' be used in selecting the optimal Epsilon for DBSCAN?

The risk of overfitting can be increased if the same data is used for both _ and _ of the Machine Learning model.