A core point in DBSCAN is a point that has at least MinPts within _________ distance from itself.

Epsilon
border point
cluster
noise point

A core point in DBSCAN has at least MinPts within the Epsilon distance from itself. The Epsilon value defines the radius of the neighborhood around the point, and if there are enough points (MinPts or more) within this radius, the point is considered a core point.

Discuss it

When multicollinearity is present in a dataset, it can make the coefficients of the variables ___________ and hard to interpret.

insignificant
reliable
stable
unstable

Multicollinearity can make the coefficients of the variables unstable and sensitive to small changes in the data. This makes the interpretation of individual coefficients unreliable and the model difficult to interpret.

Discuss it

Can you list some applications of Machine Learning?

Finance, Cooking
Games, Cooking
Games, Healthcare
Healthcare, Finance, Marketing

Machine Learning is applied in various domains such as healthcare (for predicting diseases, personalizing treatments), finance (for fraud detection, risk management), marketing (for customer segmentation, targeted advertising), and more. Its versatility has made it an essential tool in modern technology.

Discuss it

You are facing an overfitting problem in a linear model. How would you use Ridge, Lasso, or ElasticNet to address this issue?

Decrease regularization strength
Increase regularization strength
Remove all regularization
nan

Increasing the regularization strength can help to prevent overfitting by constraining the model complexity and reducing variance.

Discuss it

In the context of Decision Trees, how can overfitting be controlled using pruning techniques?

By increasing the number of features
By increasing the tree complexity
By reducing the training data
By reducing the tree complexity

Overfitting in Decision Trees can be controlled using pruning techniques by reducing the tree's complexity. By removing branches that add little predictive power, the model becomes less sensitive to noise in the training data and generalizes better to unseen examples.

Discuss it

What is underfitting, and how does it differ from overfitting?

Enhancing model complexity; similar to overfitting
Fitting the model too closely to the training data; same as overfitting
Fitting the model too loosely to the training data; opposite of overfitting
Reducing model complexity; similar to overfitting

Underfitting is when a model fits the training data too loosely and fails to capture the underlying pattern, the opposite of overfitting, where the model fits too closely.

Discuss it

What are the challenges in imbalanced classification problems?

Balanced data
Equal representation of all classes
No challenges
Overfitting to the majority class

Imbalanced classification problems, where the classes are not equally represented, can lead to models that are biased towards the majority class. This can result in poor performance on the minority class, requiring special techniques to address.

Discuss it

In a scenario where dimensionality reduction is essential but preserving the original features' meaning is also crucial, how would you approach using PCA?

You would avoid PCA and use another method
You would carefully interpret the principal components in terms of original features
You would perform PCA on a subset of the original features
You would use PCA without considering the original features' meaning

In this scenario, careful interpretation of the principal components in terms of the original features would be the key to preserve their meaning while still benefiting from dimensionality reduction.

Discuss it

What is the primary goal of clustering algorithms?

To classify labeled data
To find patterns and group similar data together
To predict outcomes
To solve reinforcement learning problems

The primary goal of clustering algorithms is to find patterns in the data and group similar data points together without using any labeled responses.

Discuss it

You notice that a Decision Tree is providing inconsistent results on different runs. How might you investigate and correct the underlying issue, possibly involving entropy, Gini Index, or pruning techniques?

Analyze the randomness in splitting and apply consistent pruning techniques
Change to a different algorithm
Ignore inconsistent results
Increase tree depth

Inconsistent results may stem from the randomness in splitting the data. Analyzing this aspect and applying consistent pruning techniques can help create more stable, reproducible results. Attention to the splitting criteria, such as entropy or Gini Index, can further refine the model's behavior.

Discuss it

In classification, when a model is biased toward predicting one class over another, it is known as a(n) ________ problem.

Clustering
Imbalanced classification
Multiclass classification
Overfitting

When a model consistently predicts one class over another, particularly when the classes are not equally represented, this is known as an imbalanced classification problem.

Discuss it

How does Lasso regression differ from Ridge regression?

Both use L1 regularization
Both use L2 regularization
Lasso uses L1 regularization, Ridge uses L2
Lasso uses L2 regularization, Ridge uses L1

Lasso (Least Absolute Shrinkage and Selection Operator) regression uses L1 regularization, which can lead to some coefficients being exactly zero, thus performing feature selection. Ridge regression uses L2 regularization, which shrinks the coefficients but doesn't set them to zero. These different regularization techniques define their behavior and application.

Discuss it