What type of learning algorithm utilizes labeled data to make predictions?

  • Reinforcement Learning
  • Semi-supervised Learning
  • Supervised Learning
  • Unsupervised Learning
Supervised Learning uses labeled data, where the output is known, to train the algorithm and make predictions.

The slope coefficient in Simple Linear Regression gives the _________ change in the dependent variable for a one-unit change in the independent variable.

  • Absolute
  • Constant
  • Incremental
  • Marginal
The slope coefficient in Simple Linear Regression gives the marginal change in the dependent variable for a one-unit change in the independent variable.

What is Accuracy in the context of classification metrics?

  • False Positives / Total predictions
  • Total correct predictions / Total predictions
  • True Negatives / (True Negatives + False Positives)
  • True Positives / (True Positives + False Negatives)
Accuracy is the ratio of correct predictions to the total number of predictions. It gives an overall measure of how well the model is performing, but may not be suitable for imbalanced datasets where one class dominates.

You are facing an overfitting problem in a linear model. How would you use Ridge, Lasso, or ElasticNet to address this issue?

  • Decrease regularization strength
  • Increase regularization strength
  • Remove all regularization
  • nan
Increasing the regularization strength can help to prevent overfitting by constraining the model complexity and reducing variance.

Can you list some applications of Machine Learning?

  • Finance, Cooking
  • Games, Cooking
  • Games, Healthcare
  • Healthcare, Finance, Marketing
Machine Learning is applied in various domains such as healthcare (for predicting diseases, personalizing treatments), finance (for fraud detection, risk management), marketing (for customer segmentation, targeted advertising), and more. Its versatility has made it an essential tool in modern technology.

When multicollinearity is present in a dataset, it can make the coefficients of the variables ___________ and hard to interpret.

  • insignificant
  • reliable
  • stable
  • unstable
Multicollinearity can make the coefficients of the variables unstable and sensitive to small changes in the data. This makes the interpretation of individual coefficients unreliable and the model difficult to interpret.

A core point in DBSCAN is a point that has at least MinPts within _________ distance from itself.

  • Epsilon
  • border point
  • cluster
  • noise point
A core point in DBSCAN has at least MinPts within the Epsilon distance from itself. The Epsilon value defines the radius of the neighborhood around the point, and if there are enough points (MinPts or more) within this radius, the point is considered a core point.

You've developed a Polynomial Regression model with a high-degree polynomial, and it's performing exceptionally well on the training data but poorly on the test data. What might be the issue, and how would you address it?

  • Add more features
  • Increase the degree
  • Reduce the degree or apply regularization
  • Use a different algorithm entirely
The issue likely is overfitting due to the high-degree polynomial. Reducing the degree or applying regularization techniques like Ridge or Lasso can help to reduce the model's complexity and improve generalization to unseen data.

In classification, when a model is biased toward predicting one class over another, it is known as a(n) ________ problem.

  • Clustering
  • Imbalanced classification
  • Multiclass classification
  • Overfitting
When a model consistently predicts one class over another, particularly when the classes are not equally represented, this is known as an imbalanced classification problem.

You notice that a Decision Tree is providing inconsistent results on different runs. How might you investigate and correct the underlying issue, possibly involving entropy, Gini Index, or pruning techniques?

  • Analyze the randomness in splitting and apply consistent pruning techniques
  • Change to a different algorithm
  • Ignore inconsistent results
  • Increase tree depth
Inconsistent results may stem from the randomness in splitting the data. Analyzing this aspect and applying consistent pruning techniques can help create more stable, reproducible results. Attention to the splitting criteria, such as entropy or Gini Index, can further refine the model's behavior.