In what way does Machine Learning support the pharmaceutical industry in drug discovery and development?

  • Drug Discovery and Development
  • Image Recognition
  • Marketing Strategies
  • Supply Chain Management
Machine Learning supports the pharmaceutical industry by analyzing biological data to predict potential drug interactions, identifying promising compounds, enhancing drug design, and accelerating the overall drug discovery and development process.

Can you explain the concept of 'density reachability' in clustering?

  • Based on Hierarchical Structure
  • Based on Number of Clusters
  • Defines How Points Are Connected Through Density
  • Defines How Points Are Directly Connected
Density reachability in clustering refers to how points are connected through density, meaning one point is density-reachable from another if there's a sequence of points connecting them within a given density threshold.

What is the primary goal of Machine Learning?

  • Data cleaning
  • Data prediction and generalization
  • Data storage
  • Data visualization
The primary goal of Machine Learning is to build models that can predict and generalize from data, making decisions or predictions based on input data.

You are working with a dataset containing many irrelevant features. Which regularization technique would you prefer and why?

  • ElasticNet
  • Lasso
  • Ridge
  • nan
Lasso regularization adds an L1 penalty, which can cause some coefficients to be exactly zero, effectively removing irrelevant features from the model.

________ is a type of classification where there are more than two classes.

  • Binary classification
  • Imbalanced classification
  • Multiclass classification
  • Overfitting
Multiclass classification refers to the classification problems where there are more than two classes to be predicted. This contrasts with binary classification, which involves just two classes.

In what situations would RMSE be a more appropriate metric than MAE?

  • When larger errors are more critical to penalize
  • When smaller errors are more critical to penalize
  • When the model needs to be robust to outliers
  • When the model requires a metric in squared units
RMSE can be more appropriate than MAE when larger errors are more critical to penalize. Since RMSE squares the errors before averaging them, it gives more weight to larger errors compared to MAE. This characteristic of RMSE can be more suitable in applications where large deviations from the actual values are considered more detrimental than smaller ones.

When using Bootstrapping for estimating the standard error of a statistic, the process involves repeatedly resampling the data ________ times.

  • infinite
  • k
  • multiple
  • n
When using Bootstrapping for estimating the standard error of a statistic, the process involves repeatedly resampling the data "n" times. The resampling is performed with replacement, and statistical measures are calculated for each bootstrap sample, providing an empirical distribution from which the standard error can be estimated.

How would you optimize the hyperparameters in an SVM to achieve the best performance on a specific dataset?

  • Guess the hyperparameters
  • Optimize the kernel only
  • Use grid search or random search with cross-validation
  • Use only the default values
Utilizing techniques like grid search or random search with cross-validation allows for systematic hyperparameter tuning to achieve the best performance.

The term _________ refers to a situation where a regression model fits the training data too closely, resulting in poor performance on new data.

  • Bias
  • Overfitting
  • Regularization
  • Underfitting
Overfitting refers to a situation where a regression model fits the training data too closely, capturing noise and resulting in poor performance on unseen data.

Is DBSCAN sensitive to the choice of Epsilon and MinPts? Why or why not?

  • No, they are auto-calculated parameters
  • No, they have minimal effect on the outcome
  • Yes, they define the shape of the clusters
  • Yes, they influence the density of clusters
DBSCAN is indeed sensitive to the choice of Epsilon and MinPts. These parameters are crucial in determining the density of the clusters, as Epsilon controls the maximum radius of the neighborhood, and MinPts sets the minimum number of points required to form a dense region. Selecting inappropriate values can lead to suboptimal clustering results.