The ________ technique in classification helps in enhancing the model's ability to generalize by using different subsets of data during training.

Clustering
Cross-validation
Feature extraction
Overfitting

Cross-validation is a technique where the dataset is partitioned into different subsets (folds), and the model is trained and tested on different combinations of these folds. It helps in assessing the model's ability to generalize to unseen data.

Discuss it

Consider a situation where you're applying DBSCAN to a high-dimensional dataset. What challenges might you face, and how could you address them?

All of the above
Difficulty in visualizing; Reduce dimensionality
High computational cost; Optimize the algorithm
Risk of overfitting; Increase MinPts

High-dimensional data can present several challenges in clustering, including the risk of overfitting, difficulty in visualization, and high computational costs. Increasing MinPts can help prevent overfitting, while dimensionality reduction techniques like PCA can aid visualization. Optimizing the algorithm can help to reduce computational demands.

Discuss it

Explain the process of selecting the number of principal components in PCA.

By choosing an arbitrary number
By selecting all eigenvectors
By using only the first eigenvector
By using the elbow method and the cumulative explained variance

The number of principal components in PCA can be selected by considering the cumulative explained variance and looking for an "elbow" in the plot, where adding more components does not significantly increase the explained variance.

Discuss it

The regularization parameter 'C' in SVM controls the trade-off between maximizing the margin and minimizing the _________.

Kernel size
Margin
Misclassification
Variance

The 'C' parameter controls the trade-off between maximizing the margin and minimizing misclassification.

Discuss it

How does clustering differ from classification?

Clustering and Classification are the same
Clustering is supervised; Classification is unsupervised
Clustering is unsupervised; Classification is supervised
Clustering uses regression

Clustering is an unsupervised learning technique that groups similar data points, whereas Classification is a supervised learning technique that assigns predefined labels to instances.

Discuss it

What are the limitations of Deep Learning as compared to other Machine Learning techniques?

Easier interpretability and requires more data
More interpretable and less efficient
Requires less data and is more complex
Requires more data and is often less interpretable

Deep Learning typically requires more data for effective training and often results in models that are less interpretable compared to traditional Machine Learning models.

Discuss it

Can you explain the assumptions underlying linear regression?

Independence of features, Normality of target variable, Linearity of relationship, Constant variance
Normal distribution of errors, Linearity of relationship, Independence of residuals, Constant variance
Normality of residuals, Constant variance, Independence of residuals, Linearity of relationship
Normality of residuals, Linearity of relationship, Multicollinearity, Independence of features

Linear regression assumes that the relationship between the dependent and independent variables is linear, errors are normally distributed, residuals are independent, and the variance of residuals is constant across all levels of the independent variables. These assumptions guide the model's performance and interpretation.

Discuss it

How does Polynomial Regression differ from Simple Linear Regression?

It fits a polynomial curve
It fits a straight line
It is used only for classification
It uses more variables

While Simple Linear Regression fits a straight line to the data, Polynomial Regression fits a polynomial curve, allowing for more flexibility in modeling non-linear relationships.

Discuss it

Autonomous vehicles rely on Machine Learning algorithms for tasks like and .

Disease Prediction, Weather Forecasting
Object Detection, Path Planning
Risk Management, Drug Development
Text Classification, Fraud Detection

Autonomous vehicles use Machine Learning for Object Detection and Path Planning, recognizing obstacles and determining optimal routes.

Discuss it

In what scenario would the AUC be a more informative metric than simply using Accuracy?

When the class distribution is balanced
When the class distribution is imbalanced
When the model has only one class
nan

The AUC (Area Under the Curve) of the ROC Curve can be more informative than Accuracy when dealing with imbalanced class distribution. It provides a more holistic measure of the model's ability to discriminate between positive and negative classes, unlike Accuracy, which may be skewed.

Discuss it

The ________ technique in classification helps in enhancing the model's ability to generalize by using different subsets of data during training.

Consider a situation where you're applying DBSCAN to a high-dimensional dataset. What challenges might you face, and how could you address them?

Explain the process of selecting the number of principal components in PCA.

The regularization parameter 'C' in SVM controls the trade-off between maximizing the margin and minimizing the _________.

How does clustering differ from classification?

What are the limitations of Deep Learning as compared to other Machine Learning techniques?

Can you explain the assumptions underlying linear regression?

How does Polynomial Regression differ from Simple Linear Regression?

Autonomous vehicles rely on Machine Learning algorithms for tasks like ____________ and ____________.

In what scenario would the AUC be a more informative metric than simply using Accuracy?

Autonomous vehicles rely on Machine Learning algorithms for tasks like and .