How do interpretability and explainability vary between AI, Machine Learning, and Deep Learning?

AI and Deep Learning are equally interpretable, Machine Learning is least
AI is least interpretable, Machine Learning and Deep Learning are equally interpretable
AI is most interpretable, Machine Learning is moderate, Deep Learning is least
Machine Learning is most interpretable, AI is moderate, Deep Learning is least

Generally, AI techniques can vary in interpretability, traditional Machine Learning models tend to be more interpretable, and Deep Learning models are often the least interpretable due to their complexity.

Discuss it

Can you explain what separates support vectors from other data points in SVM?

Cluster membership
Color
Distance to hyperplane
Size

Support vectors are the data points closest to the hyperplane, and they determine its position in SVM.

Discuss it

How do training techniques differ between traditional Machine Learning and Deep Learning?

Deep Learning relies on neural networks with multiple layers
Deep Learning uses fewer hyperparameters
Machine Learning requires more data
Machine Learning uses more complex algorithms

Deep Learning uses neural networks with multiple layers and various training techniques like backpropagation, while traditional Machine Learning may use different algorithms and methods.

Discuss it

How does the Kernel Trick transform the input data?

Compresses data
Maps data into higher-dimensional space
Normalizes data
Reduces dimensionality

The Kernel Trick maps the input data into a higher-dimensional space to enable linear separation in SVM.

Discuss it

The _________ is a single summary value that illustrates the ability of a classification model to discriminate between positive and negative classes.

AUC
F1-Score
Precision
Recall

The AUC (Area Under the Curve) provides a single summary value representing the model's ability to differentiate between positive and negative classes, considering various thresholds.

Discuss it

How would you handle a multi-class classification problem with LDA, and what is the importance of maximizing between-class variance in this context?

Apply LDA separately to each class; no importance of between-class variance
Apply LDA to all classes; maximize between-class variance for class separability
Ignore between-class variance and focus on within-class variance
Use another method entirely

For a multi-class classification problem, you would "apply LDA to all classes" and maximize between-class variance. This is essential for separating the classes from each other, which improves classification performance.

Discuss it

How do multi-class classification problems differ from binary classification problems?

Multi-class has more features
Multi-class has multiple classes; binary has two
Multi-class has two classes; binary has multiple
No difference

Binary classification deals with two classes, while multi-class classification deals with more than two. Multi-class problems can be more complex and require different handling or algorithms compared to binary classification.

Discuss it

How does the average linkage method differ from single and complete linkage in Hierarchical Clustering?

Uses the mean of all distances between pairs in clusters
Uses the median of all distances between pairs in clusters
Uses the mode of all distances between pairs in clusters
Uses the total of all distances between pairs in clusters

The average linkage method calculates the mean of all pairwise distances between the points in the clusters to determine the linkage. Single linkage uses the minimum distance, while complete linkage uses the maximum distance. Average linkage typically results in more balanced clusters, as it considers the overall distribution of distances.

Discuss it

The regularization parameter 'C' in SVM controls the trade-off between maximizing the margin and minimizing the _________.

Kernel size
Margin
Misclassification
Variance

The 'C' parameter controls the trade-off between maximizing the margin and minimizing misclassification.

Discuss it

Explain the process of selecting the number of principal components in PCA.

By choosing an arbitrary number
By selecting all eigenvectors
By using only the first eigenvector
By using the elbow method and the cumulative explained variance

The number of principal components in PCA can be selected by considering the cumulative explained variance and looking for an "elbow" in the plot, where adding more components does not significantly increase the explained variance.

Discuss it

Consider a situation where you're applying DBSCAN to a high-dimensional dataset. What challenges might you face, and how could you address them?

All of the above
Difficulty in visualizing; Reduce dimensionality
High computational cost; Optimize the algorithm
Risk of overfitting; Increase MinPts

High-dimensional data can present several challenges in clustering, including the risk of overfitting, difficulty in visualization, and high computational costs. Increasing MinPts can help prevent overfitting, while dimensionality reduction techniques like PCA can aid visualization. Optimizing the algorithm can help to reduce computational demands.

Discuss it

The ________ technique in classification helps in enhancing the model's ability to generalize by using different subsets of data during training.

Clustering
Cross-validation
Feature extraction
Overfitting

Cross-validation is a technique where the dataset is partitioned into different subsets (folds), and the model is trained and tested on different combinations of these folds. It helps in assessing the model's ability to generalize to unseen data.

Discuss it