How do interpretability and explainability vary between AI, Machine Learning, and Deep Learning?
- AI and Deep Learning are equally interpretable, Machine Learning is least
- AI is least interpretable, Machine Learning and Deep Learning are equally interpretable
- AI is most interpretable, Machine Learning is moderate, Deep Learning is least
- Machine Learning is most interpretable, AI is moderate, Deep Learning is least
Generally, AI techniques can vary in interpretability, traditional Machine Learning models tend to be more interpretable, and Deep Learning models are often the least interpretable due to their complexity.
Can you explain what separates support vectors from other data points in SVM?
- Cluster membership
- Color
- Distance to hyperplane
- Size
Support vectors are the data points closest to the hyperplane, and they determine its position in SVM.
How do training techniques differ between traditional Machine Learning and Deep Learning?
- Deep Learning relies on neural networks with multiple layers
- Deep Learning uses fewer hyperparameters
- Machine Learning requires more data
- Machine Learning uses more complex algorithms
Deep Learning uses neural networks with multiple layers and various training techniques like backpropagation, while traditional Machine Learning may use different algorithms and methods.
How does the Kernel Trick transform the input data?
- Compresses data
- Maps data into higher-dimensional space
- Normalizes data
- Reduces dimensionality
The Kernel Trick maps the input data into a higher-dimensional space to enable linear separation in SVM.
The _________ is a single summary value that illustrates the ability of a classification model to discriminate between positive and negative classes.
- AUC
- F1-Score
- Precision
- Recall
The AUC (Area Under the Curve) provides a single summary value representing the model's ability to differentiate between positive and negative classes, considering various thresholds.
How would you handle a multi-class classification problem with LDA, and what is the importance of maximizing between-class variance in this context?
- Apply LDA separately to each class; no importance of between-class variance
- Apply LDA to all classes; maximize between-class variance for class separability
- Ignore between-class variance and focus on within-class variance
- Use another method entirely
For a multi-class classification problem, you would "apply LDA to all classes" and maximize between-class variance. This is essential for separating the classes from each other, which improves classification performance.
How do multi-class classification problems differ from binary classification problems?
- Multi-class has more features
- Multi-class has multiple classes; binary has two
- Multi-class has two classes; binary has multiple
- No difference
Binary classification deals with two classes, while multi-class classification deals with more than two. Multi-class problems can be more complex and require different handling or algorithms compared to binary classification.
How does the average linkage method differ from single and complete linkage in Hierarchical Clustering?
- Uses the mean of all distances between pairs in clusters
- Uses the median of all distances between pairs in clusters
- Uses the mode of all distances between pairs in clusters
- Uses the total of all distances between pairs in clusters
The average linkage method calculates the mean of all pairwise distances between the points in the clusters to determine the linkage. Single linkage uses the minimum distance, while complete linkage uses the maximum distance. Average linkage typically results in more balanced clusters, as it considers the overall distribution of distances.
The regularization parameter 'C' in SVM controls the trade-off between maximizing the margin and minimizing the _________.
- Kernel size
- Margin
- Misclassification
- Variance
The 'C' parameter controls the trade-off between maximizing the margin and minimizing misclassification.
Explain the process of selecting the number of principal components in PCA.
- By choosing an arbitrary number
- By selecting all eigenvectors
- By using only the first eigenvector
- By using the elbow method and the cumulative explained variance
The number of principal components in PCA can be selected by considering the cumulative explained variance and looking for an "elbow" in the plot, where adding more components does not significantly increase the explained variance.
Consider a situation where you're applying DBSCAN to a high-dimensional dataset. What challenges might you face, and how could you address them?
- All of the above
- Difficulty in visualizing; Reduce dimensionality
- High computational cost; Optimize the algorithm
- Risk of overfitting; Increase MinPts
High-dimensional data can present several challenges in clustering, including the risk of overfitting, difficulty in visualization, and high computational costs. Increasing MinPts can help prevent overfitting, while dimensionality reduction techniques like PCA can aid visualization. Optimizing the algorithm can help to reduce computational demands.
The ________ technique in classification helps in enhancing the model's ability to generalize by using different subsets of data during training.
- Clustering
- Cross-validation
- Feature extraction
- Overfitting
Cross-validation is a technique where the dataset is partitioned into different subsets (folds), and the model is trained and tested on different combinations of these folds. It helps in assessing the model's ability to generalize to unseen data.