Which clustering method assigns data points to the nearest cluster center and recalculates the center until convergence?

  • Agglomerative
  • DBSCAN
  • Hierarchical
  • K-Means
K-Means clustering is an iterative algorithm that assigns each data point to the nearest cluster center, recalculating these centers until they converge.

t-SNE is particularly known for preserving which kind of structures from the high-dimensional data in the low-dimensional representation?

  • Global Structures
  • Local Structures
  • Numerical Structures
  • Geometric Structures
t-SNE is known for preserving local structures in the low-dimensional representation, making it effective for visualization and capturing fine-grained relationships.

When both precision and recall are important for a problem, one might consider optimizing the ________ score.

  • Accuracy
  • F1 Score
  • ROC AUC
  • Specificity
The F1 Score is a measure that balances both precision and recall. It is especially useful when you want to consider both false positives and false negatives in your classification problem.

When using K-means clustering, why is it sometimes recommended to run the algorithm multiple times with different initializations?

  • To ensure deterministic results.
  • To make the algorithm run faster.
  • To mitigate sensitivity to initial cluster centers.
  • To reduce the number of clusters.
K-means clustering is sensitive to initial cluster centers. Running it multiple times with different initializations helps find a more stable solution.

________ is a technique where during training, random subsets of neurons are ignored, helping to make the model more robust.

  • Dropout
  • Regularization
  • Batch Normalization
  • Activation Function
Dropout is a regularization technique that involves randomly deactivating a fraction of neurons during training. This helps prevent overfitting, making the model more robust and less dependent on specific neurons.

In the context of decision trees, what is "information gain" used for?

  • To assess the tree's overall accuracy
  • To calculate the depth of the tree
  • To determine the number of leaf nodes
  • To measure the purity of a split
Information gain is used to measure the purity of a split in a decision tree. It helps decide which feature to split on by evaluating how much it reduces uncertainty or entropy.

Which algorithm is a popular choice for solving the multi-armed bandit problem when the number of arms is large and some structure can be assumed on the rewards?

  • Epsilon-Greedy
  • UCB1
  • Thompson Sampling
  • Greedy
UCB1 (Upper Confidence Bound 1) is a popular choice for the multi-armed bandit problem when you can assume some structure on the rewards and the number of arms is large. UCB1 balances exploration and exploitation effectively by using confidence bounds to select arms.

In ________ learning, the algorithm isn't provided with the correct answers but discovers them through exploration and exploitation.

  • Reinforcement
  • Semi-supervised
  • Supervised
  • Unsupervised
Reinforcement learning involves exploration and exploitation strategies, where the algorithm learns by trial and error and discovers correct answers over time. It doesn't start with pre-defined correct answers.

In the context of text classification, Naive Bayes often works well because it can handle what type of data?

  • High-Dimensional and Sparse Data
  • Images and Videos
  • Low-Dimensional and Dense Data
  • Numeric Data
Naive Bayes is effective with high-dimensional and sparse data as it assumes independence between features, making it suitable for text data with numerous attributes.

In CNNs, the ________ layer is responsible for detecting features in an image.

  • Convolutional
  • Pooling
  • Fully Connected
  • Input
The 'Convolutional' layer in a Convolutional Neural Network (CNN) is responsible for detecting features in an image using convolution operations.