In the k-NN algorithm, as the value of k increases, the decision boundary becomes __________.

Linear
More complex
More simplified
Non-existent

As the value of k in k-NN increases, the decision boundary becomes more simplified because it is based on fewer neighboring data points.

Discuss it

A company wants to segment its customers based on their purchasing behavior. They have a fair idea that there are around 5 distinct segments but want to confirm this. Which clustering algorithm might they start with?

K-Means Clustering
Agglomerative Hierarchical Clustering
Mean-Shift Clustering
Spectral Clustering

The company might start with K-Means Clustering to confirm their idea of five distinct segments. K-Means is often used for partitioning data into a pre-specified number of clusters and can be a good choice when you have a rough idea of the number of clusters.

Discuss it

Variational autoencoders (VAEs) introduce a probabilistic spin to autoencoders by associating a ________ with the encoded representations.

Probability Distribution
Singular Value Decomposition
Principal Component
Regression Function

VAEs introduce a probabilistic element to autoencoders by associating a probability distribution (typically Gaussian) with the encoded representations. This allows for generating new data points.

Discuss it

Which regression technique is primarily used for predicting a continuous outcome variable (like house price)?

Decision Tree Regression
Linear Regression
Logistic Regression
Polynomial Regression

Linear Regression is the most common technique for predicting a continuous outcome variable, such as house prices. It establishes a linear relationship between input features and the output.

Discuss it

The Actor-Critic model combines value-based and ________ methods to optimize its decision-making process.

Policy-Based
Model-Free
Model-Based
Q-Learning

The Actor-Critic model combines value-based (critic) and model-free (actor) methods to optimize decision-making. The critic evaluates actions using value functions, and the actor selects actions based on this evaluation, thus combining two approaches for improved learning.

Discuss it

For text classification problems, the ________ variant of Naive Bayes is often used.

K-Means
Multinomial
Random Forest
SVM

In text classification, the Multinomial variant of Naive Bayes is commonly used due to its suitability for modeling discrete data like word counts.

Discuss it

________ is a technique where during training, random subsets of neurons are ignored, helping to make the model more robust.

Dropout
Regularization
Batch Normalization
Activation Function

Dropout is a regularization technique that involves randomly deactivating a fraction of neurons during training. This helps prevent overfitting, making the model more robust and less dependent on specific neurons.

Discuss it

In the context of decision trees, what is "information gain" used for?

To assess the tree's overall accuracy
To calculate the depth of the tree
To determine the number of leaf nodes
To measure the purity of a split

Information gain is used to measure the purity of a split in a decision tree. It helps decide which feature to split on by evaluating how much it reduces uncertainty or entropy.

Discuss it

Which algorithm is a popular choice for solving the multi-armed bandit problem when the number of arms is large and some structure can be assumed on the rewards?

Epsilon-Greedy
UCB1
Thompson Sampling
Greedy

UCB1 (Upper Confidence Bound 1) is a popular choice for the multi-armed bandit problem when you can assume some structure on the rewards and the number of arms is large. UCB1 balances exploration and exploitation effectively by using confidence bounds to select arms.

Discuss it

In ________ learning, the algorithm isn't provided with the correct answers but discovers them through exploration and exploitation.

Reinforcement
Semi-supervised
Supervised
Unsupervised

Reinforcement learning involves exploration and exploitation strategies, where the algorithm learns by trial and error and discovers correct answers over time. It doesn't start with pre-defined correct answers.

Discuss it