In SVM, the data points that are closest to the decision boundary and influence its orientation are called ______.

  • Decision Points
  • Influence Points
  • Margin Points
  • Support Vectors
The data points that are closest to the decision boundary are known as "Support Vectors" in Support Vector Machines (SVM). These points play a crucial role in determining the orientation of the decision boundary.

GRUs are often considered a middle ground between basic RNNs and ________ in terms of complexity and performance.

  • LSTMs
  • CNNs
  • Autoencoders
  • K-Means Clustering
GRUs (Gated Recurrent Units) are a compromise between basic RNNs and LSTMs, offering a balance between the complexity and performance of these two types of recurrent networks.

A data scientist notices that their model performs exceptionally well on the training set but poorly on the validation set. What might be the reason, and what can be a potential solution?

  • Data preprocessing is the reason, and fine-tuning hyperparameters can be a potential solution.
  • Overfitting is the reason, and regularization techniques can be a potential solution.
  • The model is working correctly, and no action is needed.
  • Underfitting is the reason, and collecting more data can be a potential solution.
Overfitting occurs when the model learns the training data too well, leading to poor generalization. Regularization techniques like L1 or L2 regularization can prevent overfitting by adding penalties to the model's complexity, helping it perform better on the validation set.

If a model has low bias and high variance, it is likely that the model is ________.

  • Optimally Fitted
  • Overfitting
  • Underfitting
  • Well-fitted
A model with low bias and high variance is likely overfitting. Low bias means the model fits the training data very well (potentially too well), and high variance indicates that it's very sensitive to fluctuations in the data, which can lead to poor generalization. Overfitting is a common outcome of this scenario.

A utility company wants to predict the demand for electricity for the next week based on historical data. They have data for the past ten years, recorded every hour. Which type of machine learning task is this, and what challenges might they face due to the nature of the data?

  • Time Series Forecasting
  • Clustering
  • Image Recognition
  • Reinforcement Learning
This is a Time Series Forecasting task because it involves predicting future values based on historical data recorded at regular intervals. Challenges could include handling seasonality, trends, and outliers within the time series data. Ensuring the right feature selection and model choice is crucial.

How do Policy Gradient Methods differ from value-based methods in their approach to reinforcement learning?

  • Value-based methods learn
  • They learn both the
  • Policy Gradient Methods
  • They learn neither
Policy Gradient Methods focus on learning the policy directly, which means they determine the probability of taking actions. Value-based methods, on the other hand, learn the value of states or state-action pairs. This key difference is essential for understanding the approach to RL each method takes.

In the context of text classification, Naive Bayes often works well because it can handle what type of data?

  • Categorical Data
  • High-Dimensional Data
  • Numerical Data
  • Time Series Data
Naive Bayes works well in text classification because it can effectively handle high-dimensional data with numerous features (words or terms).

How do residuals, the differences between the observed and predicted values, relate to linear regression?

  • They are not relevant in linear regression
  • They indicate how well the model fits the data
  • They measure the strength of the relationship between predictors
  • They represent the sum of squared errors
Residuals in linear regression measure how well the model fits the data. Specifically, they represent the differences between the observed and predicted values. Smaller residuals indicate a better fit, while larger residuals suggest a poorer fit.

In a case where a company wants to detect abnormal patterns in vast amounts of transaction data, which type of neural network model would be particularly beneficial in identifying these anomalies based on data reconstructions?

  • Variational Autoencoder
  • Long Short-Term Memory (LSTM)
  • Feedforward Neural Network
  • Restricted Boltzmann Machine
Variational Autoencoders (VAEs) are excellent for anomaly detection because they model data distributions and can recognize deviations from these distributions.

To avoid overfitting in large neural networks, one might employ a technique known as ________, which involves dropping out random neurons during training.

  • Batch Normalization
  • L2 Regularization
  • Gradient Descent
  • Dropout
The 'Dropout' technique involves randomly deactivating a fraction of neurons during training, which helps prevent overfitting in large neural networks.