While LSTMs have three gates, the GRU simplifies the model by using only ________ gates.

  • 1
  • 2
  • 3
  • 4
Gated Recurrent Units (GRUs) simplify the model by using only two gates: an update gate and a reset gate, as opposed to the three gates in LSTMs.

In a situation where you have both numerical and categorical data, which clustering method might pose challenges, and why?

  • Agglomerative Clustering
  • DBSCAN Clustering
  • Hierarchical Clustering
  • K-Means Clustering
K-Means may pose challenges in such a situation because it calculates centroids using the mean, which isn't well-defined for categorical data. Other methods like hierarchical or DBSCAN may be more suitable.

An online retailer wants to create a hierarchical structure of product categories based on product descriptions and features. They want this hierarchy to be easily interpretable and visual. Which clustering approach would be most suitable?

  • Hierarchical Clustering
  • DBSCAN
  • Gaussian Mixture Model (GMM)
  • Affinity Propagation
For creating a hierarchical structure, Hierarchical Clustering is the most suitable approach. It builds a tree-like structure that is interpretable and can be easily visualized. This makes it ideal for organizing product categories based on descriptions and features.

What is the primary benefit of using transfer learning in deep learning models?

  • Improved training time
  • Better performance
  • Reduced data requirement
  • Enhanced model complexity
The primary benefit of transfer learning in deep learning is 'Better performance.' This technique leverages knowledge from pre-trained models, allowing the model to perform well even with limited data and reducing the need for lengthy training.

Which type of neural network is specifically designed to handle image data?

  • Convolutional Neural Network
  • Recurrent Neural Network
  • Feedforward Network
  • Decision Tree
Convolutional Neural Networks (CNNs) are tailored for image data processing, thanks to their ability to capture spatial patterns and features.

One common regularization technique involves adding a penalty to the loss function based on the magnitude of the coefficients, known as ________ regularization.

  • L1 (Lasso)
  • L2 (Ridge)
  • Elastic Net
  • Mean Squared Error
L2 (Ridge) regularization adds a penalty based on the sum of squared coefficients, helping to control the model's complexity and reduce overfitting.

Support Vector Machines (SVM) aim to find a ______ that best divides a dataset into classes.

  • Cluster
  • Decision Boundary
  • Hyperplane
  • Mean
Support Vector Machines aim to find a hyperplane that best divides a dataset into classes. This hyperplane maximizes the margin between the classes, making it a powerful tool for binary classification tasks. The concept of the "support vector" is crucial in SVM.

In Gaussian Mixture Models, the "mixture" refers to the combination of ________ Gaussian distributions.

  • Different
  • Similar
  • Identical
  • Overlapping
In a Gaussian Mixture Model (GMM), the "mixture" implies that we combine multiple Gaussian (normal) distributions to model complex data distributions. The term "identical" indicates that these component Gaussians are the same type.

The weights and biases in a neural network are adjusted during the ________ process to minimize the loss.

  • Forward Propagation
  • Backpropagation
  • Initialization
  • Regularization
Weights and biases in a neural network are adjusted during the 'Backpropagation' process to minimize the loss by propagating the error backward through the network.

In the context of deep learning, what is the primary use case of autoencoders?

  • Image Classification
  • Anomaly Detection
  • Text Generation
  • Reinforcement Learning
The primary use case of autoencoders in deep learning is for anomaly detection. They can learn the normal patterns in data and detect anomalies or deviations from these patterns, making them useful in various applications, including fraud detection and fault diagnosis.