Which type of neural network is specifically designed to handle image data?
- Convolutional Neural Network
- Recurrent Neural Network
- Feedforward Network
- Decision Tree
Convolutional Neural Networks (CNNs) are tailored for image data processing, thanks to their ability to capture spatial patterns and features.
One common regularization technique involves adding a penalty to the loss function based on the magnitude of the coefficients, known as ________ regularization.
- L1 (Lasso)
- L2 (Ridge)
- Elastic Net
- Mean Squared Error
L2 (Ridge) regularization adds a penalty based on the sum of squared coefficients, helping to control the model's complexity and reduce overfitting.
Support Vector Machines (SVM) aim to find a ______ that best divides a dataset into classes.
- Cluster
- Decision Boundary
- Hyperplane
- Mean
Support Vector Machines aim to find a hyperplane that best divides a dataset into classes. This hyperplane maximizes the margin between the classes, making it a powerful tool for binary classification tasks. The concept of the "support vector" is crucial in SVM.
In Gaussian Mixture Models, the "mixture" refers to the combination of ________ Gaussian distributions.
- Different
- Similar
- Identical
- Overlapping
In a Gaussian Mixture Model (GMM), the "mixture" implies that we combine multiple Gaussian (normal) distributions to model complex data distributions. The term "identical" indicates that these component Gaussians are the same type.
A bioinformatics researcher is trying to visualize the similarities and differences between different genes in a 2D space. The data is high dimensional. Which technique would provide a good visualization emphasizing local similarities?
- t-Distributed Stochastic Neighbor Embedding (t-SNE)
- Principal Component Analysis
- Linear Regression
- A* Search Algorithm
t-SNE is well-suited for visualizing high-dimensional data by preserving local similarities. It maps data points to a 2D space in a way that emphasizes neighborhood relationships, making it ideal for visualizing gene similarities in high-dimensional data.
What is the primary goal of exploration in reinforcement learning?
- To gather information about the environment
- To maximize immediate rewards
- To stick with known actions
- To build a policy
Exploration's primary goal is to gather information about the environment, helping an RL agent learn and make better decisions in the long run.
In the context of text classification, Naive Bayes often works well because it can handle what type of data?
- Categorical Data
- High-Dimensional Data
- Numerical Data
- Time Series Data
Naive Bayes works well in text classification because it can effectively handle high-dimensional data with numerous features (words or terms).
How do residuals, the differences between the observed and predicted values, relate to linear regression?
- They are not relevant in linear regression
- They indicate how well the model fits the data
- They measure the strength of the relationship between predictors
- They represent the sum of squared errors
Residuals in linear regression measure how well the model fits the data. Specifically, they represent the differences between the observed and predicted values. Smaller residuals indicate a better fit, while larger residuals suggest a poorer fit.
In a case where a company wants to detect abnormal patterns in vast amounts of transaction data, which type of neural network model would be particularly beneficial in identifying these anomalies based on data reconstructions?
- Variational Autoencoder
- Long Short-Term Memory (LSTM)
- Feedforward Neural Network
- Restricted Boltzmann Machine
Variational Autoencoders (VAEs) are excellent for anomaly detection because they model data distributions and can recognize deviations from these distributions.
To avoid overfitting in large neural networks, one might employ a technique known as ________, which involves dropping out random neurons during training.
- Batch Normalization
- L2 Regularization
- Gradient Descent
- Dropout
The 'Dropout' technique involves randomly deactivating a fraction of neurons during training, which helps prevent overfitting in large neural networks.