Which process involves transforming and creating new variables to improve a machine learning model's predictive performance?
- Data preprocessing
- Feature engineering
- Hyperparameter tuning
- Model training
Feature engineering is the process of transforming and creating new variables based on the existing data to enhance a model's predictive performance. This can involve scaling, encoding, or creating new features from existing ones.
You are developing a recommendation system for a music app. While the system's bias is low, it tends to offer very different song recommendations for slight variations in user input. This is an indication of which issue in the bias-variance trade-off?
- High Bias
- High Variance
- Overfitting
- Underfitting
This scenario indicates overfitting in the bias-variance trade-off. Overfit models tend to provide very different recommendations for slight input changes, suggesting that the model is fitting noise in the data and not generalizing well to new user preferences.
When models are too simple and cannot capture the underlying trend of the data, it's termed as ________.
- Misfitting
- Overfitting
- Simplification
- Underfitting
When a model is too simple to capture the underlying patterns in the data, it is referred to as "underfitting." Underfit models have high bias and low variance, making them ineffective for predictions.
In the context of deep learning, what is the primary use case of autoencoders?
- Image Classification
- Anomaly Detection
- Text Generation
- Reinforcement Learning
The primary use case of autoencoders in deep learning is for anomaly detection. They can learn the normal patterns in data and detect anomalies or deviations from these patterns, making them useful in various applications, including fraud detection and fault diagnosis.
The weights and biases in a neural network are adjusted during the ________ process to minimize the loss.
- Forward Propagation
- Backpropagation
- Initialization
- Regularization
Weights and biases in a neural network are adjusted during the 'Backpropagation' process to minimize the loss by propagating the error backward through the network.
What is the primary goal of exploration in reinforcement learning?
- To gather information about the environment
- To maximize immediate rewards
- To stick with known actions
- To build a policy
Exploration's primary goal is to gather information about the environment, helping an RL agent learn and make better decisions in the long run.
A bioinformatics researcher is trying to visualize the similarities and differences between different genes in a 2D space. The data is high dimensional. Which technique would provide a good visualization emphasizing local similarities?
- t-Distributed Stochastic Neighbor Embedding (t-SNE)
- Principal Component Analysis
- Linear Regression
- A* Search Algorithm
t-SNE is well-suited for visualizing high-dimensional data by preserving local similarities. It maps data points to a 2D space in a way that emphasizes neighborhood relationships, making it ideal for visualizing gene similarities in high-dimensional data.
How do Policy Gradient Methods differ from value-based methods in their approach to reinforcement learning?
- Value-based methods learn
- They learn both the
- Policy Gradient Methods
- They learn neither
Policy Gradient Methods focus on learning the policy directly, which means they determine the probability of taking actions. Value-based methods, on the other hand, learn the value of states or state-action pairs. This key difference is essential for understanding the approach to RL each method takes.
A utility company wants to predict the demand for electricity for the next week based on historical data. They have data for the past ten years, recorded every hour. Which type of machine learning task is this, and what challenges might they face due to the nature of the data?
- Time Series Forecasting
- Clustering
- Image Recognition
- Reinforcement Learning
This is a Time Series Forecasting task because it involves predicting future values based on historical data recorded at regular intervals. Challenges could include handling seasonality, trends, and outliers within the time series data. Ensuring the right feature selection and model choice is crucial.
If a model has low bias and high variance, it is likely that the model is ________.
- Optimally Fitted
- Overfitting
- Underfitting
- Well-fitted
A model with low bias and high variance is likely overfitting. Low bias means the model fits the training data very well (potentially too well), and high variance indicates that it's very sensitive to fluctuations in the data, which can lead to poor generalization. Overfitting is a common outcome of this scenario.
GRUs are often considered a middle ground between basic RNNs and ________ in terms of complexity and performance.
- LSTMs
- CNNs
- Autoencoders
- K-Means Clustering
GRUs (Gated Recurrent Units) are a compromise between basic RNNs and LSTMs, offering a balance between the complexity and performance of these two types of recurrent networks.
In SVM, the data points that are closest to the decision boundary and influence its orientation are called ______.
- Decision Points
- Influence Points
- Margin Points
- Support Vectors
The data points that are closest to the decision boundary are known as "Support Vectors" in Support Vector Machines (SVM). These points play a crucial role in determining the orientation of the decision boundary.