In reinforcement learning, what term describes the dilemma of choosing between trying out new actions and sticking with known actions that work?
- Exploration-Exploitation Dilemma
- Action Selection Dilemma
- Reinforcement Dilemma
- Policy Dilemma
The Exploration-Exploitation Dilemma is the challenge of balancing exploration (trying new actions) with exploitation (using known actions). It's crucial in RL for optimal decision-making.
How do the generator and discriminator components of a GAN interact during training?
- The generator produces real data.
- The discriminator generates fake data.
- The generator tries to fool the discriminator.
- The discriminator generates real data.
In a GAN (Generative Adversarial Network), the generator creates fake data to deceive the discriminator, which aims to distinguish between real and fake data. This adversarial process improves the quality of the generated data.
A company wants to develop a chatbot that learns how to respond to customer queries by interacting with them and getting feedback. The chatbot should improve its responses over time based on this feedback. This is an application of which type of learning?
- Online Learning
- Reinforcement Learning
- Supervised Learning
- Unsupervised Learning
This is an application of reinforcement learning. In reinforcement learning, an agent interacts with its environment and learns to make decisions to maximize a reward signal. The chatbot improves based on feedback (rewards) received.
When regular Q-learning takes too much time to converge in a high-dimensional state space (e.g., autonomous vehicle parking), what modification could help it learn faster?
- Deep Q-Networks (DQNs)
- Policy Gradient Methods
- Fitted Q-Iteration (FQI)
- Temporal Difference (TD) Learning
Using Deep Q-Networks (DQNs) is a modification of Q-learning, which employs neural networks to handle high-dimensional state spaces efficiently. DQNs can approximate the Q-values, expediting learning in complex environments.
Techniques like backward elimination, forward selection, and recursive feature elimination are used for ________ in machine learning.
- Cross-Validation
- Data Preprocessing
- Feature Selection
- Model Training
Techniques like backward elimination, forward selection, and recursive feature elimination are used for feature selection in machine learning. Feature selection helps identify the most relevant features for building accurate models and can improve model efficiency.
Which tool or technique is often used to make complex machine learning models more understandable for humans?
- Explainable AI (XAI)
- Reinforcement Learning
- Principal Component Analysis
- Gradient Boosting
Explainable AI (XAI) techniques are employed to simplify complex machine learning models, making them interpretable, and providing insights into model decisions.
Ensuring that a machine learning model does not unintentionally favor or discriminate against certain groups is ensuring its ________.
- Fairness
- Accuracy
- Efficiency
- Robustness
Ensuring fairness in machine learning models means preventing biases and discrimination in model predictions across different groups.
Which of the following techniques is primarily used for dimensionality reduction in datasets with many features?
- Apriori Algorithm
- Breadth-First Search (BFS)
- Linear Regression
- Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a dimensionality reduction technique used to reduce the number of features while preserving data variance.
In the multi-armed bandit problem, the challenge is to balance between exploration of arms and ________ of the best-known arm.
- Exploitation
- Reward accumulation
- Arm selection
- Probability estimation
The multi-armed bandit problem involves the trade-off between exploration (trying new arms) and exploitation (selecting the best-known arm).
RNNs are particularly suitable for tasks like ________ because of their ability to handle sequences.
- Sentiment Analysis
- Image Classification
- Sequence Prediction
- Audio Recognition
RNNs excel in tasks that involve sequences, such as sequence prediction, where the previous elements influence the future ones.