A bank wants to use transaction details to determine the likelihood that a transaction is fraudulent. The outcome is either "fraudulent" or "not fraudulent." Which regression method would be ideal for this purpose?

  • Decision Tree Regression
  • Linear Regression
  • Logistic Regression
  • Polynomial Regression
Logistic Regression is the ideal choice for binary classification tasks, like fraud detection (fraudulent or not fraudulent). It models the probability of an event occurring, making it the right tool for this scenario.

How does the Random Forest algorithm handle the issue of overfitting seen in individual decision trees?

  • By aggregating predictions from multiple trees
  • By increasing the tree depth
  • By reducing the number of features
  • By using a smaller number of trees
Random Forest handles overfitting by aggregating predictions from multiple decision trees. This ensemble method combines the results from different trees, reducing the impact of individual overfitting.

In the context of transfer learning, what is the main advantage of using pre-trained models on large datasets like ImageNet?

  • Feature Extraction
  • Faster Training
  • Reduced Generalization
  • Lower Computational Cost
The main advantage of using pre-trained models on large datasets is "Feature Extraction." Pre-trained models have learned useful features, which can be transferred to new tasks, saving time and data.

The process of reducing the dimensions of a dataset while preserving as much variance as possible is known as ________.

  • Principal Component Analysis
  • Random Sampling
  • Mean Shift
  • Agglomerative Clustering
Dimensionality reduction techniques like Principal Component Analysis (PCA) are used to reduce the dataset's dimensions while preserving variance. PCA identifies new axes (principal components) in the data to reduce dimensionality. Hence, "Principal Component Analysis" is the correct answer.

In reinforcement learning scenarios where rapid feedback is not available, which strategy, exploration or exploitation, could be potentially riskier?

  • Exploration
  • Exploitation
  • Both are equally risky
  • Neither is risky
In scenarios with delayed feedback, excessive exploration can be riskier as it might lead to suboptimal decisions due to the lack of immediate feedback. Exploitation, although it doesn't uncover new options, is relatively less risky in such cases.

What potential problem might arise if you include a vast number of irrelevant features in your machine learning model?

  • Increased accuracy
  • Model convergence
  • Overfitting
  • Underfitting
Including a vast number of irrelevant features can lead to overfitting. Overfitting occurs when the model fits the noise in the data, resulting in poor generalization to new data. It's essential to select relevant features to improve model performance.

An online platform uses an algorithm to recommend songs to users. If the platform only suggests popular songs without ever introducing new or less-known tracks, it's predominantly using which strategy?

  • Popularity-Based Recommendation System
  • Content-Based System
  • Collaborative Filtering System
  • Hybrid Recommendation System
A "popularity-based recommendation system" relies on recommending popular items, which may not introduce diversity or novelty. This strategy doesn't consider users' unique preferences, limiting song suggestions to popular choices.

Policy Gradient Methods aim to optimize the ________ directly in reinforcement learning.

  • Policy
  • Value function
  • Environment
  • Reward
In reinforcement learning, Policy Gradient Methods aim to optimize the policy directly. The policy defines the agent's behavior in an environment.

A real estate company wants to predict the selling price of houses based on features like square footage, number of bedrooms, and location. Which regression technique would be most appropriate?

  • Decision Tree Regression
  • Linear Regression
  • Logistic Regression
  • Polynomial Regression
Linear Regression is the most suitable regression technique for predicting a continuous variable, such as the selling price of houses. It establishes a linear relationship between the independent and dependent variables, making it ideal for this scenario.

Which type of learning would be best suited for categorizing news articles into topics without pre-defined categories?

  • Reinforcement learning
  • Semi-supervised learning
  • Supervised learning
  • Unsupervised learning
Unsupervised learning is the best choice for categorizing news articles into topics without predefined categories. Unsupervised learning algorithms can cluster similar articles based on patterns and topics discovered from the data without the need for labeled examples. Reinforcement learning is more suitable for scenarios with rewards and actions. Supervised learning requires labeled data, and semi-supervised learning combines labeled and unlabeled data.