How can biases in training data affect the fairness of a machine learning model?

  • Bias in training data can lead to underrepresented groups not being considered
  • Bias can lead to faster training
  • Bias has no impact on model fairness
  • Bias can improve model fairness
Biases in training data can lead to underrepresentation of certain groups, causing the model to make unfair predictions, especially for those underrepresented groups.

Which type of regression is used to predict the probability of a categorical outcome?

  • Decision Tree Regression
  • Linear Regression
  • Logistic Regression
  • Polynomial Regression
Logistic Regression is specifically designed for predicting the probability of a categorical outcome. It's used when the dependent variable is binary (e.g., spam or not spam).

A financial institution wants to predict whether a loan applicant is likely to default on their loan. They have a mix of numerical data (like income, age) and categorical data (like occupation, marital status). Which algorithm might be well-suited for this task due to its ability to handle both types of data?

  • Decision Tree
  • Random Forest
  • Support Vector Machine
  • k-Nearest Neighbors
The Random Forest algorithm is well-suited for this task because it can handle both numerical and categorical data effectively. It combines multiple decision trees and takes a vote to make predictions, making it robust and accurate for such mixed data.

Which of the following RNN variants uses both a forget gate and an input gate to regulate the flow of information?

  • LSTM (Long Short-Term Memory)
  • GRU (Gated Recurrent Unit)
  • Elman Network
  • Jordan Network
The LSTM (Long Short-Term Memory) variant uses both a forget gate and an input gate to manage information flow. These gates allow it to control which information to forget or remember, making it highly effective in learning and retaining information over long sequences.

t-SNE is a technique primarily used for what kind of task in machine learning?

  • Dimensionality Reduction
  • Image Classification
  • Anomaly Detection
  • Reinforcement Learning
t-SNE (t-distributed Stochastic Neighbor Embedding) is primarily used for dimensionality reduction, reducing high-dimensional data to a lower-dimensional representation for visualization and analysis.

Which algorithm is commonly used for blind source separation or separating mixed signals?

  • Principal Component Analysis (PCA)
  • Support Vector Machine (SVM)
  • K-Means Clustering
  • Decision Trees
Principal Component Analysis (PCA) is commonly used for blind source separation, reducing the dimensionality of data to separate mixed signals. PCA identifies the principal components or directions of maximum variance in the data.

SVMs aim to maximize the margin, which is the distance between the decision boundary and the nearest ______ from any class.

  • Decision Tree
  • Hyperplane
  • Outlier
  • Support Vector
SVMs aim to maximize the margin, which is the distance between the decision boundary and the nearest support vector from any class. Support vectors play a crucial role in defining the decision boundary.

The equation y=mx+cy=mx+c is a simple representation of ________ regression.

  • Linear
  • Logistic
  • Polynomial
  • Ridge
The equation y=mx+c represents a simple linear regression. In this equation, 'y' is the dependent variable, 'x' is the independent variable, 'm' is the slope, and 'c' is the intercept. It's used to model a linear relationship between variables.

You are working on a fraud detection system where false negatives (failing to detect a fraud) can have severe financial implications. Which metric would you prioritize to ensure that as many actual fraud cases as possible are detected?

  • Accuracy
  • F1 Score
  • Precision
  • Recall
In this high-stakes scenario, prioritizing Recall is crucial. Recall measures the ability to detect actual fraud cases, minimizing false negatives, which is of paramount importance in a fraud detection system with severe financial consequences.

In GANs, what is the significance of the Nash Equilibrium?

  • It's a point where both the generator and discriminator are optimal.
  • It's a theoretical concept without practical relevance.
  • It's the point where only the generator is optimal.
  • It's the point where only the discriminator is optimal.
The Nash Equilibrium in GANs is when both the generator and discriminator reach an optimal state. It signifies stability in GAN training.