Can you explain the concept of Semi-Supervised Learning and how it bridges the gap between supervised and unsupervised learning?

  • Combines labeled & unlabeled data
  • Uses only labeled data
  • Uses only unlabeled data
  • Uses rewards and penalties
Semi-Supervised Learning bridges the gap by combining both labeled and unlabeled data, utilizing strengths of both supervised and unsupervised.

Cross-Validation divides the dataset into "k" subsets, or _______, where one subset is used as the validation set, and the rest are used for training.

  • clusters
  • folds
  • groups
  • partitions
Cross-Validation involves dividing the dataset into "k" subsets, referred to as "folds." One fold is used as the validation set, while the remaining are used for training. This process is repeated k times, with each fold being used exactly once as the validation set.

In Machine Learning, the term _________ refers to the values that the algorithm tries to predict, while _________ refers to the input variables.

  • data, parameters
  • features, targets
  • parameters, data
  • targets, features
In machine learning, "targets" are the values that a model tries to predict based on given "features," which are the input variables that represent the data.

What does DBSCAN stand for in the context of clustering algorithms?

  • Data-Based Scan Algorithm
  • Density-Based Spatial Clustering of Applications with Noise
  • Distribution-Based Scan Clustering
  • Dynamic-Based Scan Algorithm
DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise. It's a clustering algorithm that groups together points that are closely packed based on a density function, separating areas where points are concentrated from areas that are sparse or contain noise.

Artificial Intelligence encompasses both ________ and ________, including methods that may not involve learning from data.

  • AI, Deep Learning
  • Deep Learning, AI
  • Machine Learning, AI
  • Machine Learning, Deep Learning
Artificial Intelligence encompasses both Machine Learning and Deep Learning, including methods outside of learning from data.

What is the main function of the Gini Index in a Decision Tree?

  • Determine Leaf Nodes
  • Increase Complexity
  • Measure Purity
  • Reduce Overfitting
The Gini Index measures the impurity or purity of a split in the Decision Tree.

How does Principal Component Analysis (PCA) work as a method of dimensionality reduction?

  • By classifying features
  • By maximizing variance
  • By minimizing variance
  • By selecting principal features
Principal Component Analysis (PCA) works by transforming the original features into a new set of uncorrelated features called principal components. It does so by maximizing the variance along these new axes, meaning that the first principal component explains the most variance, the second explains the second most, and so on.

What are some common challenges in high-dimensional data that dimensionality reduction aims to address?

  • All of the above
  • Computational efficiency
  • Curse of dimensionality
  • Overfitting
Dimensionality reduction aims to address several challenges in high-dimensional data, including the curse of dimensionality (where distance measures lose meaning), overfitting (where models fit noise), and computational efficiency (since fewer dimensions require less computing resources).

Interaction effects in Multiple Linear Regression can be represented by adding a ___________ term for the interacting variables.

  • additive
  • divided
  • multiplied
  • subtractive
Interaction effects are represented by adding a multiplied term for the interacting variables in the model. It captures the combined effect that is not simply additive and reflects how the response variable changes when both interacting variables change together.

In Polynomial Regression, a higher degree can lead to ________, where the model learns the noise in the data.

  • accuracy
  • overfitting
  • stability
  • underfitting
A higher degree in Polynomial Regression may cause the model to fit the noise in the data, leading to overfitting.