When a Polynomial Regression model is overfitted, it will perform well on the _________ data but poorly on the _________ data.

  • test, training
  • training, test
  • training, validation
  • validation, training
An overfitted Polynomial Regression model will fit the training data very well but perform poorly on unseen test data.

In the context of K-Means clustering, what challenges may arise with poorly initialized centroids?

  • Faster convergence
  • No convergence
  • No effect on clustering
  • Suboptimal clustering, Slow convergence
Poorly initialized centroids in K-Means may lead to suboptimal clustering and slow convergence. If the centroids are initialized very poorly, it might even cause the algorithm to get stuck in local minima.

You are faced with a multi-class classification problem. How would the choice of K and distance metric affect the KNN algorithm's ability to differentiate between the classes?

  • Choice of K affects precision, distance metric affects generalization
  • Choice of K affects recall, distance metric affects speed
  • Choice of K and distance metric carefully affects differentiation between classes
  • It has no effect
The careful selection of K and distance metric can greatly affect the KNN algorithm's ability to differentiate between classes in multi-class classification.

What is the statistical interpretation of interaction effects in Multiple Linear Regression?

  • Combined effect of two variables affecting the dependent variable
  • Minimizing overfitting
  • Removing irrelevant features
  • Transforming variables into linear space
Interaction effects statistically represent the combined effect of two or more variables on the dependent variable. They capture relationships that are not apparent when variables are considered separately.

You have two datasets: one with 10,000 instances and one with 100,000. How would the size of the datasets impact the choice of Machine Learning algorithm?

  • Larger datasets always require more complex models
  • Larger datasets may require more careful consideration of computational efficiency
  • Size doesn’t impact the choice of algorithm
  • Size only impacts the choice of hyperparameters
Larger datasets may require more computational resources, so choosing an algorithm that can handle the computational complexity might be critical. The choice might not be solely determined by the size, but it can certainly impact the decision.

How does Logistic Regression differ from Linear Regression?

  • Both are used for clustering
  • Models binary outcomes
  • No difference
  • Uses the same equations
While Linear Regression is used to predict continuous outcomes, Logistic Regression is used to model binary outcomes. They use different mathematical functions to accomplish this.

In what way does Machine Learning contribute to the field of autonomous driving?

  • Enabling Real-time Decision-making
  • Financial Fraud Detection
  • Recommending Products
  • Weather Prediction
Machine Learning contributes to autonomous driving by enabling real-time decision-making, recognizing objects, and processing vast amounts of data to navigate safely.

Explain the significance of choosing different linkage methods in the outcome of a Hierarchical Clustering algorithm.

  • Different linkage methods affect the shape and size of clusters
  • Different linkage methods affect the speed of clustering only
  • Different linkage methods affect the type of data that can be clustered
  • Different linkage methods yield similar results
Different linkage methods in Hierarchical Clustering significantly affect the shape and size of the resulting clusters. For example, single linkage may create chain-like clusters, complete linkage may lead to compact clusters, and average linkage often results in more balanced clusters. The choice of linkage method should be guided by the underlying data characteristics.

What are the components of a Confusion Matrix, and how do they relate to the True Positive, False Positive, True Negative, and False Negative rates?

  • TP, FN, FP, TN, associated with model accuracy
  • TP, FP, FN, TN, associated with specific classes
  • TP, FP, TN, FN, associated with different error types
  • nan
A Confusion Matrix consists of True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN). They help in understanding the type of mistakes a classifier is making, providing insight into the model's ability to classify instances of specific classes.

The finance sector leverages Machine Learning for ____________ detection and risk management.

  • Disease Prediction
  • Fraud
  • Recommender Systems
  • Traffic Flow
In the finance sector, Machine Learning is used for Fraud Detection and managing various risks, analyzing transaction data and identifying suspicious activities.