A colleague is assessing a regression model using only the Adjusted R-Squared. What considerations or additional metrics might you suggest, and why?

  • Include MAE; because it's less sensitive to outliers
  • Include MSE; because it's the standard metric
  • Include RMSE; because it's more interpretable
  • Include both RMSE and MAE; for a more comprehensive assessment
While Adjusted R-Squared is useful, including both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) provides a more comprehensive assessment. RMSE can help in understanding how the model is penalizing larger errors, and MAE can give an indication of the model's sensitivity to outliers. Together, they offer a more nuanced view of the model's performance.

In PCA, if an Eigenvalue is close to zero, it indicates that the corresponding Eigenvector may ________.

  • be a principal component
  • be discarded
  • be of high magnitude
  • explain high variance
If an Eigenvalue in PCA is close to zero, it means that the corresponding Eigenvector (principal direction) may "be discarded" as it explains very little variance within the data. This can help in reducing dimensionality while retaining essential information.

How does reinforcement learning differ from supervised learning in terms of feedback and learning process?

  • Both learn from environment
  • Feedback through rewards/penalties
  • No feedback involved
  • Uses labeled data
Reinforcement learning uses feedback through rewards and penalties to guide the learning process, unlike supervised learning with labeled data.

You have been asked to develop a model that can classify images of handwritten digits. Describe how you would approach this problem using classification algorithms.

  • Analyze images without preprocessing
  • Convert images into numerical data; use algorithms like CNN
  • Use regression algorithms
  • Use time-series analysis
Converting images into numerical data (pixel values) and using deep learning algorithms like Convolutional Neural Networks (CNNs) can be an effective approach for classifying handwritten digits, as CNNs are well-suited for image data.

The _________ method in Cross-Validation ensures that each fold maintains the same proportion of different classes as the complete dataset.

  • k-fold
  • proportional
  • repeated
  • stratified
The stratified method in Cross-Validation ensures that each fold maintains the same proportion of different classes as the complete dataset. It aims to preserve the percentage of samples for each class, especially important when working with imbalanced class distributions.

If a model performs well on the training data but poorly on the validation data, it may be suffering from _________.

  • Generalization
  • Optimization
  • Overfitting
  • Underfitting
Overfitting occurs when a model fits too closely to the training data and captures noise, leading to poor performance on unseen data such as the validation set.

The ________ algorithm creates hyperplanes to classify data points into different classes.

  • Decision Trees
  • Naive Bayes
  • Support Vector Machines
  • k-NN
Support Vector Machines (SVMs) are designed to create hyperplanes that optimally separate data into different classes. This separation helps in accurate classification.

In hierarchical clustering, the linkage criteria, such as _______, ________, and ________, define how the distance between clusters is measured.

  • Single
  • Complete
  • Average
  • All of the Above
In hierarchical clustering, linkage criteria such as single, complete, and average linkage define how distances between clusters are measured, thus all the options are correct.

You are working on a clustering problem where you need to identify very distinct and well-separated clusters. Which linkage method might be suitable and why?

  • Average Linkage
  • Complete Linkage
  • Single Linkage
  • Ward's Method
Complete Linkage would be suitable when you need very distinct and well-separated clusters. This method considers the maximum distance between points in different clusters, ensuring that clusters are far from each other. It provides greater separation between clusters compared to other methods and is less likely to form elongated, chain-like clusters.

In what scenario would the AUC be a more informative metric than simply using Accuracy?

  • When the class distribution is balanced
  • When the class distribution is imbalanced
  • When the model has only one class
  • nan
The AUC (Area Under the Curve) of the ROC Curve can be more informative than Accuracy when dealing with imbalanced class distribution. It provides a more holistic measure of the model's ability to discriminate between positive and negative classes, unlike Accuracy, which may be skewed.

Autonomous vehicles rely on Machine Learning algorithms for tasks like ____________ and ____________.

  • Disease Prediction, Weather Forecasting
  • Object Detection, Path Planning
  • Risk Management, Drug Development
  • Text Classification, Fraud Detection
Autonomous vehicles use Machine Learning for Object Detection and Path Planning, recognizing obstacles and determining optimal routes.

How does Polynomial Regression differ from Simple Linear Regression?

  • It fits a polynomial curve
  • It fits a straight line
  • It is used only for classification
  • It uses more variables
While Simple Linear Regression fits a straight line to the data, Polynomial Regression fits a polynomial curve, allowing for more flexibility in modeling non-linear relationships.