What's the difference between simple linear regression and multiple linear regression?

  • One is for classification, the other for regression
  • One is linear, the other is nonlinear
  • One uses one predictor, the other uses more than one
  • One uses time series data, the other doesn't
Simple linear regression involves one independent variable to predict the dependent variable, while multiple linear regression uses two or more independent variables for prediction.

What is the main goal of Artificial Intelligence as a scientific discipline?

  • To create complex algorithms
  • To improve data storage
  • To make money through technology
  • To mimic human intelligence and decision-making
The primary goal of Artificial Intelligence is to develop systems that can mimic human intelligence and decision-making processes.

How can Cross-Validation help in hyperparameter tuning?

  • By allowing repeated testing on the same validation set
  • By improving model accuracy directly
  • By providing robust performance estimates to select the best hyperparameters
  • By reducing computation time
Cross-Validation enables hyperparameter tuning by providing a robust estimate of the model's performance across different data splits. This process helps to find hyperparameters that generalize well to unseen data, minimizing the risk of overfitting, and allowing a more informed selection of optimal hyperparameters.

How do Ridge and Lasso regularization techniques interact with Polynomial Regression to mitigate overfitting?

  • By adding a penalty term to constrain coefficients
  • By fitting low-degree polynomials
  • By ignoring interaction terms
  • By increasing the model's complexity
Ridge and Lasso regularization techniques mitigate overfitting in Polynomial Regression by adding a penalty term to the loss function. This constrains the coefficients, reducing the complexity of the model, and helps in avoiding overfitting.

You need to build a recommendation system with a clear explanation of why certain recommendations are made. How would you approach this considering AI, Machine Learning, and Deep Learning?

  • AI, for its rule-based logic
  • Deep Learning, for its data-driven insights
  • Machine Learning, for its predictive analysis
  • nan
Traditional AI models, often based on rule-based logic, may provide clear explanations for recommendations, whereas Machine Learning and Deep Learning models may be less interpretable.

Can you explain how the choice of kernel in SVM affects the decision boundary?

  • Changes the color
  • Increases computation time
  • Reduces data size
  • Transforms the feature space
Different kernels transform the feature space in different ways, affecting the shape and position of the decision boundary in SVM.

Clustering is a common task in __________ learning, where data is grouped based on inherent similarities without the use of labels.

  • reinforcement
  • semi-supervised
  • supervised
  • unsupervised
Unsupervised learning commonly involves clustering, where data is grouped based on similarities without using labels.

A company wants to classify its products into different categories based on various features. How could LDA be applied here, considering both within-class and between-class variances?

  • Apply LDA to balance within-class and between-class variances for effective classification
  • Focus on within-class variance and ignore between-class variance
  • Ignore within-class variance and focus on between-class variance
  • Use another method
LDA could be applied by considering both within-class and between-class variances, seeking to "balance within-class and between-class variances for effective classification." This ensures that products in the same category are similar, while products in different categories are distinct.

How does the K-Means clustering algorithm determine the centroids?

  • Based on Density
  • By Class Labels
  • Minimizing Euclidean Distance
  • Random Selection
K-Means clustering algorithm determines the centroids by iteratively minimizing the sum of squared Euclidean distances between the data points and the centroids of their respective clusters.

Why is DBSCAN often preferred for data with clusters of varying shapes and sizes?

  • It depends on density rather than distance
  • It relies on statistical modeling
  • It requires manual setting for each cluster shape
  • It uses fixed-size clusters
DBSCAN is preferred for data with clusters of varying shapes and sizes because it depends on density rather than a specific distance metric. This means that DBSCAN can identify clusters with arbitrary shapes and sizes based on the density of data points within a region, rather than relying on a fixed distance or shape constraint. This makes it versatile for complex clustering tasks.