Your task is to detect fraudulent activities in financial transactions. What would be the considerations in choosing between AI, Machine Learning, or Deep Learning for this task?

AI, for its expert systems
Deep Learning, for its complex pattern recognition
Machine Learning, for its ability to learn from historical data
nan

Machine Learning can be trained on historical data to detect patterns indicative of fraudulent activities, making it a suitable choice for this task.

Discuss it

Your Logistic Regression model is suffering from separation, causing some estimated Odds Ratios to be extremely large. How could you handle this issue?

By adding more variables
By applying regularization techniques
By increasing the size of the dataset
By removing all predictors

Separation in Logistic Regression can lead to overly large coefficient estimates. Applying regularization techniques, such as Ridge or Lasso, can help in constraining the coefficient estimates and mitigate this issue.

Discuss it

In which type of Machine Learning does the model learn from labeled data?

Reinforcement Learning
Semi-Supervised Learning
Supervised Learning
Unsupervised Learning

Supervised Learning involves training a model on a labeled dataset, meaning the desired output for each input is known. The model learns to predict the output from the input data.

Discuss it

Bagging stands for Bootstrap __________, which involves creating subsets of the original dataset and training individual models on them.

Adjustment
Aggregation
Algorithm
Alignment

Bagging, or Bootstrap Aggregation, involves creating subsets of the original dataset through bootstrapping and training individual models on these subsets, which are then combined to make the final prediction.

Discuss it

How can you test for multicollinearity in Simple Linear Regression, and why is it important?

By Checking Accuracy, Improves Prediction
By Checking Residuals, Reduces Overfitting
By Checking Variance Inflation Factor (VIF), Prevents Unstable Estimates
By Examining Correlations between Variables, Prevents Confounding Effects

Multicollinearity can be detected by checking the Variance Inflation Factor (VIF). It is important as multicollinearity can lead to unstable estimates and make it difficult to interpret individual coefficients.

Discuss it

If there is a concern about the importance of closer neighbors versus further neighbors in KNN, implementing __________ can be a solution to give varying importance based on distance.

distance weighting
equal weighting
random weighting
uniform weighting

Implementing distance weighting in KNN gives varying importance based on distance, assigning more weight to closer neighbors.

Discuss it

How does Machine Learning play a role in enhancing personalized education systems?

Autonomous Driving
Fraud Detection
Personalized Learning Paths
Weather Prediction

Machine Learning enhances personalized education systems by adapting learning paths to individual student needs, analyzing performance, and providing targeted resources or exercises.

Discuss it

How do AI, Machine Learning, and Deep Learning differ in terms of data volume and data quality requirements?

AI requires least data, followed by Machine Learning and Deep Learning
AI requires most data, followed by Machine Learning and Deep Learning
Deep Learning requires most data, Machine Learning less, and AI least
Machine Learning requires most data, followed by AI and Deep Learning

Deep Learning usually requires the most data to train effectively, followed by Machine Learning, with AI generally requiring the least (since AI includes non-learning methods).

Discuss it

In what situations would ElasticNet be preferred over Ridge or Lasso?

When all features are equally important
When features are uncorrelated
When model complexity is not a concern
When multicollinearity is high

ElasticNet is preferred when there's multicollinearity and you want to balance between Ridge and Lasso, as it combines the properties of both.

Discuss it

The ________ component in PCA explains the highest amount of variance within the data.

first
last
median
random

The "first" principal component in PCA explains the highest amount of variance within the data. It is aligned with the direction of the maximum spread of the data and forms the most substantial part of the dataset's structure.

Discuss it