A linear regression model's R-Squared value significantly improves after polynomial features are added. What could be the reason, and what should you be cautious about?
- Reason: Improved fit to nonlinear patterns; Caution: Risk of overfitting
- Reason: Increased bias; Caution: Risk of complexity
- Reason: Increased complexity; Caution: Risk of bias
- Reason: Reduced error; Caution: Risk of underfitting
The significant improvement in R-Squared value after adding polynomial features indicates an improved fit to potentially nonlinear patterns in the data. However, caution should be exercised as adding too many polynomial features may lead to overfitting, where the model fits the noise in the training data rather than the underlying trend. Regularization techniques and cross-validation can be used to mitigate this risk.
Explain the role of Machine Learning in optimizing supply chain and inventory management.
- Customer Segmentation
- Image Recognition
- Sentiment Analysis
- Supply Chain Optimization
Machine Learning plays a vital role in supply chain optimization by analyzing and predicting demand, improving inventory management, optimizing logistics, and enhancing decision-making through predictive analytics.
Explain how the learning agent interacts with the environment in Reinforcement Learning.
- Through clustering
- Through combining labeled and unlabeled data
- Through labeled data
- Through rewards and penalties
The agent in Reinforcement Learning interacts with the environment through rewards and penalties, learning to take actions to achieve a goal.
What is the difference between Gradient Boosting and AdaBoost?
- Both are the same
- Both focus on increasing bias
- Gradient Boosting is for regression, AdaBoost is for classification
- Gradient Boosting uses gradients to correct errors, while AdaBoost focuses on weighting misclassified instances
Gradient Boosting builds models sequentially, using gradients to correct errors and minimize the loss function. AdaBoost, on the other hand, adjusts the weights of misclassified instances to focus the next model on them. Both aim to improve performance but use different approaches.
What role does the regularization parameter 'C' play in an SVM?
- Controls balance between margin maximization and error minimization
- Controls visualization
- Increases computation time
- Reduces data size
The regularization parameter 'C' in SVM controls the trade-off between maximizing the margin and minimizing classification errors.
You are using Bootstrapping to estimate the confidence interval for a model parameter. Explain how the process works.
- By calculating the mean and standard deviation without resampling
- By randomly selecting without replacement from the dataset
- By resampling with replacement and calculating empirical quantiles of the distribution
- By splitting the data into training and validation sets
Bootstrapping to estimate the confidence interval for a model parameter involves resampling with replacement from the original data, calculating the parameter for each resampled dataset, and then determining empirical quantiles of the parameter's distribution. It allows the estimation of confidence intervals even when the underlying distribution is unknown.
What are the key differences between Hierarchical Clustering and K-Means Clustering?
- Algorithm Complexity
- Cluster Number & Structure
- Data Type
- Learning Type
Hierarchical Clustering builds a tree-like structure and does not require a predefined number of clusters, whereas K-Means requires the number of clusters in advance and builds non-hierarchical clusters.
A colleague has built a Polynomial Regression model and suspects overfitting. What diagnostic tools and techniques would you recommend to confirm or deny this suspicion?
- Cross-validation and visual inspection of residuals
- Ignore the suspicion
- Increase polynomial degree
- Look at training data only
Cross-validation and visual inspection of residuals are common techniques to detect overfitting. They can help in assessing how well the model generalizes to new data, revealing any overfitting issues.
In LDA, what is meant by the term "between-class variance"?
- Variance among different classes
- Variance among similar classes
- Variance between individual data points
- Variance within individual classes
"Between-class variance" in LDA refers to the "variance among different classes." It quantifies how separated the means of different classes are from each other. Maximizing this variance enhances class separation.
Explain the role of eigenvalues and eigenvectors in PCA.
- Eigenvalues represent direction, eigenvectors variance
- Eigenvalues represent variance, eigenvectors direction
- Neither plays a role in PCA
- They are used in LDA, not PCA
In PCA, eigenvectors represent the directions in which the data varies the most, while the corresponding eigenvalues give the amount of variance in those directions. These are obtained from the covariance matrix of the original data, and the eigenvectors with the largest eigenvalues become the principal components that capture the most significant patterns in the data.