You are working with a small dataset, and your model is prone to overfitting. What techniques could you employ to mitigate this issue?
- Add complexity
- Reduce complexity
- Use L1 regularization
- Use cross-validation and data augmentation
Using techniques like cross-validation and data augmentation can mitigate overfitting when working with a small dataset. Cross-validation ensures that the model is evaluated on unseen data, and data augmentation artificially increases the size of the dataset, reducing the risk of overfitting.
What is the main difference between supervised and unsupervised learning?
- The algorithms used
- The complexity of the models
- The data size
- The use of labeled data
Supervised learning uses labeled data where the output is known, while unsupervised learning deals with unlabeled data and finds hidden patterns without guidance on the expected outcome.
While AI aims to mimic human intelligence, Machine Learning focuses on learning from data, and Deep Learning emphasizes learning from data using __________.
- clustering
- neural networks
- regression
- statistical methods
Deep Learning emphasizes learning from data using neural networks, particularly multi-layered structures known as deep neural networks.
Clustering can be used in _________ analysis to find patterns and similarities in large datasets, facilitating targeted marketing strategies.
- Customer Segmentation
- Decision Tree
- Linear Regression
- Principal Component
Clustering is used in customer segmentation analysis to group customers based on patterns and similarities, allowing for more targeted marketing strategies.
Which term refers to a subset of AI that deals with algorithms designed to identify patterns and make decisions with minimal human intervention?
- Data Mining
- Machine Learning
- Neural Networks
- Robotics
Machine Learning is a subset of AI that focuses on creating algorithms to identify patterns and make decisions with little or no human intervention.
Can dimensionality reduction be used to improve the performance of machine learning models? If so, how?
- All of the above
- By improving computational efficiency
- By reducing overfitting
- By simplifying the model
Dimensionality reduction can improve the performance of machine learning models by reducing overfitting (as the model becomes less complex), simplifying the model (making it easier to interpret), and improving computational efficiency (reducing training time and resource requirements).
The _________ linkage method in Hierarchical Clustering considers the average distance between all pairs of points in two clusters.
- Average Linkage
- Complete Linkage
- Single Linkage
- Ward's Method
Average Linkage considers the average distance between all pairs of points in two clusters. It falls between the Single and Complete Linkage methods, often providing a balance that avoids some of the extremes of either method. It can be a good choice when clusters are relatively compact but not necessarily spherical.
What are the strengths and weaknesses of using the Ward's method in Hierarchical Clustering?
- Maximizes mean distance but sensitive to initial configuration
- Maximizes variance but creates well-separated clusters
- Minimizes mean distance but less compact clusters
- Minimizes variance but sensitive to outliers
Ward's method in Hierarchical Clustering aims to minimize the variance within clusters, leading to tightly packed clusters. Strength: It often results in compact and balanced clusters. Weakness: It can be sensitive to outliers, as it minimizes the total within-cluster variance, which can be disproportionately influenced by extreme values.
Centering variables in Multiple Linear Regression helps to reduce the ___________ and ease the interpretation of interaction effects.
- complexity
- mean
- multicollinearity
- variance
Centering variables (subtracting the mean) helps to reduce multicollinearity, especially when interaction effects are included. This eases the interpretation of the coefficients and reduces potential issues related to multicollinearity with interaction terms.
What role does model complexity play in overfitting?
- Has no effect on overfitting
- Increases the risk of overfitting
- Increases the risk of underfitting
- Reduces the risk of overfitting
Model complexity "increases the risk of overfitting." A more complex model can capture the noise in the training data, leading to poor generalization on unseen data.