Which regularization method would you likely use if you suspect some of the features are entirely irrelevant?
- Elastic Net
- Lasso
- Ridge
- nan
Lasso regularization is useful when some features are suspected to be irrelevant, as it can set the coefficients for those features to zero, effectively removing them.
While AI aims to mimic human intelligence, Machine Learning focuses on learning from data, and Deep Learning emphasizes learning from data using __________.
- clustering
- neural networks
- regression
- statistical methods
Deep Learning emphasizes learning from data using neural networks, particularly multi-layered structures known as deep neural networks.
What is the main difference between supervised and unsupervised learning?
- The algorithms used
- The complexity of the models
- The data size
- The use of labeled data
Supervised learning uses labeled data where the output is known, while unsupervised learning deals with unlabeled data and finds hidden patterns without guidance on the expected outcome.
You are working with a small dataset, and your model is prone to overfitting. What techniques could you employ to mitigate this issue?
- Add complexity
- Reduce complexity
- Use L1 regularization
- Use cross-validation and data augmentation
Using techniques like cross-validation and data augmentation can mitigate overfitting when working with a small dataset. Cross-validation ensures that the model is evaluated on unseen data, and data augmentation artificially increases the size of the dataset, reducing the risk of overfitting.
What are some common methods of initializing centroids in K-Means clustering?
- Data Transformation
- Normalization
- Principal Component Analysis
- Random Selection, K-Means++
Common methods for initializing centroids in K-Means include Random Selection and K-Means++. These methods can affect the convergence speed and quality of the final clusters.
What are the strengths and weaknesses of using the Ward's method in Hierarchical Clustering?
- Maximizes mean distance but sensitive to initial configuration
- Maximizes variance but creates well-separated clusters
- Minimizes mean distance but less compact clusters
- Minimizes variance but sensitive to outliers
Ward's method in Hierarchical Clustering aims to minimize the variance within clusters, leading to tightly packed clusters. Strength: It often results in compact and balanced clusters. Weakness: It can be sensitive to outliers, as it minimizes the total within-cluster variance, which can be disproportionately influenced by extreme values.
The _________ linkage method in Hierarchical Clustering considers the average distance between all pairs of points in two clusters.
- Average Linkage
- Complete Linkage
- Single Linkage
- Ward's Method
Average Linkage considers the average distance between all pairs of points in two clusters. It falls between the Single and Complete Linkage methods, often providing a balance that avoids some of the extremes of either method. It can be a good choice when clusters are relatively compact but not necessarily spherical.
Can dimensionality reduction be used to improve the performance of machine learning models? If so, how?
- All of the above
- By improving computational efficiency
- By reducing overfitting
- By simplifying the model
Dimensionality reduction can improve the performance of machine learning models by reducing overfitting (as the model becomes less complex), simplifying the model (making it easier to interpret), and improving computational efficiency (reducing training time and resource requirements).
Which term refers to a subset of AI that deals with algorithms designed to identify patterns and make decisions with minimal human intervention?
- Data Mining
- Machine Learning
- Neural Networks
- Robotics
Machine Learning is a subset of AI that focuses on creating algorithms to identify patterns and make decisions with little or no human intervention.
Clustering can be used in _________ analysis to find patterns and similarities in large datasets, facilitating targeted marketing strategies.
- Customer Segmentation
- Decision Tree
- Linear Regression
- Principal Component
Clustering is used in customer segmentation analysis to group customers based on patterns and similarities, allowing for more targeted marketing strategies.
Logistic Regression is commonly used for __________ problems where the outcome has two categories.
- Binary classification
- Clustering
- Multiclass classification
- Regression
Logistic Regression is primarily used for binary classification problems where the outcome has only two categories.
Can you name a popular clustering algorithm used in Machine Learning?
- Decision Trees
- K-Means
- K-Nearest Neighbors
- Linear Regression
K-Means is a widely-used clustering algorithm that partitions data into K distinct, non-overlapping clusters based on similarity.