You have applied PCA to your dataset and the first three principal components explain 95% of the variance. What does this signify, and how would you proceed?
- This indicates an error in the PCA process
- This means that 5% of the data is missing, so you should reapply PCA
- This means that 95% of the variance is captured, so you may choose to proceed with these components
- This means that the data is uniformly distributed and PCA is not needed
The first three principal components explaining 95% of the variance means that most of the original information is captured, and you may proceed with these components if the loss of 5% is acceptable.
In DBSCAN, Epsilon is the maximum radius of the neighborhood from a data point, and MinPts is the minimum number of points required to form a ________.
- border point
- cluster
- core point
- noise point
In DBSCAN, Epsilon defines the neighborhood radius, and MinPts defines the minimum number of points required to form a cluster. If a point has at least MinPts within its Epsilon neighborhood, a cluster is formed.
What is multicollinearity in the context of Multiple Linear Regression?
- Adding interaction effects
- High correlation among variables
- Lowering the bias of the model
- Reducing overfitting
Multicollinearity refers to a situation where two or more independent variables in a Multiple Linear Regression model are highly correlated with each other.
What are the main differences between PCA and Linear Discriminant Analysis (LDA) as techniques for dimensionality reduction?
- Both techniques work the same way
- PCA is a type of LDA
- PCA is unsupervised, LDA is supervised
- PCA maximizes within-class variance, LDA between
The main difference between PCA and LDA is that PCA is an unsupervised technique that maximizes the total variance in the data, while LDA is a supervised technique that maximizes the between-class variance and minimizes the within-class variance. This makes LDA more suitable when class labels are available, while PCA can be used without them.
How are financial institutions using Machine Learning to detect fraudulent activities?
- Fraud Detection
- Personalized Education
- Recommending Media
- Weather Prediction
Financial institutions use Machine Learning algorithms to detect fraudulent activities by analyzing transaction patterns and identifying anomalies or suspicious behavior.
In what scenarios might DBSCAN be a less appropriate clustering algorithm compared to others?
- When clusters have different densities
- When clusters have similar densities
- When data distribution is highly skewed
- When data is uniformly distributed
DBSCAN might be less suitable when clusters have different densities, as the same Epsilon and MinPts parameters apply to all clusters. This can lead to difficulty in capturing clusters with widely varying densities, making other clustering methods that can adapt to varying density clusters potentially more appropriate in such scenarios.
Your Decision Tree is suffering from high bias. How could adjusting the parameters related to entropy or the Gini Index help in this scenario?
- Add more training data
- Increase tree complexity by fine-tuning split criteria
- Reduce tree complexity by fine-tuning split criteria
- Remove features
High bias often means the model is too simple. Adjusting the parameters related to entropy or the Gini Index to create more complex splits can help capture underlying patterns in the data, thereby reducing bias and potentially improving predictive accuracy.
In a situation with mixed types of features, a __________ distance metric might be preferable in KNN.
- Cosine
- Euclidean
- Gower
- Manhattan
The Gower distance metric can handle mixed types of features (numerical, categorical) and is often preferable in such cases.
The _________ linkage method in Hierarchical Clustering minimizes the variance of the distances between clusters.
- Average Linkage
- Complete Linkage
- Single Linkage
- Ward's Method
Ward's Method minimizes the variance of the distances between clusters. It considers the sum of squared deviations from the mean and tends to create equally sized clusters. This method can be beneficial when we want compact, spherical clusters and when minimizing within-cluster variance is a primary consideration.
You have built an SVM for a binary classification problem but the model is overfitting. What changes can you make to the kernel or hyperparameters to improve the model?
- Change the kernel's color
- Change to a simpler kernel or adjust the regularization parameter 'C'
- Ignore overfitting
- Increase the kernel's complexity
Overfitting can be mitigated by choosing a simpler kernel or adjusting the regularization parameter 'C', allowing for a better balance between bias and variance.