How does 'DBSCAN' clustering differ from 'K-means' and 'hierarchical' clustering?

DBSCAN can find arbitrarily shaped clusters and is less affected by outliers
DBSCAN creates a hierarchy of clusters
DBSCAN requires the number of clusters to be specified
DBSCAN uses centroid to form the clusters

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) differs from K-means and hierarchical clustering in that it can find arbitrarily shaped clusters, and it's less affected by outliers. It does not require the user to set the number of clusters a priori, but instead, it infers the number of clusters based on the data.

Discuss it

In Bayes' theorem, what is the posterior probability?

The likelihood of the evidence
The probability of an event before evidence is observed
The probability of the evidence given the event
The updated probability of an event after evidence is observed

In Bayes' Theorem, the posterior probability is the updated probability of an event after new evidence has been observed. It is calculated by multiplying the likelihood and the prior probability and then dividing by the probability of the evidence.

Discuss it

The range of values around the point estimate that captures the true population parameter at some predetermined confidence level is called a ________ interval.

Confidence
Correlation
Deviation
Variable

The range of values around the point estimate that captures the true population parameter at some predetermined confidence level is called a confidence interval. Confidence intervals are used in statistics to indicate the reliability of an estimate.

Discuss it

To prevent overfitting, we can apply a technique called ________ in polynomial regression.

Aggregation
Factorization
Normalization
Regularization

To prevent overfitting, we can apply a technique called regularization in polynomial regression. Regularization involves adding a penalty term to the loss function during the process of training a model. This penalty term discourages the coefficients of the model from reaching large values, leading to a simpler model that's less likely to overfit.

Discuss it

What is a uniform distribution?

A bell-shaped distribution
A distribution with different probabilities for different outcomes
A distribution with the same probability for all outcomes
A skewed distribution

A uniform distribution, also called a rectangular distribution, is a type of probability distribution in which all outcomes are equally likely. Each interval of equal length on the distribution's support has the same probability.

Discuss it

The geometric mean is particularly useful when comparing different items with very different ________.

Mean values
Median values
Mode values
Ranges

The geometric mean is particularly useful when comparing different items with very different ranges. It is used in various kinds of growth rates, like population growth or financial growth, where each year's value is relative to the previous year's value.

Discuss it

How would you interpret the result of a Kruskal-Wallis Test?

As a measure of correlation
As a measure of dependence
As a measure of difference between groups
As a measure of variance

The result of a Kruskal-Wallis Test is interpreted as a measure of difference between groups. If the test is significant, it suggests that at least one of the groups differs from the others.

Discuss it

When two or more predictors in a multiple linear regression model are highly correlated, it is known as __________.

Autocorrelation
Homoscedasticity
Multicollinearity
Overfitting

Multicollinearity is a phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. This can lead to unstable estimates of the coefficients.

Discuss it

In the presence of multicollinearity, the estimated regression coefficients are _______.

biased
equal to zero
negative
unbiased

Even in the presence of multicollinearity, the least squares estimates of the regression coefficients are still unbiased. However, they are less precise and have high standard errors.

Discuss it

How does standard deviation differ from the mean absolute deviation?

Mean absolute deviation is always greater
Standard deviation is always greater
Standard deviation squares the deviations while mean absolute deviation takes absolute values
They are the same

The standard deviation and mean absolute deviation both measure the dispersion in a dataset. The key difference lies in how they treat deviations from the mean: standard deviation squares the deviations before averaging them, while mean absolute deviation takes the absolute value of deviations before averaging. As a result, standard deviation is more sensitive to extreme values than the mean absolute deviation.

Discuss it