What does a residual plot tell us about the fit of the model?

It indicates how well the model's predictions match the actual data
It indicates the variance of the residuals
It shows the correlation between the dependent and independent variables
It shows the relationship between the dependent and independent variables

A residual plot shows the residuals on the y-axis and the independent variable on the x-axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.

Discuss it

Can PCA be used for both supervised and unsupervised learning?

No
Only for supervised learning
Only for unsupervised learning
Yes

No, PCA is a technique for unsupervised learning. It does not use any class label information in its algorithm, making it unsupervised. However, the transformed dataset from PCA can be used for subsequent supervised learning tasks.

Discuss it

What is the effect of outliers on PCA?

It depends on the distribution of the data
They can distort the principal components
They enhance the performance of PCA
They have no effect on PCA

Outliers can significantly distort the principal components identified by PCA, as they can artificially inflate the variance along their direction. It's generally a good practice to address outliers before applying PCA.

Discuss it

What is the concept of "Type I" error in the context of hypothesis testing?

Failing to reject a false null hypothesis
Failing to reject a true alternative hypothesis
Rejecting a false alternative hypothesis
Rejecting a true null hypothesis

A Type I error in hypothesis testing is the incorrect rejection of a true null hypothesis, often signified by the Greek letter alpha (α). In other words, a Type I error happens when the researcher incorrectly concludes that the null hypothesis is false when, in fact, it is true.

Discuss it

When can we apply the Chi-square test for goodness of fit?

When the data are continuously distributed
When the data are normally distributed
When we have categorical data and want to see if it follows a specific distribution
When we want to compare means

The Chi-square test for goodness of fit is used when we have categorical data and we want to see if the data follows a specific distribution.

Discuss it

How does Spearman's Rank Correlation react to outliers as compared to Pearson's correlation?

Both are equally sensitive to outliers
Less sensitive to outliers
More sensitive to outliers
Neither is sensitive to outliers

Spearman's Rank Correlation is less sensitive to outliers than Pearson's correlation. This is because Spearman's correlation is based on rank orders rather than raw data values, making it more robust against outliers.

Discuss it

What does a null hypothesis represent in statistical testing?

A condition of no effect or no difference
A specific outcome of the experiment
An effect or difference exists
The sample size is large enough for the test

The null hypothesis is a statement in statistical inference which asserts that there is no significant difference between the set of observed and expected data.

Discuss it

What is the difference between a one-sample t-test and a two-sample t-test?

All of the above
The number of hypotheses being tested
The number of samples being compared
The type of data being used

The key difference between a one-sample t-test and a two-sample t-test lies in the number of samples being compared. A one-sample t-test compares the mean of a single sample to a known value, while a two-sample t-test compares the means of two different samples.

Discuss it

What is the concept of significance level in hypothesis testing?

The amount of data needed to support the alternative hypothesis
The difference between the null and alternative hypotheses
The probability of rejecting a true null hypothesis
The proportion of the sample that supports the null hypothesis

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true.

Discuss it

What is the Multiplication Rule of Probability primarily used for?

To calculate the joint probability of two independent events
To calculate the probability of either of two events occurring
To divide one probability by another
To subtract one probability from another

The Multiplication Rule in probability is used to calculate the joint probability of two independent events. It states that the probability of two independent events both occurring is the product of their individual probabilities.

Discuss it