What is multicollinearity in the context of multiple regression?

It refers to the high correlation between at least two independent variables.
It refers to the linear relationship between each independent variable and the dependent variable.
It refers to the presence of a linear relationship between the dependent variables.
It refers to the relationship between the residuals of the regression model.

Multicollinearity refers to the situation in which two or more predictor variables in a regression model are highly correlated.

Discuss it

How can you interpret interaction terms in a multiple linear regression model?

All of the above
They represent the change in the slope of one variable for different values of another variable
They represent the combined effect of two variables on the response
They represent the effect of a variable at different levels of another variable

Interaction terms represent the combined effect of two predictors on the response variable. They can also be interpreted as the effect of a predictor at different levels of another predictor or the change in the slope of one predictor for different values of another predictor.

Discuss it

What type of data represents measurements or counts?

Categorical data
Nominal data
Qualitative data
Quantitative data

Quantitative data represents measurements or counts. It can be mathematically quantified and is usually collected in numerical form. For example, data such as age, weight, height, and number of items are all quantitative data.

Discuss it

How does the Mann-Whitney U test handle ties?

Ties are given the average rank
Ties are given the highest rank
Ties are given the lowest rank
Ties are removed from the data

In the Mann-Whitney U test, ties (equal values) are handled by giving them the average of the ranks they would have received if they were not tied.

Discuss it

In the presence of ties or zeros in differences, it's usually better to apply _________ test.

Mann-Whitney U
Sign
Wilcoxon Signed Rank
nan

In the presence of ties or zeros in differences, it's usually better to apply the Sign Test because the Wilcoxon Signed Rank Test discards zeros and the Sign Test is less sensitive to ties than other tests.

Discuss it

When is it more appropriate to use the Wilcoxon Signed Rank Test rather than the Sign Test?

When data is nominal
When data is normally distributed
When data is ordinal or interval
When sample size is large

The Wilcoxon Signed Rank Test is more appropriate to use when data is ordinal or interval because it takes into account the magnitude of the differences between paired observations, unlike the Sign Test which only considers the sign of the differences.

Discuss it

The Mann-Whitney U test can be used when the assumptions of the ________ test are not met.

ANOVA
F
chi-square
t

The Mann-Whitney U test can be used as an alternative when the assumptions of the t-test (e.g., normality, homogeneity of variance) are not met.

Discuss it

What does probability measure?

Degree of difference between groups
Likelihood of an event occurring
Speed of an event
Strength of a relationship

Probability measures the likelihood or chance of an event occurring. It is a mathematical concept that is fundamental to statistics and data science, helping to predict outcomes and guide decision-making.

Discuss it

If a dataset has positive skewness, where would you find the majority of the data values?

At the mean
It is equally distributed
To the left of the mean
To the right of the mean

If a dataset has positive skewness, the majority of the data values are found to the left of the mean. The distribution will have a longer or fatter tail on the right side.

Discuss it

What is the impact of a large sample size on the confidence interval of the mean?

Larger sample size has no impact on the confidence interval
Larger sample size leads to a narrower interval
Larger sample size leads to a wider interval
Larger sample size makes the interval skewed

Larger sample sizes lead to narrower confidence intervals. With more data, we're able to estimate the population parameter more precisely, thus the range of values within which we believe the parameter lies (the confidence interval) gets smaller.

Discuss it

Spearman's Rank Correlation is used when the variables are measured on a ________ scale.

Interval
Nominal
Ordinal
Ratio

Spearman's Rank Correlation is used when the variables are measured on an ordinal scale, as it compares the ranks of data. It can also be used with interval and ratio scales, particularly when a non-parametric measure of association is desired.

Discuss it

What are the assumptions of the Mann-Whitney U test?

Equal variances and independent observations
Independent observations and normally distributed residuals
Independent observations and similarly shaped distributions
Normal distribution, equal variances, and independent observations

The assumptions of the Mann-Whitney U test are that observations are independent and that the distributions are similarly shaped. It does not require the assumption of normal distribution or equal variances.

Discuss it