What is multicollinearity in the context of multiple regression?

  • It refers to the high correlation between at least two independent variables.
  • It refers to the linear relationship between each independent variable and the dependent variable.
  • It refers to the presence of a linear relationship between the dependent variables.
  • It refers to the relationship between the residuals of the regression model.
Multicollinearity refers to the situation in which two or more predictor variables in a regression model are highly correlated.

The Kruskal-Wallis Test ranks all the data from all groups together; it then tests whether the ________ ranks differ significantly between the groups.

  • average
  • mean
  • median
  • mode
The Kruskal-Wallis Test ranks all the data from all groups together; it then tests whether the mean ranks differ significantly between the groups.

When two variables increase and decrease together, they are said to have a ________ correlation.

  • negative
  • positive
  • strong
  • zero
When two variables increase and decrease together, they are said to have a positive correlation. This is indicated by a positive Pearson's Correlation Coefficient.

If the null hypothesis is false, but we fail to reject it, what type of error have we made?

  • Both Type I and Type II error
  • Neither Type I nor Type II error
  • Type I error
  • Type II error
If the null hypothesis is false, but we fail to reject it, we have made a Type II error. This is also known as a "false negative" result.

What are the assumptions of the Mann-Whitney U test?

  • Equal variances and independent observations
  • Independent observations and normally distributed residuals
  • Independent observations and similarly shaped distributions
  • Normal distribution, equal variances, and independent observations
The assumptions of the Mann-Whitney U test are that observations are independent and that the distributions are similarly shaped. It does not require the assumption of normal distribution or equal variances.

Spearman's Rank Correlation is used when the variables are measured on a ________ scale.

  • Interval
  • Nominal
  • Ordinal
  • Ratio
Spearman's Rank Correlation is used when the variables are measured on an ordinal scale, as it compares the ranks of data. It can also be used with interval and ratio scales, particularly when a non-parametric measure of association is desired.

What is the impact of a large sample size on the confidence interval of the mean?

  • Larger sample size has no impact on the confidence interval
  • Larger sample size leads to a narrower interval
  • Larger sample size leads to a wider interval
  • Larger sample size makes the interval skewed
Larger sample sizes lead to narrower confidence intervals. With more data, we're able to estimate the population parameter more precisely, thus the range of values within which we believe the parameter lies (the confidence interval) gets smaller.

If a dataset has positive skewness, where would you find the majority of the data values?

  • At the mean
  • It is equally distributed
  • To the left of the mean
  • To the right of the mean
If a dataset has positive skewness, the majority of the data values are found to the left of the mean. The distribution will have a longer or fatter tail on the right side.

What does probability measure?

  • Degree of difference between groups
  • Likelihood of an event occurring
  • Speed of an event
  • Strength of a relationship
Probability measures the likelihood or chance of an event occurring. It is a mathematical concept that is fundamental to statistics and data science, helping to predict outcomes and guide decision-making.

The Mann-Whitney U test can be used when the assumptions of the ________ test are not met.

  • ANOVA
  • F
  • chi-square
  • t
The Mann-Whitney U test can be used as an alternative when the assumptions of the t-test (e.g., normality, homogeneity of variance) are not met.