In a multiple linear regression model, the assumption that the variance of the residuals is the same for all levels of the predictors is known as __________.

  • Autocorrelation
  • Homoscedasticity
  • Linearity
  • Multicollinearity
Homoscedasticity refers to the assumption in regression analysis that the variance of the residuals (or "errors") is constant across all levels of the independent variables.

Simple linear regression is a method used to predict a ________ variable using a ________ variable.

  • continuous, discrete
  • dependent, independent
  • discrete, continuous
  • independent, dependent
Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables: One variable, denoted x, is regarded as the predictor, explanatory, or independent variable. The other variable, denoted y, is regarded as the response, outcome, or dependent variable.

Can the probability of an event be a negative number?

  • It depends on the event
  • No
  • Only if the event is impossible
  • Yes
The probability of an event cannot be a negative number. By definition, the probability of an event is a number between 0 and 1, inclusive.

If the p-value from a Mann-Whitney U test is less than the significance level, you would ________ the null hypothesis.

  • accept
  • either accept or reject
  • fail to reject
  • reject
If the p-value from a Mann-Whitney U test is less than the significance level (often 0.05), you would reject the null hypothesis, suggesting there is a significant difference between the groups.

What does the Law of Large Numbers state?

  • It states that as the size of a sample is increased, the mean value of the sample will get closer to the mean or expected value of the population.
  • It states that if an event is repeated under identical conditions, the probability of the event remains the same.
  • It's a rule which states that the sum of the probabilities of all possible events is 1.
  • It's the law that states the probability of an event is always constant.
The Law of Large Numbers states that as a sample size grows, its mean gets closer to the average of the whole population. In other words, as the number of experiments increases, the actual ratio of outcomes will converge on the theoretical, or expected, ratio of outcomes.

The graphical representation of residuals versus predicted values is known as a ________ plot.

  • Box
  • Histogram
  • Residual
  • Scatter
A Residual plot is a graph that shows the residuals on the vertical axis and the independent variable on the horizontal axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.

What can the Mann-Whitney U test tell you about the shape of your distributions?

  • It can confirm if your distributions are normal
  • It can confirm if your distributions are skewed
  • It can confirm if your distributions have equal variances
  • It cannot tell you anything about the shape of your distributions
The Mann-Whitney U test does not provide information about the shape of the distributions. It is a non-parametric test that does not make any assumptions about the distribution of the data.

What is the purpose of multiple linear regression analysis?

  • To classify data into different categories
  • To cluster data into different groups
  • To examine the relationship between several independent variables and a dependent variable
  • To predict the outcome of a binary dependent variable
Multiple linear regression analysis is used to understand the relationship between several independent (explanatory) variables and a dependent (response) variable. It can also be used for predicting the mean value of the dependent variable given the values of the independent variables.

What are the implications of the Central Limit Theorem on statistical testing?

  • It asserts that all statistical tests must involve the normal distribution.
  • It eliminates the need for statistical testing.
  • It guarantees that all results of statistical tests will be accurate.
  • It states that sample means will be normally distributed regardless of the shape of the population distribution.
The Central Limit Theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables will be approximately normally distributed, regardless of the shape of the original distribution. This underpins many statistical methods, including hypothesis tests and confidence intervals, which may assume normality of the sampling distribution.

Which type of plot is particularly useful for identifying outliers in a dataset?

  • Bar plot
  • Box plot
  • Histogram
  • Scatter plot
Box plots are particularly useful for identifying outliers in a dataset. The box plot displays a summary of the data distribution including minimum, first quartile, median, third quartile, and maximum. Outliers are typically represented as individual points that are far from the 'box' and 'whiskers'.