What is the key characteristic of a symmetric distribution?

It has a mean of zero
It has a mode at the peak
It has no outliers
It has the same shape on the left and right when split vertically at the center

The key characteristic of a symmetric distribution is that it has the same shape on the left and right when split vertically at the center (i.e., about the mean). This means that the frequencies of corresponding values on either side of the center are equal.

Discuss it

Can the probability of an event be a negative number?

It depends on the event
No
Only if the event is impossible
Yes

The probability of an event cannot be a negative number. By definition, the probability of an event is a number between 0 and 1, inclusive.

Discuss it

Simple linear regression is a method used to predict a variable using a variable.

continuous, discrete
dependent, independent
discrete, continuous
independent, dependent

Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables: One variable, denoted x, is regarded as the predictor, explanatory, or independent variable. The other variable, denoted y, is regarded as the response, outcome, or dependent variable.

Discuss it

In a multiple linear regression model, the assumption that the variance of the residuals is the same for all levels of the predictors is known as __________.

Autocorrelation
Homoscedasticity
Linearity
Multicollinearity

Homoscedasticity refers to the assumption in regression analysis that the variance of the residuals (or "errors") is constant across all levels of the independent variables.

Discuss it

What types of scales of measurement are suitable for non-parametric tests?

Nominal, ordinal, interval, and ratio
Only interval and ratio
Only nominal and ordinal
Only ratio

Non-parametric tests can be used with nominal, ordinal, interval, and ratio scales of measurement. This is one of the reasons why non-parametric tests are sometimes chosen over parametric ones, as they can handle data that are not interval or ratio (which are required for many parametric tests).

Discuss it

The ________ score is a measure of how close each point in one cluster is to the points in the neighboring clusters.

boundary
distance
proximity
silhouette

The silhouette score is a measure of how close each point in one cluster is to the points in the neighboring clusters. It ranges from -1 (incorrect clustering) to +1 (highly dense clustering). 0 indicates overlapping clusters.

Discuss it

What does the Law of Large Numbers state?

It states that as the size of a sample is increased, the mean value of the sample will get closer to the mean or expected value of the population.
It states that if an event is repeated under identical conditions, the probability of the event remains the same.
It's a rule which states that the sum of the probabilities of all possible events is 1.
It's the law that states the probability of an event is always constant.

The Law of Large Numbers states that as a sample size grows, its mean gets closer to the average of the whole population. In other words, as the number of experiments increases, the actual ratio of outcomes will converge on the theoretical, or expected, ratio of outcomes.

Discuss it

If the p-value from a Mann-Whitney U test is less than the significance level, you would ________ the null hypothesis.

accept
either accept or reject
fail to reject
reject

If the p-value from a Mann-Whitney U test is less than the significance level (often 0.05), you would reject the null hypothesis, suggesting there is a significant difference between the groups.

Discuss it

What is the purpose of multiple linear regression analysis?

To classify data into different categories
To cluster data into different groups
To examine the relationship between several independent variables and a dependent variable
To predict the outcome of a binary dependent variable

Multiple linear regression analysis is used to understand the relationship between several independent (explanatory) variables and a dependent (response) variable. It can also be used for predicting the mean value of the dependent variable given the values of the independent variables.

Discuss it

What can the Mann-Whitney U test tell you about the shape of your distributions?

It can confirm if your distributions are normal
It can confirm if your distributions are skewed
It can confirm if your distributions have equal variances
It cannot tell you anything about the shape of your distributions

The Mann-Whitney U test does not provide information about the shape of the distributions. It is a non-parametric test that does not make any assumptions about the distribution of the data.

Discuss it

The graphical representation of residuals versus predicted values is known as a ________ plot.

Box
Histogram
Residual
Scatter

A Residual plot is a graph that shows the residuals on the vertical axis and the independent variable on the horizontal axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.

Discuss it

What does a larger sample size do to the sampling distribution of the mean?

It decreases the spread of the distribution
It does not affect the distribution
It increases the spread of the distribution
It skews the distribution

A larger sample size decreases the spread of the sampling distribution of the mean. This is because as the sample size increases, the standard error (a measure of the spread of the distribution of sample means) decreases, which means that the sampling distribution becomes more concentrated around the true population mean.

Discuss it

What is the key characteristic of a symmetric distribution?

Can the probability of an event be a negative number?

Simple linear regression is a method used to predict a ________ variable using a ________ variable.

In a multiple linear regression model, the assumption that the variance of the residuals is the same for all levels of the predictors is known as __________.

What types of scales of measurement are suitable for non-parametric tests?

The ________ score is a measure of how close each point in one cluster is to the points in the neighboring clusters.

What does the Law of Large Numbers state?

If the p-value from a Mann-Whitney U test is less than the significance level, you would ________ the null hypothesis.

What is the purpose of multiple linear regression analysis?

What can the Mann-Whitney U test tell you about the shape of your distributions?

The graphical representation of residuals versus predicted values is known as a ________ plot.

What does a larger sample size do to the sampling distribution of the mean?

Simple linear regression is a method used to predict a variable using a variable.