What are the ways to check the assumptions of an ANOVA test?

By calculating the F-statistic
By calculating the mean and variance of each group
By checking normality of residuals, homogeneity of variance, and independence of observations
By conducting post-hoc tests

The assumptions of an ANOVA test can be checked by: 1. Checking the normality of residuals using a normal probability plot or a statistical test like the Shapiro-Wilk test; 2. Checking the homogeneity of variance using a Levene's test or Bartlett's test; 3. Checking the independence of observations which usually pertains to the study design (random sampling, random assignment).

Discuss it

How does the Spearman’s Rank Correlation test handle ties in data ranks?

Assigns the maximum rank to ties
Assigns the minimum rank to ties
Averages the tied ranks
Ignores the tied ranks

The Spearman’s Rank Correlation test handles ties in data ranks by averaging the ranks. For example, if two values tie for a place in the ranking, they are assigned a rank equal to the average of those places.

Discuss it

When is it more appropriate to use the Mann-Whitney U test than a t-test?

When data is normally distributed
When data is not normally distributed
When sample sizes are equal
When the variances of the two groups are equal

The Mann-Whitney U test is more appropriate to use than a t-test when the data is not normally distributed. This test is a non-parametric alternative to the independent t-test and does not assume normality.

Discuss it

In the Kruskal-Wallis Test, if the p-value is less than the chosen significance level, we ________ the null hypothesis.

accept
consider
ignore
reject

If the p-value is less than the chosen significance level in the Kruskal-Wallis Test, we reject the null hypothesis. It means there is enough evidence to suggest that at least one of the groups is different from the others.

Discuss it

What is the main difference between a population and a sample?

A population can only consist of people
A population is always smaller than a sample
A sample is a subset of a population
A sample is always larger than a population

The main difference between a population and a sample is that a sample is a subset of a population. A population refers to the entire group of individuals or observations that we're interested in, while a sample is a smaller group that's been selected from that population.

Discuss it

What strategies can be employed to reduce both Type I and Type II errors?

Decrease sample size, use a more lenient significance level
Decrease sample size, use a more stringent significance level
Increase sample size, use a more lenient significance level
Increase sample size, use a more stringent significance level

Increasing the sample size makes the test more sensitive, reducing both Type I and Type II errors. Similarly, a more stringent significance level (lower α) reduces the chance of a Type I error. However, it's important to note that decreasing Type I error probability often leads to an increase in Type II error probability, and vice versa. This is known as the Type I/Type II tradeoff.

Discuss it

What does the interquartile range in a box plot represent?

The middle 50% of the data
The range of the top 25% of the data
The range within one standard deviation from the mean
The total range of the dataset

The interquartile range (IQR) in a box plot represents the middle 50% of the data. It is the range within which the central half of the values fall and is calculated as the difference between the third quartile (Q3) and the first quartile (Q1).

Discuss it

What is the purpose of non-parametric statistical methods?

To analyze data without making assumptions about the population distribution
To make the calculation process more complex
To provide less accurate results
To use less data in the analysis

Non-parametric statistical methods are used to analyze data without making assumptions about the population distribution. These tests are based on differences in medians or ranks rather than differences in means.

Discuss it

A value of 0 in Pearson's Correlation Coefficient means there's no ________ correlation between the two variables.

linear
negative
perfect
visible

A value of 0 in Pearson's Correlation Coefficient means there's no linear correlation between the two variables. However, it's important to note that this doesn't necessarily mean there is no relationship at all, it could be that the relationship is nonlinear.

Discuss it

How is a probability distribution defined?

It is the average value of a dataset
It is the highest and lowest value of a dataset
It is the likelihood of each possible outcome of a random variable
It is the spread of possible values in a dataset

A probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. For a random variable, the probability distribution is the probability that the variable takes a particular value.

Discuss it