When is it more appropriate to use the Mann-Whitney U test than a t-test?

When data is normally distributed
When data is not normally distributed
When sample sizes are equal
When the variances of the two groups are equal

The Mann-Whitney U test is more appropriate to use than a t-test when the data is not normally distributed. This test is a non-parametric alternative to the independent t-test and does not assume normality.

Discuss it

In the Kruskal-Wallis Test, if the p-value is less than the chosen significance level, we ________ the null hypothesis.

accept
consider
ignore
reject

If the p-value is less than the chosen significance level in the Kruskal-Wallis Test, we reject the null hypothesis. It means there is enough evidence to suggest that at least one of the groups is different from the others.

Discuss it

What is the main difference between a population and a sample?

A population can only consist of people
A population is always smaller than a sample
A sample is a subset of a population
A sample is always larger than a population

The main difference between a population and a sample is that a sample is a subset of a population. A population refers to the entire group of individuals or observations that we're interested in, while a sample is a smaller group that's been selected from that population.

Discuss it

What strategies can be employed to reduce both Type I and Type II errors?

Decrease sample size, use a more lenient significance level
Decrease sample size, use a more stringent significance level
Increase sample size, use a more lenient significance level
Increase sample size, use a more stringent significance level

Increasing the sample size makes the test more sensitive, reducing both Type I and Type II errors. Similarly, a more stringent significance level (lower α) reduces the chance of a Type I error. However, it's important to note that decreasing Type I error probability often leads to an increase in Type II error probability, and vice versa. This is known as the Type I/Type II tradeoff.

Discuss it

What does the interquartile range in a box plot represent?

The middle 50% of the data
The range of the top 25% of the data
The range within one standard deviation from the mean
The total range of the dataset

The interquartile range (IQR) in a box plot represents the middle 50% of the data. It is the range within which the central half of the values fall and is calculated as the difference between the third quartile (Q3) and the first quartile (Q1).

Discuss it

In what type of data distribution is the mean usually greater than the median?

Negatively skewed distribution
Normal distribution
Positively skewed distribution
Uniform distribution

In a positively skewed distribution, the mean is usually greater than the median. A positive skew means the right tail of the distribution is longer or fatter. The mean, being affected by the values of the data points, gets dragged in the direction of the tail, and hence is typically greater than the median in a positively skewed distribution.

Discuss it

A distribution that is symmetric and bell-shaped is known as a _______ distribution.

Bimodal
Normal
Skewed
Uniform

A normal distribution, also known as Gaussian distribution, is symmetric and bell-shaped. It is characterized by its mean and standard deviation. The mean, mode and median are all equal and are located at the center of the distribution.

Discuss it

Data that can be divided into categories but has no order or priority is known as ________ data.

Continuous
Discrete
Nominal
Ordinal

Nominal data is data that can be divided into categories but has no order or priority. It is a type of categorical data that simply allows us to classify or categorize. Examples include types of cuisine (Italian, Chinese, Mexican, etc.), hair color, or city of residence.

Discuss it

When two events are mutually exclusive, what is the probability that both will occur?

0
0.5
1
The sum of the probabilities of the two events

When two events are mutually exclusive, it means they cannot occur at the same time. Therefore, the probability that both will occur is 0.

Discuss it

The ___________ correlation is a non-parametric measure of correlation based on data rank.

Kendall's
Pearson's
Point-biserial
Spearman's

Spearman's correlation is a non-parametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function. This makes it suitable for both continuous and discrete ordinal variables.

Discuss it