What does a 95% confidence interval estimate?

  • The mean of the sample
  • The range within which 95% of the data points lie
  • The standard deviation of the population
  • The true population parameter with a 95% level of confidence
A 95% confidence interval estimates the range within which we are 95% confident that the true population parameter lies. It is not about the range of the data or the mean of the sample.

In a Chi-square test for independence, small expected frequencies can lead to a ________ Chi-square value.

  • constant
  • larger
  • smaller
  • zero
In a Chi-square test for independence, small expected frequencies can lead to a larger Chi-square value. This is because the Chi-square value is inflated by small expected frequencies, which can lead to a significant result even when there is no substantial relationship between the variables.

What is the purpose of hypothesis testing in statistics?

  • To compare the sample mean to the population mean
  • To make inferences about a population based on sample data
  • To understand the distribution of the data
  • To visualize the data
Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data. It's an inferential statistic that allows us to infer if our observed results deviate from null hypothesis by chance or by a true statistical difference.

The p-value in a hypothesis test is the probability of getting a sample statistic as extreme as the test statistic, given that the _______ hypothesis is true.

  • Alternative
  • Null
  • Original
  • Random
In the context of hypothesis testing, the p-value is the probability of observing a test statistic as extreme as the one calculated, assuming that the null hypothesis is true.

What is the null hypothesis in the Mann-Whitney U test?

  • The groups have different variances
  • The groups have equal variances
  • There is a significant difference between the groups
  • There is no significant difference between the groups
In the Mann-Whitney U test, the null hypothesis is that there is no significant difference between the groups. More specifically, it states that the probability that a randomly selected value from the first group is greater than a randomly selected value from the second group is equal to 0.5.

How does sample size affect the width of a confidence interval?

  • Increasing the sample size decreases the width of the confidence interval
  • Increasing the sample size has no effect on the width of the confidence interval
  • Increasing the sample size increases the width of the confidence interval
  • The relationship between sample size and the width of the confidence interval is unpredictable
Increasing the sample size decreases the width of the confidence interval. The larger the sample size, the more information you have, and thus the less uncertainty (which translates into a smaller standard error and narrower confidence interval).

If A and B are independent events, the probability of both occurring is ________.

  • P(A + B)
  • P(A / B)
  • P(A ∩ B)
  • P(A ∪ B)
If A and B are independent events, the probability of both occurring is P(A ∩ B) which is equal to P(A) * P(B). This is the fundamental characteristic of independent events in probability.

Why is it important to consider the power of a test when designing a study?

  • To ensure the study can detect an effect if it exists
  • To ensure the study does not detect an effect if it does not exist
  • To maximize the chance of a Type I error
  • To minimize the chance of a Type I error
The power of a test is the ability of the test to detect an effect if it truly exists. It's the probability that the test correctly rejects a false null hypothesis. High power is desirable because it means the test is less likely to make a Type II error (false negative). When designing a study, it's important to choose a sample size and significance level that will provide enough power to detect an effect if one exists.

_______ is a measure of how spread out the numbers in a dataset are around the mean.

  • Median
  • Range
  • Standard Deviation
  • Variance
Standard deviation is a measure of how spread out the numbers in a dataset are around the mean. It measures the average distance between each data point and the mean. The higher the standard deviation, the more spread out the data is.

In the context of cluster analysis, what is the 'centroid'?

  • The average distance between clusters
  • The geometric center of a cluster
  • The largest point in a cluster
  • The smallest point in a cluster
The centroid is the geometric center of a cluster. In other words, it's the mean value of all the points in a specific cluster.