What are the potential issues with the K-means clustering method?

  • It cannot handle non-spherical clusters
  • It does not work well with non-numeric data
  • It is sensitive to outliers
  • All the options
The K-means clustering method can have several issues: it doesn't work well with non-numeric data, it's sensitive to outliers (since outliers can significantly move the cluster centroids), and it has difficulty handling clusters that are non-spherical or have varying sizes and densities.

In the context of a scatter plot, what does a positive slope indicate?

  • The correlation between the variables is weak
  • The variables are negatively correlated
  • The variables are positively correlated
  • The variables are unrelated
A positive slope in a scatter plot suggests that the two variables are positively correlated. This means as one variable increases, the other variable also tends to increase.

What is the impact of PCA on the interpretability of the original features?

  • It depends on the data
  • It doesn't affect interpretability
  • It enhances interpretability
  • It reduces interpretability
PCA typically reduces the interpretability of the original features. This is because each principal component is a linear combination of all the original features, making it difficult to understand how individual features affect the outcome.

What is the primary application of Bayes' Theorem in statistics?

  • To calculate the mean of a data set
  • To calculate the standard deviation
  • To determine if two events are independent
  • To update prior beliefs given new data
Bayes' Theorem is primarily used to update prior beliefs given new data. It's a way to go from a prior probability to a posterior probability, which is a more accurate estimate because it incorporates new evidence.

How do Type I and Type II errors relate to the power of a statistical test?

  • Both decrease the power of a test
  • Both increase the power of a test
  • Type I errors decrease the power, Type II errors increase it
  • Type I errors increase the power, Type II errors decrease it
The power of a test is the probability that it correctly rejects a false null hypothesis (true positive). It's the complement of a Type II error. As Type I error probability increases, power also increases because we're more willing to reject the null hypothesis. However, a Type II error decreases power because it's a missed opportunity to reject a false null hypothesis.

What happens to the correlation coefficient when you have outliers in your data?

  • It decreases
  • It increases
  • It may become misleading
  • It remains the same
Outliers can greatly affect the correlation coefficient, making it misleading. If outliers are in the same direction, they can inflate the correlation. If they are in opposite directions, they can deflate or even reverse the sign of the correlation. Hence, it's important to handle outliers before conducting correlation analysis.

How does the standard error affect the confidence interval?

  • Larger standard error leads to a narrower interval
  • Larger standard error leads to a skewed interval
  • Larger standard error leads to a wider interval
  • Standard error does not affect the confidence interval
Larger standard error leads to a wider confidence interval. The standard error measures the variability in the sampling distribution and a larger standard error suggests more variability, which in turn leads to less precise estimates and wider intervals.

How does the interquartile mean provide a measure of central tendency that is resistant to outliers?

  • By focusing on the data between the first and third quartiles
  • By focusing only on the highest values in the data
  • By focusing only on the lowest values in the data
  • By ignoring all outlier values
The interquartile mean focuses on the data between the first quartile (25th percentile) and the third quartile (75th percentile), excluding the lowest 25% and the highest 25% of data points. This makes it less influenced by outliers and extreme values, hence a more robust measure of central tendency for skewed or asymmetrical distributions.

In hypothesis testing, a Type I error is committed when the null hypothesis is ______ but we ______ it.

  • False, fail to reject
  • False, reject
  • True, fail to reject
  • True, reject
A Type I error, also known as a false positive, occurs when we reject a true null hypothesis. This means we've found evidence of an effect or difference when there really isn't one.

The ______ Rule of Probability is used when we want to find the probability that either of two events happens.

  • Addition
  • Division
  • Multiplication
  • Subtraction
The Addition Rule of Probability is used when we want to find the probability that either of two events happens. This rule states that the probability of either of two mutually exclusive events occurring is the sum of their individual probabilities.