How does Spearman's Rank Correlation react to outliers as compared to Pearson's correlation?

  • Both are equally sensitive to outliers
  • Less sensitive to outliers
  • More sensitive to outliers
  • Neither is sensitive to outliers
Spearman's Rank Correlation is less sensitive to outliers than Pearson's correlation. This is because Spearman's correlation is based on rank orders rather than raw data values, making it more robust against outliers.

What does a null hypothesis represent in statistical testing?

  • A condition of no effect or no difference
  • A specific outcome of the experiment
  • An effect or difference exists
  • The sample size is large enough for the test
The null hypothesis is a statement in statistical inference which asserts that there is no significant difference between the set of observed and expected data.

How does the rate parameter affect the shape of a Poisson distribution?

  • All of the above
  • It determines the kurtosis of the distribution
  • It determines the skewness of the distribution
  • It does not affect the shape of the distribution
The rate parameter (lambda) in a Poisson distribution determines the average rate of occurrence per interval, which directly affects the shape of the distribution. Higher lambda values result in distributions that are less skewed to the right.

How does the Central Limit Theorem relate to the use of Z-tests?

  • It allows for the assumption that the sample mean distribution is normally distributed
  • It enables the calculation of the sample standard deviation
  • It increases the power of the test
  • It reduces the impact of outliers in the sample
The Central Limit Theorem states that, with a large enough sample size, the distribution of the sample mean will be approximately normally distributed. This allows us to use Z-tests even when the population is not normally distributed.

What is the difference between a one-sample t-test and a two-sample t-test?

  • All of the above
  • The number of hypotheses being tested
  • The number of samples being compared
  • The type of data being used
The key difference between a one-sample t-test and a two-sample t-test lies in the number of samples being compared. A one-sample t-test compares the mean of a single sample to a known value, while a two-sample t-test compares the means of two different samples.

What is the concept of significance level in hypothesis testing?

  • The amount of data needed to support the alternative hypothesis
  • The difference between the null and alternative hypotheses
  • The probability of rejecting a true null hypothesis
  • The proportion of the sample that supports the null hypothesis
The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true.

What is the Multiplication Rule of Probability primarily used for?

  • To calculate the joint probability of two independent events
  • To calculate the probability of either of two events occurring
  • To divide one probability by another
  • To subtract one probability from another
The Multiplication Rule in probability is used to calculate the joint probability of two independent events. It states that the probability of two independent events both occurring is the product of their individual probabilities.

What is the primary purpose of the Mann-Whitney U test?

  • To calculate the correlation between two variables
  • To compare the means of two independent groups
  • To compare the medians of two independent groups
  • To compare the variances of two independent groups
The Mann-Whitney U test is a nonparametric statistical significance test for determining whether two independent samples were drawn from a population with the same distribution, specifically, it tests the null hypothesis that the medians of two groups are the same.

What is the goal of 'hierarchical' clustering?

  • To create a hierarchy or a tree of clusters
  • To find the centroid of clusters
  • To find the most diverse instances in the dataset
  • To predict the outcome of a new instance
The goal of hierarchical clustering is to create a hierarchy or a tree of clusters. This hierarchy can be visually represented in a dendrogram.

________ is a popular method for cluster analysis that partitions the data into non-hierarchical clusters.

  • DBSCAN
  • Hierarchical
  • K-means
  • PCA
K-means is a popular method for cluster analysis that partitions the data into non-hierarchical clusters. The algorithm iteratively assigns each data point to one of the K clusters based on the feature similarity (distance).