What does a null hypothesis represent in statistical testing?

  • A condition of no effect or no difference
  • A specific outcome of the experiment
  • An effect or difference exists
  • The sample size is large enough for the test
The null hypothesis is a statement in statistical inference which asserts that there is no significant difference between the set of observed and expected data.

How does the rate parameter affect the shape of a Poisson distribution?

  • All of the above
  • It determines the kurtosis of the distribution
  • It determines the skewness of the distribution
  • It does not affect the shape of the distribution
The rate parameter (lambda) in a Poisson distribution determines the average rate of occurrence per interval, which directly affects the shape of the distribution. Higher lambda values result in distributions that are less skewed to the right.

How does the Central Limit Theorem relate to the use of Z-tests?

  • It allows for the assumption that the sample mean distribution is normally distributed
  • It enables the calculation of the sample standard deviation
  • It increases the power of the test
  • It reduces the impact of outliers in the sample
The Central Limit Theorem states that, with a large enough sample size, the distribution of the sample mean will be approximately normally distributed. This allows us to use Z-tests even when the population is not normally distributed.

In what kind of scenario is the Central Limit Theorem used?

  • It's used only when dealing with a uniform distribution.
  • It's used to determine whether an event will occur.
  • It's used to predict the future.
  • It's used when we want to make inferences about a population based on a sample.
The Central Limit Theorem (CLT) is often used in scenarios where we are interested in the average outcome of a large number of independent or nearly independent events. This is commonly the case when we are making inferences about a population based on a sample.

What is the concept of significance level in hypothesis testing?

  • The amount of data needed to support the alternative hypothesis
  • The difference between the null and alternative hypotheses
  • The probability of rejecting a true null hypothesis
  • The proportion of the sample that supports the null hypothesis
The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true.

What is the Multiplication Rule of Probability primarily used for?

  • To calculate the joint probability of two independent events
  • To calculate the probability of either of two events occurring
  • To divide one probability by another
  • To subtract one probability from another
The Multiplication Rule in probability is used to calculate the joint probability of two independent events. It states that the probability of two independent events both occurring is the product of their individual probabilities.

What is the primary purpose of the Mann-Whitney U test?

  • To calculate the correlation between two variables
  • To compare the means of two independent groups
  • To compare the medians of two independent groups
  • To compare the variances of two independent groups
The Mann-Whitney U test is a nonparametric statistical significance test for determining whether two independent samples were drawn from a population with the same distribution, specifically, it tests the null hypothesis that the medians of two groups are the same.

What is the goal of 'hierarchical' clustering?

  • To create a hierarchy or a tree of clusters
  • To find the centroid of clusters
  • To find the most diverse instances in the dataset
  • To predict the outcome of a new instance
The goal of hierarchical clustering is to create a hierarchy or a tree of clusters. This hierarchy can be visually represented in a dendrogram.

________ is a popular method for cluster analysis that partitions the data into non-hierarchical clusters.

  • DBSCAN
  • Hierarchical
  • K-means
  • PCA
K-means is a popular method for cluster analysis that partitions the data into non-hierarchical clusters. The algorithm iteratively assigns each data point to one of the K clusters based on the feature similarity (distance).

What is the impact of outliers on the skewness of a distribution?

  • Outliers can decrease skewness
  • Outliers can either increase or decrease skewness
  • Outliers can increase skewness
  • Outliers do not impact skewness
Outliers can have a significant impact on the skewness of a distribution. An outlier can increase skewness if it is further from the mean in the direction of the skew. Conversely, an outlier can decrease skewness if it is further from the mean in the direction opposite to the skew. The extent of the impact depends on the value and direction of the outlier relative to the rest of the data.