How does the sample size affect the standard error of a sample mean?

Larger sample sizes decrease the standard error
Larger sample sizes increase the standard error
Smaller sample sizes decrease the standard error
The sample size has no effect on the standard error

The sample size has an inverse relationship with the standard error of a sample mean. As the sample size increases, the standard error decreases. This is because larger samples provide a better approximation of the population, reducing the variability of the sample mean around the population mean.

Discuss it

How does the choice of significance level (α) affect the conclusion of a Chi-square test for goodness of fit?

A higher α makes it easier to reject the null hypothesis
A higher α makes it harder to reject the null hypothesis
α has no impact on the conclusion of the test
α only affects the power of the test, not the conclusion

A higher significance level (α) increases the likelihood of rejecting the null hypothesis. This is because you're setting a higher bar for the amount of evidence needed to retain the null hypothesis.

Discuss it

If a distribution is flatter than a normal distribution, it is said to have negative ________.

Kurtosis
Mean
Skewness
Variance

If a distribution is flatter than a normal distribution, it is said to have negative kurtosis. This type of distribution has lighter tails and a flatter peak than the normal distribution. It is also called platykurtic.

Discuss it

A __________ is a subset of a population that is used to represent the entire group as a whole.

Dataset
Parameter
Sample
Statistic

A sample in statistics is a subset of individuals or observations from a larger population. Sampling is a key concept in statistics and data science because it allows us to collect and analyze a manageable amount of data that represents a larger group.

Discuss it

Which type of plot is particularly useful for identifying outliers in a dataset?

Bar plot
Box plot
Histogram
Scatter plot

Box plots are particularly useful for identifying outliers in a dataset. The box plot displays a summary of the data distribution including minimum, first quartile, median, third quartile, and maximum. Outliers are typically represented as individual points that are far from the 'box' and 'whiskers'.

Discuss it

What are the implications of the Central Limit Theorem on statistical testing?

It asserts that all statistical tests must involve the normal distribution.
It eliminates the need for statistical testing.
It guarantees that all results of statistical tests will be accurate.
It states that sample means will be normally distributed regardless of the shape of the population distribution.

The Central Limit Theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables will be approximately normally distributed, regardless of the shape of the original distribution. This underpins many statistical methods, including hypothesis tests and confidence intervals, which may assume normality of the sampling distribution.

Discuss it

How does PCA help in reducing the dimensionality of the dataset?

By creating new uncorrelated variables
By grouping similar data together
By removing unnecessary data
By rotating the data to align with axes

PCA reduces the dimensionality of a dataset by creating new uncorrelated variables that successively maximize variance. These new variables or "principal components" can replace the original variables, thus reducing the data's dimensionality.

Discuss it

In a 95% confidence interval, if the true population parameter lies outside of the interval, it is considered a _______ error.

Alpha
Standard
Type I
Type II

In a 95% confidence interval, if the true population parameter lies outside of the interval, it is considered a Type I error. This is when the null hypothesis is true, but is incorrectly rejected.

Discuss it

_________ sampling is a method where every individual in the population has an equal chance of being selected.

Cluster
Simple Random
Stratified
Systematic

Simple random sampling is a basic type of sampling method where each individual in the population has an equal chance of being selected. This ensures that the sample will be representative of the population, making it easier to make accurate inferences about the whole population.

Discuss it

What is the relationship between the eigenvalue of a component and the variance of that component in PCA?

It depends on the dataset
There is no relationship
They are directly proportional
They are inversely proportional

The eigenvalue of a component in PCA is directly proportional to the variance of that component. In other words, a larger eigenvalue corresponds to a larger amount of variance explained by that principal component.

Discuss it

How is Bayes' theorem related to conditional probability?

Bayes' theorem and conditional probability are not related
Bayes' theorem cannot be used with conditional probability
Bayes' theorem is a specific type of conditional probability
Bayes' theorem is used to calculate the complement of the conditional probability

Bayes' theorem is a way of finding a probability when we know certain other probabilities. The probabilities that we know are usually conditional probabilities, and Bayes' theorem is used to 'reverse' these probabilities.

Discuss it

The ________ in a two-way ANOVA can reveal whether the effect of one independent variable depends on the level of the other independent variable.

Effect size
Interaction effect
Main effect
Post-hoc test

The interaction effect in a two-way ANOVA reveals whether the effect of one independent variable depends on the level of the other independent variable. This allows us to understand how the independent variables relate to each other.

Discuss it