How does skewness affect the mean and median of a dataset?
- In a positively skewed distribution, the mean is greater than the median
- In a positively skewed distribution, the median is greater than the mean
- Skewness affects only the mean
- Skewness does not affect the mean and median
In a positively skewed distribution, the mean is greater than the median as the mean gets pulled in the direction of the skew (towards the right tail). In a negatively skewed distribution, the mean is less than the median as the mean gets pulled towards the left tail.
What are confidence intervals used for in statistics?
- To determine the median of a sample
- To determine the spread of data in a sample
- To estimate the population parameter
- To find the mean of a sample
Confidence intervals are used to estimate the range within which the true population parameter lies with a certain degree of confidence. They do not specifically determine the mean, median, or spread of a sample.
What is the difference between frequentist and Bayesian statistics?
- Bayesians use Bayes' theorem, frequentists do not
- Frequentists believe in probability and Bayesians do not
- Frequentists interpret probability as a long-run frequency, Bayesians as a degree of belief
- There is no difference
Frequentist statistics interprets probability as the long-run frequency of events, whereas Bayesian statistics interprets probability as a degree of belief or as subjective probability. The Bayesian approach uses Bayes' theorem to update probabilities based on new data.
________ is a measure of asymmetry of a probability distribution.
- Mean
- Median
- Mode
- Skewness
Skewness is a measure of the asymmetry of a probability distribution about its mean. It quantifies the direction and extent of skew (departure from horizontal symmetry) in the data.
The _________ states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger—no matter what the shape of the population distribution.
- Central Limit Theorem
- Law of Large Numbers
- Probability Rule
- Sampling Distribution
The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger—no matter what the shape of the population distribution. This allows us to apply normal probability calculations to situations that might not initially seem appropriate for them.
What does a scatter plot with points clustered tightly around a line indicate?
- A strong correlation
- A weak correlation
- An undefined correlation
- No correlation
When points in a scatter plot are clustered tightly around a line, it indicates a strong correlation between the two variables. The line is typically a line of best fit or regression line.
What assumptions are made when conducting an ANOVA test?
- Independent observations, no outliers, equal sample sizes
- Independent observations, normal distribution of variables, no outliers
- Independent observations, normally distributed residuals, homoscedasticity
- No missing data, normally distributed residuals, no outliers
ANOVA makes three key assumptions: 1) Observations are independent. 2) Residuals (the differences between the observed and predicted values) are normally distributed. 3) The variance of the residuals is the same for all groups (homoscedasticity).
The ________ of a random variable is the sum of the probabilities of all possible outcomes.
- Distribution
- Expected value
- Mean
- Variance
The "expected value" of a random variable is the sum of all possible values it can take, each multiplied by the probability of that outcome. It gives us the mean or average value of the random variable and is a fundamental concept in probability theory and statistics.
Why might you perform a paired t-test?
- All of the above
- To compare the means of the same group at two different times
- To compare the means of two different populations
- To compare two independent groups
A paired t-test is used to compare the means of the same group at two different times or under two different conditions. It is not used to compare independent groups or different populations.
In the factor analysis, the _______ measures the amount of variance in all the variables which is accounted for by that factor.
- communality
- eigenvalue
- factor variance
- total variance
In the factor analysis, the eigenvalue measures the amount of variance in all the variables which is accounted for by that factor.
A ________ distribution has a constant probability.
- Binomial
- Normal
- Poisson
- Uniform
A uniform distribution is a type of probability distribution in which all outcomes are equally likely. This implies a constant probability for all outcomes.
What can cause the Chi-square test for goodness of fit to be biased?
- Having a large sample size
- Having a small sample size
- Having equal expected frequencies in all categories
- Having normally distributed data
A small sample size can lead to unreliable results in a Chi-square test for goodness of fit. This can be due to the fact that the test requires a sufficient number of observations in each category to provide a reliable estimate of the distribution.