What is the relationship between Cramér's V and the Chi-square test?
- Cramér's V is the inverse of the Chi-square statistic
- Cramér's V is the square of the Chi-square statistic
- Cramér's V is the square root of the Chi-square statistic divided by the sample size and the minimum of rows and columns minus 1
- There is no relationship between Cramér's V and the Chi-square test
Cramér's V is a measure of association between two nominal variables and it is based on the Chi-square statistic. It is calculated as the square root of the Chi-square statistic divided by the sample size and the minimum of rows and columns minus 1.
How is the confidence interval for a proportion calculated?
- nan
- p ± (z*√(p(1-p)/n))
- p ± z*(s/√n)
- p ± z*(σ/√n)
The confidence interval for a proportion is calculated using the formula: p ± (z*√(p(1-p)/n)), where p is the sample proportion, z is the z-score associated with the desired confidence level, and n is the sample size.
What is the primary purpose of conducting an ANOVA test?
- To calculate the standard deviation of a dataset
- To determine the mode of a set of data
- To find the correlation between two variables
- To test the equality of means among groups
The primary purpose of an ANOVA test is to compare the means of different groups and determine whether any of those means are significantly different from each other.
In multiple linear regression, ________ is used to test the overall significance of the model.
- the Chi-square statistic
- the F-statistic
- the Z-statistic
- the t-statistic
In multiple linear regression, the F-statistic is used to test the overall significance of the model. This test checks the null hypothesis that all regression coefficients are zero against the alternative that at least one of them is not zero. If the F-statistic is significantly large and the corresponding p-value is small, we reject the null hypothesis, concluding that the regression model has some validity in predicting the outcome variable.
The ________ is used to fit the regression line in a simple linear regression model.
- least squares method
- mean
- median
- mode
The least squares method is used to find the best-fitting line through the data points. This is done by minimizing the sum of the squares of the vertical distances of the points from the line.
When is it appropriate to use a binomial distribution?
- When each trial in an experiment has exactly two possible outcomes
- When the data is continuous
- When the outcomes are not independent
- When the probability of success changes with each trial
A binomial distribution is appropriate when conducting an experiment where each trial has exactly two possible outcomes (often termed success and failure), the trials are independent, and the probability of success is constant across trials.
What are the dependent and independent variables in simple linear regression?
- Both variables are dependent
- Both variables are independent
- The dependent variable is the outcome we are trying to predict, and the independent variable is the predictor
- The dependent variable is the predictor, and the independent variable is the outcome we are trying to predict
In simple linear regression, the dependent variable is the outcome we are trying to predict, and the independent variable is the predictor. The dependent variable is also known as the response or target variable, and the independent variable is also known as the explanatory or feature variable.
What is the difference between a discrete and a continuous probability distribution?
- Discrete distributions are always normal; continuous distributions are always uniform
- Discrete distributions are for qualitative data; continuous distributions are for quantitative data
- Discrete distributions involve countable outcomes; continuous distributions involve uncountable outcomes
- There is no difference
Discrete probability distributions are used when the outcomes are countable or discrete. Examples include the number of heads when flipping coins or the number of defective items in a batch. Continuous probability distributions are used when outcomes are uncountably infinite, typically involving measurements. Examples include the height of individuals or the time it takes to run a mile.
The Mann-Whitney U test is primarily used for comparing ________ distributions.
- binomial
- dependent
- independent
- normal
The Mann-Whitney U test is used for comparing independent distributions, particularly to determine whether two independent samples were drawn from a population with the same distribution.
What is a random variable in probability theory?
- A factor that doesn't change
- A variable that can take on different values, each with an associated probability
- An unknown variable
- An unpredictable factor
A random variable in probability theory is a variable that can take on different values, each with an associated probability. It's not "random" in the everyday sense of the word, but its exact value is uncertain until it's observed.
A _______ t-test is used to compare two related samples or repeated measurements on a single sample.
- Independent
- One-sample
- Paired
- Two-sample
A Paired t-test is used to compare two related samples or repeated measurements on a single sample. It's often used in before-and-after scenarios where the same individuals are measured twice.
How does the standard deviation affect the shape of a normal distribution?
- Changes the kurtosis
- Changes the skewness
- Changes the spread or dispersion
- Does not affect the shape
The standard deviation, a measure of dispersion or spread, determines the width of a normal distribution. A larger standard deviation results in a wider, flatter distribution, while a smaller standard deviation results in a narrower, steeper distribution.