Conditional independence of A and B given C means that knowing that C has occurred does not change the ________ between A and B.
- Difference
- Intersection
- Ratio
- Relationship
Conditional independence of A and B given C means that knowing that C has occurred does not change the relationship between A and B. In other words, the occurrence of event C does not affect the independence of events A and B.
What is the assumption made when computing the Pearson correlation coefficient?
- The correlation is zero
- The variables are independent
- The variables are normally distributed
- There is a linear relationship between variables
When computing the Pearson correlation coefficient, it is assumed that there is a linear relationship between the variables. Furthermore, it's also assumed that the variables are continuous and that the data is homoscedastic (i.e., the variance of the errors is the same across all levels of the variables).
How is the variance related to the standard deviation in a data set?
- The variance is the average of the standard deviation
- The variance is the square of the standard deviation
- The variance is the square root of the standard deviation
- The variance is twice the standard deviation
The variance is the square of the standard deviation. Standard deviation is a measure of dispersion in a dataset and variance is a square of it, meaning that they both represent the same concept of dispersion, but in different units.
What does kurtosis measure in a dataset?
- Central tendency
- Dispersion
- Skewness
- The "tailedness" of the distribution
Kurtosis is a statistical measure that defines how heavily the tails of a distribution differ from the tails of a normal distribution. In other words, kurtosis identifies whether the tails of a given distribution contain extreme values.
How does Spearman's Rank Correlation react to outliers as compared to Pearson's correlation?
- Both are equally sensitive to outliers
- Less sensitive to outliers
- More sensitive to outliers
- Neither is sensitive to outliers
Spearman's Rank Correlation is less sensitive to outliers than Pearson's correlation. This is because Spearman's correlation is based on rank orders rather than raw data values, making it more robust against outliers.
What does a null hypothesis represent in statistical testing?
- A condition of no effect or no difference
- A specific outcome of the experiment
- An effect or difference exists
- The sample size is large enough for the test
The null hypothesis is a statement in statistical inference which asserts that there is no significant difference between the set of observed and expected data.
How does the rate parameter affect the shape of a Poisson distribution?
- All of the above
- It determines the kurtosis of the distribution
- It determines the skewness of the distribution
- It does not affect the shape of the distribution
The rate parameter (lambda) in a Poisson distribution determines the average rate of occurrence per interval, which directly affects the shape of the distribution. Higher lambda values result in distributions that are less skewed to the right.
How does the Central Limit Theorem relate to the use of Z-tests?
- It allows for the assumption that the sample mean distribution is normally distributed
- It enables the calculation of the sample standard deviation
- It increases the power of the test
- It reduces the impact of outliers in the sample
The Central Limit Theorem states that, with a large enough sample size, the distribution of the sample mean will be approximately normally distributed. This allows us to use Z-tests even when the population is not normally distributed.
In what kind of scenario is the Central Limit Theorem used?
- It's used only when dealing with a uniform distribution.
- It's used to determine whether an event will occur.
- It's used to predict the future.
- It's used when we want to make inferences about a population based on a sample.
The Central Limit Theorem (CLT) is often used in scenarios where we are interested in the average outcome of a large number of independent or nearly independent events. This is commonly the case when we are making inferences about a population based on a sample.
What does a residual plot tell us about the fit of the model?
- It indicates how well the model's predictions match the actual data
- It indicates the variance of the residuals
- It shows the correlation between the dependent and independent variables
- It shows the relationship between the dependent and independent variables
A residual plot shows the residuals on the y-axis and the independent variable on the x-axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.