What is the interpretation of a 95% confidence interval that contains zero?

  • The sample mean is significantly different from zero
  • The sample size was not large enough to determine a precise estimate of the population parameter
  • There is a 95% chance that the true population parameter is zero
  • There is no significant evidence to suggest that the true population parameter is different from zero
If a 95% confidence interval includes zero, it means that there is no significant evidence to suggest that the true population parameter is different from zero. This is often interpreted in the context of hypothesis testing, where a confidence interval that includes zero implies that we fail to reject the null hypothesis.

What is an interaction effect in regression analysis?

  • It's when one variable has a stronger effect than another
  • It's when the effect of one variable changes based on the level of another variable
  • It's when two variables have no effect on each other
  • It's when two variables have the same effect on the dependent variable
An interaction effect in regression analysis is when the effect of one independent variable on the dependent variable changes based on the level of another independent variable. This is captured by including an interaction term in the regression model.

How can you detect multicollinearity in multiple linear regression?

  • By checking the correlation among predictors
  • By checking the normality of residuals
  • By looking at the scatter plot of residuals
  • By using the F-test
Multicollinearity can be detected by examining the correlations among the predictors. High correlation among the predictors indicates the presence of multicollinearity. More formal methods such as the Variance Inflation Factor (VIF) can also be used.

A __________ is the difference between the observed value and the predicted value of the response variable in regression analysis.

  • Coefficient
  • Error term
  • Outlier
  • Residual
In the context of regression analysis, the difference between the observed value and the predicted value of the response variable is called a "residual".

The ________ distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space.

  • Exponential
  • Gaussian
  • Poisson
  • Uniform
The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.

A 95% confidence interval means that if the same sampling method is repeated many times, then ________% of the confidence intervals will contain the true population parameter.

  • 50
  • 75
  • 90
  • 95
A 95% confidence interval means that if we were to take a large number of samples and calculate the confidence interval for each sample, we would expect the true population parameter to fall within the interval 95% of the time.

What is the significance of 'distance measures' in cluster analysis?

  • Distance measures determine the similarities or differences between data points
  • Distance measures help in determining the shape of the clusters
  • Distance measures help in visualizing the clusters
  • Distance measures indicate the number of clusters
Distance measures, like Euclidean distance or Manhattan distance, play a crucial role in cluster analysis. They determine the similarities or differences between data points. They influence how the clusters will be formed, as the most similar or closest data points get clustered together.

In what situations is Spearman's rank correlation preferred over Pearson's correlation?

  • All of the above
  • When the data contains outliers
  • When the relationship between variables is nonlinear
  • When the variables are not normally distributed
Spearman's rank correlation coefficient is a nonparametric measure of rank correlation. It's preferred over Pearson's correlation when the variables are not normally distributed, the relationship is nonlinear, or the data contains outliers. It assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any assumptions about the frequency distribution of the variables.

What is a residual in the context of simple linear regression?

  • The difference between the observed and predicted values
  • The difference between the predicted and observed values of the independent variable
  • The error in the slope of the regression line
  • The observed value of the dependent variable
A residual is the difference between the observed value of the dependent variable (y) and the predicted value (ŷ), given by the regression model. It represents the error of the estimate.

How do changes in the scale of measurement affect the correlation coefficient?

  • They decrease the correlation coefficient
  • They do not affect the correlation coefficient
  • They increase the correlation coefficient
  • They reverse the sign of the correlation coefficient
The correlation coefficient is not affected by changes in the center (mean) or scale (standard deviation) of the variables. This is because correlation measures the strength of a relationship between variables relative to their variability. It's a dimensionless quantity, so changes in the scale of measurements of the variables do not change it.