Which common statistical test is considered non-parametric?

  • ANOVA
  • Chi-Square Test
  • Linear Regression
  • t-test
The Chi-Square Test is a common statistical test that is considered non-parametric. This test is often used to analyze categorical data and does not require assumptions about the population distribution.

Which measure of dispersion considers all the data points in a dataset?

  • Interquartile range
  • Mode
  • Range
  • Variance
Variance is a measure of dispersion that considers all data points in the dataset. It is calculated by taking the average of the squared differences from the mean.

The residuals in a simple linear regression model should be randomly distributed. This is referred to as the assumption of ________.

  • autocorrelation
  • heteroscedasticity
  • independence
  • multicollinearity
The assumption of independence in simple linear regression implies that the residuals (errors) between the observed and predicted values are not correlated. That is, the error value for one observation does not depend on the error value of any other observation. This is typically checked by examining a plot of the residuals for any visible pattern.

What is the interpretation of a 95% confidence interval that contains zero?

  • The sample mean is significantly different from zero
  • The sample size was not large enough to determine a precise estimate of the population parameter
  • There is a 95% chance that the true population parameter is zero
  • There is no significant evidence to suggest that the true population parameter is different from zero
If a 95% confidence interval includes zero, it means that there is no significant evidence to suggest that the true population parameter is different from zero. This is often interpreted in the context of hypothesis testing, where a confidence interval that includes zero implies that we fail to reject the null hypothesis.

What is an interaction effect in regression analysis?

  • It's when one variable has a stronger effect than another
  • It's when the effect of one variable changes based on the level of another variable
  • It's when two variables have no effect on each other
  • It's when two variables have the same effect on the dependent variable
An interaction effect in regression analysis is when the effect of one independent variable on the dependent variable changes based on the level of another independent variable. This is captured by including an interaction term in the regression model.

A 95% confidence interval means that if the same sampling method is repeated many times, then ________% of the confidence intervals will contain the true population parameter.

  • 50
  • 75
  • 90
  • 95
A 95% confidence interval means that if we were to take a large number of samples and calculate the confidence interval for each sample, we would expect the true population parameter to fall within the interval 95% of the time.

What is the significance of 'distance measures' in cluster analysis?

  • Distance measures determine the similarities or differences between data points
  • Distance measures help in determining the shape of the clusters
  • Distance measures help in visualizing the clusters
  • Distance measures indicate the number of clusters
Distance measures, like Euclidean distance or Manhattan distance, play a crucial role in cluster analysis. They determine the similarities or differences between data points. They influence how the clusters will be formed, as the most similar or closest data points get clustered together.

In what situations is Spearman's rank correlation preferred over Pearson's correlation?

  • All of the above
  • When the data contains outliers
  • When the relationship between variables is nonlinear
  • When the variables are not normally distributed
Spearman's rank correlation coefficient is a nonparametric measure of rank correlation. It's preferred over Pearson's correlation when the variables are not normally distributed, the relationship is nonlinear, or the data contains outliers. It assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any assumptions about the frequency distribution of the variables.

A __________ is the difference between the observed value and the predicted value of the response variable in regression analysis.

  • Coefficient
  • Error term
  • Outlier
  • Residual
In the context of regression analysis, the difference between the observed value and the predicted value of the response variable is called a "residual".

The ________ distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space.

  • Exponential
  • Gaussian
  • Poisson
  • Uniform
The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.