The Pearson's Correlation Coefficient measures the ________ between two variables.

  • causal relationship
  • linear correlation
  • percentage similarity
  • rank
Pearson's Correlation Coefficient measures the linear correlation between two variables. It quantifies the degree to which two variables are related to each other.

What does it mean when we say a non-parametric test makes fewer assumptions about the data distribution?

  • The data distribution must be known
  • The data does not have to follow a specific distribution, such as normal
  • The data must be normally distributed
  • The data must be uniformly distributed
When we say a non-parametric test makes fewer assumptions about the data distribution, we mean that the data does not have to follow a specific distribution, such as the normal distribution. Non-parametric tests are distribution-free tests and make no assumption about the probability distribution of the variables.

What are the components of a confidence interval?

  • The population mean, the margin of error, and the level of confidence
  • The population mean, the sample size, and the standard error
  • The sample mean, the margin of error, and the level of confidence
  • The sample mean, the population size, and the standard deviation
A confidence interval is composed of three parts: a point estimate (the sample mean), a margin of error (which depends on the standard error and the Z-value or T-value), and the level of confidence (which indicates the probability that the interval estimate contains the population parameter).

Bayes' theorem combines our prior knowledge about an event with evidence from data to provide a ________ probability.

  • joint
  • marginal
  • posterior
  • prior
The theorem combines our prior knowledge (the prior probability) and evidence (the likelihood) to provide a new, updated probability of an event (the posterior probability).

An event that cannot possibly occur has a probability of ________.

  • -1
  • 0
  • 0.5
  • 1
An event that cannot possibly occur is said to be impossible and has a probability of 0. This is in line with the definition of probability as a measure that takes values between 0 and 1, inclusive.

What is the Central Limit Theorem and how does it relate to point and interval estimation?

  • It implies that every data set is symmetrically distributed, which affects the reliability of point and interval estimations
  • It suggests that all data has a central tendency and this affects the point and interval estimations
  • It suggests that as sample size increases, the distribution of sample means approaches a normal distribution, which affects how we estimate population parameters
  • It suggests that every large enough dataset is normally distributed, which is the foundation of point and interval estimations
The Central Limit Theorem states that when you have a sufficiently large sample, the distribution of the sample mean approximates a normal distribution, regardless of the shape of the population distribution. This allows us to make inferences about the population parameters using the sample mean and the standard error, which form the basis of point and interval estimation.

What happens to the width of a confidence interval as the confidence level increases?

  • It decreases
  • It fluctuates unpredictably
  • It increases
  • It stays the same
The width of a confidence interval increases as the confidence level increases. A higher confidence level means that you want to be more sure that you are capturing the true population parameter, which requires a wider interval.

The presence of a pattern in the residuals of a multiple linear regression model can indicate violations of the ________ assumption.

  • homoscedasticity
  • independence
  • linearity
  • normality
The presence of a pattern in the residuals of a multiple linear regression model can indicate a violation of the independence assumption. This assumption requires that the residuals, which are the differences between the observed and predicted values of the dependent variable, are independent of each other. If a pattern is observed in the residuals, it may indicate that the residuals are not independent, and the model may not provide valid results.

What are the common techniques used for model selection in multiple regression?

  • Chi-square test
  • F-test
  • Forward selection, backward elimination, and stepwise regression.
  • T-test
Techniques like forward selection, backward elimination, and stepwise regression are commonly used for model selection in multiple regression.

How is the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy used in factor analysis?

  • It is used to assess the appropriateness of factor analysis
  • It is used to determine the number of factors to retain
  • It is used to test the assumption of homoscedasticity
  • It is used to test the assumption of normality
The Kaiser-Meyer-Olkin (KMO) measure is a measure of how suitable the data is for factor analysis. It determines the adequacy for each observed variable and for the complete model. KMO estimates vary between 0 and 1. A value of 0 indicates that the sum of partial correlations is large relative to the sum correlations, implying diffusion in the pattern of correlations (hence, factor analysis will be likely inappropriate).

How does the sample size affect the width of the confidence interval?

  • Larger sample size makes the interval narrower
  • Larger sample size makes the interval wider
  • Sample size has no effect on the interval
  • nan
Larger sample sizes reduce the standard error and thus, the width of the confidence interval becomes narrower. This means that with larger samples, our estimates are more precise.

The Sign Test ignores the ________ of the differences between paired observations.

  • direction
  • distribution
  • magnitude
  • nan
The Sign Test ignores the magnitude of the differences between paired observations, and only considers the sign of the differences.