How is the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy used in factor analysis?
- It is used to assess the appropriateness of factor analysis
- It is used to determine the number of factors to retain
- It is used to test the assumption of homoscedasticity
- It is used to test the assumption of normality
The Kaiser-Meyer-Olkin (KMO) measure is a measure of how suitable the data is for factor analysis. It determines the adequacy for each observed variable and for the complete model. KMO estimates vary between 0 and 1. A value of 0 indicates that the sum of partial correlations is large relative to the sum correlations, implying diffusion in the pattern of correlations (hence, factor analysis will be likely inappropriate).
When a distribution has a long tail on the right, it is said to be ________ skewed.
- Negatively
- Normally
- Positively
- Uniformly
When a distribution has a long tail on the right, it is said to be positively skewed or right-skewed. In a positively skewed distribution, the mean is typically greater than the median, which is greater than the mode.
A random variable that takes a finite or countably infinite number of values is known as a ________ random variable.
- Continuous
- Dependent
- Discrete
- Normal
A discrete random variable is one which may take on only a countable number of distinct values and thus can be quantified. For example, you can count the change in your pocket. You can count the money in your bank account. You can count the number of heads in 50 coin tosses. These are all examples of discrete random variables.
A situation where two or more independent variables in a regression model are highly correlated is known as ________.
- autocorrelation
- heteroscedasticity
- homoscedasticity
- multicollinearity
Multicollinearity refers to a situation in which two or more independent variables in a regression model are highly linearly related. This can lead to unstable estimates of the regression coefficients and make it difficult to assess the effect of independent variables on the dependent variable.
How does multiple linear regression differ from simple linear regression?
- Multiple linear regression cannot handle categorical variables, simple linear regression can
- Multiple linear regression is not suitable for prediction tasks
- Multiple linear regression requires a larger dataset
- Multiple linear regression uses multiple independent variables, simple linear regression only uses one
The main difference between simple and multiple linear regression is the number of independent variables. While simple linear regression uses only one independent variable to predict the dependent variable, multiple linear regression uses two or more independent variables to predict the dependent variable.
What does the residual plot tell you in a simple linear regression analysis?
- It shows the distribution of residuals and can help identify non-linearity, unequal error variances, and outliers
- It shows the distribution of the independent variable
- It shows the relationship between the dependent and independent variables
- It tells you the strength of the correlation
A residual plot is a graph that shows the residuals on the vertical axis and the independent variable on the horizontal axis. It helps to identify non-linearity, unequal error variances (heteroscedasticity), and outliers. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.
Two events are said to be ________ if the occurrence of one does not affect the probability of the occurrence of the other.
- Dependent
- Exhaustive
- Independent
- Mutually exclusive
Two events are said to be "independent" if the occurrence of one does not affect the probability of the occurrence of the other. For example, if you toss a coin twice, the outcome of the first toss doesn't affect the outcome of the second toss, so the two events are independent.
How does sample size impact the Mann-Whitney U test?
- Larger sample sizes make the test less reliable
- Larger sample sizes make the test more reliable
- Only equal sample sizes can be used in the test
- Sample size has no impact on the test
Larger sample sizes make the Mann-Whitney U test more reliable. As with most statistical tests, a larger sample size increases the power of the test, which is the probability that it will correctly reject a false null hypothesis.
In which situations is it appropriate to use the Wilcoxon Signed Rank Test?
- When comparing the means of two independent groups
- When comparing the medians of two related groups
- When comparing the modes of two related groups
- nan
The Wilcoxon Signed Rank Test is appropriate when comparing the medians of two related groups.
A ________ is a graphical representation of the distribution of a dataset, typically used to visualize the frequency of data items in successive numerical intervals.
- Bar plot
- Histogram
- Line graph
- Pie chart
A histogram is a graphical representation of the distribution of a dataset, typically used to visualize the frequency of data items in successive numerical intervals. The data range is divided into a series of intervals or 'bins' and the number of data points falling within each bin is represented by the height of a bar.