A positive Pearson's Correlation Coefficient indicates a ________ relationship between two variables.
- inverse
- linear
- perfect
- positive
A positive Pearson's Correlation Coefficient indicates a positive relationship between two variables. This means that as one variable increases, the other variable also increases, and vice versa.
What are the assumptions made in simple linear regression?
- Homogeneity, normality, and symmetry
- Independence, homogeneity, and linearity
- Linearity, homoscedasticity, and normality
- Symmetry, linearity, and independence
The assumptions made in simple linear regression include linearity (the relationship between the independent and dependent variables is linear), homoscedasticity (the variance of the residuals is constant across all levels of the independent variable), and normality (the residuals are normally distributed).
Principal Component Analysis (PCA) is a dimensionality reduction technique that projects the data into a lower dimensional space called the _______.
- eigen space
- feature space
- subspace
- variance space
PCA is a technique that projects the data into a new, lower-dimensional subspace. This subspace consists of principal components which are orthogonal to each other and capture the maximum variance in the data.
The range of a dataset is sensitive to _______.
- Mean
- Median
- Mode
- Outliers
The range of a dataset is sensitive to outliers. Because the range is calculated as the difference between the maximum and minimum values, an outlier (an extremely high or low value) can greatly increase the range.
In what situations is the coefficient of variation a better measure of dispersion than the standard deviation?
- When data sets have different units
- When data sets have the same units
- When the data set is normally distributed
- When the mean of the data set is zero
The coefficient of variation (CV) is a standardized measure of dispersion that is unitless. It's particularly useful when comparing the dispersion of two or more datasets that have different units or significantly different means. Standard deviation, on the other hand, has the same units as the data, which may not be helpful for comparisons across different datasets.
Under what circumstances can the conditional probability of an event be equal to its marginal probability?
- When the event is certain
- When the event is dependent on all other events
- When the event is impossible
- When the event is independent of all other events
The conditional probability of an event A given an event B equals the marginal probability of A when A and B are independent. This is because the occurrence of B does not change the probability of A if they are independent.
What type of data is the Mann-Whitney U test used for?
- Interval data
- Nominal data
- Ordinal data
- Ratio data
The Mann-Whitney U test is used for ordinal data, which can be ranked but have unknown or non-equivalent differences between values. It can also be used with interval and ratio data that do not meet the assumptions of other tests.
What does the 'mode' refer to in a data set?
- The average value
- The middle value
- The most frequently occurring value
- The range of values
The mode in a data set refers to the most frequently occurring value. In a dataset, the mode is the value that appears the most number of times. A dataset may have one mode (unimodal), two modes (bimodal), or multiple modes (multimodal).
How is the Chi-square statistic calculated in a goodness of fit test?
- The differences between observed and expected frequencies are averaged
- The differences between observed and expected frequencies are divided by the expected frequencies
- The differences between observed and expected frequencies are squared and summed
- The differences between observed and expected frequencies are squared, summed, and then the square root is taken
In a Chi-square goodness of fit test, the Chi-square statistic is calculated by squaring the differences between observed and expected frequencies, then summing these squared differences.
In a ________ distribution, the events occur with a known constant mean rate and independently of the time since the last event.
- Binomial
- Normal
- Poisson
- Uniform
The Poisson distribution models the number of events happening in a fixed interval of time or space, given a constant mean rate of occurrence and independence of the time since the last event.