The probability of the intersection of Events A and B is represented by _______.
- P(A + B)
- P(A - B)
- P(A ∩ B)
- P(A ∪ B)
The probability of the intersection of Events A and B is represented by P(A ∩ B), which means the probability that both events A and B occur.
What is the F statistic in an ANOVA analysis, and what does it represent?
- The average of the group means
- The difference between the highest and lowest means
- The ratio of the between-group variance to the within-group variance
- The ratio of the within-group variance to the between-group variance
In an ANOVA, the F statistic is the ratio of the between-group variance to the within-group variance. It represents the extent to which group means differ from each other, compared to the variability within groups.
How does multicollinearity affect the interpretation of regression coefficients?
- It has no effect on the interpretation of the coefficients.
- It increases the value of the coefficients.
- It makes the coefficients less interpretable and reliable.
- It makes the coefficients more interpretable and reliable.
Multicollinearity can cause large changes in the estimated regression coefficients for small changes in the data. Hence, it makes the coefficients less reliable and interpretable.
The Wilcoxon Signed Rank Test uses the _______ of differences for ranking.
- distributions
- magnitudes
- nan
- signs
The Wilcoxon Signed Rank Test uses the magnitudes of differences for ranking.
The probability of an event A, given that another event B has occurred, is called the ________ probability of A given B.
- Conditional
- Independent
- Joint
- Marginal
The probability of an event A, given that another event B has occurred, is called the conditional probability of A given B. It is denoted as P(A
The sum of the squared loadings for a factor (i.e., the column in the factor matrix) which represents the variance in all the variables accounted for by the factor is known as _______ in factor analysis.
- communality
- eigenvalue
- factor variance
- total variance
The sum of the squared loadings for a factor (i.e., the column in the factor matrix) which represents the variance in all the variables accounted for by the factor is known as eigenvalue in factor analysis.
When the residuals exhibit a pattern or trend rather than a random scatter, it is a sign of _________.
- Autocorrelation
- Model misspecification
- Overfitting
- Underfitting
When the residuals exhibit a pattern or trend rather than a random scatter, it can be a sign of model misspecification, i.e., the model doesn't properly capture the relationship between the predictors and the outcome variable.
The branch of statistics that involves using a sample to draw conclusions about a population is called ________ statistics.
- descriptive
- inferential
- numerical
- qualitative
Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about a population. It takes data from a sample and makes inferences about the larger population from which the sample was drawn. For example, inferential statistics might use data from a sample of women to infer something about the mean weight of all women.
What is the primary purpose of factor analysis in data science?
- To categorize data
- To classify data
- To identify underlying variables (factors)
- To predict future outcomes
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. Its primary purpose is to identify the underlying structure and relationships within a set of variables.
What does it mean when a confidence interval includes the value zero?
- The population mean is likely to be zero
- The sample mean is zero
- There is no effect in the population
- nan
If a confidence interval for a mean difference or an effect size includes zero, it suggests that there is no effect in the population and that the observed effect in the sample is likely due to sampling error.