The probability of an event A, given that another event B has occurred, is called the ________ probability of A given B.
- Conditional
- Independent
- Joint
- Marginal
The probability of an event A, given that another event B has occurred, is called the conditional probability of A given B. It is denoted as P(A
The sum of the squared loadings for a factor (i.e., the column in the factor matrix) which represents the variance in all the variables accounted for by the factor is known as _______ in factor analysis.
- communality
- eigenvalue
- factor variance
- total variance
The sum of the squared loadings for a factor (i.e., the column in the factor matrix) which represents the variance in all the variables accounted for by the factor is known as eigenvalue in factor analysis.
When the residuals exhibit a pattern or trend rather than a random scatter, it is a sign of _________.
- Autocorrelation
- Model misspecification
- Overfitting
- Underfitting
When the residuals exhibit a pattern or trend rather than a random scatter, it can be a sign of model misspecification, i.e., the model doesn't properly capture the relationship between the predictors and the outcome variable.
The branch of statistics that involves using a sample to draw conclusions about a population is called ________ statistics.
- descriptive
- inferential
- numerical
- qualitative
Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about a population. It takes data from a sample and makes inferences about the larger population from which the sample was drawn. For example, inferential statistics might use data from a sample of women to infer something about the mean weight of all women.
What is the primary purpose of factor analysis in data science?
- To categorize data
- To classify data
- To identify underlying variables (factors)
- To predict future outcomes
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. Its primary purpose is to identify the underlying structure and relationships within a set of variables.
What does it mean when a confidence interval includes the value zero?
- The population mean is likely to be zero
- The sample mean is zero
- There is no effect in the population
- nan
If a confidence interval for a mean difference or an effect size includes zero, it suggests that there is no effect in the population and that the observed effect in the sample is likely due to sampling error.
Can you provide a practical example of where the Law of Large Numbers is applied?
- Insurance companies use the Law of Large Numbers to predict claim amounts.
- It's used to calculate the speed of light.
- The Law of Large Numbers is only theoretical and has no practical applications.
- The Law of Large Numbers is used to predict lottery numbers.
The Law of Large Numbers has many practical applications. For example, insurance companies use it to predict future claim amounts. The law allows them to predict losses and to set premiums in a way that ensures profitability, by basing predictions on large aggregations of independent or nearly independent losses.
What is the impact of heteroscedasticity on a multiple linear regression model?
- It affects the linearity of the model
- It affects the normality of the residuals
- It causes multicollinearity
- It invalidates the statistical inferences that could be made from the model
Heteroscedasticity, or non-constant variance of the error term, can invalidate statistical inferences that could be made from the model because it violates one of the assumptions of multiple linear regression. This could lead to inefficient estimation of the regression coefficients and incorrect standard errors, which in turn affects confidence intervals and hypothesis tests.
What is the impact of data transformation on the decision to use non-parametric tests?
- A suitable data transformation may make it possible to use a parametric test
- Data transformation always leads to non-parametric tests
- Data transformation always makes data normally distributed
- Data transformation does not affect the choice between parametric and non-parametric tests
A suitable data transformation may make it possible to use a parametric test instead of a non-parametric test. Transformations can help to stabilize variances, normalize the data, or linearize relationships between variables, allowing for the use of parametric tests that might have more statistical power.
If two events are independent, what is the conditional probability of one given the other?
- 0
- 1
- Equal to the probability of the given event
- Undefined
If two events are independent, the conditional probability of one event given the other is simply the probability of the event itself. This is because in independent events, the occurrence of one event does not affect the occurrence of the other event.