What is the purpose of an interaction term in a regression model?

  • To increase the complexity of the model
  • To minimize the error of the model
  • To represent the combined effect of two variables
  • To represent the effect of one variable based on the level of another
An interaction term in a regression model is used to represent the combined effect of two independent variables on the dependent variable. It captures situations where the effect of one variable on the dependent variable is different at different levels of another variable.

In what type of problem scenarios is Bayes' Theorem most commonly used?

  • When new evidence is used to update the probability of an event
  • When the data is categorical
  • When the events are mutually exclusive
  • When the population is normally distributed
Bayes' Theorem is most commonly used when new evidence is used to update the probability of an event. It provides a way to revise existing predictions or theories (prior probabilities) in light of new data (the likelihood).

Which type of data can be categorized into groups: qualitative or quantitative?

  • Both
  • None
  • Qualitative
  • Quantitative
Qualitative data can be categorized into groups. It represents characteristics or attributes and is often categorized or grouped. For example, hair color (blonde, brunette, etc.) or marital status (single, married, etc.) are qualitative data.

The ________ is the middle value in a data set when the data is arranged in ascending or descending order.

  • Mean
  • Median
  • Mode
  • nan
The median is the value separating the higher half from the lower half of a data sample. If the data set has an odd number of observations, the number in the middle is the median. If there is an even number of observations, the median is defined as the arithmetic mean of the two middle values.

The probability of the intersection of Events A and B is represented by _______.

  • P(A + B)
  • P(A - B)
  • P(A ∩ B)
  • P(A ∪ B)
The probability of the intersection of Events A and B is represented by P(A ∩ B), which means the probability that both events A and B occur.

What is the F statistic in an ANOVA analysis, and what does it represent?

  • The average of the group means
  • The difference between the highest and lowest means
  • The ratio of the between-group variance to the within-group variance
  • The ratio of the within-group variance to the between-group variance
In an ANOVA, the F statistic is the ratio of the between-group variance to the within-group variance. It represents the extent to which group means differ from each other, compared to the variability within groups.

What type of data is best suited for a Chi-square test?

  • Categorical data
  • Continuous data
  • Numerical data
  • Time series data
Categorical data is best suited for a Chi-square test. The Chi-square test is used to determine if there is a significant association between two categorical variables.

The sum of the squared loadings for a factor (i.e., the column in the factor matrix) which represents the variance in all the variables accounted for by the factor is known as _______ in factor analysis.

  • communality
  • eigenvalue
  • factor variance
  • total variance
The sum of the squared loadings for a factor (i.e., the column in the factor matrix) which represents the variance in all the variables accounted for by the factor is known as eigenvalue in factor analysis.

When the residuals exhibit a pattern or trend rather than a random scatter, it is a sign of _________.

  • Autocorrelation
  • Model misspecification
  • Overfitting
  • Underfitting
When the residuals exhibit a pattern or trend rather than a random scatter, it can be a sign of model misspecification, i.e., the model doesn't properly capture the relationship between the predictors and the outcome variable.

The branch of statistics that involves using a sample to draw conclusions about a population is called ________ statistics.

  • descriptive
  • inferential
  • numerical
  • qualitative
Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about a population. It takes data from a sample and makes inferences about the larger population from which the sample was drawn. For example, inferential statistics might use data from a sample of women to infer something about the mean weight of all women.