A ________ plot uses rectangular bars to represent data. The length of the bar corresponds to the frequency of data.

  • Bar
  • Line
  • Pie
  • Scatter
A bar plot uses rectangular bars to represent data. The length (or height, if vertical) of each bar corresponds to the frequency or amount of data it represents. Bar plots are particularly useful for comparing categories of data.

The ________ is a statistic that provides an estimate of the center of a distribution.

  • mean
  • median
  • mode
  • range
The mean, often known as the average, is a measure of central tendency that provides an estimate of the center of a distribution. It's calculated by adding all the numbers in the dataset and then dividing by the number of values in the dataset. However, it's worth noting that the mean can be skewed by extremely large or small values.

What does a correlation coefficient of 0 indicate?

  • A perfect negative correlation
  • A perfect positive correlation
  • A very strong correlation
  • No linear correlation
A correlation coefficient of 0 indicates no linear correlation between the two variables. This means that as one variable changes, there's no predictable pattern of change in the other variable. However, this doesn't rule out the possibility of a non-linear relationship.

What is conditional probability?

  • The probability of an event given the occurrence of another event
  • The probability of an event regardless of the occurrence of other events
  • The probability that both of two events occur
  • The ratio of the number of outcomes in an event to the number of outcomes in a sample space
Conditional probability is the probability of an event (A) given that another event (B) has already occurred. It's a fundamental concept in probability theory and is often denoted as P(A

How can you detect multicollinearity in multiple linear regression?

  • By checking the correlation among predictors
  • By checking the normality of residuals
  • By looking at the scatter plot of residuals
  • By using the F-test
Multicollinearity can be detected by examining the correlations among the predictors. High correlation among the predictors indicates the presence of multicollinearity. More formal methods such as the Variance Inflation Factor (VIF) can also be used.

How does kurtosis relate to the tails of a distribution?

  • Kurtosis does not relate to the tails of a distribution
  • Kurtosis is a measure of the weight in the tails
  • Kurtosis relates to the length of the tails
  • Kurtosis relates to the width of the tails
Kurtosis is a statistical measure used to describe the distribution of observed data around the mean. It is a measure of the heaviness of the tails of a distribution. A high kurtosis in a data set is a signal that data has heavy tails or outliers.

What is the concept of post-hoc testing in ANOVA?

  • It is a test performed before ANOVA
  • It is a test performed to calculate the F-statistic
  • It is a test performed to check the assumptions of ANOVA
  • It is a test performed to determine which groups are significantly different from each other
Post-hoc testing in ANOVA is performed after the ANOVA test when the null hypothesis has been rejected. Its purpose is to determine which specific groups are significantly different from each other. Commonly used post-hoc tests include Tukey's HSD, Bonferroni correction, and Scheffe's method.

In the context of simple linear regression, the difference between the observed value and the predicted value is referred to as the ________.

  • correlation coefficient
  • dependent variable
  • error term
  • independent variable
The error term, or residual, in a regression model is the difference between the observed value and the predicted value. It represents the portion of the dependent variable that cannot be explained by the independent variable(s).

Which common statistical test is considered non-parametric?

  • ANOVA
  • Chi-Square Test
  • Linear Regression
  • t-test
The Chi-Square Test is a common statistical test that is considered non-parametric. This test is often used to analyze categorical data and does not require assumptions about the population distribution.

Which measure of dispersion considers all the data points in a dataset?

  • Interquartile range
  • Mode
  • Range
  • Variance
Variance is a measure of dispersion that considers all data points in the dataset. It is calculated by taking the average of the squared differences from the mean.