What is the relationship between variance and the square of the standard deviation?

  • Standard deviation is always larger
  • They are the same
  • Variance is always larger
  • Variance is the square root of the standard deviation
Variance and the square of the standard deviation are the same. The variance is calculated as the mean of the squared deviations from the mean, and the standard deviation is the square root of this variance. Hence, squaring the standard deviation gives us the variance.

What does polynomial regression allow you to model?

  • Correlations
  • Data distribution
  • Non-linear relationships
  • Relationships between variables
Polynomial regression allows modeling of non-linear relationships. Unlike linear regression that models relationships between variables as a straight line, polynomial regression models relationships as curves, better capturing relationships that change in direction at different levels of the independent variables.

What is the maximum value that a probability can take?

  • 1
  • 10
  • 100
  • Infinity
The maximum value that a probability can take is 1. This signifies that an event is certain to occur. In probability theory, probabilities range from 0 (implying the event is impossible) to 1 (implying the event is certain).

What are the key properties of a Bernoulli distribution?

  • It can only take positive integer values
  • It has a bell-shaped curve
  • It has a single trial with two possible outcomes
  • It models a series of independent trials
A Bernoulli distribution is a discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with probability q=1-p. It models a single trial with two possible outcomes, often labelled 'success' and 'failure'.

In multiple regression, model selection aims to choose the most _______ model that best predicts the response variable.

  • complex
  • overfit
  • parsimonious
  • simple
In multiple regression, model selection aims to choose the most parsimonious model that best predicts the response variable. A parsimonious model is a model that accomplishes the desired level of explanation or prediction with as few predictor variables as possible.

What does it mean if the Chi-square test for goodness of fit is statistically significant?

  • The observed data and theoretical distribution are negatively correlated
  • The observed data and theoretical distribution are positively correlated
  • The observed data differs significantly from what we would expect if it followed the theoretical distribution
  • The observed data fits the theoretical distribution perfectly
If the Chi-square test for goodness of fit is statistically significant, this means that the observed data differs significantly from what we would expect if the data followed the theoretical distribution.

How does Pearson's Correlation Coefficient handle outliers?

  • Automatically removes outliers
  • Converts outliers to mean values
  • Ignores outliers
  • Is highly sensitive to outliers
Pearson's Correlation Coefficient is highly sensitive to outliers. This is because it involves a mean and standard deviation calculation, and these values can be greatly influenced by outliers. Even a single outlier can significantly skew the result of the correlation.

In statistics, the entire group of individuals or observations that we want to understand is called the _______.

  • distribution
  • parameter
  • population
  • sample
In statistics, a population is the entire group of individuals or observations that we want to understand or draw conclusions about. It's the total set of observations that can be made. For example, if you want to know the average height of an adult male in the US, the population would be all adult males in the US.

The first Principal Component is the direction in the dataset that captures the ______ variance in the data.

  • least
  • median
  • most
  • random
The first Principal Component is the direction (or vector) in the multidimensional space along which the data varies the most, so it captures the most variance in the data.

The ________ of a box plot are used to indicate variability outside the upper and lower quartiles.

  • Bars
  • Outliers
  • Tails
  • Whiskers
The whiskers of a box plot are used to indicate the variability of the data outside the upper and lower quartiles. They often extend to the maximum and minimum data values (excluding outliers), or 1.5 times the interquartile range.

One common feature of non-parametric methods is the use of ________ rather than raw data points.

  • averages
  • frequencies
  • medians
  • ranks
One common feature of non-parametric methods is the use of ranks rather than raw data points, which makes them more robust to outliers and does not require the assumption of a specific distribution.

How does a p-value relate to the significance level in a hypothesis test?

  • A higher p-value indicates a more significant result
  • A smaller p-value means the result is less likely to have occurred by chance
  • The p-value does not depend on the significance level
  • The p-value is the probability that the null hypothesis is true
The p-value is the probability of obtaining a result as extreme as, or more extreme than, the result actually obtained, assuming the null hypothesis is true. If the p-value is smaller than the significance level (alpha), we reject the null hypothesis.