How does factor analysis differ from principal component analysis (PCA)?

Factor analysis does not involve rotation of variables, while PCA does
Factor analysis looks for shared variance while PCA looks for total variance
PCA focuses on unobservable variables, while factor analysis focuses on observable variables
PCA is used for dimensionality reduction, while factor analysis is used for data cleaning

Factor analysis and PCA differ primarily in what they seek to model. Factor analysis models the shared variance among variables, focusing on the latent or unobservable variables, while PCA models the total variance and aims at reducing the dimensionality.

Discuss it

The term '________' refers to the sharpness of the peak of a frequency-distribution curve.

Kurtosis
Median
Mode
Skewness

Kurtosis refers to the sharpness of the peak of a frequency-distribution curve. It measures the tails and sharpness of the distribution. Distributions with large kurtosis exhibit tail data exceeding the tails of the normal distribution.

Discuss it

What is the underlying assumption of linearity in a multiple linear regression model?

All independent variables must have a linear relationship with the dependent variable
All residuals must be equal
All variables must be continuous
All variables must be normally distributed

The assumption of linearity in a multiple linear regression model assumes that the relationship between each independent variable and the dependent variable is linear. This implies that the change in the dependent variable due to a one-unit change in the independent variable is constant, regardless of the value of the independent variable.

Discuss it

How does a binomial distribution differ from a normal distribution?

Binomial distribution is continuous, while normal is discrete
Both are continuous distributions
Both are discrete distributions
Normal distribution is continuous, while binomial is discrete

A binomial distribution is discrete, meaning it only takes on integer values on a countable range, and it represents the number of successes in a fixed number of independent Bernoulli trials with a given success probability. A normal distribution is continuous, and it is often used as a first approximation to the binomial distribution, when the number of trials is large.

Discuss it

How does effect size impact hypothesis testing?

Effect size has no impact on hypothesis testing
Larger effect sizes always lead to rejection of the null hypothesis
Larger effect sizes always lead to smaller p-values
Larger effect sizes increase the statistical power of the test

Effect size measures the magnitude of the difference or the strength of the relationship in the population. A larger effect size means a larger difference or stronger relationship, which in turn increases the statistical power of the test. Power is the probability that the test correctly rejects the null hypothesis when the alternative is true.

Discuss it

Under what conditions does the Central Limit Theorem hold true?

When the data is skewed
When the population is normal
When the sample size is sufficiently large
When the standard deviation is zero

The Central Limit Theorem holds true when the sample size is sufficiently large (usually n > 30), regardless of the shape of the population distribution. This theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed.

Discuss it

How does skewness affect the relationship between the mean, median, and mode of a distribution?

Changes the relationship
Increases the standard deviation
No effect
Reduces the kurtosis

Skewness affects the relationship between the mean, median, and mode. In a positively skewed distribution, the mean is usually greater than the median, which is greater than the mode. In a negatively skewed distribution, the mode is usually greater than the median, which is greater than the mean.

Discuss it

How can you test the assumption of independence in a Chi-square test for goodness of fit?

By calculating the standard deviation of the observations
By conducting a separate Chi-square test of independence
By conducting a t-test
By examining the correlation between observations

To test the assumption of independence in a Chi-square test for goodness of fit, you can conduct a separate Chi-square test of independence. This test compares the observed frequencies in each category with what we would expect if the variables were independent.

Discuss it

Non-parametric tests are also known as ________ tests because they make fewer assumptions about the data.

assumption-free
distribution-free
free-assumption
free-distribution

Non-parametric tests are also known as distribution-free tests because they make fewer assumptions about the data, specifically, they do not require the data to follow a specific distribution.

Discuss it

What is the difference between a parameter and a statistic in the field of statistics?

A parameter and a statistic are the same thing
A parameter is based on a sample; a statistic is based on the population
A statistic is a numerical measure; a parameter is a graphical representation
A statistic is based on a sample; a parameter is based on the population

In the field of statistics, a parameter is a numerical characteristic of a population, whereas a statistic is a numerical characteristic of a sample. Parameters are often unknown because we cannot examine the entire population. We use statistics, which we compute from sample data, to estimate parameters.

Discuss it

___________ refers to the condition where the variance of the errors or residuals is constant across all levels of the explanatory variables.

Autocorrelation
Heteroscedasticity
Homoscedasticity
Multicollinearity

Homoscedasticity is the condition in which the variance of the errors or residuals is constant across all levels of the explanatory variables. It is one of the key assumptions of linear regression.

Discuss it

Which mathematical concept is at the core of PCA?

Differentiation
Eigenvalues and Eigenvectors
Integration
Matrix Multiplication

PCA relies heavily on the concepts of Eigenvalues and Eigenvectors. These allow it to determine the axes along which the data has the most variance, which are used to form the new variables (principal components).

Discuss it