What happens to the correlation coefficient when you have outliers in your data?

It decreases
It increases
It may become misleading
It remains the same

Outliers can greatly affect the correlation coefficient, making it misleading. If outliers are in the same direction, they can inflate the correlation. If they are in opposite directions, they can deflate or even reverse the sign of the correlation. Hence, it's important to handle outliers before conducting correlation analysis.

Discuss it

How does the standard error affect the confidence interval?

Larger standard error leads to a narrower interval
Larger standard error leads to a skewed interval
Larger standard error leads to a wider interval
Standard error does not affect the confidence interval

Larger standard error leads to a wider confidence interval. The standard error measures the variability in the sampling distribution and a larger standard error suggests more variability, which in turn leads to less precise estimates and wider intervals.

Discuss it

How does the interquartile mean provide a measure of central tendency that is resistant to outliers?

By focusing on the data between the first and third quartiles
By focusing only on the highest values in the data
By focusing only on the lowest values in the data
By ignoring all outlier values

The interquartile mean focuses on the data between the first quartile (25th percentile) and the third quartile (75th percentile), excluding the lowest 25% and the highest 25% of data points. This makes it less influenced by outliers and extreme values, hence a more robust measure of central tendency for skewed or asymmetrical distributions.

Discuss it

In hypothesis testing, a Type I error is committed when the null hypothesis is but we it.

False, fail to reject
False, reject
True, fail to reject
True, reject

A Type I error, also known as a false positive, occurs when we reject a true null hypothesis. This means we've found evidence of an effect or difference when there really isn't one.

Discuss it

The Breusch-Pagan test and the White test are common methods to detect __________ in the residuals.

Autocorrelation
Heteroscedasticity
Multicollinearity
Outliers

The Breusch-Pagan test and the White test are common methods used to detect heteroscedasticity in the residuals. Heteroscedasticity refers to the circumstance in which the variability of a variable is unequal across the range of values of a second variable that predicts it.

Discuss it

How does the Akaike Information Criterion (AIC) handle the trade-off between goodness of fit and model complexity in model selection?

It always prefers a more complex model.
It always prefers a simpler model.
It does not consider model complexity.
It penalizes models with more parameters to avoid overfitting.

The AIC handles the trade-off by introducing a penalty term for the number of parameters in the model. This discourages overfitting and leads to a balance between model fit and complexity.

Discuss it

What information does a box plot provide about a dataset?

The correlation between variables
The exact values of all data points
The mean and standard deviation
The minimum, first quartile, median, third quartile, and maximum

A box plot (also known as a whisker plot) displays a summary of the distribution of data values, including the minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. The 'box' represents the interquartile range (the distance between Q1 and Q3), and the 'whiskers' represent the range of the data. Outliers may also be plotted as individual points.

Discuss it

Why is sampling without replacement often used in practice?

It allows for the inclusion of every individual in the population
It ensures that each selection is independent
It guarantees that each sample is unique
It is easier than sampling with replacement

Sampling without replacement is often used in practice because it guarantees that each sample is unique. This means that once an individual is selected, it cannot be chosen again for the same sample. This method can help reduce bias and ensure a more diverse and representative sample.

Discuss it

Why is the Spearman rank correlation considered a non-parametric test?

It assumes a normal distribution
It can't handle ordinal data
It does not assume a normal distribution
It tests for a linear relationship

The Spearman rank correlation is considered a non-parametric test because it does not assume a normal distribution of data. It only assumes that the variables are ordinal or continuous and that the relationship between them is monotonic.

Discuss it

What are the degrees of freedom in a Chi-square test for a 2x3 contingency table?

2
3
4
6

In a Chi-square test, the degrees of freedom for a 2x3 contingency table is (2-1) * (3-1) = 2.

Discuss it

What happens to the correlation coefficient when you have outliers in your data?

How does the standard error affect the confidence interval?

How does the interquartile mean provide a measure of central tendency that is resistant to outliers?

In hypothesis testing, a Type I error is committed when the null hypothesis is ______ but we ______ it.

The Breusch-Pagan test and the White test are common methods to detect __________ in the residuals.

How does the Akaike Information Criterion (AIC) handle the trade-off between goodness of fit and model complexity in model selection?

What information does a box plot provide about a dataset?

Why is sampling without replacement often used in practice?

Why is the Spearman rank correlation considered a non-parametric test?

What are the degrees of freedom in a Chi-square test for a 2x3 contingency table?

In hypothesis testing, a Type I error is committed when the null hypothesis is but we it.