What is the impact of data transformation on the decision to use non-parametric tests?

A suitable data transformation may make it possible to use a parametric test
Data transformation always leads to non-parametric tests
Data transformation always makes data normally distributed
Data transformation does not affect the choice between parametric and non-parametric tests

A suitable data transformation may make it possible to use a parametric test instead of a non-parametric test. Transformations can help to stabilize variances, normalize the data, or linearize relationships between variables, allowing for the use of parametric tests that might have more statistical power.

Discuss it

A positive Pearson's Correlation Coefficient indicates a ________ relationship between two variables.

inverse
linear
perfect
positive

A positive Pearson's Correlation Coefficient indicates a positive relationship between two variables. This means that as one variable increases, the other variable also increases, and vice versa.

Discuss it

What are the assumptions made in simple linear regression?

Homogeneity, normality, and symmetry
Independence, homogeneity, and linearity
Linearity, homoscedasticity, and normality
Symmetry, linearity, and independence

The assumptions made in simple linear regression include linearity (the relationship between the independent and dependent variables is linear), homoscedasticity (the variance of the residuals is constant across all levels of the independent variable), and normality (the residuals are normally distributed).

Discuss it

Principal Component Analysis (PCA) is a dimensionality reduction technique that projects the data into a lower dimensional space called the _______.

eigen space
feature space
subspace
variance space

PCA is a technique that projects the data into a new, lower-dimensional subspace. This subspace consists of principal components which are orthogonal to each other and capture the maximum variance in the data.

Discuss it

The range of a dataset is sensitive to _______.

Mean
Median
Mode
Outliers

The range of a dataset is sensitive to outliers. Because the range is calculated as the difference between the maximum and minimum values, an outlier (an extremely high or low value) can greatly increase the range.

Discuss it

The type of factor analysis in which the researcher assumes that all variance in the observed variables is common variance is known as _______ factor analysis.

common factor
confirmatory
exploratory
principal component

The type of factor analysis in which the researcher assumes that all variance in the observed variables is common variance is known as common factor analysis.

Discuss it

The Kruskal-Wallis Test is used to compare ________ independent samples.

four
three
three or more
two

The Kruskal-Wallis Test is used to compare three or more independent samples. It's an extension of the Mann-Whitney U Test for more than two groups.

Discuss it

In which situation is Spearman's Rank Correlation preferable to Pearson's correlation?

When the data is normally distributed
When the relationship between variables is non-linear and monotonic
When the relationship is linear
When there are no ties in the ranks

Spearman's Rank Correlation is preferable to Pearson's correlation when the relationship between variables is non-linear but monotonic. Pearson's correlation measures linear relationships, while Spearman's can capture non-linear relationships.

Discuss it

What is the z-value associated with a 95% confidence interval in a standard normal distribution?

1.64
1.96
2
2.33

The z-value associated with a 95% confidence interval in a standard normal distribution is approximately 1.96. This means that we are 95% confident that the true population parameter lies within 1.96 standard deviations of the sample mean.

Discuss it

How is the interquartile range different from the range in handling outliers?

Both exclude outliers
Both include outliers
The interquartile range does not include outliers, the range does
The interquartile range includes outliers, the range does not

The interquartile range, which is the difference between the upper quartile (Q3) and the lower quartile (Q1), represents the middle 50% of the data and is not affected by outliers. The range, on the other hand, is the difference between the maximum and minimum data values and is significantly affected by outliers.

Discuss it