If we want to reduce both Type I and Type II errors, we could increase the ______.

Confidence level
Power of the test
Sample size
Significance level

Increasing the sample size makes the test more sensitive, thereby reducing both Type I and Type II errors. With a larger sample, there is more data available, which often leads to more accurate and reliable results. However, resources, time, and other constraints often limit the sample size in real-world studies.

Discuss it

How does the Central Limit Theorem influence the shape of the distribution of sample means?

It states that all distributions will be skewed to the right.
It states that as the sample size increases, the distribution of sample means will more closely approximate a normal distribution, regardless of the shape of the population distribution.
The Central Limit Theorem does not influence the shape of the distribution.
The Central Limit Theorem turns all distributions into uniform distributions.

The Central Limit Theorem (CLT) states that the distribution of sample means will tend towards a normal distribution as the sample size increases, regardless of the shape of the population distribution. Therefore, the CLT has a profound impact on the shape of the distribution, tending to 'normalize' it as sample size increases.

Discuss it

Each subsequent Principal Component must be ______ to all the previous Principal Components.

equal
orthogonal
parallel
proportional

Each subsequent Principal Component in PCA must be orthogonal (perpendicular) to all previous Principal Components. This ensures that the Principal Components are uncorrelated.

Discuss it

What are the assumptions made when using factor analysis?

Homoscedasticity, autocorrelation, and stationarity
Independence, normality, and equal variance
Normality, linearity, and homoscedasticity
Normality, linearity, and multicollinearity

The assumptions of factor analysis include normality (the variables used in the analysis should be normally distributed), linearity (the relationship between the factors and the variables should be linear), and homoscedasticity (the variances of the errors should be constant).

Discuss it

What is the significance of a Gaussian or normal distribution?

It describes the spread of evenly distributed data
It is the distribution that maximizes entropy
It is used only for discrete random variables
It is used when events occur at a constant rate

The Gaussian or normal distribution has several important properties and is widely used in statistics and natural sciences. It's significant because it is the distribution that maximizes entropy among all distributions with given mean and variance, making it the most "uninformative" and often serving as a good default choice in many scenarios. Also, according to the central limit theorem, the sum of many independent and identically distributed (i.i.d.) random variables tends toward a normal distribution.

Discuss it

How does Pearson's Correlation Coefficient differ from Spearman's Rank Correlation?

Pearson's correlation coefficient cannot be negative, Spearman's can
Pearson's correlation coefficient is non-parametric, Spearman's is parametric
Pearson's correlation coefficient is used for ranked data, Spearman's is not
Pearson's correlation coefficient measures linear relationships, Spearman's measures monotonic relationships

Pearson's correlation coefficient measures linear relationships, while Spearman's Rank Correlation measures monotonic relationships. Monotonic relationships are ones where the variables tend to change together, but not necessarily at a constant rate. Pearson's Correlation is used when the data is normally distributed, whereas Spearman's Rank Correlation is used when the data does not assume normal distribution.

Discuss it

What does it mean if the Chi-square statistic is significantly larger than the critical value?

The alternative hypothesis is true
The null hypothesis is true
The test result is insignificant
There is not enough evidence to reject the null hypothesis

If the Chi-square statistic is significantly larger than the critical value, we reject the null hypothesis in favor of the alternative hypothesis. This suggests that there is a significant association between the variables.

Discuss it

What are the limitations of using qualitative data in data analysis?

It cannot be easily quantified for statistical analysis
It may be influenced by researcher bias
It requires substantial resources and time for data collection
It's always better than quantitative data

Qualitative data has several limitations in data analysis. Firstly, it cannot be easily quantified for statistical analysis which limits its utility in certain research settings. Secondly, collecting and analyzing qualitative data often requires substantial resources and time, which can be a challenge for large-scale studies. Lastly, qualitative data may be influenced by researcher bias, particularly during data collection and interpretation.

Discuss it

What is the assumption of normality in residual analysis?

The coefficients of the regression line are normally distributed
The dependent variable is normally distributed
The independent variables are normally distributed
The residuals are normally distributed

The assumption of normality in residual analysis states that if we draw a large number of samples and create a distribution of the sample means, this distribution will be well approximated by a normal distribution. This is necessary to make inferences about the regression coefficients and to calculate prediction intervals.

Discuss it

How does the Wilcoxon Signed Rank Test deal with zeros in the difference of paired observations?

Zeros are averaged
Zeros are counted as half a sign
Zeros are discarded
Zeros are included

In the Wilcoxon Signed Rank Test, zeros in the difference of paired observations are typically discarded.

Discuss it