Non-parametric statistical methods do not require the data to follow a specific ________.
- distribution
- pattern
- sequence
- trend
Non-parametric statistical methods do not require the data to follow a specific distribution, which is why they are often used when the assumptions of parametric tests are violated.
What does the peak of a distribution represent?
- The mean of the data
- The median of the data
- The mode of the data
- The range of the data
The peak of a distribution represents the mode of the data, that is, the value(s) that appear most frequently in the data set. In a perfectly symmetrical distribution, the mode, median, and mean coincide at the peak.
What is the potential outcome if we fail to reject the null hypothesis?
- The null hypothesis is definitely true
- The sample size was too small
- The significance level was too high
- There is not enough evidence in the data to support the alternative hypothesis
If we fail to reject the null hypothesis, this means that there is not enough evidence in the data to support the alternative hypothesis. We do not say the null hypothesis is true, because it is possible that a type II error (false negative) occurred.
In _________ sampling, the population is divided into subgroups, and a simple random sample is drawn from each subgroup.
- Cluster
- Simple Random
- Stratified
- Systematic
In stratified sampling, the population is divided into non-overlapping groups, or strata, such as age groups, income levels, or gender. Then, a simple random sample is taken from each stratum. Stratified random sampling can provide more precise estimates if the strata are relevant to the characteristic of interest.
A low p-value (less than 0.05) in a t-test suggests that you can reject the _______ hypothesis.
- alternative
- both a and b
- nan
- nan
A low p-value in a t-test suggests that you can reject the null hypothesis. The p-value represents the probability that the results are due to random chance, so a lower p-value means the results are less likely to be due to chance.
How is the concept of independence used in probability theory?
- To calculate the probability of an event without any prior information
- To describe events that always occur together
- To describe events that are mutually exclusive
- To describe events that have no influence on each other
Independence in probability theory refers to situations where the occurrence of one event does not affect the occurrence of another event. In other words, Events A and B are independent if the fact that A occurs does not affect the probability of B occurring.
What does it mean when we say that a distribution is skewed?
- All data points are identical
- It has outliers
- It is not symmetric about its mean
- Its mean and median are not equal
When we say that a distribution is skewed, we mean that the distribution is not symmetric about its mean. In a skewed distribution, the data points are not evenly distributed around the mean, with more data on one side of the mean than the other.
What does it mean if the p-value in a Chi-square test is smaller than the significance level?
- The alternative hypothesis is true
- The null hypothesis is true
- The test result is insignificant
- There is not enough evidence to reject the null hypothesis
If the p-value in a Chi-square test is smaller than the significance level, we reject the null hypothesis in favor of the alternative hypothesis. This suggests that there is a significant association between the variables.
How does multicollinearity affect the coefficients in multiple linear regression?
- It doesn't affect the coefficients
- It makes the coefficients less interpretable
- It makes the coefficients more precise
- It makes the coefficients negative
Multicollinearity refers to a situation where two or more predictor variables in a multiple regression model are highly correlated. This high correlation can result in unstable coefficient estimates, making them less reliable and harder to interpret.
When data points are concentrated on the left and the tail is on the right, the distribution is said to be _______.
- Negatively skewed
- Normal
- Positively skewed
- Uniform
When data points are concentrated on the left and the tail is on the right, the distribution is said to be positively skewed or right-skewed. This is because the tail of the distribution points towards the positive end of the axis.