If the null hypothesis is true, but we reject it, what type of error have we made?
- Both Type I and Type II error
- Neither Type I nor Type II error
- Type I error
- Type II error
If the null hypothesis is true, but we reject it, we have made a Type I error. This is also known as a "false positive" result.
What happens when you reject the null hypothesis in a Mann-Whitney U test?
- You conclude that the distributions are normal
- You conclude that the groups have equal variances
- You conclude that there is a difference between the groups
- You conclude that there is no difference between the groups
When you reject the null hypothesis in a Mann-Whitney U test, you conclude that there is a difference between the groups. More specifically, it means that it is unlikely that a randomly selected value from the first group is greater than a randomly selected value from the second group with a probability of 0.5.
What steps would you take to validate the assumptions of a multiple linear regression model?
- Check the R-squared value and the F-statistic
- Check the correlation between the dependent and independent variables
- Check the residuals plot, conduct a normality test on the residuals, and check for homoscedasticity
- Increase the sample size
The assumptions of a multiple linear regression model can be validated by checking the residuals plot for randomness (i.e., no patterns), conducting a normality test on the residuals to check if they are normally distributed, and checking for homoscedasticity (i.e., constant variance of the residuals).
If all the values in a dataset are identical, what would be the variance and standard deviation?
- The variance and standard deviation would be 0
- The variance and standard deviation would be 1
- The variance would be 0 but the standard deviation would be 1
- The variance would be 1 but the standard deviation would be 0
If all the values in a dataset are identical, there is no variation or dispersion in the data. Hence, both the variance and the standard deviation would be zero.
The ________ mean is a type of average, which is calculated by taking the reciprocal of the arithmetic mean of the reciprocals.
- Arithmetic
- Geometric
- Harmonic
- nan
The harmonic mean is a measure of central tendency that is much less well known than, for example, the arithmetic mean or the median. It is appropriate for situations when the average of rates is desired. The harmonic mean is calculated by taking the reciprocal of the arithmetic mean of the reciprocals.
What are the assumptions made while applying ANOVA?
- Independence, Homogeneity of variance, Non-linearity
- Linearity, Independence, Equal Variance
- Normality, Homogeneity of variance, Independence
- Normality, Linearity, Independence
While applying ANOVA, the following assumptions are made: Normality (data is normally distributed), Homogeneity of variance (variance among the groups is approximately equal), Independence (the observations are independent of each other).
What is interval estimation in inferential statistics?
- The process of calculating the standard deviation of a population
- The process of determining the mode of a population
- The process of estimating the mean of a population
- The process of providing a range of values for an unknown population parameter
Interval estimation in inferential statistics is a method by which a range of values is provided that is likely to contain the population parameter. Instead of a single value, it provides an interval of estimates making it more flexible and informative than point estimation.
How do non-parametric tests treat data points?
- They analyze only the maximum and minimum data values
- They analyze ranks rather than actual data values
- They analyze the median of the data set only
- They ignore outliers in the data set
Non-parametric tests treat data points by analyzing their ranks rather than their actual values. This makes non-parametric tests less sensitive to extreme values and makes them a good choice when dealing with skewed data or data with many outliers.
The Central Limit Theorem states that the sampling distribution of the sample means approaches a ________ distribution as the sample size gets larger, regardless of the shape of the population distribution.
- Poisson
- binomial
- normal
- uniform
The Central Limit Theorem is a fundamental theorem in statistics that states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger, no matter what the shape of the population distribution. This outcome is significant because it enables us to make statistical inferences about the population mean based on the distribution of sample means.
Why is it important to check the assumptions of a multiple linear regression model?
- To ensure the validity of the model
- To increase the complexity of the model
- To increase the number of observations
- To reduce the R-squared value
Checking the assumptions of a multiple linear regression model (like linearity, independence, normality, and homoscedasticity) is crucial to ensure the validity of the model and its estimates. Violations of these assumptions can lead to biased or inefficient estimates, and inferences made from such models could be misleading.