What is the error term in a simple linear regression model?
- It is the dependent variable
- It is the difference between the observed and predicted values
- It is the independent variable
- It is the slope of the regression line
The error term in a simple linear regression model is the difference between the observed and predicted values. It captures the variability in the dependent variable that is not explained by the independent variable in the model.
What can be inferred if the residuals are not randomly distributed in the residual plot?
- The data has no outliers
- The data is perfectly linear
- The linear regression model is a perfect fit for the data
- The linear regression model is not a good fit for the data
If the residuals are not randomly distributed (e.g., if they form a pattern), it suggests that the linear regression model is not a good fit for the data. This could be because the relationship between the variables is not linear, or because the data exhibits heteroscedasticity (unequal variances of errors), among other reasons.
What type of data is used in the Chi-square test for goodness of fit?
- Categorical data
- Continuous data
- Interval data
- Ordinal data
The Chi-square test for goodness of fit is used with categorical data. It compares the observed frequencies in each category with the frequencies we would expect to see if the data followed the theoretical distribution.
Which measure of central tendency is most affected by extreme values or outliers?
- All are equally affected
- Mean
- Median
- Mode
The mean is most affected by extreme values or outliers. The mean calculates the average of all data points in a dataset. Therefore, when an extreme value (either very high or very low compared to the rest) is introduced, it can significantly impact the mean. The median and mode are less affected by outliers.
How does the Sign Test handle ties between paired samples?
- Ties are averaged
- Ties are counted as half a sign
- Ties are discarded
- Ties are included in the test
In the Sign Test, ties between paired samples are generally discarded and not included in the final calculation.
Which type of test is the Sign Test: parametric or non-parametric?
- Both
- Non-parametric
- Parametric
- nan
The Sign Test is a non-parametric test used to test the median of a distribution, or to compare the medians of two distributions.
What does the term 'population' mean in the context of statistics?
- All animals in a zoo
- All people in a country
- The entire group that you are interested in studying
- The group of people who respond to a survey
In the context of statistics, a 'population' refers to the entire set of individuals or items that we are interested in studying. This could range from all the people living in a country, to all the particles in a chemical solution, to all the sales transactions in a store, and so forth. When we collect data from every individual or item in the population, it's called a census. The information gained from a population is reliable and accurate because it accounts for everyone or everything we want to study.
What is an event in the context of probability?
- A hypothesis
- A physical happening
- A possible outcome or combination of outcomes
- An experiment
An event in the context of probability refers to a possible outcome or combination of outcomes from a random experiment. For instance, if we toss a coin, the possible outcomes are 'Heads' and 'Tails'. Each of these outcomes is an event.
What's the probability of the union of two mutually exclusive events A and B?
- P(A) * P(B)
- P(A) + P(B)
- P(A) - P(B)
- P(A) / P(B)
The probability of the union of two mutually exclusive events A and B is given by P(A ∪ B) = P(A) + P(B). This is because mutually exclusive events cannot occur at the same time, so the probability of either event A occurring or event B occurring is the sum of their individual probabilities.
The alternative hypothesis, denoted by H1 or Ha, is the hypothesis that _______ observations are influenced by some non-random cause.
- Population
- Random
- Sample
- Systematic
The alternative hypothesis proposes that there is a difference or an effect in the population and the sample observations are influenced by some non-random cause.