Why might you use a non-parametric test over a parametric one?

The data does not meet the assumptions for a parametric test
The data follows a normal distribution
The data has no outliers
The data set is very large

Non-parametric tests might be used over parametric ones when the data does not meet the assumptions for a parametric test, such as when the data does not follow a normal distribution, when the variances are not equal across groups, or when the data are ordinal or nominal rather than interval or ratio.

Discuss it

Bayes' theorem is a fundamental principle underlying ________ learning.

active
machine
passive
rote

Bayesian methods, which are grounded in Bayes' theorem, play an integral role in many areas of machine learning. They allow the model to update its predictions as it receives more data, making them particularly useful for tasks involving prediction and recommendation.

Discuss it

What is the purpose of an F-test in multiple linear regression?

To check for multicollinearity
To check the linearity of the model
To check the normality of residuals
To check the overall significance of the model

The F-test in multiple linear regression is used to test the overall significance of the model, essentially testing whether at least one of the predictors' coefficients is non-zero and hence contributes to explaining the variability in the response variable.

Discuss it

In the context of multiple linear regression, __________ refers to the phenomenon where the coefficients estimate becomes highly sensitive to changes in the model.

Autocorrelation
Heteroscedasticity
Multicollinearity
Overfitting

Multicollinearity refers to the situation in multiple linear regression where the predictor variables are highly correlated. This can lead to unstable estimates of the coefficients which can change erratically in response to small changes in the model.

Discuss it

How can multicollinearity be addressed in multiple regression analysis?

By adding more variables to the model.
By increasing the sample size.
By removing one or more of the correlated variables.
Multicollinearity cannot be addressed.

Multicollinearity can be addressed by removing one or more of the highly correlated independent variables.

Discuss it

What is the purpose of an interaction term in a regression model?

To increase the complexity of the model
To minimize the error of the model
To represent the combined effect of two variables
To represent the effect of one variable based on the level of another

An interaction term in a regression model is used to represent the combined effect of two independent variables on the dependent variable. It captures situations where the effect of one variable on the dependent variable is different at different levels of another variable.

Discuss it

In what type of problem scenarios is Bayes' Theorem most commonly used?

When new evidence is used to update the probability of an event
When the data is categorical
When the events are mutually exclusive
When the population is normally distributed

Bayes' Theorem is most commonly used when new evidence is used to update the probability of an event. It provides a way to revise existing predictions or theories (prior probabilities) in light of new data (the likelihood).

Discuss it

Which type of data can be categorized into groups: qualitative or quantitative?

Both
None
Qualitative
Quantitative

Qualitative data can be categorized into groups. It represents characteristics or attributes and is often categorized or grouped. For example, hair color (blonde, brunette, etc.) or marital status (single, married, etc.) are qualitative data.

Discuss it

The ________ is the middle value in a data set when the data is arranged in ascending or descending order.

Mean
Median
Mode
nan

The median is the value separating the higher half from the lower half of a data sample. If the data set has an odd number of observations, the number in the middle is the median. If there is an even number of observations, the median is defined as the arithmetic mean of the two middle values.

Discuss it

The probability of the intersection of Events A and B is represented by _______.

P(A + B)
P(A - B)
P(A ∩ B)
P(A ∪ B)

The probability of the intersection of Events A and B is represented by P(A ∩ B), which means the probability that both events A and B occur.

Discuss it