How is the strength of correlation between two variables determined?

By the correlation coefficient
By the number of data points
By the slope of the line of best fit
By the y-intercept of the line of best fit

The strength of correlation between two variables is determined by the correlation coefficient. A value close to +1 or -1 indicates a strong correlation, while a value close to 0 indicates a weak or no correlation.

Discuss it

What type of data represents characteristics or attributes?

Categorical data
Ordinal data
Qualitative data
Quantitative data

Qualitative data represents characteristics or attributes. It is often non-numerical and may include qualities such as textures, colors, smells, tastes, appearance, beauty, etc. This data type is commonly used in fields such as sociology, marketing, and psychology.

Discuss it

How does a probability mass function differ from a probability density function?

A probability mass function is used for continuous random variables, while a probability density function is used for discrete random variables
A probability mass function is used for discrete random variables, while a probability density function is used for continuous random variables
The two terms are interchangeable
There is no difference between a probability mass function and a probability density function

A probability mass function is used for discrete random variables and gives the probability that a discrete random variable is exactly equal to some value. A probability density function, on the other hand, is used for continuous random variables and gives the density of the variable at a particular value.

Discuss it

What are the two subtypes of quantitative data?

Categorical and Ordinal
Discrete and Continuous
Interval and Ratio
Nominal and Categorical

Quantitative data can be classified into two subtypes: discrete and continuous. Discrete data can only take certain values (like the number of children in a family – 1, 2, 3, etc.) and Continuous data can take any value within a given range or continuum (like height or weight of a person).

Discuss it

In the context of probability distributions, what is a random variable?

A variable that always takes a constant value
A variable that does not have a specific value
A variable that is not influenced by other variables
A variable whose outcome is based on the result of a random event

A random variable is a variable whose possible values are outcomes of a random event. It can be either discrete (having specific values) or continuous (any value within a certain range).

Discuss it

The __________ plot is used to check the linearity and equal variance assumptions of a multiple linear regression.

Cook's Distance
Leverage
Quantile-Quantile
Residuals vs fitted values

The residuals vs fitted values plot is commonly used in regression diagnostics to check the assumptions of linearity and equal variance (homoscedasticity). The residuals should be scattered randomly around zero, and the spread of the residuals should not change with the fitted values.

Discuss it

In the presence of ties or zeros in differences, it's usually better to apply _________ test.

Mann-Whitney U
Sign
Wilcoxon Signed Rank
nan

In the presence of ties or zeros in differences, it's usually better to apply the Sign Test because the Wilcoxon Signed Rank Test discards zeros and the Sign Test is less sensitive to ties than other tests.

Discuss it

How does the Mann-Whitney U test handle ties?

Ties are given the average rank
Ties are given the highest rank
Ties are given the lowest rank
Ties are removed from the data

In the Mann-Whitney U test, ties (equal values) are handled by giving them the average of the ranks they would have received if they were not tied.

Discuss it

What type of data represents measurements or counts?

Categorical data
Nominal data
Qualitative data
Quantitative data

Quantitative data represents measurements or counts. It can be mathematically quantified and is usually collected in numerical form. For example, data such as age, weight, height, and number of items are all quantitative data.

Discuss it

How can you interpret interaction terms in a multiple linear regression model?

All of the above
They represent the change in the slope of one variable for different values of another variable
They represent the combined effect of two variables on the response
They represent the effect of a variable at different levels of another variable

Interaction terms represent the combined effect of two predictors on the response variable. They can also be interpreted as the effect of a predictor at different levels of another predictor or the change in the slope of one predictor for different values of another predictor.

Discuss it