How is the strength of correlation between two variables determined?

  • By the correlation coefficient
  • By the number of data points
  • By the slope of the line of best fit
  • By the y-intercept of the line of best fit
The strength of correlation between two variables is determined by the correlation coefficient. A value close to +1 or -1 indicates a strong correlation, while a value close to 0 indicates a weak or no correlation.

What type of data represents characteristics or attributes?

  • Categorical data
  • Ordinal data
  • Qualitative data
  • Quantitative data
Qualitative data represents characteristics or attributes. It is often non-numerical and may include qualities such as textures, colors, smells, tastes, appearance, beauty, etc. This data type is commonly used in fields such as sociology, marketing, and psychology.

What are the two subtypes of quantitative data?

  • Categorical and Ordinal
  • Discrete and Continuous
  • Interval and Ratio
  • Nominal and Categorical
Quantitative data can be classified into two subtypes: discrete and continuous. Discrete data can only take certain values (like the number of children in a family – 1, 2, 3, etc.) and Continuous data can take any value within a given range or continuum (like height or weight of a person).

In the context of probability distributions, what is a random variable?

  • A variable that always takes a constant value
  • A variable that does not have a specific value
  • A variable that is not influenced by other variables
  • A variable whose outcome is based on the result of a random event
A random variable is a variable whose possible values are outcomes of a random event. It can be either discrete (having specific values) or continuous (any value within a certain range).

The __________ plot is used to check the linearity and equal variance assumptions of a multiple linear regression.

  • Cook's Distance
  • Leverage
  • Quantile-Quantile
  • Residuals vs fitted values
The residuals vs fitted values plot is commonly used in regression diagnostics to check the assumptions of linearity and equal variance (homoscedasticity). The residuals should be scattered randomly around zero, and the spread of the residuals should not change with the fitted values.

In the presence of ties or zeros in differences, it's usually better to apply _________ test.

  • Mann-Whitney U
  • Sign
  • Wilcoxon Signed Rank
  • nan
In the presence of ties or zeros in differences, it's usually better to apply the Sign Test because the Wilcoxon Signed Rank Test discards zeros and the Sign Test is less sensitive to ties than other tests.

How does the Mann-Whitney U test handle ties?

  • Ties are given the average rank
  • Ties are given the highest rank
  • Ties are given the lowest rank
  • Ties are removed from the data
In the Mann-Whitney U test, ties (equal values) are handled by giving them the average of the ranks they would have received if they were not tied.

What type of data represents measurements or counts?

  • Categorical data
  • Nominal data
  • Qualitative data
  • Quantitative data
Quantitative data represents measurements or counts. It can be mathematically quantified and is usually collected in numerical form. For example, data such as age, weight, height, and number of items are all quantitative data.

How can you interpret interaction terms in a multiple linear regression model?

  • All of the above
  • They represent the change in the slope of one variable for different values of another variable
  • They represent the combined effect of two variables on the response
  • They represent the effect of a variable at different levels of another variable
Interaction terms represent the combined effect of two predictors on the response variable. They can also be interpreted as the effect of a predictor at different levels of another predictor or the change in the slope of one predictor for different values of another predictor.

What is multicollinearity in the context of multiple regression?

  • It refers to the high correlation between at least two independent variables.
  • It refers to the linear relationship between each independent variable and the dependent variable.
  • It refers to the presence of a linear relationship between the dependent variables.
  • It refers to the relationship between the residuals of the regression model.
Multicollinearity refers to the situation in which two or more predictor variables in a regression model are highly correlated.