The normal distribution is also known as the ________ distribution.

Exponential
Gaussian
Poisson
Uniform

The normal distribution is also known as the Gaussian distribution. It is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is bell-shaped.

Discuss it

How do you calculate the probability of the intersection of two independent events?

P(A ∩ B) = P(A) * P(B)
P(A ∩ B) = P(A) + P(B)
P(A ∩ B) = P(A) - P(B)
P(A ∩ B) = P(A) / P(B)

The probability of the intersection of two independent events is calculated as the product of their individual probabilities. So if A and B are independent, P(A ∩ B) = P(A) * P(B). This is a direct result of the Multiplication Rule for independent events.

Discuss it

What type of data represents characteristics or attributes?

Categorical data
Ordinal data
Qualitative data
Quantitative data

Qualitative data represents characteristics or attributes. It is often non-numerical and may include qualities such as textures, colors, smells, tastes, appearance, beauty, etc. This data type is commonly used in fields such as sociology, marketing, and psychology.

Discuss it

How is the strength of correlation between two variables determined?

By the correlation coefficient
By the number of data points
By the slope of the line of best fit
By the y-intercept of the line of best fit

The strength of correlation between two variables is determined by the correlation coefficient. A value close to +1 or -1 indicates a strong correlation, while a value close to 0 indicates a weak or no correlation.

Discuss it

How does the sample size affect the power of the Kruskal-Wallis Test?

It depends on the data
Larger sample sizes decrease power
Larger sample sizes increase power
Sample size has no effect on power

Larger sample sizes increase the power of the Kruskal-Wallis Test. Power is the ability of a test to detect a true effect when there is one.

Discuss it

Polynomial regression allows us to model a relationship between the dependent variable and independent variables as a _________.

High
Linear equation
Non-linear equation
Straight line

Polynomial regression allows us to model the relationship between the dependent variable and independent variables as a non-linear equation. This is achieved by raising independent variables to a power, allowing the model to fit more complex data patterns.

Discuss it

When a data distribution is skewed, which measure of central tendency is typically the most reliable?

Mean
Median
Mode
nan

The median is usually the most reliable measure of central tendency when a data distribution is skewed. Unlike the mean, the median isn't influenced by extreme values. Therefore, in a skewed distribution, the median generally gives a better idea of the typical value than the mean.

Discuss it

What is 'dendrogram' in hierarchical clustering?

A diagram showing the change in the number of clusters
A graph showing the distribution of clusters
A tree-like diagram that represents the hierarchy of clusters
The center point of a cluster

A dendrogram is a tree-like diagram that is used in hierarchical clustering to represent the hierarchy of clusters. Each join in the dendrogram represents the two clusters merging, and the height of the join is the distance between those clusters.

Discuss it

The __________ plot is used to check the linearity and equal variance assumptions of a multiple linear regression.

Cook's Distance
Leverage
Quantile-Quantile
Residuals vs fitted values

The residuals vs fitted values plot is commonly used in regression diagnostics to check the assumptions of linearity and equal variance (homoscedasticity). The residuals should be scattered randomly around zero, and the spread of the residuals should not change with the fitted values.

Discuss it

In the context of probability distributions, what is a random variable?

A variable that always takes a constant value
A variable that does not have a specific value
A variable that is not influenced by other variables
A variable whose outcome is based on the result of a random event

A random variable is a variable whose possible values are outcomes of a random event. It can be either discrete (having specific values) or continuous (any value within a certain range).

Discuss it

What are the two subtypes of quantitative data?

Categorical and Ordinal
Discrete and Continuous
Interval and Ratio
Nominal and Categorical

Quantitative data can be classified into two subtypes: discrete and continuous. Discrete data can only take certain values (like the number of children in a family – 1, 2, 3, etc.) and Continuous data can take any value within a given range or continuum (like height or weight of a person).

Discuss it

How does a probability mass function differ from a probability density function?

A probability mass function is used for continuous random variables, while a probability density function is used for discrete random variables
A probability mass function is used for discrete random variables, while a probability density function is used for continuous random variables
The two terms are interchangeable
There is no difference between a probability mass function and a probability density function

A probability mass function is used for discrete random variables and gives the probability that a discrete random variable is exactly equal to some value. A probability density function, on the other hand, is used for continuous random variables and gives the density of the variable at a particular value.

Discuss it