What assumptions must be met for Pearson's Correlation Coefficient to be valid?
- Both variables are independent
- Both variables are measured on a nominal scale
- Both variables are normally distributed, and there is a linear relationship between them
- Both variables have no outliers
For Pearson's Correlation Coefficient to be valid and reliable, the following assumptions should be met: both variables should be continuous, they should be linearly related, and both variables should be approximately normally distributed. Independence of observations is also required.
What is the probability of an impossible event?
- 0
- 1
- Infinity
- Undefined
The probability of an impossible event is 0. In the probability scale, 0 denotes impossibility, while 1 denotes certainty. An event with a probability of 0 is said to be impossible because it cannot happen.
What is a cumulative distribution function?
- It is the function that maps values to their percentile rank in a distribution
- It is the function that shows the cumulative probability associated with a function
- It is the maximum value a random variable can take
- It is the minimum value a random variable can take
The cumulative distribution function (CDF) of a random variable is the probability that the variable takes a value less than or equal to a certain value. The CDF of a function increases monotonically, and its limit is one as it approaches positive infinity.
In a skewed distribution, the ________ tends to get pulled in the direction of the skew.
- Mean
- Median
- Mode
- nan
In a skewed distribution, the mean tends to get pulled in the direction of the skew. Since the mean involves every value in the distribution, extreme values (values far from the others) have a big influence. This results in skewness where the mean is drawn towards the tail, and is a common occurrence in distributions that are not symmetric.
What is the significance of the 68-95-99.7 rule in a normal distribution?
- It refers to the kurtosis of the distribution
- It refers to the outliers in the distribution
- It refers to the percentage of data within 1, 2, and 3 standard deviations of the mean
- It refers to the skewness of the distribution
The 68-95-99.7 rule, also known as the empirical rule, states that for a normal distribution, 68% of the data fall within one standard deviation of the mean, 95% fall within two standard deviations, and 99.7% fall within three standard deviations. This rule provides a quick estimate of the probability of a certain event within the distribution.
What does ANOVA stand for?
- Analysis Of Variance
- Analysis Of Vitality
- Average Of Variance
- nan
ANOVA stands for Analysis Of Variance. It's a statistical technique used to check if the means of two or more groups are significantly different from each other.
What is the Central Limit Theorem and how does it relate to the normal distribution?
- It states that all distributions are ultimately normal distributions
- It states that the mean of a large sample is always equal to the population mean
- It states that the sum of a large number of independent and identically distributed random variables tends to be normally distributed
- It states that the sum of a small number of random variables has an exponential distribution
The Central Limit Theorem states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined (finite) expected value and finite variance, will be approximately normally distributed, regardless of the shape of the original distribution.
How do you calculate the probability of the intersection of two independent events?
- P(A ∩ B) = P(A) * P(B)
- P(A ∩ B) = P(A) + P(B)
- P(A ∩ B) = P(A) - P(B)
- P(A ∩ B) = P(A) / P(B)
The probability of the intersection of two independent events is calculated as the product of their individual probabilities. So if A and B are independent, P(A ∩ B) = P(A) * P(B). This is a direct result of the Multiplication Rule for independent events.
The normal distribution is also known as the ________ distribution.
- Exponential
- Gaussian
- Poisson
- Uniform
The normal distribution is also known as the Gaussian distribution. It is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is bell-shaped.
How does the presence of outliers affect measures of dispersion like range, variance, and standard deviation?
- Decreases them
- Depends on the values of the outliers
- Increases them
- No effect
Outliers can greatly affect measures of dispersion like the range, variance, and standard deviation by making them larger. These measures consider the distance of each value from the mean, so an outlier (which is a value that is significantly higher or lower than the other values) can result in a much larger measure of dispersion.