The measure of how much individual sample means will vary is called the __________ error.

Absolute
Margin of
Sampling
Standard

The standard error of a statistic is a measure of the statistical accuracy of an estimate, equal to the standard deviation of the theoretical distribution of a large population of such estimates. It is used to test hypotheses on the grounds of a set of data. For sample means, the standard error tells us how the mean varies from one sample to another.

Discuss it

How does changing the units of measurement affect the standard deviation and variance of a dataset?

It decreases them
It depends on the new units
It doesn't affect them
It increases them

Changing the units of measurement will change the scale of the data, and hence will affect the values of standard deviation and variance. If the data is scaled up, both measures will increase, and if the data is scaled down, they will decrease. However, the relative dispersion, as measured by the coefficient of variation, will remain the same.

Discuss it

What is the principle of equally likely outcomes?

All outcomes are equally probable
All outcomes are identical
All outcomes are independent
All outcomes are mutually exclusive

The principle of equally likely outcomes is a basic assumption in the classical definition of probability. It states that if an experiment has n outcomes, and there's no reason to believe that any one outcome is more likely than any other, then each outcome is assumed to have an equal probability of 1/n. For example, in tossing a fair coin, heads and tails are equally likely.

Discuss it

________ clustering is a density-based clustering method that can find arbitrary shaped clusters and is less affected by outliers.

DBSCAN
Hierarchical
K-means
Spectral

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a density-based clustering method that can find arbitrary shaped clusters and is less affected by outliers. It works based on the density of points in a region, growing clusters according to the density estimate.

Discuss it

What does the standard deviation measure in a dataset?

Central tendency
Dispersion
Kurtosis
Skewness

The standard deviation measures the amount of variation or dispersion in a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range.

Discuss it

How can you determine skewness of a distribution using a box plot?

By the height of the box
By the position of the median in the box
By the width of the box
It cannot be determined from a box plot

The skewness of a distribution can be determined using a box plot by looking at the position of the median in the box. If the median is not in the center of the box (i.e., the quartiles are not equidistant from the median), the data is skewed. If the median is closer to the bottom of the box, the data is positively skewed, and if it's closer to the top, the data is negatively skewed.

Discuss it

If the p-value from a Mann-Whitney U test is less than the significance level, you would ________ the null hypothesis.

accept
either accept or reject
fail to reject
reject

If the p-value from a Mann-Whitney U test is less than the significance level (often 0.05), you would reject the null hypothesis, suggesting there is a significant difference between the groups.

Discuss it

What does the Law of Large Numbers state?

It states that as the size of a sample is increased, the mean value of the sample will get closer to the mean or expected value of the population.
It states that if an event is repeated under identical conditions, the probability of the event remains the same.
It's a rule which states that the sum of the probabilities of all possible events is 1.
It's the law that states the probability of an event is always constant.

The Law of Large Numbers states that as a sample size grows, its mean gets closer to the average of the whole population. In other words, as the number of experiments increases, the actual ratio of outcomes will converge on the theoretical, or expected, ratio of outcomes.

Discuss it

The graphical representation of residuals versus predicted values is known as a ________ plot.

Box
Histogram
Residual
Scatter

A Residual plot is a graph that shows the residuals on the vertical axis and the independent variable on the horizontal axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.

Discuss it

What can the Mann-Whitney U test tell you about the shape of your distributions?

It can confirm if your distributions are normal
It can confirm if your distributions are skewed
It can confirm if your distributions have equal variances
It cannot tell you anything about the shape of your distributions

The Mann-Whitney U test does not provide information about the shape of the distributions. It is a non-parametric test that does not make any assumptions about the distribution of the data.

Discuss it