How does negative kurtosis affect the tails of a data distribution?

It has no effect on the tails of the distribution.
It makes the distribution perfectly symmetrical.
It makes the tails of the distribution heavier.
It makes the tails of the distribution lighter.

Negative kurtosis, also known as platykurtic kurtosis, makes the tails of the data distribution lighter, indicating fewer extreme outliers. The distribution is flatter or more spread out than a normal distribution.

Discuss it

Which type of analysis is most commonly used for hypothesis testing?

CDA
Data Visualization
EDA
Predictive Modeling

CDA (Confirmatory Data Analysis) is most commonly used for hypothesis testing. While EDA is used to formulate hypotheses, CDA uses statistical techniques to confirm or reject these hypotheses.

Discuss it

The _________ library in Python allows for the creation of complex animated plots and provides widgets to allow for interactive plots.

Bokeh
Matplotlib
Plotly
Seaborn

Bokeh is a powerful library for creating interactive plots, including complex animated plots, and it includes support for widgets, making it a great tool for creating dynamic, interactive visualizations.

Discuss it

What is the full form of NMAR in the context of missing data?

Never Missing At Random
No Missing At Random
Not Measured At Random
Not Missing At Random

In the context of missing data, NMAR stands for Not Missing At Random.

Discuss it

To create multiple plots in one figure in Matplotlib, you would use the ___________ function.

heatmap
pairplot
subplot
violinplot

The 'subplot' function in Matplotlib is used to create multiple plots in a single figure. It allows you to arrange plots in a grid structure.

Discuss it

You find that both Z-score and modified Z-score methods give different sets of outliers for the same dataset. How will you reconcile this?

Assume the Z-score method is correct
Assume the modified Z-score method is correct
Consider the intersection of both methods
Further inspect the data and the assumptions of each method

When two methods give different sets of outliers, it's best to further inspect the data and the assumptions of each method before drawing conclusions.

Discuss it

Which method for dealing with missing data might introduce bias if the data is not missing completely at random?

Listwise Deletion
Mean/Median/Mode Imputation
Pairwise Deletion
Regression Imputation

Mean/Median/Mode Imputation might introduce bias if the data is not missing completely at random. If missing values have some systematic patterns, replacing them with mean, median, or mode might lead to incorrect estimation of variability and biased results.

Discuss it

Readability in data visualization refers to how easily the audience can __________.

Analyze the underlying code
Download the graph
Interact with the graph
Understand the represented data

Readability in data visualization refers to how easily the audience can understand the represented data. This includes the clarity of text elements (labels, title, caption), color scheme, and whether the choice of plot type makes sense for the represented data.

Discuss it

What is the underlying JavaScript library that Plotly uses to render its graphics?

D3.js
Node.js
React.js
jQuery

Plotly uses D3.js (Data-Driven Documents) under the hood to render its graphics. D3.js is a JavaScript library for producing dynamic and interactive data visualizations in web browsers.

Discuss it

You're working with a data set where a few observations are vastly different from the rest. Which method, Z-score or IQR, would be more robust to use for outlier detection?

Either would work equally well
IQR
Neither would be effective
Z-score

The IQR method is more robust than Z-score for outlier detection in this scenario, as Z-scores can be significantly affected by extreme values.

Discuss it

What is an outlier in the context of Exploratory Data Analysis?

A data point that falls outside of the normal range
A data point that is a duplicate
A data point that is missing
A frequently occurring data point

In statistics, an outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In simple terms, an outlier is a value that is significantly different from other similar values.

Discuss it

What is the key difference between 'removal' and 'transformation' of outliers?

Removal changes the data distribution, while transformation does not
Removal deals with extreme values, while transformation does not
Removal discards outliers, while transformation modifies their values
Removal is a type of data cleaning, while transformation is not

The key difference between 'removal' and 'transformation' of outliers is that removal discards outliers from the dataset, while transformation modifies the values of outliers to reduce their impact.

Discuss it