What functionality does the Seaborn library add over Matplotlib?

  • 3D plotting
  • Interactive plotting
  • Real-time plotting
  • Statistical plotting
While Matplotlib is a powerful library for creating a wide range of plots, Seaborn adds on to this by providing a number of high-level statistical plotting capabilities, allowing users to create more informative and attractive visualizations with fewer lines of code.

Which measure of central tendency can be used for both quantitative and qualitative data?

  • Mean
  • Median
  • Mode
  • nan
The "Mode" is the measure of central tendency that can be used for both quantitative and qualitative data. It is the value that appears most frequently in a data set, and it is the only measure of central tendency that can be used with nominal data.

Which method for dealing with missing data might introduce bias if the data is not missing completely at random?

  • Listwise Deletion
  • Mean/Median/Mode Imputation
  • Pairwise Deletion
  • Regression Imputation
Mean/Median/Mode Imputation might introduce bias if the data is not missing completely at random. If missing values have some systematic patterns, replacing them with mean, median, or mode might lead to incorrect estimation of variability and biased results.

You find that both Z-score and modified Z-score methods give different sets of outliers for the same dataset. How will you reconcile this?

  • Assume the Z-score method is correct
  • Assume the modified Z-score method is correct
  • Consider the intersection of both methods
  • Further inspect the data and the assumptions of each method
When two methods give different sets of outliers, it's best to further inspect the data and the assumptions of each method before drawing conclusions.

To create multiple plots in one figure in Matplotlib, you would use the ___________ function.

  • heatmap
  • pairplot
  • subplot
  • violinplot
The 'subplot' function in Matplotlib is used to create multiple plots in a single figure. It allows you to arrange plots in a grid structure.

What is the full form of NMAR in the context of missing data?

  • Never Missing At Random
  • No Missing At Random
  • Not Measured At Random
  • Not Missing At Random
In the context of missing data, NMAR stands for Not Missing At Random.

The _________ library in Python allows for the creation of complex animated plots and provides widgets to allow for interactive plots.

  • Bokeh
  • Matplotlib
  • Plotly
  • Seaborn
Bokeh is a powerful library for creating interactive plots, including complex animated plots, and it includes support for widgets, making it a great tool for creating dynamic, interactive visualizations.

Readability in data visualization refers to how easily the audience can __________.

  • Analyze the underlying code
  • Download the graph
  • Interact with the graph
  • Understand the represented data
Readability in data visualization refers to how easily the audience can understand the represented data. This includes the clarity of text elements (labels, title, caption), color scheme, and whether the choice of plot type makes sense for the represented data.

In the context of handling missing data, what does 'imputation' mean?

  • Adding artificial data
  • Deleting data points
  • Filling in missing data with substituted values
  • Transforming data
In the context of handling missing data, 'imputation' refers to the process of filling in missing data with substituted values. These values can be determined in a variety of ways such as using measures of central tendency (mean, median, mode), predictive models, or other techniques.

Imagine you are examining a correlation matrix and find that two variables have a correlation coefficient close to -1. What does this imply about the relationship between these two variables?

  • Their relationship is random
  • They are unrelated
  • They have a strong negative relationship
  • They have a weak positive relationship
A correlation coefficient close to -1 implies that the two variables have a strong negative relationship. This means that as one variable increases, the other decreases and vice versa.