What is the full form of NMAR in the context of missing data?

  • Never Missing At Random
  • No Missing At Random
  • Not Measured At Random
  • Not Missing At Random
In the context of missing data, NMAR stands for Not Missing At Random.

The _________ library in Python allows for the creation of complex animated plots and provides widgets to allow for interactive plots.

  • Bokeh
  • Matplotlib
  • Plotly
  • Seaborn
Bokeh is a powerful library for creating interactive plots, including complex animated plots, and it includes support for widgets, making it a great tool for creating dynamic, interactive visualizations.

Which type of analysis is most commonly used for hypothesis testing?

  • CDA
  • Data Visualization
  • EDA
  • Predictive Modeling
CDA (Confirmatory Data Analysis) is most commonly used for hypothesis testing. While EDA is used to formulate hypotheses, CDA uses statistical techniques to confirm or reject these hypotheses.

How does negative kurtosis affect the tails of a data distribution?

  • It has no effect on the tails of the distribution.
  • It makes the distribution perfectly symmetrical.
  • It makes the tails of the distribution heavier.
  • It makes the tails of the distribution lighter.
Negative kurtosis, also known as platykurtic kurtosis, makes the tails of the data distribution lighter, indicating fewer extreme outliers. The distribution is flatter or more spread out than a normal distribution.

What type of plot is often used for visualizing the relationship between two continuous variables?

  • Bar plot
  • Box plot
  • Histogram
  • Scatter plot
Scatter plots are ideal for visualizing the relationship between two continuous variables. Each point in the scatter plot corresponds to the values of two variables.

What is the process of removing an entire row when any single data point within it is missing called?

  • Listwise Deletion
  • Mean Imputation
  • Pairwise Deletion
  • Regression Imputation
The process of removing an entire row when any single data point within it is missing is called 'Listwise Deletion'. Also known as 'Complete Case Analysis', this technique is straightforward and fast, but it can potentially discard valuable data and introduce bias if the missingness is not completely at random.

What functionality does the Seaborn library add over Matplotlib?

  • 3D plotting
  • Interactive plotting
  • Real-time plotting
  • Statistical plotting
While Matplotlib is a powerful library for creating a wide range of plots, Seaborn adds on to this by providing a number of high-level statistical plotting capabilities, allowing users to create more informative and attractive visualizations with fewer lines of code.

While analyzing a dataset using a box plot, you notice that there are several data points plotted as circles. What might these circles represent?

  • Data within the interquartile range
  • Data within the whiskers
  • Median values
  • Outliers
In a box plot, data points plotted as circles often represent outliers.

What is the key difference between 'removal' and 'transformation' of outliers?

  • Removal changes the data distribution, while transformation does not
  • Removal deals with extreme values, while transformation does not
  • Removal discards outliers, while transformation modifies their values
  • Removal is a type of data cleaning, while transformation is not
The key difference between 'removal' and 'transformation' of outliers is that removal discards outliers from the dataset, while transformation modifies the values of outliers to reduce their impact.

What is an outlier in the context of Exploratory Data Analysis?

  • A data point that falls outside of the normal range
  • A data point that is a duplicate
  • A data point that is missing
  • A frequently occurring data point
In statistics, an outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In simple terms, an outlier is a value that is significantly different from other similar values.