You have a data set with a large number of outliers. Which measure of dispersion should you use to best describe the data set, and why?
- Interquartile range (IQR) because it is robust to outliers
- Range because it covers all values
- Standard deviation because it gives the average spread
- Variance because it squares the differences
When dealing with a large number of outliers in a data set, the "Interquartile range (IQR)" is the most suitable measure of dispersion. This is because it measures the statistical spread between the 25th and 75th percentiles, thus excluding outliers.
Loading...
Related Quiz
- In a scenario where you have to visualize real-time data for a live audience, what factors would you consider in your data visualization strategy?
- What are the disadvantages of using backward elimination in feature selection?
- Which of the following graphs can help identify outliers in a univariate dataset?
- What is the importance of the 'explore' step in the EDA process?
- What is the main purpose of data normalization in machine learning?