During your EDA process, you identify several outliers in your dataset. How does this finding impact your subsequent steps in data analysis?
- You may need to collect more data
- You may need to ignore these outliers as they are anomalies
- You might consider robust methods or outlier treatment methods for your analysis
- You might decide to use a different dataset
Identifying outliers during the EDA process would influence the subsequent steps in data analysis. The outliers could indicate errors, but they could also be true data points. Depending on the context, you might need to investigate the reasons for their presence, treat them appropriately (for example, using robust statistical methods, data transformations, or outlier removal), or revise your analysis techniques to accommodate them.
Loading...
Related Quiz
- What are some factors to consider when choosing between a scatter plot, pairplot, correlation matrix, and heatmap?
- Principal Component Analysis (PCA) is a technique that reduces dimensionality by creating new uncorrelated variables called _______. These new variables retain most of the variability in the original dataset.
- The process of converting an actual range of values in a numeric feature column into a standard range of values is known as _____.
- When the data is skewed to the right, the _____ will usually be greater than the median.
- What are the key statistical tools used in Confirmatory Data Analysis (CDA)?