You are working on a dataset with income values, and you notice that a majority of incomes are clustered around $50,000, but a few are as high as $1,000,000. What transformation would be best suited to reduce the impact of these high incomes on your analysis?
- Min-Max Scaling
- Log Transformation
- Z-score Standardization
- Removing Outliers
To reduce the impact of extreme values in income data, a log transformation is often used. It compresses the range of values and makes the distribution more symmetrical. Min-Max scaling and z-score standardization don't address the issue of extreme values, and removing outliers may lead to loss of important information.
Loading...
Related Quiz
- What is the most common measure of central tendency, which calculates the average value of a dataset?
- Which term refers to the ethical principle where AI systems should be transparent about how they make decisions?
- You are a data engineer tasked with setting up a real-time data processing system for a large e-commerce platform. The goal is to analyze user behavior in real-time to provide instant recommendations. Which technology would be most appropriate for this task?
- For datasets with multiple features, EDA often involves dimensionality reduction techniques like PCA to visualize data in two or three _______.
- Which emerging technology in Data Science uses a combination of AI, sensors, and data analytics to predict and prevent equipment failures?