In a dataset with a categorical variable missing for some rows, why might mode imputation not be the best strategy?
- All of the above
- It can introduce bias if the data is not missing at random
- It could distort the original data distribution
- It may not capture the underlying data pattern
Mode imputation might not be the best strategy for a dataset with a categorical variable missing for some rows. Although it's simple to implement, it may fail to capture the underlying data pattern, introduce bias if the data is not missing at random, and distort the original data distribution by overrepresenting the mode.
Loading...
Related Quiz
- What kind of effect can an outlier have on a linear regression model?
- Imagine you are analyzing a dataset with continuous variables with outliers. The focus is on understanding the linear relationship between these variables. What type of correlation coefficient should you consider?
- How does the presence of outliers affect the range and interquartile range?
- Which type of missing data relies on information that is not included in the dataset?
- What is the key characteristic of a Uniform Distribution?