Suppose you are dealing with time series data with some missing values and you decided to use regression imputation. What potential issues might arise and how could you address them?
- May lead to overfitting; Address by adding more data
- May violate independence assumption; Address by considering time dependence
- May violate uniform distribution; Address by transforming data
- No issues might arise
In time series data, observations are usually dependent on time, so the independence assumption of regression imputation may be violated. This issue can be addressed by considering time dependence in the regression model used for imputation, for example by including lagged variables.
Loading...
Related Quiz
- Which Python visualization library would be most suited to creating a complex, layered, "small multiple" style plot?
- In what circumstances can the IQR method lead to incorrect detection of outliers?
- A correlation coefficient of +1 between two variables indicates what kind of relationship?
- What is skewness in the context of data analysis?
- Which measure of central tendency is calculated by adding all the numbers and dividing by the number of numbers?