While preparing data for a machine learning model, you realize that the 'Height' column has some missing values. Upon closer inspection, you find that these missing values often correspond to records where the 'Age' column has values less than 1 year. What might be a reasonable way to handle these missing values?

Impute missing values with the mean height
Impute missing values with 0
Leave missing values as they are
Impute missing values based on 'Age'

In this case, it might be reasonable to leave missing values as they are. Imputing with the mean height or 0 may introduce bias, and imputing based on 'Age' should be done carefully, as infants may have different height characteristics than adults. Depending on the context and dataset size, leaving the missing values untouched might be the best choice.

Add your answer

Facebook Twitter Linkedin Reddit Pinterest

Data Science Quiz

Quiz

What is the process of transforming raw data into a format that makes it suitable for modeling called?

What is the primary aim of software verification in the software development life cycle?

Related Quiz

Leave a commentCancel