You are working with a dataset where city names have been entered in various formats (e.g., "NYC," "New York City," "New York"). To standardize these entries, which data cleaning technique would be most appropriate?

  • Data Imputation
  • Data Normalization
  • One-Hot Encoding
  • String Matching
When dealing with diverse formats of city names, string matching is the most suitable data cleaning technique. It involves comparing and matching strings to standardize them. This ensures that all variations of city names are transformed into a consistent format, making data analysis and aggregation more straightforward.
Add your answer
Loading...

Leave a comment

Your email address will not be published. Required fields are marked *