You are working with a dataset where city names have been entered in various formats (e.g., "NYC," "New York City," "New York"). To standardize these entries, which data cleaning technique would be most appropriate?

Data Imputation
Data Normalization
One-Hot Encoding
String Matching

When dealing with diverse formats of city names, string matching is the most suitable data cleaning technique. It involves comparing and matching strings to standardize them. This ensures that all variations of city names are transformed into a consistent format, making data analysis and aggregation more straightforward.

Add your answer