When handling time-series data in Hadoop, which combination of file format and compression would optimize performance?

Avro with Bzip2
ORC with LZO
Parquet with Snappy
SequenceFile with Gzip

When dealing with time-series data in Hadoop, the optimal combination for performance is using the Parquet file format with Snappy compression. Parquet is columnar storage, and Snappy provides fast compression, making it efficient for analytical queries on time-series data.

Add your answer