When handling time-series data in Hadoop, which combination of file format and compression would optimize performance?

  • Avro with Bzip2
  • ORC with LZO
  • Parquet with Snappy
  • SequenceFile with Gzip
When dealing with time-series data in Hadoop, the optimal combination for performance is using the Parquet file format with Snappy compression. Parquet is columnar storage, and Snappy provides fast compression, making it efficient for analytical queries on time-series data.
Add your answer
Loading...

Leave a comment

Your email address will not be published. Required fields are marked *