In a case study where Hive is used for analyzing web log data, what data storage format would be most optimal for query performance?

  • Avro
  • ORC (Optimized Row Columnar)
  • Parquet
  • SequenceFile
For analyzing web log data in Hive, using the ORC (Optimized Row Columnar) storage format is optimal. ORC is highly optimized for read-heavy workloads, offering efficient compression and predicate pushdown, resulting in improved query performance.
Add your answer
Loading...

Leave a comment

Your email address will not be published. Required fields are marked *