In a scenario involving large-scale data aggregation in a Hadoop pipeline, which tool would be most effective?

Apache HBase
Apache Hive
Apache Kafka
Apache Spark

In scenarios involving large-scale data aggregation, Apache HBase would be a suitable tool. HBase is a NoSQL database that provides real-time read and write access to large datasets, making it effective for quick data retrieval in aggregation scenarios.

Add your answer