For a Hadoop pipeline processing log data from multiple sources, what would be the best approach for data ingestion and analysis?

  • Apache Flink
  • Apache Flume
  • Apache Sqoop
  • Apache Storm
The best approach for ingesting and analyzing log data from multiple sources in a Hadoop pipeline is to use Apache Flume. Flume is designed for efficient, reliable, and scalable data ingestion, making it suitable for handling log data streams.
Add your answer
Loading...

Leave a comment

Your email address will not be published. Required fields are marked *