Flume

FLUME

Apache flume data transfer


What is FLUME?.

Flume is a tool with data ingestion mechanism for collecting aggregating and transporting large amounts of streaming data such as log files, events from various sources to a centralized data store. Flume is a highly reliable, distributed, and configurable tool. It is principally designed to copy streaming data (log data) from various web servers to HDFS.




  Pros:
  • Provides the feature of contextual routing.
  • Reliable, fault tolerant, scalable, manageable, and customizable.
  • Supports multi-hop flows, fan-in fan-out flows, contextual routing, etc.
  • Flume can be scaled horizontally.
  Cons:
  • Weak ordering guarantee.
  • Does Not guarantee that message reaching is unique (duplicate messages might pop in at times, in many scenarios).