Flame-MR is a new MapReduce framework that improves the performance of Hadoop. It is based on an event-driven architecture that enhances the usage of memory and computing resources, while also applying several optimizations to the overall MapReduce process. One of them is the reduction of memory copies and object creations, which decreases the overhead of the Java garbage collector. Flame-MR also pipelines data movements through network and disk, alternating computation, network utilization and I/O operations. Finally, it applies specific optimization techniques to each MapReduce phase, including in-memory object sort, a k-way merge algorithm and binary comparisons of data fields.
Flame-MR workflow overview
The implementation of Flame-MR is purely based on Java, which ensures its portability among systems. Moreover, it is fully integrated with the Hadoop ecosystem, running on Hadoop YARN and using HDFS for data storage. Flame-MR keeps full compatibility with Hadoop API, which means that existing MapReduce applications can be executed with Flame-MR without source code modifications.