Presenting akka-mapreduce

89 views
Skip to first unread message

Nicolau Werneck

unread,
Apr 27, 2015, 1:01:26 AM4/27/15
to akka...@googlegroups.com
Greetings. I wanted to share with you our open source project we called 'akka-mapreduce' for the lack of a better name. We are developing a small Map/Reduce framework using Akka in Scala.
    https://github.com/projetoeureka/akka-mapreduce

The idea is to run map-reduce jobs keeping the aggregated output in memory all the way to the end. It is a more lightweight alternative to Spark, Storm or Hadoop Streaming directed to ad-hoc data processing problems.

When your data is too big you have to read it iteratively, and that makes the problem unsuitable for things like Scala parallel collections. Akka-mapreduce attempts to be a better fit for those situations when, while you can't load the input to the memory all at once, it is small enough to be processed by a monolithic application in a single multi-core machine.

We have developed a reducer component with a "decimator" actor that allows us to know that all the data has been processed. We are also creating a builder class that lets you define a processing pipeline like this

val pipeline = pipe_mapkv {
  row: String => row split raw"\s+"
} times 4 map {
  word => Some(KeyVal(word.trim.toLowerCase, 1))
} times 4 reduce (_ + _) times 8 output self

We started the project for our own purposes, but we thought it might be of interest to someone else, and decided to publish it. It is still a very young project, so there are many rough edges and things are changing very fast, but we would love to hear some feedback from the community.

There appears to be a lot of Akka map-reduce examples out there on the Internet, but few of them are in Scala, and even fewer offer a really good solution to properly reading the data and to figure out that processing has ended. We are not sure the framework will be adopted by anyone, but it might have a couple of ideas that can be borrowed by your individual projects.

Cheers,
    ++nic

Akka Team

unread,
May 4, 2015, 5:17:12 AM5/4/15
to Akka User List
Thank you for sharing Nicolau!

-Endre

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--
Akka Team
Typesafe - Reactive apps on the JVM
Blog: letitcrash.com
Twitter: @akkateam
Reply all
Reply to author
Forward
0 new messages