Akka Streams as an ETL tool?

526 views
Skip to first unread message

Beno

unread,
Jul 14, 2016, 12:28:20 AM7/14/16
to Akka User List
Ive been using Akka streaming for a use case which I dont see much about - that of a small/moderate scale ETL or simple processing pipeline. Im relatively new to it all, so I just wanted to see if I might be missing something that would change my opinion, which is that Akka Streams is among the best tools for data cleaning--  the graph dsl is so easy to code with and reason about.

The details: Batch processing to clean and curate data, with external calls RESTful requests as part of the flow. 

Source[A] (read from file or DB) ~> Flow[A,B](some transformation function) ~>  Flow[B,C] (by way of a RESTful request/response) ~> Flow[C,D](graph query) ~> Sink[D](to DB)

Where Source might be 50,000 lines in a file or rows in a table. 

Thanks for any feedback

James Matlik

unread,
Jul 14, 2016, 7:22:37 AM7/14/16
to akka...@googlegroups.com

Using Akka streams for ETL is our primary use case. The back pressure support has been extremely useful in helping us maximize throughout while at the same time avoid overwhelming the multiple external rest services we query against. By maintaining dedicated, fixed sized dispatcher pools, we can easily use our legacy blocking client SDKs over a fixed max number of concurrent connections/requests. Then the ETL can process as fast as it possibly can within those constraints.

We found the learning curve to be on the steep side, but once it clicks, the power and ease of use Streams provides is... impressive... refreshing... exhilarating... addicting.. take your pick.


--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Patrik Nordwall

unread,
Jul 16, 2016, 10:31:03 AM7/16/16
to akka...@googlegroups.com
Akka Streams is an excellent choice for this use case. I would even say that it is the canonical use case for Akka Streams. We have probably not highlighted that enough, yet.

Cheers,
Patrik

Akka Team

unread,
Jul 22, 2016, 5:46:21 AM7/22/16
to Akka User List
Hi James

On Thu, Jul 14, 2016 at 1:22 PM, James Matlik <james....@gmail.com> wrote:

Using Akka streams for ETL is our primary use case. The back pressure support has been extremely useful in helping us maximize throughout while at the same time avoid overwhelming the multiple external rest services we query against. By maintaining dedicated, fixed sized dispatcher pools, we can easily use our legacy blocking client SDKs over a fixed max number of concurrent connections/requests. Then the ETL can process as fast as it possibly can within those constraints.

We found the learning curve to be on the steep side, but once it clicks, the power and ease of use Streams provides is... impressive... refreshing... exhilarating... addicting.. take your pick.

Thank you for the kind words. Are you maybe interested in contributing some material around ETL? Maybe a blog post, or even as simple as adding some patterns into our cookbook: http://doc.akka.io/docs/akka/2.4/scala/stream/stream-cookbook.html ?

-Endre



--
Akka Team
Lightbend - Reactive apps on the JVM
Twitter: @akkateam

James Matlik

unread,
Jul 25, 2016, 11:01:27 AM7/25/16
to akka...@googlegroups.com

Hello Endre,

I would be happy to contribute some documentation when I have some time. Unfortunately my current code base is not OSS friendly, so it will take some effort to provide meaningful examples of more complex patterns I've been using.  Do you have a preference on how cookbook examples are submitted?

- James

Akka Team

unread,
Jul 27, 2016, 10:08:33 AM7/27/16
to Akka User List
Hi James,

If you want to contribute to the cookbook then just follow the style of the samples already in there, in other words decently short recipes of some patterns with a bit of code sample and a bit of explanation.

Let us know if you start out but get stuck for some reason!

--
Johan
Reply all
Reply to author
Forward
0 new messages