Cascading - strategic direction

33 views
Skip to first unread message

Srikanth Adibhatla

unread,
Nov 6, 2018, 5:42:15 AM11/6/18
to cascading-user
we are concluding a POC using cascading for complex ETL workflows. we sort of pivoted away from Spark considering the complexity and business rules involved and are back to MR

the questions i have are more on strategic direction of cascading 
1. will it still be supported actively ?
2. does it make sense to invest in cascading in 2018, considering most of commits into the core API (sources/sinks) are pretty much dated to 2-3 yrs back 


Regards
Srikanth

Chris K Wensel

unread,
Nov 6, 2018, 10:50:25 AM11/6/18
to cascadi...@googlegroups.com
Hi, great questions.

1. will it still be supported actively ?

if by free support from the community, yes, we are still here. And it is Apache licensed, so it can be modified.

if by paid support, there are no companies offering official support, but individuals can be contracted to help out with issues.

2. does it make sense to invest in cascading in 2018, considering most of commits into the core API (sources/sinks) are pretty much dated to 2-3 yrs back 

3.x of Cascading is very stable, and continues to be in production in many companies. There was a minor release earlier this year.

Some or migrating off because they are migrating from MapReduce. 

I am working on Cascading 4

And am building new local mode extensions

ckw


minkymorgan

unread,
Nov 8, 2018, 12:32:15 PM11/8/18
to cascading-user
Srikanth

I’d love to know your reasons for pivoting away from spark.
Any usecases?

Andrew

Srikanth

unread,
Nov 8, 2018, 8:59:57 PM11/8/18
to cascadi...@googlegroups.com
1. For complex ETL pipelines where we had to apply multiple business rules on incoming data, we found spark to require large memory than we could throw at it
2. We were ingesting file sizes up to 5GB in some cases and this again hit memory constraints 
3. Partly, this is because we run an internal cluster and not on things like AWS/GCP

Also, this also helped us calibrate the solution for use-case better since large part of our case was more ETL than streaming

Hope this helps 
Srikanth

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at https://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/f93a92ed-50ef-46d8-b0ca-c4bc1cd71d96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages