Latest wip 3.0

37 views
Skip to first unread message

Chris K Wensel

unread,
Mar 19, 2015, 12:42:30 PM3/19/15
to cascadi...@googlegroups.com
hey all

Still making good progress on 3.0 and Apache Tez support.

Do note the next few wips will have a dependency on JGrapht 0.9.1 SNAPSHOT, so you may need to add yet another repo to your build:

we will release 3.0 without the SNAPSHOT (I rather dislike the snapshot model) but its a necessary evil.

fwiw, we have updated jgrapht to correct two things long time users may have hit more than once.

the ‘no such vertext’ messages have been updated to name the offending vertex.

graphs can now be backed by an IdentityHashMap. This will both improve performance of planning, but allow things like Taps and Pipe to mistakenly change their #hashCode() value, which was the root of nearly all ‘no such vertex’ errors.

will also add that one user is having some success with Scalding on Cascading 3.0 and Tez, with some crazy complex dags. Hope to report complete success soon.

ckw

Chris K Wensel




Chris K Wensel

unread,
Apr 3, 2015, 5:12:30 PM4/3/15
to cascadi...@googlegroups.com
The JGraphT SNAPSHOT dependency has been removed. 

We can’t get a firm commitment from that community on when they will make a 0.9.1 release, so we have rolled back the version to 0.9.0 but have implemented a workaround for injecting the IdentityHashMap into the backing graphs.

This is a little wasteful on the client side (the original map is gc’d), but it won’t affect anything cluster side.

On another note, we are working on one additional refactoring around the ProcessFlow class (decoupling from Hadoop) allowing general purpose use. This is the last planned feature addition to 3.0.

That said, there is a less than optimal plan, on Tez, being generated in some cases where HashJoins are used. And these plans might be aggravating a formerly unknown issue with Tez, so we are attempting to update the rule set to compensate (and be more optimal).

So if you haven’t tested with Cascading 3 WIP on MR or Tez, please do give it a shot, we are very close to getting a 3.0.0 release out. I would love to resolve as many issues as possible before we do.

ckw


--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/A867EAD2-4023-47E3-BA3E-A09CB0A3628C%40wensel.net.
For more options, visit https://groups.google.com/d/optout.

Chris K Wensel




Chris K Wensel

unread,
Apr 3, 2015, 6:17:07 PM4/3/15
to cascadi...@googlegroups.com
oh, and I forgot the bad news…

we are attempting to update the Gradle build to Gradle 2.x, along with the other elements of the SDK. 

most everyone won’t notice, but if you are porting Cascading, be warned.

ckw


For more options, visit https://groups.google.com/d/optout.

Chris K Wensel




Chris K Wensel

unread,
Apr 6, 2015, 11:31:23 PM4/6/15
to cascadi...@googlegroups.com
Ok, last message on this.

Seems JGraphT 0.9.1 was released over the weekend. So I've updated the latest wip to include it.

If you are on Maven, know explicitly declaring 0.9.0 will force it to pin to that release. On Ivy the reverse will happen, JGraphT will be updated to 0.9.1.

The former situation will cause class loading errors and halt the jvm.

The best solution is to let the transitive dependencies naturally resolve.

ckw


For more options, visit https://groups.google.com/d/optout.

Chris K Wensel




Reply all
Reply to author
Forward
0 new messages