[3.0.0-wip-115] small source-level breakage for scala consumers

9 views
Skip to first unread message

Cyrille Chépélov

unread,
May 15, 2015, 6:23:16 AM5/15/15
to cascadi...@googlegroups.com
Hi,

while looking for ways to alter properties at the Cascade level from a scalding app, I encountered a fun little thing with CascadingStats, which would prevent rebuilding scalding against cascading 3.0.0 (didn't catch it earlier, as we're running a modified scalding-0.13.1 built against Cascading 2.6.3 which is evicted by Cascading-3.0.0-wip-X at application build time).

To recall, we used to have in Cascading 2.6.x:
package cascading.stats;

public abstract class CascadingStats { }
public class CascadeStats extends CascadingStats {}
public class Cascade {
  public CascadeStats getCascadeStats { /* ... */ }
}
and then we have in scalding's CascadeJob:
package com.twitter.scalding;

abstract class Job(...) {
   protected def handleStats(statsData: CascadingStats /* bang! */  ) { /* ... */ }
}
abstract class CascadeJob(...) extends Job(...) {
   override def run = {                     
        val flows = jobs.map { _.buildFlow }
        val cascade = new CascadeConnector().connect(flows: _*)
        preProcessCascade(cascade)
        cascade.complete()
        postProcessCascade(cascade)
        val statsData = cascade.getCascadeStats /* returns a CascadeStats (inherited) */
        handleStats(statsData)
        statsData.isSuccessful
    }
}
Which is fine. Now in Cascading 3.0.0-wip-115, the definition has slightly changed:
package cascading.stats;

public abstract class CascadingStats<Child> { }
public class FlowStats { }
public class CascadeStats extends CascadingStats<FlowStats> {}
public class Cascade {
  public CascadeStats getCascadeStats { /* ... */ } /* same */
}
When one tries to build scalding against cascading 3.0.0 [*], the scala compiler complains on c.t.scalding.Job#handleStats, because the cascading Jar provides CascadingStats[Child] and saying just CascadingStats is illegal (it must at least be CascadingStats[_]); equivalent java code would be tolerated, with possibly a warning.
I didn't see it before, because when one mates cascading 3.0.0 against scalding-built-against cascading 2.6.3, the JVM is in charge and it is nowhere as strict with type correctness as the scala compiler.

As I understand it, there are two ways of getting out of this:
  1. Modify Cascading to restore source compatibility for scala consumers, with something along the lines of:

    public
    abstract class GenericCascadingStats<Child> { /* everything which is in CascadingStats<Child> as of wip-115 */ }
    @Deprecated /* please use
    GenericCascadingStats<FlowStats> instead */ public abstract class CascadingStats extends GenericCascadingStats<FlowStats> {}
    @SuppressWarnings("deprecation") public class CascadeStats extends CascadingStats { }

    /* this should preserve source-level compatibility with just a few warnings on the scala side */

  2. Modify scalding (and possibly, application code if affected overloads were done) once cascading-3.0.0 becomes a target there
… Perhaps there might be something more clever, such as a compatibility jar destined to shadow CascadingStats<?> with a non-generic CascadingStats?

At least thought it'd be worth reporting.

    -- Cyrille

[*] I haven't yet actually done it, but overloading CascadeJob#run in my cascade definition was close enough.

Chris K Wensel

unread,
May 18, 2015, 1:24:10 PM5/18/15
to cascadi...@googlegroups.com
I suspect you’ve started to notice many of the other API/generics changes introduced in 3.0.

It really isn’t a goal for Scalding to swap between Cascading 2.x and 3.x without code changes. 

That said, we have been very judicious with those changes, only applied where required (move from JobConf to Configuration, as one big example). 

We also introduced these changes along with deprecated class/method removal 5+ months ago to get feedback. (2.7.0 has all the deprecations we could apply added) 

At this point without any major issues surfacing, we are API complete and stable for a 3.0 release after 1.5 years of development.

ckw

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/5555C90A.70703%40transparencyrights.com.
For more options, visit https://groups.google.com/d/optout.

Chris K Wensel




Oscar Boykin

unread,
May 18, 2015, 1:44:04 PM5/18/15
to cascadi...@googlegroups.com
Cyrille,

Is this the only example of something that is not source compatible? I thought there were the Configuration/JobConf changes as well.

Can it still run when you put the 3.0 jar in at runtime?

We have talked about creating a scalding branch to target cascading 3.0. We do want to move forward, but we have a very large number of jobs in production, so we have to do a lot of validation before we make it the default.

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/5555C90A.70703%40transparencyrights.com.
For more options, visit https://groups.google.com/d/optout.



--
Oscar Boykin :: @posco :: http://twitter.com/posco

Cyrille Chépélov

unread,
May 19, 2015, 7:59:37 AM5/19/15
to cascadi...@googlegroups.com
Chris,

lack of source-level compatibility makes plenty of sense, especially as the binary-level compatibility does work as we know. Didn't look at 2.7.0 honestly, but great!

And yes, congratulations!!

Oscar,

Yes, the 3.0 jar at runtime is exactly how I work today. I use 0.13.1+PR1220, built against whatever 0.13.1 was (ISTR it was Cascading 2.6.1), and then my application's sbt evicts cascading 2.6.1 with 3.0.0-wip-115. The JVM's type

Perhaps it would make sense, for a transition period, to add a task to run the freshly built binary against cascading-3.0.0, until it's time to start and then merge the "build agains 3.0.0" branch?

I believe there are other source-incompatible changes indeed; probably of the same class. I'm not sure at this point whether any such changes would escape through a made-compatible scalding and cause ripple source-incompatible changes into consumer applications, this may be fun to look at.

    -- Cyrille

For more options, visit https://groups.google.com/d/optout.


--

Logo Transparency

Cyrille CHÉPÉLOV
Chief Innovation Officer

Transparency Rights Management
15 rue Jean-Baptiste Berlier - Hall B, 75013 Paris
T : +33 1 84 16 52 74 / F : +33 1 84 17 83 34

Reply all
Reply to author
Forward
0 new messages