mrjob v0.6.8 is out, big news for Spark users

11 views
Skip to first unread message

Dave Marin

unread,
Apr 26, 2019, 2:40:45 PM4/26/19
to mr...@googlegroups.com
mrjob v0.6.8 provides full support for Spark. You can now launch Spark code with any runner (except Google Cloud Dataproc, still working on that), and mrjob not only integrates with existing features, but makes mrjob-specific features (e.g. setup scripts) work seamlessly inside Spark:


As if that weren’t enough, this release also adds a Spark runner that can run regular old MRJobs (originally designed to run on Hadoop Streaming) on any Spark cluster. So if your team is moving from Hadoop to a non-Hadoop Spark (e.g. Mesos, Kubernetes), you can take your old MRJobs with you, without rewriting a line of code. For more info, see:


As with all releases, there are also a number of bugfixes and small improvements; for the details, see:


-Dave

Reply all
Reply to author
Forward
0 new messages