mrjob v0.6.8 provides full support for Spark. You can now launch Spark code with any runner (except Google Cloud Dataproc, still working on that), and mrjob not only integrates with existing features, but makes mrjob-specific features (e.g. setup scripts) work seamlessly inside Spark:
As if that weren’t enough, this release also adds a Spark runner that can run regular old MRJobs (originally designed to run on Hadoop Streaming) on any Spark cluster. So if your team is moving from Hadoop to a non-Hadoop Spark (e.g. Mesos, Kubernetes), you can take your old MRJobs with you, without rewriting a line of code. For more info, see:
As with all releases, there are also a number of bugfixes and small improvements; for the details, see:
-Dave