There are some example code:
but none of which actually provides the prototype you are looking for. But basically you can launch an mrjob programatically by doing something like this (using mr_wc.py as a template):
#### CODE BEGINS ####
from mrjob.job import MRJob
class MRWordCountUtility(MRJob):
def __init__(self, *args, **kwargs):
super(MRWordCountUtility, self).__init__(*args, **kwargs)
self.chars = 0
self.words = 0
self.lines = 0
def mapper(self, _, line):
# Don't actually yield anything for each line. Instead, collect them
# and yield the sums when all lines have been processed. The results
# will be collected by the reducer.
self.chars += len(line) + 1 # +1 for newline
self.words += sum(1 for word in line.split() if word.strip())
self.lines += 1
def mapper_final(self):
yield('chars', self.chars)
yield('words', self.words)
yield('lines', self.lines)
def reducer(self, key, values):
yield(key, sum(values))
if __name__ == '__main__':
job = MRWordCountUtility(args=['-r', 'emr'])
with job.make_runner() as runner:
runner.run()
#### CODE ENDS ####
Note the lines after "if __name__ == '__main__':". If this doesn't make sense to you, you probably should read up on the basics of object-oriented programming in Python before you simply copy & paste stuff just to find things that don't magically work...