Running a different mapper on different input files

41 views
Skip to first unread message

Dane

unread,
Apr 9, 2018, 11:58:43 AM4/9/18
to mrjob
Good-day

I have an assignment wherein I have to do matrix multiplication. I am given two files in the matrix-market format, i.e. 

matrixA.list

2 3      <- dimensions
0 0 1   <- entries
0 1 2
0 2 3
1 0 4
1 1 5
1 2 6

I am struggling to develop the (key, value) pairs.

I currently run the program like this: python MatrixMulti.py matrixA.list matrixB.list

I would like to either run a different mapper for each file so that I can identify the origin matrix in the (key,value) pair, but I don't know how to do this. Is this the best approach?

Thanks.

below is my python code.

from mrjob.job import MRJob

class MrMatrixMultiplier(MRJob):

def mapper_init(self):
self.ROWS = 0
self.COLS = 0
self.kvpairs = []

def mapper(self, _, line):
line_list = line.split(' ')
if (len(line_list) == 2):
[self.ROWS, self.COLS] = line_list
else:
self.i = line_list[0]
self.j = line_list[1]
self.v = line_list[2]
for k in range(1, int(self.ROWS)):
print [[self.i, k],['A', self.j, self.v]]


def reducer(self, key, values):
yield key, sum(values)

if __name__ == '__main__':
MrMatrixMultiplier.run()
Reply all
Reply to author
Forward
0 new messages