Re: Trying to get started and failing.

61 views
Skip to first unread message

j.barrett Strausser

unread,
May 14, 2013, 9:37:57 PM5/14/13
to Alex SV, disc...@googlegroups.com
That worked. Still having troubled reading and writing from ddfs
Does anyone have a example that demonstrates the syntax to read from ddfs and writes ddfs.


-b


On Tue, May 14, 2013 at 3:53 AM, Alex SV <alexs.v...@gmail.com> wrote:
Hi,

I suppose here:
if __name__ == '__main__':
    job =  WordCount().run(input=["http://discoproject.org/media/text/chekhov.txt"])
    for word, count in result_iterator(job.wait(show=True)):
        print(word, count)


You should import this module itself, like in https://github.com/discoproject/disco/blob/master/examples/util/simple_innerjoin.py
something like:
if __name__ == '__main__':
    from this_module_name import WordCount
    job =  WordCount().run(input=["http://discoproject.org/media/text/chekhov.txt"])
    for word, count in result_iterator(job.wait(show=True)):
        print(word, count)

Regards,
Alex

On Tuesday, May 14, 2013 5:26:45 AM UTC+3, j.barrett...@gmail.com wrote:
To all,

I just started with Disco today. I haven't found a coherent example showing reading and writing from DDFS.

I'm trying to put the pieces together with the examples. When I run the example found here : http://discoproject.org/doc/disco/howto/discodb.html

from disco.core import  Job,result_iterator
from disco.util import kvgroup
from disco.worker.classic.func import discodb_stream

class WordCount(Job):
    reduce_output_stream = discodb_stream

    @staticmethod
    def map(line, params):
        for word in line.split():
            yield word, 1

    @staticmethod
    def reduce(iter, params):
        for word, counts in kvgroup(sorted(iter)):
            yield word, str(sum(counts))


if __name__ == '__main__':
    job =  WordCount().run(input=["http://discoproject.org/media/text/chekhov.txt"])
    for word, count in result_iterator(job.wait(show=True)):
        print(word, count)


I get the following error

Status: [map] 0 waiting, 1 running, 0 done, 0 failed
2013/05/13 22:24:54 master New job initialized!
2013/05/13 22:24:54 master Starting job
2013/05/13 22:24:54 master Starting map phase
2013/05/13 22:24:54 master map:0 assigned to localhost
2013/05/13 22:24:54 master ERROR: Job failed: Worker at 'localhost' died: Traceback (most recent call last):
  File "/srv/disco/data/localhost/22/WordCount@558:79a76:cfaac/usr/lib/python2.7/dist-packages/disco/worker/__init__.py", line 335, in main
    task = cls.get_task()
  File "/srv/disco/data/localhost/22/WordCount@558:79a76:cfaac/usr/lib/python2.7/dist-packages/disco/worker/__init__.py", line 385, in get_task
    return Task(**dict((str(k), v) for k, v in cls.send('TASK').items()))
  File "/srv/disco/data/localhost/22/WordCount@558:79a76:cfaac/usr/lib/python2.7/dist-packages/disco/task.py", line 67, in __init__
    self.jobobjs = dPickle.loads(self.jobpack.jobdata)
AttributeError: 'module' object has no attribute 'WordCount'

2013/05/13 22:24:54 master WARN: Job killed
Status: [map] 1 waiting, 0 running, 0 done, 1 failed
Traceback (most recent call last):
  File "/home/bearrito/Pythontools/pycharm-2.7.2/helpers/pydev/pydevd.py", line 1481, in <module>
    debugger.run(setup['file'], None, None)
  File "/home/bearrito/Pythontools/pycharm-2.7.2/helpers/pydev/pydevd.py", line 1124, in run
    pydev_imports.execfile(file, globals, locals) #execute the script
  File "/home/bearrito/Git/PythonComputation/pythoncomputation/WordCount.py", line 21, in <module>
    for word, count in result_iterator(job.wait(show=True)):
  File "/usr/lib/python2.7/dist-packages/disco/core.py", line 365, in wait
    timeout, poll_interval * 1000)
  File "/usr/lib/python2.7/dist-packages/disco/core.py", line 325, in check_results
    raise JobError(Job(name=jobname, master=self), "Status {0}".format(status))
disco.error.JobError: Job WordCount@558:79a76:cfaac failed: Status dead

What could I be doing incorrectly :



--


https://github.com/bearrito
@barrettsmash

Prashanth Mundkur

unread,
May 16, 2013, 1:10:30 AM5/16/13
to disc...@googlegroups.com
On 21:37 Tue 14 May, j.barrett Strausser wrote:
> That worked. Still having troubled reading and writing from ddfs
> Does anyone have a example that demonstrates the syntax to read from ddfs
> and writes ddfs.

When one needs examples of API usage, it's a good idea to look under
the tests directory. Perhaps this might help:

https://github.com/discoproject/disco/blob/master/tests/test_ddfs.py

--prashanth
Reply all
Reply to author
Forward
0 new messages