The code in examples/wordcount.py works fine but whenever I run this
job - https://gist.github.com/1006140 - or pretty much any variation
on it and follow the example I get overflow errors:
root@li317-243:~# disco run poc.ParseDownloads dump:downloads
ParseDownloads@51b:17a09:994dd
root@li317-243:~# disco wait @ | xargs ddfs xcat
long int too large to convert to int
So I poked around a little bit.
root@li317-243:~# disco results @
dir://li317-243/disco/li317-243/a5/ParseDownloads@51b:17a09:994dd/.disco/map-index.txt.gz
root@li317-243:~# cp /usr/local/var/disco/data/li317-243/a5/
ParseDownloads@51b:17a09:994dd/.disco/map-index.txt.gz ./
root@li317-243:~/results# gunzip ./map-index.txt.gz
root@li317-243:~# cat map-index.txt
0 disco://li317-243/disco/li317-243/a5/ParseDownloads@51b:17a09:994dd/.disco/partitions-51b-17e8e-da1b2/part-0
root@li317-243:~# ddfs xcat
disco://li317-243/disco/li317-243/a5/ParseDownloads@51b:17a09:994dd/.disco/partitions-51b-17e8e-da1b2/part-0
long int too large to convert to int
root@li317-243:~# ddfs cat
disco://li317-243/disco/li317-243/a5/ParseDownloads@51b:17a09:994dd/.disco/partitions-51b-17e8e-da1b2/part-0
Killed
root@li317-243:~# cp /usr/local/var/disco/data/li317-243/a5/
ParseDownloads@51b:17a09:994dd/.disco/partitions-51b-17e8e-da1b2/
part-0 ./
root@li317-243:~# head part-0 -n 1
� �v �{� x^��]ɑ ��� �8��,���� �ojA�� @�C ���E �G�j
$�zF R�����ps3��̬�=�$!�x�u�0�o�������?O��a��p9� ��O���x�?���������� ��?
�滿�������q�Ϳ��������������G���� �_._~= ��p8����rկ �O2��� �����?
�o� ]��� ����t-D�{q: ~Ɵ��z~G�8�_~{:\@�̿�r8��Ğ��Li-8��Th��.���: ^����!
�;���< ^�e>./����Ϊ�j�)����i� R��Gj=��^���2��7M����x�� ֗?
g� `��0 N�q�� ��L����z�ȼ +��u����x^ �*G�0SZ�/������z�i���阞
UELI�t�_�>� ���q�(� ɯ�� �?��$L/�|z9.���r�zھ�@�L�����{� .W��z|Y/+t��czJ
Z��t�Q% 5�Z_��|y�I�QT� *fJ
^Croot@li317-243:~# ls -lh part-0
-rw-r--r-- 1 root root 2.8G 2011-06-03 06:53 part-0
I'm guessing that something is trying to index into that file using a
signed machine int, which is why this job files but the example jobs
work fine. I can move to a 64 bit machine for now but I think this
should definitely be considered a bug.
Cheers
Jamie
root@li317-243:~# wget http://discoproject.org/media/text/bigfile.txt
...
root@li317-243:~# ddfs chunk data:bigtxt ./bigfile.txt
'ascii' codec can't decode byte 0x81 in position 0: ordinal not in range(128)
root@li317-243:~# echo 'foo' > foo
root@li317-243:~# ddfs chunk data:bigtxt ./foo
'ascii' codec can't decode byte 0x81 in position 0: ordinal not in range(128)
This is on Ubuntu 11.04 on a 64 bit linode VPS. Disco is installed
from source. Setup notes are here: https://gist.github.com/1007630
Jamii,
Was your 32-bit platform Ubuntu 11.04 as well?
>
> On Jun 4, 2:34 pm, Jamie Brandon <jamii...@googlemail.com> wrote:
>> I tried moving this to a 64 bit server but now I can't even run the tutorial:
>>
>> root@li317-243:~# wgethttp://discoproject.org/media/text/bigfile.txt
>> ...
>> root@li317-243:~# ddfs chunk data:bigtxt ./bigfile.txt
>> 'ascii' codec can't decode byte 0x81 in position 0: ordinal not in range(128)
>>
>> root@li317-243:~# echo 'foo' > foo
>> root@li317-243:~# ddfs chunk data:bigtxt ./foo
>> 'ascii' codec can't decode byte 0x81 in position 0: ordinal not in range(128)
>>
>> This is on Ubuntu 11.04 on a 64 bit linode VPS. Disco is installed
>> from source. Setup notes are here:https://gist.github.com/1007630
>
> --
> You received this message because you are subscribed to the Google Groups "Disco-development" group.
> To post to this group, send email to disc...@googlegroups.com.
> To unsubscribe from this group, send email to disco-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/disco-dev?hl=en.
>
>
Yes, and both were linode VPS' images running the same setup script. There should be no differences between them except the word size. The 32 bit machine worked fine until the output filr exceeded 2gb.
disco@trillian:~/disco/examples/util$ python wordcount.py
Traceback (most recent call last):
File "wordcount.py", line 10, in <module>
from disco.core import Job
ImportError: No module named disco.core
------------------------
Most recently I updated my versions of python on both my linux and osx
machine. I seemed to have an older R13 version of erlang and updated
that to R14 latest. Both machines have python 2.6 or better.
Also my ssh setup seems to be working correctly..
------------------------
disco@trillian:~$ cd disco
disco@trillian:~/disco$ ssh localhost erl
Eshell V5.7.4 (abort with ^G)
1>
------------------------
So I still dont get what my installation is lacking.
Any help appreciated.
Thanks
- Sudhindra
PS: thanks for all the recommendations earlier but I seem to get only
weekends to try out things that you suggest and hence my feedback cycle
maybe slower.