actually i wanted to test echoprint by its database : http://echoprint-data.s3.amazonaws.com/echoprint-dump-1.json
and i try to do this :
cat echoprint-dump-1.json|jq -r '.[].code' | echoprint-inverted-index index.bin
and it gives this error :
Traceback (most recent call last): File "/usr/local/bin/echoprint-inverted-index", line 19, in <module> create_inverted_index(streamer(sys.stdin), args.indexfile) File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 57, in create_inverted_index for batch_index, batch in enumerate(split_seq(songs, 65535)): File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 30, in split_seq item = list(itertools.islice(it, size)) File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 78, in parsing_code_streamer yield decode_echoprint(line.strip())[1] File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 42, in decode_echoprint unzipped = zlib.decompress(zipped) zlib.error: Error -5 while decompressing data: incomplete or truncated stream
i think it happens just when file is being larger , i tested it with small json files and it works.
any one encounter with this error ?
is this a bug or the problem is just mine ?
and also i try to do it with the api in a script like this :
def makeAndLoadInvertedIndex():
client = MongoClient('localhost', 27017)
colection = client.test.songs
docs = colection.find({})
codesStr=""
app.gids=[]
for doc in docs:
codesStr+= str(doc['code'])+"\n"
app.gids.append({"id":str(doc['_id'])})
f = io.BytesIO(codesStr)
print "submiting ...."
create_inverted_index(parsing_code_streamer(f), args.indexfile)
app.inverted_index = load_inverted_index(['./index.bin'])
print "all song submited"
and also it give that error too !