New echoprint server with echoprint jsons

576 views
Skip to first unread message

Kevin

unread,
Apr 19, 2016, 12:04:52 AM4/19/16
to echoprint
Hi,
I recently installed all the echoprint server and codegen tools and confirmed test.py runs fine (except that you have to move it up one directory out of /test/).

But now I'm trying to use the new Github instructions to set up the echoprint server to use the echoprint-dump-x.json files found here:
http://echoprint.me/data_download

I assume I want to create an inverted index off of this data so I can query it, but I'm not sure how to go about doing that.

Is there a complete set of instructions on how to get this data loaded up using the new code that was recently released?

Thanks!

Nurlan Ibikeev

unread,
Jul 11, 2016, 8:12:59 AM7/11/16
to echoprint

adding to index

1) Create data for index.bin
 echoprint-codegen song.ogg > codegen_output.json 
2) cat codegen_output.json | jq -r '.[0].code' | echoprint-inverted-index index.bin
this will create index.bin file

quering
1)  echoprint-codegen query_song.ogg > codegen_output.json 
2) cat codegen_output.json | jq -r '.[0].code' | echoprint-inverted-query index.bin

you sould get sometinhg like

{"results": [{"index": 0,"score": 0.69340412080287933} ]}

more score amount mostly your song
* i noticed if even the song was not added to index.bin you will get result ofcource with less score
i hope it will help

Masha Belyi

unread,
Oct 10, 2016, 12:21:46 AM10/10/16
to echoprint
I am trying to index data from http://echoprint.me/data_download and running into an error with zlib:

cat echoprint-dump-1.json | jq -r '.[].code' | echoprint-inverted-index inverted_index_1.bin 

Traceback (most recent call last):

  File "./bin/echoprint-inverted-index", line 19, in <module>

    create_inverted_index(streamer(sys.stdin), args.indexfile)

  File "/Library/Python/2.7/site-packages/echoprint_server/lib.py", line 57, in create_inverted_index

    for batch_index, batch in enumerate(split_seq(songs, 65535)):

  File "/Library/Python/2.7/site-packages/echoprint_server/lib.py", line 30, in split_seq

    item = list(itertools.islice(it, size))

  File "/Library/Python/2.7/site-packages/echoprint_server/lib.py", line 78, in parsing_code_streamer

    yield decode_echoprint(line.strip())[1]

  File "/Library/Python/2.7/site-packages/echoprint_server/lib.py", line 42, in decode_echoprint

    unzipped = zlib.decompress(zipped)

zlib.error: Error -5 while decompressing data: incomplete or truncated stream


But it works when I try to index only a small part of the data like this:

cat echoprint-dump-1.json | jq -r '.[0:100] | .[].code' | echoprint-inverted-index inverted_index_1.bin 


Did anyone else run into this issue?

abbas hoseini

unread,
Feb 28, 2017, 2:44:54 AM2/28/17
to echoprint
i have this error too . actually i imported it in mongodb and then i want to make and load inverted index from whole database  and query it : 
def makeAndLoadInvertedIndex():
    client
= MongoClient('localhost', 27017)
    colection
= client.test.songs
    docs
= colection.find({})
    codesStr
=""
    app
.gids=[]
   
for doc in docs:
        codesStr
+= str(doc['code'])+"\n"
        app
.gids.append({"id":str(doc['_id'])})
    f
= io.BytesIO(codesStr)
   
print "submiting ...."
    create_inverted_index
(parsing_code_streamer(f), args.indexfile)
    app
.inverted_index = load_inverted_index(['./index.bin'])
   
print "all song submited"

with a small number of songs it works but when i import the echoprint database it give this error :

Traceback (most recent call last):
  File "./echoprint-rest-service", line 117, in <module>
    makeAndLoadInvertedIndex();
  File "./echoprint-rest-service", line 84, in makeAndLoadInvertedIndex
    create_inverted_index(parsing_code_streamer(f), args.indexfile)
  File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 57, in create_inverted_index

    for batch_index, batch in enumerate(split_seq(songs, 65535)):
  File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 30, in split_seq
    item = list(itertools.islice(it, size))
  File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 78, in parsing_code_streamer
    yield decode_echoprint(line.strip())[1]
  File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 42, in decode_echoprint

    unzipped = zlib.decompress(zipped)
zlib.error: Error -5 while decompressing data: incomplete or truncated stream

any one have a solution for this ?

xulen

unread,
Apr 4, 2020, 12:07:57 PM4/4/20
to echoprint
Have you tried:
cat fingerprint.json | jq -r '.[].code' | egrep -v null$ | echoprint-inverted-index index.bin

source
Reply all
Reply to author
Forward
0 new messages