Neurolex as a simple list

17 views
Skip to first unread message

Sine Zambach

unread,
Aug 26, 2013, 9:30:52 AM8/26/13
to neur...@googlegroups.com

Dear Neurolex-group,

 

I am trying to get a list of all terms in Neurolex as a simple list or tab-delimited file. However, I have only managed to find a CSV-list of the neurons and brain parts.

Is there a way to access rest of the 25.000 terms as a simple list?

 

Best regards,

 

Sine Zambach


--
Sine Zambach
Tlf: 21235533

Maryann Martone

unread,
Aug 26, 2013, 9:45:33 AM8/26/13
to neur...@googlegroups.com, willy wong, Amarnath Gupta
Dear Sine:

That's a good question.  I'm cc'ing Willy and Amarnath who are in the best position to answer that.  I know that the native wiki only supports 1000 item downloads, and that's why we only have individual parts available as downloads.  But there may be a workaround.

Regards,

Maryann


--
You received this message because you are subscribed to the Google Groups "neurolex" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neurolex+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

anita bandrowski

unread,
Aug 26, 2013, 9:45:56 AM8/26/13
to neur...@googlegroups.com
Dear Sine,
The NeuroLex is set up to interact with data and not to produce large outputs as this slows down servers.

However, you may get a list of all terms by either using a SPARQL query at the NueroLex endpoint, which is accessible from here along with several examples:
http://neurolex.org/wiki/NeuroLex_SPARQL_endpoint

The other option is for you to define which properties you would like and I would be happy to send you the list from a database dump.
Hope this helps,
anita





On Mon, Aug 26, 2013 at 6:30 AM, Sine Zambach <sinez...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "neurolex" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neurolex+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Anita Bandrowski, Ph.D.
NIF Project Lead
UCSD 858-822-3629
http://neuinfo.org
9500 Gillman Dr.#0446
la Jolla, CA 92093-0446

Stephen Larson

unread,
Aug 26, 2013, 11:04:31 AM8/26/13
to neur...@googlegroups.com
I've gone ahead and updated the SPARQL page to include the query you'd want to run.  Note that the endpoint itself can return content in CSV or TSV as you like.  I've included direct links at the bottom of this section that will run the query in text or CSV:


I've gone ahead and run this to output the current list of terms in CSV (attached).

Interestingly our CSV file only has 23k rows in it, which may indicate we have a bug in our update process where we are missing some terms.  We'll have to figure out where the deficit is coming from.

Thanks,
  Stephen

NeuroLexAllTerms-8-26-13.csv

Finn Årup Nielsen

unread,
Aug 28, 2013, 7:39:30 AM8/28/13
to neur...@googlegroups.com
Dear Sine,


It is usually possible to query MediaWikis via the API, but you need to
iterate over pages.

The following program will download all titles of main-namespace pages
on the wiki, - not just term pages:


import requests
import simplejson as json

#
http://neurolex.org/w/api.php?action=query&list=allpages&apnamespace=0&aplimit=500&format=json

urlbase = "http://neurolex.org/w/api.php"
urlparams = {'action': 'query',
'list': 'allpages',
'apnamespace': 0,
'aplimit': 500,
'format': 'json'}


response = json.load(requests.get(urlbase, params=urlparams).raw)
pages = response['query']['allpages']
while 'query-continue' in response:
urlparams.update(response['query-continue']['allpages'])
response = json.load(requests.get(urlbase, params=urlparams).raw)
pages.extend(response['query']['allpages'])


for page in pages: print(page['title'])



http://neuro.imm.dtu.dk/w/index.php?title=Talk:NeuroLex&oldid=32953

best
Finn �rup Nielsen
Reply all
Reply to author
Forward
0 new messages