Is someone working on / interested in pyBIDS in the browser?

75 views
Skip to first unread message

Sebastian Urchs

unread,
Jun 2, 2022, 5:57:15 PM6/2/22
to bids-discussion
Hi everyone,

we are working on tools to annotate and harmonize BIDS datasets that include clinical demographics data, so we can search across them and define subject level cohorts. To make life easier for users (and us), harmonizing demographic data (e.g. participant.tsv) is happening in the browser in a Vue app.

One step that still requires users to install and run things locally is parsing a BIDS dataset on their local filesystem with pyBIDS. We don't need a lot of detail here, just the subject, session, and run names, and what modalities are available for these - all of this can be learned from the path name without direct access to the file content (as far as I know).

The web version of the BIDS-validator already allows a user to upload a BIDS dataset to a React app to do validation there. Is there anything similar to parse a BIDS dataset in order to answer some simple, pyBIDS like queries such as those above? Or are there folks who would be interested to working on something like this?

Many thanks for any pointers!
Best,
Seb

yarikoptic

unread,
Jun 2, 2022, 11:05:25 PM6/2/22
to bids-discussion
I wonder how/if pyodide has a chance? docs at https://pyodide.org/en/stable/usage/quickstart.html say that numpy import works... tried but didn't see anything in console as was promised, postponed figuring it out

Sebastian Urchs

unread,
Jun 3, 2022, 3:28:21 PM6/3/22
to bids-discussion
Yes, thanks for the link Yarik! What started this discussion was the announcement of pyscript (https://pyscript.net/), but I am not sure whether we'd actually need to run python directly or if we couldn't get most of the way with what the BIDS-validator project has already done.

Erdal Karaca

unread,
Jun 4, 2022, 8:36:09 AM6/4/22
to bids-di...@googlegroups.com
Hi,
I just tried pyodide with ancpBIDS:

# download and unzip test dataset
from pyodide.http import pyfetch
ds005_zip = await pyfetch("https://raw.githubusercontent.com/ANCPLabOldenburg/ancp-bids-dataset/main/ds005-testdata.zip")
if ds005_zip.status == 200:
  with open("ds005-testdata.zip", "wb") as f:
  f.write(await ds005_zip.bytes())
# unzip dataset archive
import zipfile
with zipfile.ZipFile('ds005-testdata.zip', 'r') as zip_ref:
    zip_ref.extractall('./ds005')
# install and import ancpBIDS
import micropip
await micropip.install('ancpbids')
import ancpbids
# load layout of dataset
ds005_layout = ancpbids.BIDSLayout('ds005')
#run queries
ds005_layout.get_subjects()

Running queries using BIDSLayout:

>>> ds005_layout = ancpbids.BIDSLayout('ds005')
>>> ds005_layout.get_subjects()
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16']
>>> ds005_layout.get_runs()
['1', '2', '3']
>>> ds005_layout.get_tasks()
['mixedgamblestask']
>>> ds005_layout.get_entities()
OrderedDict([('task', {'mixedgamblestask'}), ('sub', {'16', '06', '07', '04', '05', '02', '03', '10', '01', '11', '12', '13', '14', '15', '08', '09'}),
 ('run', {'3', '2', '1'}), ('desc', {'mypipeline', 'extra'}), ('ds', {'005'}), ('type', {'test', 'mfx'})])
>>> ds005_layout.get(suffix='bold', subject='02', return_type='filename')
['ds005/ds005/sub-02/func/sub-02_task-mixedgamblestask_run-01_bold.nii.gz', 'ds005/ds005/sub-02/func/sub-02_task-mixedgamblestask_run-02_bold.nii.gz', 
'ds005/ds005/sub-02/func/sub-02_task-mixedgamblestask_run-03_bold.nii.gz']


--
We are all colleagues working together to shape brain imaging for tomorrow, please be respectful, gracious, and patient with your fellow group members.
---
You received this message because you are subscribed to the Google Groups "bids-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bids-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bids-discussion/eb264a88-21dd-44ab-b9e0-c665a2b32479n%40googlegroups.com.

Sebastian Urchs

unread,
Jun 4, 2022, 12:29:52 PM6/4/22
to bids-discussion
Hi, this looks super cool and very promising! Would you mind sharing the JS stuff you ran to outside of the python code to get this to run? I haven't played around with pyodide yet and I'm having a bit of a hard time to get micropip to play nice.

Best,
Seb
side of the

Sebastian Urchs

unread,
Jun 4, 2022, 11:36:43 PM6/4/22
to bids-discussion
Hey Erdal,

thanks for the code snippet. I played around a little more with pyodide and set it up in a Vue App. Put the code here: https://github.com/surchs/vuebids. I wasn't aware of ancp_bids, but this looks like a very cool project, also very happy to see that this is going to merge back into the pybids effort! I didn't manage to directly install pybids via pyodide/micropip because of the sqlalchemy dependency (but didn't debug much, might be possible), so ancp_bids works very nice for this. And overall, loading pyodide and installing ancp_bids is still reasonably fast.

It feels like this already puts in-browser-bids-parsing within reach. The thing that's missing is how to pass along the directory tree of a BIDS dataset to the bids API. webkitdirectory is probably the way to go / what bids-validator is using, but I don't know how ancp_bids would get this. If simply parsing file paths is easier than handling full file-system access, that would still let us do some interesting queries to begin with.

Would be curious to hear what others think about this.

Best,
Seb

Erdal Karaca

unread,
Jun 5, 2022, 4:07:35 AM6/5/22
to bids-di...@googlegroups.com
Hi Seb,
Great to hear that it worked! Due to security reasons, the browser cannot access the local file system, but pyodide has its own virtual file system. You need to "download" the dataset to the virtual file system which is not practical in most cases as BIDS datasets can be quite huge.
In theory, if there is a wrapper to use on Python side to interact with the webkitdirectory API, ancpBIDS could be refactored to use that instead of OS file system access (in the browser), but sounds like much overhead involved here as this would require making file system access switchable.

Best regards,
Erdal


Sebastian Urchs

unread,
Jun 5, 2022, 3:25:15 PM6/5/22
to bids-discussion
Hi Erdal,

pyodide has the JsProxy class that let's you directly expose the webkitAPI file objects to python. This gets you access to the webkitRelativePath attribute, i.e. the relative file path. That's just the path though. I put a toy example in the app: https://voluble-kleicha-ccad41.netlify.app/. Agreed that downloading to the virtual FS isn't a good idea. I don't have any experience with this, so don't know if there would be a way to "mount" an isolated filesystem into the browser FS via the user (there is a proposed FileSystem Access API but that's not supported in Firefox I would trust their judgement that it's not a good idea).

On the JS side, you would use a FileReader to try and get the contents of this path - not sure if that could be done / made available on the python side too. But there should be a sizable number of queries that could be answered purely on the relative file path, without having to open/read any file content. Not sure if this would be in the scope of ancp_bids / pybids to support. It would be very nice to have a single reference implementation of the BIDS schema that can be reused also in such a more limited way.

If the pyodide path turns out to be too tricky (and pyscript doesn't make things easier either), then maybe there is some other way to reuse something like https://github.com/bids-standard/bids-schema so we don't have to do another hard-coded implmentation. The reason we are interested in getting these very basic queries done in the browser is to give users a way to summarize their local BIDS dataset metadata into a jsonld (or similar) file without installing anything locally. We would then use these jsonld files to run cross-dataset queries.

If someone else is working on this at the moment or would be interested in having something like this, I'd be happy to chat about it!

Best,
Seb

Sebastian Urchs

unread,
Jun 6, 2022, 1:37:38 PM6/6/22
to bids-di...@googlegroups.com
Quick update: I had a chat with Stephan Heunis today from datalad who is working on something conceptually similar for datalad: extract BIDS metadata on a cloned but not "gotten" datalad dataset. In that state, you don't have access to the file contents either, just a bunch of symlinks. And he successfully ran pybids on those data to get some dataset level metadata (I think some files actually have to be created, like dataset_description.json).

So I tried this out with ancp_bids, and indeed it works with a purely (dead) symlink BIDS dataset (i.e. no file contents at all). Because we can easily get a list of file paths via the webkitdirectory API, I thought I'd just go and touch all of these paths in the browser FS and see what happens.

Happy to report, that actually gives me BIDS querying capability via ancp_bids, running inside pyodide, inside the browser. Check it out here: https://voluble-kleicha-ccad41.netlify.app/ (it just runs BIDSLayout.get_entities()).
Very excited. Thanks Erdal and Yarik for the pointers!

I think in this state we can probably already get most of the information that we'd like to have from a BIDS dataset, bundle it all into a json file and give it back for download to the user. Would be interested to hear if there are other things this could be used for / expanded to. Maybe this is more of a conversation for neurostars / the hackathon though.

Best,
Seb


You received this message because you are subscribed to a topic in the Google Groups "bids-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bids-discussion/Z3fRsk5IXRM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bids-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bids-discussion/10f6e506-4511-4f87-a63a-426e9c4ce8a4n%40googlegroups.com.

Melissa Kline Struhl

unread,
Jun 7, 2022, 11:00:49 AM6/7/22
to bids-di...@googlegroups.com
I'm working on a data standard for behavioral studies that uses Schema.org/Dataset JSON-LD for the metadata object, and one of the use cases is inserting (or extracting) task/response/participant information into a BIDS directory from a separate, standalone directory from a behavioral task. Since this sounds like it would create similar metadata as this tool, I'd definitely be interested in making sure Psych-DS generates JSON-LD that's useful for the same purposes as you're describing. Is there somewhere I can learn more about the kinds of cross-dataset queries you're interested in?

Remi Gau

unread,
Jun 9, 2022, 4:15:29 AM6/9/22
to bids-discussion
Fairly low hanging fruit that could go in there.

- run pybids reports to give the user a human-readable "methods section like" version of the dataset

I have had it on the back burner for ages to improve this pybids functionality. 

- generate a "diagnostic figure" to see how many files there are per subject

Recently added this to bids matlab and it could also be a pybids functionality. 

Happy to discuss and hack on some of this. 



Alejandro De La Vega

unread,
Jun 9, 2022, 5:36:42 PM6/9/22
to bids-discussion
Hi all,
Sorry I'm late to this discussion. Ross Blair from Poldrack Lab did something similar to what Sebastian did.

He used pyiodide to run ancp_bids in the browser, and if I recall correctly he was able to index a local dataset... and perhaps even read event files (which was our end goal-- in order to support BIDS StasModel generation).

Unfortunately, Ross is out on vacation for a few weeks, and it was just a quick prototype he didn't push anywhere, but I thought I'd mention his name, and that it would be good to loop him into this discussion.

Seems like it would be great to come up with a generalized solution to using ancp_bids in the browser (e.g. maybe it can be written as a npm package or vue/react component?)

Obviously, given unlimited resources a bids-lite.js library would be ideal and have better performance, but this should serve us well in the absence.

Best,
Alejandro

Sebastian Urchs

unread,
Jun 9, 2022, 7:25:34 PM6/9/22
to bids-di...@googlegroups.com
Hi Melissa,

thanks for the link, would love to hear more about your psych-ds project. We're working on a project to represent a very limited set of imaging and demographics metadata in an RDF data model and then query across these in a graph store. Our project is called https://github.com/neurobagel, we're pretty early with development but you're very welcome to check it out and even more so to have a chat with us. Our main focus is making the process of annotating, harmonizing and searching the data as easy as possible, so we're currently focusing a lot on frontend tools. That's why it'd be nice to also be able to parse the BIDS metadata we need directly from inside the browser.

Would be happy to hear more about what your project goals are and if there is some overlap!

Best,
Seb

Sebastian Urchs

unread,
Jun 10, 2022, 7:32:41 AM6/10/22
to bids-di...@googlegroups.com
Hey everyone,

also sorry for late reply, thanks for all the pointers!

Sorry I'm late to this discussion. Ross Blair from Poldrack Lab did something similar to what Sebastian did.

He used pyiodide to run ancp_bids in the browser, and if I recall correctly he was able to index a local dataset... and perhaps even read event files (which was our end goal-- in order to support BIDS StasModel generation).
Unfortunately, Ross is out on vacation for a few weeks, and it was just a quick prototype he didn't push anywhere, but I thought I'd mention his name, and that it would be good to loop him into this discussion.

Absolutely, thanks for mentioning him. 

Seems like it would be great to come up with a generalized solution to using ancp_bids in the browser (e.g. maybe it can be written as a npm package or vue/react component?)

 Fully agreed. Something simple and reusable would be great. We can use the hackathon to figure out which makes the most sense.

On Thursday, June 9, 2022 at 3:15:29 AM UTC-5 remi...@gmail.com wrote:
Fairly low hanging fruit that could go in there.

- run pybids reports to give the user a human-readable "methods section like" version of the dataset

 I like that. I think that's very close to what we need for our project, only "machine readable".  would be good to see what pybids queries need to run for this to be possible.

I have had it on the back burner for ages to improve this pybids functionality. 

- generate a "diagnostic figure" to see how many files there are per subject

 That's a cool idea. Maybe this could be one of the output goals for a quick prototype: run the browser-bids thing, hand over the results to a figure and show them.


Happy to discuss and hack on some of this. 

Yay, me too! I'll pitch this hackathon project!

Best,
Seb

Sebastian Urchs

unread,
Jun 10, 2022, 5:45:05 PM6/10/22
to bids-di...@googlegroups.com
Hey everyone,

I have put a project up here: https://github.com/ohbm/hackathon2022/issues/65 Please let me know if you'd like to have your name attached to the project, particularly if you're not coming to Glasgow in person (and/or could therefore handle another hub during the hackathon).

Looking forward to chatting soon!
Best,
Seb

Erdal Karaca

unread,
Jun 11, 2022, 7:10:55 AM6/11/22
to bids-di...@googlegroups.com
It seems Python has a virtual file system layer [1] that could be used to mount a webkitdirectory (using JsProxy) into a virtual space for use within a Python program.
Sounds like doable without touching the low-level layer of ancpBIDS.
That would mean to provide a directory path like

BIDSLayout('bids:/dataset-dir-123')

instead of

BIDSLayout('/mnt/xyz/dataset-dir-123')  


Reply all
Reply to author
Forward
0 new messages