Implementing Croissant for The Cancer Imaging Archive

6 views
Skip to first unread message

Justin Kirby

unread,
Aug 1, 2025, 9:16:46 AMAug 1
to croissant-users
Hi all,

I am still getting my feet wet with Croissant, but have already run into some problems trying to use the Croissant Editor on Hugging Face (various error messages when trying to load different dataset examples from our system) and also have some general questions about how to represent our datasets.  Is there anyone I can meet with to discuss our general use cases and get some preliminary guidance on best practices?  If it helps incentivize anyone, I will add that if we can get this in place it would probably help spur adoption among several additional biomedical databases at the National Cancer Institute that are managed by my colleagues :-D  

We've already got schema.org metadata in place (https://www.cancerimagingarchive.net/) to ensure discoverability via Google Datasets, etc and are excited to take the next step with Croissant to better support the AI community.  I am particularly interested in trying to merge what we've already done on our https://www.cancerimagingarchive.net/cancer-imaging-checklist-for-data-sharing-cicadas/ with the ResponsibleAI extension in Croissant.

Best,
Justin

Omar Benjelloun

unread,
Aug 4, 2025, 9:46:09 AMAug 4
to Justin Kirby, croissant-users
Hi Justin,

I'd be happy to meet to help you resolve issues with Croissant. Could you compile a list of issues you have run into and share it via a GitHub issue, or a Google doc?

Best,
Omar


--
You received this message because you are subscribed to the Google Groups "croissant-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to croissant-use...@mlcommons.org.
To view this discussion visit https://groups.google.com/a/mlcommons.org/d/msgid/croissant-users/1ae7067f-4e55-4363-a701-db938f567475n%40mlcommons.org.

Justin Kirby

unread,
Aug 6, 2025, 9:12:40 AMAug 6
to croissant-users, benj...@google.com, croissant-users, Justin Kirby
Hi Omar,

Thanks so much for the offer.  I've started a google doc here.  It starts with the issues I had trying to use the Croissant Editor streamlit app.  When I couldn't figure that out I looked for examples of other people who might have created similar types of Croissant files to use as examples for my use case (e.g. describe the relationship of images and their segmentation files), but I had trouble finding any functional ones on HF/Kaggle/openML.  The doc concludes with a couple of other high level questions.  

Justin
Reply all
Reply to author
Forward
0 new messages