questions from potential adopter

50 views
Skip to first unread message

gja...@ucsb.edu

unread,
Apr 24, 2017, 11:55:02 AM4/24/17
to Dataverse Users Community
Hello, I'm the director of the Data Curation Program at UC Santa Barbara.  I've done some preliminary investigation into using Dataverse as a platform for our faculty research data repository, and have a few questions.

1. Geospatial data.  I see support for spatial metadata, e.g., FGDC (presumably that means ISO 19115 now?).  Is spatial (i.e., map-based) search supported?  I'm seeing a connector option to Harvard's Worldmap.  I'm not seeing spatial search in Worldmap (perhaps I missed it), but in any case, this area appears to be in the middle of some redevelopment.  What's the current status and roadmap here?  Will Dataverse support spatial search natively?

2. Administrative controls.  Are there options that would allow library administrators/curators to exercise any administrative control over deposited files and datasets?  A kind of superuser or sudo ability?  Are there workflow options that would allow curators to vet datasets and metadata before being published?

3. Limits.  Is there a practical upper bound on the number of files in a dataset?  Practical upper bound on individual size?  (Note the word "practical" here.)

4. Community.  Is the code still being developed exclusively by Harvard?  Or are other institutions contributing?

Thanks much!
-Greg

Sebastian Karcher

unread,
Apr 24, 2017, 12:07:34 PM4/24/17
to dataverse...@googlegroups.com
Hi Greg,

someone else will have to answer 1.). For the rest:

2.) Absolutely -- you can assign various levels of controls to different users. You can even do so at subsets of the repository (i.e. in different "dataverses" -- think collections). A good way to test this is to set up your own dataverse on demo.dataverse.org and then play with roles&controls.

3.) I talked to Gustavo, the lead dev about this. The largest number of files in a dataset is currently ~2000. This isn't ideal in how it's displayed -- e.g. you only see something like 20 previews and then you need to scroll down and wait -- but with the ability to search files within a dataset and to download all files in a dataset, even that is feasible. There aren't any noticeable performance issues with that number of files in a single dataset. Sonia, the lead curator at IQSS, does advise against this for usability, though.

4.) The bulk of the software is still developed by the team at IQSS, but yes, there are definitely outside contributions (and given the rapid adoption of DV, some of it by larger institutions worldwide, they are likely to increase, imo). The core developers are certainly extremely welcoming of outside contributions. We just had our first pull request accepted and that went very smoothly.

Best,
Sebastian

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/b115e00f-c173-41a1-b06b-83ec72b9dc56%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Sebastian Karcher, PhD
www.sebastiankarcher.com

julian...@g.harvard.edu

unread,
Apr 24, 2017, 6:39:14 PM4/24/17
to Dataverse Users Community
Hi Greg,

Thanks for your questions. I'd like to take a stab at answering the one about geospatial search:

1. Geospatial data.  I see support for spatial metadata, e.g., FGDC (presumably that means ISO 19115 now?).  Is spatial (i.e., map-based) search supported?  I'm seeing a connector option to Harvard's Worldmap.  I'm not seeing spatial search in Worldmap (perhaps I missed it), but in any case, this area appears to be in the middle of some redevelopment.  What's the current status and roadmap here?  Will Dataverse support spatial search natively?

The Dataverse community has plans to develop better support for spatial search in Dataverse. The WorldMap integration is helping progress a lot of that work. More support for it on the Dataverse side hasn't been prioritized and isn't on the roadmap, yet, but feedback is always welcome. Below I've included what is supported and some related development tickets, where a lot of this work is tracked and discussed.

Right now, you can search for datasets on Dataverse spatially by using the search box to search with the geospatial metadata fields Dataverse uses (here are the fields on this geospatial metadata spreadsheet), e.g. entering westLongitude:83.4 AND eastLongitude:84.38... or state:California. These metadata fields aren't in the advanced search form (an issue raised in this Github issue: #2353and the names aren't really published outside of that spreadsheet. Another issue includes changing the way Dataverse currently stores metadata values so that datasets can be searched with coordinate ranges (see #370 and this related issue about sky coverage metadata: #3526).

Also, Geospatial metadata is automatically extracted from geospatial files that are mapped on Harvard's WorldMap, and there are plans to expose this metadata on the Dataverse side (see: #3251) to enhance spatial searching within Dataverse.

Regarding spatial search in Harvard WorldMap, a colleague working with the Harvard WorldMap team let me know that it's possible by creating a map and searching for layers to add to that map. That interface let's you search for maps to use as a layer by keyword and by zooming and panning on the results (they have a robust guide and ways to contact the WorldMap team here: http://worldmap.harvard.edu/static/docs/WorldMap_Help_en.pdf).

I hope this helps! Feel free to follow up for clarification, and we'd love to hear your suggestions.

Best,
Julian Gautier
Product Research Specialist, IQSS

Philip Durbin

unread,
Apr 24, 2017, 10:09:54 PM4/24/17
to dataverse...@googlegroups.com
Thanks, Julian, great summary. Basically, Dataverse has fairly decent support for entering geospatial metadata but it falls a bit short on the search side. Almost all metadata fields are indexed (except email addresses, for privacy) and geospatial metadata fields are no exception. So those geospatial fields mentioned in that spreadsheet are indexed but there's no GUI dedicated to searching geospatial stuff. Greg, if you want to create a GitHub issue that describes some of your use cases, please feel free!

Thanks,

Phil

p.s. Here's another related issue with regard to search fields: What are the allowed search fields for the Search API q parameter? - https://github.com/IQSS/dataverse/issues/2558 . Also, this one is closed but the majority of our text fields are indexed as "text_en" rather than numbers and such. It would be nice to spend time on issues like this some day.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

gja...@ucsb.edu

unread,
Apr 26, 2017, 12:27:41 PM4/26/17
to Dataverse Users Community, philip...@harvard.edu
Thanks all for the responses.  Sounds like a basic spatial search capability is within reach.  -Greg
To post to this group, send email to dataverse...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages