What determines "Metadata Source"?

13 views
Skip to first unread message

fa...@kb.dk

unread,
Jul 21, 2021, 7:46:28 AM7/21/21
to Dataverse Users Community

Hi everyone,

on the left-hand side of the main page, there is a browse facet called "Metadata Source", which cannot be edited or removed from the admin interface and I'm wondering how it works and what it is used for.

What I have observed so far and what confuses me is:
By default, "Metadata Source" equals the name of the root Dataverse of the installation.
When changing the name of the root Dataverse:
- The "Metadata Source" changes for all datasets published in the root Dataverse
- It does not change for any dataset published in a sub-Dataverse
- It does not change for any sub-Dataverse published in the root Dataverse

This seems inconsistent to me, since I would expect
- either changes to appear in the same way for all datasets and sub-Dataverses
- or that "Metadata Source" would equal the name of the sub-Dataverse that the dataset is published in

For me, the second option would be most useful, since it would facilitate an easy way to browse datasets by sub-Dataverse, but maybe I just misunderstand the concept?
I guess that its main use is related to importing metadata from other installations (and that one should not change the name of the root Dataverse anyway)?

Best regards,
Falco

Sherry Lake

unread,
Jul 21, 2021, 8:32:38 AM7/21/21
to dataverse...@googlegroups.com
Hello Falco,

This is not an official answer, but at the University of Virginia, our Dataverse repository shows two sources "University of Virginia Dataverse" (files uploaded by users of our repository) and "Harvested" (dataset metadata that we have harvested from Harvard's repository via OAI-PMH).

So I think "Source" is where the metadata comes from (repository-based, not sub-dataverse). Harvard's repository also shows two "sources" https://dataverse.harvard.edu/dataverse/harvard

--
Sherry Lake

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/27647cbc-bb4b-4369-ac40-338c2f055be4n%40googlegroups.com.

James Myers

unread,
Jul 21, 2021, 8:32:42 AM7/21/21
to dataverse...@googlegroups.com

Falco,

Others may know more but with a quick look in the code:

·         It looks like this facet only displays if you have more than one source, i.e. you are harvesting.

·         It should always be either the root Dataverse collection name or ‘Harvested’. However, the root Dataverse collection name is cached in solr, so if you change the root name, you’ll need to reindex to get the update for sub-collections. (This could be automated but changing the root name is hopefully rare. You could add an issue if you think this is needed.)

 

I didn’t try to find info on the original intent or whether using sub-Dataverse names would be consistent with the reason this was created.

 

- Jim

--

Philip Durbin

unread,
Jul 21, 2021, 10:01:12 AM7/21/21
to dataverse...@googlegroups.com
You can find more details at https://github.com/IQSS/dataverse/issues/3203 about the original thinking but just to repeat what Sherry and Jim have said:

- The intention of "Metadata Source" is to have a way to facet on harvested (or non-harvested).
- "Metadata Source" should not appear if you haven't harvested anything.
- "Metadata Source" always reflects the root dataverse collection, never any sub dataverse collection.
- It sounds like if you rename your root dataverse collection, you should reindex. (Maybe this should be documented, at least.)

I hope this helps,

Phil




--
Reply all
Reply to author
Forward
0 new messages