ExcludeEmailFromExport Not Working V4.12 for Metadata export

14 views
Skip to first unread message

Sherry Lake

unread,
Jun 13, 2019, 4:45:10 PM6/13/19
to Dataverse Users Community
According to issues (the titles) github, #5191 and #5185 - this sounds like it is (was) fixed.

Well, on exports of (raw) JSON and DDI the datasetContact email is exported.

My test server V 4.12, setting:

curl -X PUT -d true http://localhost:8080/api/admin/settings/:ExcludeEmailFromExport

{"status":"OK","data":{":ExcludeEmailFromExport":"true"}}



When I clicked "Export Metadata - JSON" - not logged in.
generates this:


My Results:


{"id":233420,"identifier":"OOI78A",
"persistentUrl":"https://doi.org/10.5072/FK2/OOI78A",
"protocol":"doi","authority":"10.5072/FK2","publisher":"University of Virginia Dataverse",
"publicationDate":"2019-01-10","datasetVersion":{"id":22808,"versionNumber":1,
"versionMinorNumber":0,"versionState":"RELEASED","productionDate":"Production Date",
"lastUpdateTime":"2019-01-10T14:57:54Z","releaseTime":"2019-01-10T14:57:54Z","createTime":
"2019-01-10T14:56:52Z","license":"CC0","termsOfUse":"CC0 Waiver","metadataBlocks":
{"citation":{"displayName":"Citation Metadata","fields":[{"typeName":"title","multiple":false,
"typeClass":"primitive","value":"Added by Admin for Cultural Heritage"},{"typeName":"author","multiple":true,
"typeClass":"compound","value":[{"authorName":{"typeName":"authorName","multiple":false,
"typeClass":"primitive","value":"Admin, Dataverse"},
"authorAffiliation":{"typeName":"authorAffiliation","multiple":false,"typeClass":"primitive","value":"University of Virginia Library"}}]},
{"typeName":"datasetContact","multiple":true,"typeClass":"compound",
"value":[{"datasetContactName":{"typeName":"datasetContactName","multiple":false,"typeClass":"primitive","value":"Admin, Dataverse"},
"datasetContactAffiliation":{"typeName":"datasetContactAffiliation","multiple":false,"typeClass":"primitive","value":"University of Virginia Library"},
"datasetContactEmail":{"typeName":"datasetContactEmail","multiple":false,"typeClass":"primitive","value":"shl...@virginia.edu"}}]},
{"typeName":"dsDescription","multiple":true,"typeClass":"compound",
"value":[{"dsDescriptionValue":{"typeName":"dsDescriptionValue","multiple":false,
"typeClass":"primitive","value":"Added by Admin for Cultural Heritage"}}]},
{"typeName":"subject","multiple":true,"typeClass":"controlledVocabulary",
"value":["Arts and Humanities (Ex: English, History, Foreign, Language)"]},
{"typeName":"productionDate","multiple":false,"typeClass":"primitive","value":"2018"},
{"typeName":"depositor","multiple":false,"typeClass":"primitive","value":"Admin, Dataverse"},
{"typeName":"dateOfDeposit","multiple":false,"typeClass":"primitive","value":"2019-01-10"}]}},"files":[],
"citation":"Admin, Dataverse, 2019, \"Added by Admin for Cultural Heritage\", https://doi.org/10.5072/FK2/OOI78A, University of Virginia Dataverse, V1"}}



And the DDI Export (from the Export Metadata button), has this line:

<contact affiliation="University of Virginia Library" email="shl...@virginia.edu">Admin, Dataverse</contact>


Philip Durbin

unread,
Jun 13, 2019, 5:00:06 PM6/13/19
to dataverse...@googlegroups.com
Hi Sherry,

Those "export" files are cached on disk. Is it possible that you're looking at an export from a dataset you created before you upgraded to Dataverse 4.10[1] or newer? If you create a new dataset right now, is the datasetContact email included or excluded? Is that new-ish ":ExcludeEmailFromExport" being respected?

Thanks,

Phil

1. The release notes for Datavere 4.10 at https://github.com/IQSS/dataverse/releases mention "Run ReExportall to generate JSON-LD exports in the new format added in 4.10" but this should probably also mention that the other reason to run "reExportAll" is to pick up the ":ExcludeEmailFromExport" change you're talking about. Internally, we talked about these release notes a bit at https://help.hmdc.harvard.edu/Ticket/Display.html?id=271155

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2af64933-a7a3-4103-a734-f4142ff2b8a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

Sherry Lake

unread,
Jun 14, 2019, 11:39:55 AM6/14/19
to dataverse...@googlegroups.com
Thanks, Phil.

Confirmed:  Metadata export on NEW datasets do not contain contact email.

So, how do I remove the cached versions of exports (assuming you mean "cached on disk" on the Dataverse server)?

If I have "ExcludeEmailFromExport" set, I would assume that the email would not be exported on ANY datasets (old ones and newly created ones).

--
Sherry

Philip Durbin

unread,
Jun 14, 2019, 12:03:23 PM6/14/19
to dataverse...@googlegroups.com
Oh good, I'm glad :ExcludeEmailFromExport is working after all. Thanks for confirming!

The "export" files are stored per dataset in a directory that might look something like /usr/local/glassfish4/glassfish/domains/domain1/files/10.5072/FK2/U33D6V

They have names like this:

- export_Datacite.cached
- export_dataverse_json.cached
- export_dcterms.cached
- export_ddi.cached
- export_oai_datacite.cached
- export_oai_dc.cached
- export_oai_ddi.cached
- export_OAI_ORE.cached
- export_schema.org.cached


You *could* delete these files by hand (they get regenerated on demand) but there are also APIs and https://github.com/IQSS/dataverse/releases/tag/v4.10 mentions "reExportAll" which is documented at http://guides.dataverse.org/en/4.14/admin/metadataexport.html#batch-exports-through-the-api

On a test server I just modified one of the files above manually on a dataset and ran the following command:


The output is a little weird ({"status":"WORKFLOW_IN_PROGRESS"}) but the files were successfully re-exported, overwriting the manual edit I made (I had changed an author name to "foobar").

We don't really have triggers when you update a database settings but I can understand why you'd like that setting :ExcludeEmailFromExport to true should do some of the work above for you. Please feel free to open an GitHub issue about this.

I hope this helps,

Phil



For more options, visit https://groups.google.com/d/optout.

Sherry Lake

unread,
Jun 17, 2019, 12:49:29 PM6/17/19
to Dataverse Users Community
Hi Phil,

Thanks for the reExportAll command. I ran that on our test system now none of our exported metadata have emails.

Thanks. See you on Wednesday.
Sherry

Philip Durbin

unread,
Jun 18, 2019, 7:40:20 AM6/18/19
to dataverse...@googlegroups.com
Great! And thanks for opening https://github.com/IQSS/dataverse/issues/5952 about the :ExcludeEmailFromExport setting.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages