virus checking and encryption for Dataverse upload/ingest

152 views
Skip to first unread message

Janet McDougall - Australian Data Archive

unread,
Mar 18, 2019, 11:42:00 PM3/18/19
to Dataverse Users Community
hi All

I have been unable to ascertain whether there is any virus checking capacity, and encryption (built in encryption in transit) in Dataverse as part of the upload/ingest process.

Has anyone any solutions or had any issues with viruses being uploaded to Dataverse?

thanks
Janet

Pete Meyer

unread,
Mar 19, 2019, 10:25:15 AM3/19/19
to Dataverse Users Community
Hi Janet,

I'm relatively sure there's no virus checking built into the dataverse application; although there shouldn't be a problem with pointing a scanner at the files directory (or object store, if there are scanners that can handle those).

Encryption in transit is also not built in to the application; but the usual production configuration has the application server only accessible over https - in which case uploads and downloads are encrypted in transit.  This is installation dependent though, and could depend on the web server and application server configuration, and if "transit" covers both internal and external network connections or just external ones.

Best,
Pete

Philip Durbin

unread,
Mar 19, 2019, 8:20:09 PM3/19/19
to dataverse...@googlegroups.com
Pete's right that there's no virus checking built in to Dataverse. In my day, in a previous life, we used ClamAV to check for viruses on our storage for our research cluster, but I'm not sure what the cool kids use these days. https://en.wikipedia.org/wiki/Clam_AntiVirus

And yes, always HTTPS in production, please, for encryption in transit. Don't forget Firesheep: https://en.wikipedia.org/wiki/Firesheep

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/5279f1c6-1662-47b0-a155-8d70758e875a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Janet McDougall - Australian Data Archive

unread,
Mar 19, 2019, 8:47:47 PM3/19/19
to Dataverse Users Community
hi Pete and Phil,
Great, and thanks for details.  I will discuss with our DevOps and computer centre as to how best to ensure we are covered based on your responses.

I was unsure as to how else to discover this information - has there been any interest previously in these questions?  I am wondering if it might be useful to add these details to doco/FAQs for new Dataverse members/installations?

Anyway, thanks again.
Janet

Philip Durbin

unread,
Mar 19, 2019, 9:01:37 PM3/19/19
to dataverse...@googlegroups.com
I think virus scanning falls under "monitoring", which we document at http://guides.dataverse.org/en/4.11/admin/monitoring.html

If you want to make a pull request, the source for that page can be found at https://github.com/IQSS/dataverse/blob/v4.11/doc/sphinx-guides/source/admin/monitoring.rst

I tried to emphasize the need for HTTPS at http://guides.dataverse.org/en/4.11/installation/config.html#forcing-https but you're welcome to add extra emphasis. :)

To be clear, we're open to any changes to the guide. Please just open an issue and make a pull request and we'll be glad to take a look!

I'm sort of making it sound like making a pull request is easy but it can be daunting at first. We're happy to coach anyone through it. :)

Thanks,

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Janet McDougall - Australian Data Archive

unread,
Mar 19, 2019, 9:47:27 PM3/19/19
to Dataverse Users Community
Hi Phil

From a policy/procedural perspective there is possibly a disconnect between what sys admin/devops are aware of, and what archivists/managers understand of technical measures implemented relating to data security.   

From my position I would probably like to see some reference in the User Guide that these issues are covered under the Admin Guide - installation configs. Unless i've missed it?  I looked for questions relating to virus checking and encryption in transit on the community forum, but didn't think to look at the installation config guides - which wouldn't be clear to me anymore, without further questions.   

 I'm happy to  make a pull request, but it seems there is sufficient info for administrators from both your responses.  Let me know what you think.

thanks
Janet

Philip Durbin

unread,
Mar 19, 2019, 10:16:49 PM3/19/19
to dataverse...@googlegroups.com
I don't think you've missed anything. It's up to each of the 40 installations of Dataverse around the world to decide whether or not to enforce encryption in transit (please do!) or virus scanning (why not!) but the generic User Guide makes no assumptions about what an installation of Dataverse has implemented. One option you have is to create your own version of the Dataverse guides, like https://dataverse.lib.virginia.edu or https://dataverse.no has. These installations of Dataverse have built their own versions of the guides which may contain specifics about their implementation of encryption in transit or virus scanning (I haven't looked). If you want to go down this path of building your own guides for ADA, the config option is called :GuidesBaseUrl and is documented at http://guides.dataverse.org/en/4.11/installation/config.html#guidesbaseurl

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Janet McDougall - Australian Data Archive

unread,
Mar 19, 2019, 10:35:40 PM3/19/19
to Dataverse Users Community
Hi Phil

I think I will leave it as it is, as we will maintain data security details in our wiki and Core Trust Seal.    After this discussion I realise I should have just asked our DevOps and he could have assured me of standards applied through the web configs etc.  Although yours and Pete's responses have also been useful.

We are looking at self-deposit through Dataverse (coinciding with release of Archivematica) which brought these questions to the surface for me in relation to Dataverse - I had not needed to consider them previously.

thanks again!
Janet 
 

Pete Meyer

unread,
Mar 20, 2019, 11:02:57 AM3/20/19
to Dataverse Users Community
Hi Janet,

Glad to hear you've got the information you need.  

One thing that may be worth mentioning (mainly for the benefit of people with the same questions finding this thread in the future) is that any installation using "real" credentials also really should be running over https, even if that installation doesn't care about having uploads and downloads encrypted.  Dataverse will work over unencrypted http, but it means that user API tokens (and potentially usernames and passwords, depending on authentication provider) are going over the network in readable/interceptable form.  Folks coming from a system administration perspective might assume this to be the case; but it may not be something that archivists usually have to worry about.

Best,
Pete

Janet McDougall - Australian Data Archive

unread,
Mar 21, 2019, 8:50:08 AM3/21/19
to Dataverse Users Community
Hi Pete

We disseminate a lot of sensitive data, so moving data is always a concern. We are changing to data owners using Dataverse to self deposit and it occurred that I had not read anything specific in the user guides relating to security for data uploads. As we are changing deposit processes I need to verify security etc.

So Yes, once you both pointed to https and monitoring etc I read the tech doco. I originally have a systems background but even so, it never occurred to me to look to the installation guides for the security options. I have become too reliant on user doco!

I’ve also just started reading some of the sensitive data topics in the recent community meetings. we are certainly interested in following where this is going.

Thanks again,
Janet

Philip Durbin

unread,
Mar 21, 2019, 11:30:13 AM3/21/19
to dataverse...@googlegroups.com
Sure, just to emphasize your point, there is an entire "Securing Your Installation" section in the Installation Guide I highly recommend to anyone running Dataverse: http://guides.dataverse.org/en/4.11/installation/config.html#securing-your-installation

Feedback and pull requests are certainly welcome to improve that section and any part of the guides.

Thanks!

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Janet McDougall - Australian Data Archive

unread,
Mar 21, 2019, 10:21:44 PM3/21/19
to Dataverse Users Community
Thanks Phil, I will write up our doco based on this.
Janet
Reply all
Reply to author
Forward
0 new messages