AWS & GDPR?

78 views
Skip to first unread message

Thomas Pronk

unread,
Aug 4, 2020, 2:12:30 PM8/4/20
to Online Experiments
Hi there!

To what degree would I comply with the GDRP when using AWS?

Background: I'm Dutch (and I love biking). Since the GDPR and CLOUD ACT, Dutch Universities have been restricted with regard to the degree to which they can use US servers for their research. Some forbade it; some put restrictions on what kind of data may be collected.

More concretely, I've seen this happen a lot with restrictions being put on Qualtrics. Data Shields or servers in EU countries (Ireland) don't seem to have helped in lifting those restriction. Hence my question.

Best, Thomas

jkhart...@gmail.com

unread,
Aug 4, 2020, 3:38:45 PM8/4/20
to Online Experiments
Fair enough. I expect AWS has a method for dealing with that. The thing I'd look into is the phrase 'availability zone'. Specifically, you can set up where your servers are.

I will say that the CLI right now automatically assigns you to US-East-1, which is located in Virginia. There is one particular feature that may not work if you aren't in that zone. I'm not sure and it would require testing.  

Can you please post an issue on this repo? And is it something you might be willing to look into? I doubt I can address it in time for Thursday, but I expect it's relatively straightforward once we know what needs to be done.

Thomas Pronk

unread,
Aug 4, 2020, 4:42:42 PM8/4/20
to Online Experiments
Thanks for your reply!

No, I'm afraid I wouldn't be willing to look into this, but I've got some input that could be useful and I would like like to collaborate on another level.

GDPR versus AWS
From what I know in the EU situation, you really need a server maintained by an EU company, especially when it concerns personally identifiable data (such as e-mail addresses). Researchers may fly under the radar, but it would get shot down by the Data Stewards. Before examining the technical side, I'd recommend getting one of those involved in order to clear up the requirements. As Pushkin penetrates the EU market, this could likely happen as a matter of course.

How to collaborate?
In my question I represented my Dutch network, but my actual job is programming for PyschoJS. My job is explicitly Open Source and I'd love to collaborate with you and other OS devs (jsPsych, lab.js, OSWeb, JATOS). Our PsychoJS + Pavlovia stack is rather vertically integrated, but I'm sure there are more than enough opportunities to collaborate. For instance, I'm setting up an automated end-to-end testing workflow. Have you got any use for that?

jkhart...@gmail.com

unread,
Aug 13, 2020, 12:08:25 PM8/13/20
to Online Experiments
For obvious reasons, GDPR is just not a priority at the moment, and in any case, a European will have a better sense of what exactly is needed to keep local folks happy. 

In terms of end-to-end testing, what are you thinking? 

I think figuring out where to collaborate may take some hard thinking. PsychoJS and jsPsych are essentially alternatives to one another, and Pavlovia is aimed at a very different problem from Pushkin. It's more analogous to PsiTurk or Ibex Farm -- it's just not a use-case we are focused on. 

It might make sense to work on an integration so that people could use PsychoJS with Pushkin. Again, I'm not sure much functionality would be gained, but it would give users of PsychJS the opportunity to take advantage of Pushkin. I forget if Pavlovia can already run jsPsych experiments, but if not, that might be something to consider. 

Thomas Pronk

unread,
Aug 14, 2020, 7:49:59 AM8/14/20
to Online Experiments
AWS & GDPR
I had a quick chat with a Dutch data steward. The situation seems less dire than I expected: AWS would require a data processing agreement with Amazon and a mention of the Amazon server in the informed consent. Quite similar to what Becky thought, actually.

Below is an extensive reflection, for TLDR, skip to the heading on end-to-end testing :)

PsychoJS, jsPsych, lab.js, OSWeb, Labvanced on the client-side | Pavlovia, Pushkin, JATOS, Gorilla on the server side
There are quite a lot of systems out there with similar functionality; client-side web apps for cognitive tasks and some QN functionality. I call them cognitive task libraries. Those listed above are all open source, so they could, in principle, be plugged into a server-side app (such as Pavlovia, Pushkin, JATOS, Gorilla, etc.). Most are affiliated with a particular server-side app, but jsPsych is unique in that it's not affiliated with any one system. A unique thing about Pavlovia is that it supports many different cognitive task libraries (PsychoJS, jsPsych, lab.js, and OSWeb). Its revenue model is different from Pushkin; while Pushkin is financed via sub-contracting, Pavlovia is Software-as-a-Service (clients pay per participant), and part of the profit is donated to whatever task library is used for administering the task. This way Pavlovia can give back to the open source community. The same revenue model finances part of the PsychoPy and PsychoJS development.

Challenges to collaboration
I see three challenges:
  • Developers don't want to give up their revenue models. For example, if PsychoJS would be made compatible with Pushkin, it would lose revenues via Pavlovia (the only system that presently supports PsychoJS)
  • Developers don't want to give up their system. Even if someone else builds something superior and you don't have a revenue model, it's still hard to let go of your baby, so to say.
  • Researchers don't want to switch unless they have to. We like to adopt new technologies, but most researchers aren't that tech-savvy, nor do they have the incentive. My own experiences underline this: I used to maintain my own cognitive task library, but I actively disbanded it, because other systems had become so much better (and I wanted to support those). It took quite a long time (and me reiterating that support had ended) before researchers stopped using my system.
Where could we collaborate?
I'd like to sidestep the challenges above, by exploring ways of collaboration that are external to the software. For instance on the output formats and perhaps in time on particular modules that we all need. On the short term, e2e testing could be an nice one, because I'm already working on that.

End-to-end testing
I'm financed by a grant of the Zuckerberg foundation to develop open source software and improve the stability of PsychoJS and Pavlovia. To this end, I'm setting up a system for automated end-to-end testing. It's based on the WebDriver/Appium standards, written in WebdriverIO, and deployed via BrowserStack. Software for online experiments has quite some idiosyncrasies compared to a standard website, which are reflected in the software I built so far. It's quite nicely documented and free to use to by any OS group that's involved in online experiments. I shared it with Josh (jsPsych) and Felix (lab.js). I'll share it with you in a moment.

jkhart...@gmail.com

unread,
Aug 16, 2020, 8:35:14 PM8/16/20
to Online Experiments
Hi Thomas,

Thank you for sharing your thoughts on this forum. I think that not wanting to give up a project you've invested in or not wanting to learn new technology both play a role. Revenue models is probably only an issue for projects that have revenue, like Pavlovia or Testable. Pushkin has never had direct funding. I mentioned subcontracts in my talk only as a way of supporting people who have specific technical needs but limited programming chops. It's not something I particularly want to do for its own sake; the motivation is a) service to the community, and b) promoting the kind of research I'd like to see more of.

I suspect that last issue is more critical. I can't use Gorilla or Pavlovia because they don't support the kind of research I do.  Conversely, the folks behind Empirica can't use Pushkin because because Pushkin's "citizen science" method just isn't compatible with the tightly-controlled manipulations of groups that Empirica is designed for. Similarly, you could probably try to run a behavioral experiment using Zooniverse, but why on Earth would you? The same underlying structure that makes Zooniverse excel at  annotation projects means it would be extremely hard to implement a behavioral study.

Ultimately, though, I think we have ended up in the same place: collaboration is going to require finding stand-along pieces of functionality that can be shared across projects. Basically, it'll require the open-source model. 

As far as end-to-end testing, I understand how it applies to experiment engines like PsychoJS or jsPsych. It's not immediately clear how it would apply to Pushkin independent of jsPsych. jsPsych runs in the client's browser, so presumably it doesn't much matter what website that code was downloaded from, right? Conversely, most of what Pushkin does is unique to Pushkin and not part of any other project, so I'm not sure what kind of testing could be shared. But this may be my lack of imagination, or my lack of knowing your long-term plans for Pavlovia.

Joshua K Hartshorne
Assistant Professor
Boston College
Reply all
Reply to author
Forward
0 new messages