Event sourcing with GDPR

297 views
Skip to first unread message

Vikas Pandey

unread,
Sep 22, 2018, 6:00:30 PM9/22/18
to DDD/CQRS
Hello 

As many already know GDPR came into effect since end of May 2018. As per GDPR customer has right to erasure of right to be forgotten. In my event sourcing application I am having customer sensitive data. 
For whole immutability purpose I do not want to delete or update events. I am familiar with one of the approach which is encrypting data for each customer with it's own key. But this approach has couple of disadvantages
1. I cannot perform search in events (I am using Mongodb to store events)
2. Maintaining key

I wanted to know if anyone else has tried any other approach?

Rickard Öberg

unread,
Sep 22, 2018, 11:20:53 PM9/22/18
to ddd...@googlegroups.com
It’s definitely a hassle. Only mitigating thing to be done about 1 that I can think of is to decrypt the data on insert and hash it, and store that. When doing a search hash the keywords and you’ll be able to find but only on exact matches.

Sent from my iPad
--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
Visit this group at https://groups.google.com/group/dddcqrs.
For more options, visit https://groups.google.com/d/optout.

Robert Reppel

unread,
Sep 23, 2018, 10:33:22 AM9/23/18
to DDD/CQRS
Hi Vikas,

Maybe replace the event streams which have GDPR relevant info in them with new ones which don't have it anymore ... ?  Greg's book on event versioning ( https://leanpub.com/esversioning ) has some really good info about various patterns one might use for that kind of thing.

R.

Rickard Öberg

unread,
Sep 24, 2018, 1:58:28 AM9/24/18
to ddd...@googlegroups.com
Another important point for 2, you can actually store the key in the
database if you want, you just have to store it encrypted with another
key that you only have available at runtime. So you load the user key,
decode using runtime key, then use that to decode user fields. The
point being that if someone stole a db copy it would still be useless
without the runtime key.

/Rickard

Philip Jander

unread,
Sep 24, 2018, 2:19:50 AM9/24/18
to ddd...@googlegroups.com
Hi Vikas,

as an alternative, you could reconsider your search strategy. Instead of searching through events, our projects usually provide raw search data (= searchable tokens, source entity (etc.) type, IDs) as a projection. While this is a bit more overhead compared to searching inside the events, it is quite flexible and does not couple to a specific kind of event store. Actually, it allows to integrate searchable data from sources with many different kinds of implementation pattern, beyond event sourced modules.

If you should do so, then as a starter, the following pattern (so far) worked like a charm for us with respect to event sourcing and gdpr:
All potentially relevant data ist encrpyted in the events using symmetric, salted encryption. This includes practically any kind of text fields, but also practically any references connecting information regarding a person with other entities*. The key is stored in a CRUD accessible database. 
The index to that key (key to the key) is the ID of the entity which governs the lifetime of the data. That might be something connected to a real person, then the lifetime depends on the GDPR limited need-to-store-the-data, and makes this data available for information requests and deletion requests by a human being. That might also be the ID of a contract or account, then the lifetime is connected to the requirements regarding orderly book and record keeping. 
There is a specific generic data type for protected information, that has an accessor returning an option type.
The index itself is also stored with the encrpyted data, so that the reading part of the system does not need to know the details of the above scheme.
When reading into memory (i.e. during projection or transaction preparation), the data is automatically decrypted, if the key is no longer known, it returns None. Subsequent projections usually replace that None with token strings like 'removed do to GDPR request or end of record keeping period'.
Writing into a data cell without key registered leads to generation of a new key.
Deletation is effected by removing the key.
The key database itself is encrypted using a run-time configurable key, so the database is automatically anonymized without the secret.

* Two caveats:
- the entity reference structure must be designed with GDPR in mind. It took some thinking to get the right references connected to the right keys such that deletion requests and record keeping requirements were correctly respected. Amazingly, no conflict between them occurred.
- since the key index is stored in immutable data, tracing it thoughout the database forms a network potentially allowing to stil recover information allowing to identify a person. This takes some thinking to get right.

Also, I did not have this design audited yet, but from previous audit experience, I assume (=hope) that it would pass.

Cheers
Phil


--
Jander IT Beratung
Philip Jander

Hammer Steindamm 113
D 20535 Hamburg
Germany

T +49 (0)40 228 656 10

http://Jander.IT

Ust-ID DE 224798452

Alexander Langer

unread,
Sep 24, 2018, 2:23:39 AM9/24/18
to ddd...@googlegroups.com
There are solutions for this:

http://cryptowiki.net/index.php?title=Searching_on_encrypted_data

An even easier solution: Project the event stream (with encrypted data)
into a searchable unencrypted read model. Search on the read model.

If the user requests deletion or the retention time kicks in, delete
data in the read model and the encryption key.

Vikas Pandey

unread,
Sep 30, 2018, 7:46:06 PM9/30/18
to DDD/CQRS
Thank you all for replying.
I am exploring from legal standpoint if loosing the encryption key is GDPR compliant solution or not. There are other points like guarantee of key not being replicated anywhere else and strength of encryption algorithm.
I also do not see any other way to keep data immutable and comply with GDPR.

Rickard Öberg

unread,
Sep 30, 2018, 8:28:21 PM9/30/18
to ddd...@googlegroups.com
Note that you will want to store the user keys encrypted, with a password only available in memory, injected into service. With this, my understanding is that it is GDPR compliant.

Sent from my iPad
--

Vikas Pandey

unread,
Oct 7, 2018, 3:08:02 PM10/7/18
to DDD/CQRS
Yes. That was also something which I was thinking about. It makes it more secure if we could encrypt individual keys also with a secured key present in a secure vault. 

Greg Young

unread,
Oct 7, 2018, 3:11:06 PM10/7/18
to ddd...@googlegroups.com
I have been considering building such functionality into EventStore directly. How do people see it working overall so I can compare notes. What would an ideal API look like?

Greg
--
Studying for the Turing test
Reply all
Reply to author
Forward
0 new messages