Encryption of ndjson files goes against FHIR Bulk Data specification

144 views
Skip to first unread message

Justin Stauffer

unread,
May 28, 2019, 2:10:36 PM5/28/19
to Developer Group - Beneficiary Claims Data API
The FHIR Bulk Data Specification (https://github.com/HL7/bulk-data/blob/master/spec/export/index.md) does not call for the ndjson files returned to be encrypted (other than being exchanged over TLS). By adding this additional requirement, it makes it more difficult for clients implementing a standard-compliant solution because this means they now have to perform custom processes for the CMS Bulk FHIR server.

It should be the responsibility of the client to ensure the file contents are kept securely (just like they would be required to secure PHI obtained from any other web service) and that can be handled outside the scope of this API. That is, it should be an implementation decision -- for example, if those files need to be passed on to third parties behind the FHIR Bulk Data Client, then perhaps encryption is appropriate and that can be handled out of band. However, if the FHIR Bulk Data Client is already storing the retrieved data securely, then additional encryption / decryption is just a burden for no security gain.

Dave DeCaprio

unread,
May 28, 2019, 2:35:45 PM5/28/19
to Justin Stauffer, Developer Group - Beneficiary Claims Data API

I’d also prefer a more standards-compliant solution that did not require the extra custom process of decrypting the data that is already sent over an encrypted channel.

--
You received this message because you are subscribed to the Google Groups "Developer Group - Beneficiary Claims Data API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bc-api+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bc-api/0632d36f-75e1-4546-beae-b6573330f198%40googlegroups.com.

Jack Shoemaker

unread,
May 28, 2019, 5:03:05 PM5/28/19
to Dave DeCaprio, Justin Stauffer, Developer Group - Beneficiary Claims Data API

I second Dave’s preference.

 

--

Jack N Shoemaker / (336) 202-2165


This incoming email was seamlessly encrypted by Paubox

--
You received this message because you are subscribed to the Google Groups "Developer Group - Beneficiary Claims Data API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bc-api+un...@googlegroups.com.

This email was seamlessly encrypted for your privacy and security by Paubox

Ryan Nagle

unread,
May 29, 2019, 3:44:56 PM5/29/19
to bc-...@googlegroups.com
Hi Justin, Dave, Jack,

Sincere thank you for taking the time to share your experience using BCDA's sandbox environment. This is the sort of input from the community we're looking for to inform the future shape of the product, to better serve BCDA's consumers and, ultimately, Medicare beneficiaries.

Wanted to drop a note to let you know that your questions about BCDA's encryption decisions are well-taken, and we're working with our team to compile a more complete response to address your concerns. More soon!

All the best,
Ryan
BCDA Engineering Lead

On Tuesday, May 28, 2019 at 4:03:05 PM UTC-5, Jack Shoemaker wrote:

I second Dave’s preference.

 

--

Jack N Shoemaker / (336) 202-2165

 

From: bc-...@googlegroups.com <bc-...@googlegroups.com> On Behalf Of Dave DeCaprio
Sent: Tuesday, May 28, 2019 13:36
To: Justin Stauffer <jus...@epic.com>; Developer Group - Beneficiary Claims Data API <bc-...@googlegroups.com>
Subject: RE: [BCDA-API] Encryption of ndjson files goes against FHIR Bulk Data specification

 

I’d also prefer a more standards-compliant solution that did not require the extra custom process of decrypting the data that is already sent over an encrypted channel.

 

From: bc-...@googlegroups.com <bc-...@googlegroups.com> On Behalf Of Justin Stauffer
Sent: Tuesday, May 28, 2019 1:11 PM
To: Developer Group - Beneficiary Claims Data API <bc-...@googlegroups.com>
Subject: [BCDA-API] Encryption of ndjson files goes against FHIR Bulk Data specification

 

The FHIR Bulk Data Specification (https://github.com/HL7/bulk-data/blob/master/spec/export/index.md) does not call for the ndjson files returned to be encrypted (other than being exchanged over TLS). By adding this additional requirement, it makes it more difficult for clients implementing a standard-compliant solution because this means they now have to perform custom processes for the CMS Bulk FHIR server.

 

It should be the responsibility of the client to ensure the file contents are kept securely (just like they would be required to secure PHI obtained from any other web service) and that can be handled outside the scope of this API. That is, it should be an implementation decision -- for example, if those files need to be passed on to third parties behind the FHIR Bulk Data Client, then perhaps encryption is appropriate and that can be handled out of band. However, if the FHIR Bulk Data Client is already storing the retrieved data securely, then additional encryption / decryption is just a burden for no security gain.

--
You received this message because you are subscribed to the Google Groups "Developer Group - Beneficiary Claims Data API" group.

To unsubscribe from this group and stop receiving emails from it, send an email to bc-api+unsubscribe@googlegroups.com.


This incoming email was seamlessly encrypted by Paubox

--
You received this message because you are subscribed to the Google Groups "Developer Group - Beneficiary Claims Data API" group.

To unsubscribe from this group and stop receiving emails from it, send an email to bc-api+unsubscribe@googlegroups.com.

Ryan Nagle

unread,
Jun 3, 2019, 11:10:53 AM6/3/19
to Developer Group - Beneficiary Claims Data API
Hi again, Justin, Dave, Jack,

I wanted to thank you again for calling attention to the ways in which BCDA's encryption decisions cause us to deviate from the FHIR spec. At this early stage, BCDA is a pilot meant to test the efficacy of the approach to distributing this data and, again, we're relying on feedback from members of this group to help make it better. We wanted to share with you the reasons why these decisions were made, and would love to hear your thoughts.

There are primarily two threat vectors BCDA's use of file encryption aims to mitigate:

1. A client loses control of their API credentials and/or access token. A malicious actor possessing the authenticator(s) may make requests for data, but will be unable to read the contents of files without possession of the client's private key. It's worth noting that the implementation described in the Authorization Guide provides mitigations for this threat, however, the shape that auth/n/z will take in BCDA long-term is still to be determined. We're in the process of evaluating strategies to determine which approach is sufficient both in terms of practical security posture and security compliance requirements we must adhere to. Stay tuned for more on this!

2. A client retrieves data from BCDA and the data (likely unknowingly) lands in an improperly secured storage location. For example, the data is placed in an S3 bucket or FTP share with permissions too liberal (or plainly public), or an adversary otherwise gains admin access to said storage location. Incidents like this are a real threat rooted in human error. From our perspective, encrypting files encourages safe handling of the data and goes a step beyond secure transport to give the best possible chance for successful, secure receipt and storage of this highly sensitive data once it leaves our security boundary.

We acknowledge that this is a deviation from the letter of the FHIR specification. We hope to propose an addendum to the spec that would, at least, allow for this as an optional part of server implementations. We think this speaks to the security considerations section of the bulk export implementation guide:

"This specification does not address protection of the servers themselves from potential compromise. An adversary who successfully captures administrative rights to a server will have full control over that server and can use those rights to undermine the server's security protections."

Ultimately, our goal is to protect this data with the same diligence we would a living, breathing human placed in our care.

Please keep sharing your thoughts as you work in the sandbox – we welcome your questions, and are grateful for the conversations they start on our team and within the community.

Warm regards,
Ryan
BCDA Engineering Lead


On Wednesday, May 29, 2019 at 2:44:56 PM UTC-5, Ryan Nagle wrote:
Hi Justin, Dave, Jack,

Sincere thank yous for taking the time to share your experience using BCDA's sandbox environment. This is the sort of input from the community we're looking for to inform the future shape of the product, to better serve BCDA's consumers and, ultimately, Medicare beneficiaries.

Justin Stauffer

unread,
Jun 3, 2019, 12:19:09 PM6/3/19
to Developer Group - Beneficiary Claims Data API
Those concerns make sense at a high level but for #1, why would any other web service exchange be different than this one? If a malicious actor obtains an access token, they can re-use it within the expiration window. Are you going to start requiring separate payload encryption on every web service response? I'm guessing not so why does this API need to be different? Also, if a malicious actor is able to obtain a valid access token, why wouldn't they also be able to obtain the private key that could be used to decrypt the file contents? It seems to me if you're going off the assumption that a malicious actor has access to secret data, well then they have access to everything and it's all out the door.

For #2, that seems like it should be entirely the responsibility of the client and outside the scope of the API specification. My client will not be saving data to an improperly secured storage location so the additional requirement doesn't make sense and is an additional development and implementation burden for my client's users.

Ultimately, I think in order for this to be an acceptable API for the whole community of stakeholders, it needs to be in line with the HL7 FHIR Bulk Data specification. There's a reason that HL7 working group exists and follows the processes they do to obtain a community consensus on what's required and what's not -- that is, so one party doesn't unilaterally decide what's correct.

To unsubscribe from this group and stop receiving emails from it, send an email to bc-...@googlegroups.com.


This incoming email was seamlessly encrypted by Paubox

--
You received this message because you are subscribed to the Google Groups "Developer Group - Beneficiary Claims Data API" group.

To unsubscribe from this group and stop receiving emails from it, send an email to bc-...@googlegroups.com.

Dave DeCaprio

unread,
Jun 3, 2019, 2:02:32 PM6/3/19
to Justin Stauffer, Developer Group - Beneficiary Claims Data API

I’m wondering if there are ways to address these underlying concerns in different ways which might be more standards compliant.

 

For #1, it seems like the core goal is to make it so that an attacker needs to compromise multiple credentials in order to be able to compromise the account.  If the attacker gets a hold of the client’s API key they would still need the private key in order to access the data, or vice versa.  This of course makes sense and is the core principle behind the various different approaches to multi-factor authentication. 

 

There are several ways to support multifactor authentication in a standards compliant way.  I’ll briefly describe one approach.  Instead of clients using long lived credentials to access the data APIs, you could use short lived tokens that are created by a separate authentication call.  The authentication call could use one of several approaches to multifactor authentication.  For example, it could return a token that is encrypted using the client’s public key.  The client would then use their private key to decrypt the token and use it in the resulting data API call.  The advantage of an approach like this is that it separates the implementation of authentication from the actual payload of the API. It also allows for more flexibility in how the authentication is handled over time without requiring changes to the data implementation.

 

For #2, I think there is a counterbalancing factor, which is that the likelihood of the data ending up in an unexpected location goes down if the client can use standard tools to process the data.  If this endpoint is custom and works different than other FHIR endpoints, then the result is that custom code needs to be written to support this case.  This custom code likely won’t have the same level of rigorous testing that a standards compliant implementation would go through.  The more special exceptions and corner cases an implementation has, the bigger it’s attack surface.  Having a smaller number of cases to test means that each one can be tested more fully.

 

As a concrete example, imagine that a client gets the encrypted file, saves it somewhere, runs the python script to convert it to a decrypted file, and then imports that into a standard FHIR database.  Then imagine the alternative case where the payload is not encrypted.  In this case the client gets the data and can immediately pipe it over to their FHIR database without any need for intermediate storage.  Note that these scenarios, which I think are very realistic use cases, the encrypted payload actually ends up with MORE potential for exploitation than the standards-compliant version, as the decryption process puts a copy on the temporary local file system, which might not be adequately secured.  Of course, you can create use cases where the opposite is true, but the point here is to lean towards the standards-based implementation unless there is a clear advantage to a different approach.

 

Overall I’m very excited about this project and happy to see the team be so responsive.

 

Thanks,

Dave

To unsubscribe from this group and stop receiving emails from it, send an email to bc-api+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bc-api/6ae58359-e47e-4ffe-b10a-78d9e513003a%40googlegroups.com.

Isaac Vetter

unread,
Jun 6, 2019, 11:04:36 PM6/6/19
to Developer Group - Beneficiary Claims Data API
Hey Dave,

I just wanted to point out (to your credit!) that your recommendations regarding #1 are extremely similar to the existing HL7 FHIR Bulk Data Access standard

Instead of clients using long lived credentials to access the data APIs, you could use short lived tokens that are created by a separate authentication call.

For example, the widely implemented OAuth2 client_credentials flow, which SMART Backend Services profiles, does mandate a separate authentication call in which the client authenticates with a pre-registered private key pair. 

The result of the authentication call is a “short-lived” access_token. The FHIR spec recommends an expiration time of no more than 5 minutes

I took Ryan’s comment —
It's worth noting that the implementation described in the Authorization Guide provides mitigations for this threat, however, the shape that auth/n/z will take in BCDA long-term is still to be determined.

to be an acknowledgement that this ANSI-track standard already addresses the concerns, but that BCDA may choose to not adhere to the standard in favor of a different solution. 

Isaac

Jack Shoemaker

unread,
Jun 7, 2019, 10:41:57 AM6/7/19
to Isaac Vetter, Developer Group - Beneficiary Claims Data API

Dear Isaac,

 

But surely, we will want BCDA to adhere to the standard and will disfavor different solutions because, well, they aren’t standard.

 

  • Jack

 

--

Jack N Shoemaker / (336) 202-2165

 

To unsubscribe from this group and stop receiving emails from it, send an email to bc-api+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bc-api/b2161156-35c8-459b-a5d2-f9e5a05f7ecc%40googlegroups.com.

Verified secure by Paubox Inbound Security - HITRUST CSF Certified

Secured by Paubox Encrypted Email - HITRUST CSF Certified

Liani Lye

unread,
Jun 7, 2019, 1:54:42 PM6/7/19
to Jack Shoemaker, Isaac Vetter, Developer Group - Beneficiary Claims Data API

Hello Isaac, Jack, Dave, and Justin:

I want to reiterate Ryan's note: Thank you again for calling attention to the ways that BCDA's encryption decisions deviate from the FHIR spec.

This type of discussion is precisely why we created the group -- it acts as a community-based forcing function. Based on the great points you all have made, we are going to be working with our CMS partners to evaluate further. We look forward to bringing the fruits of these discussions to you, the Google Group community, as well as other FHIR communities, and iterate further.

Thank you again.


All my best,
Liani

-- 

Liani Lye
Community Manager
Beneficiary Claims Data API

Reply all
Reply to author
Forward
0 new messages