On 04.04.2024 16:52,
shyma...@gmail.com wrote:
>The Mitre project seems to be randomly generated (but reasonable) data.
> Not what I seek.
>
>Maybe I should explain more about what I'm thinking. I would like to have
>a database of medical histories to look for real life patterns. There are
>numerous possible applications that I could imagine.
HIPAA's requirements for deidentification under the safe harbor method (i.e.
the standard you can satisfy to keep you in the clear legally) would mean that
all dates would be reduced to only the year, and on top of that, birthdates >89
years old can't even include the year, only a generic "90 years old and over"
indication. Admission dates, discharge dates, etc (at my previous employer we
took that to mean ANY date of service whether inpatient or outpatient) also
have to reduce to 1-year resolution (drop the month and day).
That alone would get in the way of any kind of fine-grained pattern analysis
unless you're looking for very broad strokes ("someone born sometime in 1950
had a diagnosis of X in 1978 and procedure Y sometime in 1983"... but if 8
procedures, diagnoses, etc were made throughout 1985, you wouldn't know when in
that year they happened, just that they happened all in that one year).
So even if the VA has some process to release real de-identified data (I highly
doubt they'd agree to that, though; I agree with the assertion that the idea of
truly infallible de-identified data is a myth), they would at a minimum meet
the HIPAA safe harbor standard, and I'm not sure that gives you the granularity
of data you'd be looking for.
>Do people do that? Then don't they get a deidentified database?
I'm guessing a lot of statistical analysis is done on the *real* (or
deidentified) data by the covered entity themselves, or another organization
working under a BAA with the source of the data. We deidentified data
internally to pass to the analysis folks, but we would NEVER release that data
externally because we didn't feel comfortable with that liability; as a covered
entity or a business associate, if one of our people analyzing the data DID
"see through" the deidentification and accidentally come across a link between
PHI and PII, it wasn't necessarily a breach. Everyone getting anywhere close to
that data, even de-identified, was always up-to-date on mandatory HIPAA
training, working on computer systems/networks that were managed under our
HIPAA security policies, etc. But if that happened after we passed the data
along outside the scope of the covered entity or outside the permissible uses
of a BAA that we received the data under, it'd be a much bigger deal. So no way
we'd let out that data even if we thought we had de-identified it sufficiently.
-Matthew (been out of the industy for a few year, so take what I say with
an appropriate quantity of salt grains)