What would it take to create an openAnalytics platform where
all care.data is available for anyone to query?
The identifiable data would stay behind a wall, but is open for querying (SQL).
It needs a check on access levels to comply with NHS IG standards (Caldicott2),
e.g.
- a GP can see all related to their practice, even identifiable.
- a service re-designer at a CCG or hospital (non-clinician) can query and see
pseudonymised data for their own limited area.
- any member of the public can submit a query, but the query will be (automatically)
screened before and after execution to avoid the potential re-identification of
individuals
Behind the wall:
It needs proper cleansing, SUS (Secondary Uses Service) is a bit tame on
enforcing agreed standards at point of submission, and even HES data needs a
lot of additional cleansing
It needs activity pricing attached to it for current year and actual year
This can easily be achieved; about a year ago I was involved in processing 4
years of SUS data in a few weeks’ time. Updates can be faster, as you don't
have to redo previous years
It needs some really advanced querying, like calculating the queue lengths for
A&E and the bed occupancy queues from admissions; relating these two would
already help explaining A&E bottle-necks (on Monday morning they cannot
find a bed to admit to, because some clinicians were a bit complacent at
discharging before they went on weekend on Friday)
I can go on ...
Just one more, linking with other datasets, like deprivation, pollution, public
health, or any set submitted by a user
Now we have the silly situation that a private company processes the data and
makes the results available for GP practices to view for a few £1000s per
practice
And we don't know what clever algorithms they use, let alone if these are actually
reliable. I don't have any problem with someone providing a useful service and
asking payment for it, that's all fine, private or public. Just that if you ask
for a business case with £s attached there it is.
This is not Big Data, this is working with very structured datasets, some of
which just happen to be a bit big, so well established technology, a SQL server
(open or proprietary, maybe also R (statistics)
Any more suggestions or critical appraisals
After all anything that doesn't kill the idea, can just make it stronger
Cheers
Harry
Anyone from Atchai on this list? You might want to share recent experiences.....
Anyone involved with the CfH Open Data Platform of a few years back may also want to chime in.
--
You received this message because you are subscribed to the Google Groups "nhshackday" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nhshackday+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.