ISO Clinical Data Warehouse draft Technical Specification

17 views

Skip to first unread message

Whitaker, Patrick

unread,

Nov 25, 2008, 10:34:19 AM11/25/08

to emr-data-u...@googlegroups.com, Andrew Grant

I have posted this document to the EMR Data Use Workshop Google Group. Comments are being solicited by ISO TC215 Working Group 8. If you would like to contribute to this Technical Specification, please forward comments directly to Andrew Grant by December 15th.

Best regards,

Patrick

Patrick Whitaker
Technical Officer
Health Care Informatics Unit
Health Statistics and Informatics Department
World Health Organization

Tel. direct: +41 22 791 1372
E-mail: whit...@who.int

From: Andrew Grant [mailto:Andrew...@usherbrooke.ca]
Sent: 16 October 2008 11:12
To: Whitaker, Patrick
Cc: Bailey, Christopher; Boucher, Phillipe; jeremy...@nhs.net; Spohr, Mark; Mark Fuller
Subject: RE: Clinical Data Warehouse

(Voir fichier joint : DW - Process and outcomes.ppt)

Thanks Patrick,

On behalf also of Mark Fuller (Canada) and Jeremy Thorp (UK) who are leading their respective national data warehouse programs, and myself as a medical medical informatics practitioner / researcher in this area responsible for coordinating this technical specification content, we would be very grateful for comments to all parts of the proposal. Following our discussions in Istanbul I am particularly interested in including certain points of view that could be particularly relevant to developing countries.

It is a technical specification hence it is the generalisable component which is of most interest which can be supported by examples of particular scenarios.

As regards part 2, concerning data abstraction and modelling which will continue to evolve in its specificity before publication, I am interested in your remarks on how already abstracted data might become the input into a data repository/warehouse rather than primary data, as well as your points of view on indicators. As regards indicators, data warehouses are a future basic source of performance measures of different sorts. From the point of view of data quality the relation to primary data is a very important component and normally it is recommended that the primary data be in the data warehouse. However we should also be able to describe contexts where this may not be possible. As regards formulation of indicators we also should be descriptive rather than prescriptive. I would also suggest you take into account as a reference point the ISO TS 21667 on the Health Indicator Conceptual Framework which is being promoted to become an International Standard.

I am including an additional slide from my presentation as it illustrates how a data warehouse can act across different levels of care, as well as take into account not only cross-sectional but also longitudinal (pathway / process) data views and I would like to suggest this integrative perspective be taken into account in your comments and from your perspective.

This is a great and important contribution from your part and I am delighted that we can address these issues.

With very best wishes

Andrew Grant MD PhD
Professeur titulaire, université de Sherbrooke, Québec

Ngeno, Titus

unread,

Nov 25, 2008, 2:42:33 PM11/25/08

to emr-data-u...@googlegroups.com, whit...@who.int, Andrew Grant

Hi Patrick,

Not sure how help this would be but in about a year ago I was privilege to work with University of Utah Hospitals & Clinics. The objective for this project was set up a testing environment to determine the amount of backend hardware resources required to execute data-warehouse ( my case PowerInsight by Cerner) extracts against a client sized database and the length of time associated with each type of extract.

1 Test Configuration:

The test is comprised of a 3 node configuration. Extracts will be pulled from the clinical environment node (Clinical Database). And you must have two other nodes one for loading and another for informatica database. The two nodes used for running the load will have a Health management systems footprint running and one node will have the informatica database executing on it. The load nodes will also connect across to pull the extract after it has completed on the clinical environment and populate the data warehouse.

II Hardware:

Back-End Clinical Environment (used for extracts) I would think Aix P570 with 4-CPU’s 26 GB of memory . The loads environment should also be AIX P570 with 2-CPU’s 6.5 GB of memory and the second node (informatica database) should be AIX P570 and 2 CPU’s and 6.5 GB of memory

III Software:

Back-End Software: Aix 5.3 with MQseries (for message transfers) or Linux and Health management software (my case Cerner Millennium) of your choice and the load Environment should have HMS software , informatica version 7.12 and ( Cerner Powerinsight) data-warehouse EDW alpa code

IV Workflows:

Workflow 1

Workflow 1: Test Data Extracts from Clinical Database source

Identify a date range in which the number of required encounters was updated. Configure the extract scripts to pull data from the ***table_name*** for that date range, Run the extract scripts. The data extracted from the tables are written to flat files.

Workflow 2

Workflow 2: Test data loads from flat files to EDW tables

Move the generated data files to the Informatica server and load the data into the data warehouse Gather performance statistics around time measurements and other data useful in evaluating performance improvement (CPU, Memory, etc).

V) Business Objects

You would Need business Objects for the end users

I strongly believe with a good Database this could be emulated in a third world country!

I hope this helps

Titus Ngeno

(+ 1 913 207 8107)

Reply all

Reply to author

Forward

0 new messages