Re: Translating VistA (was: [Hardhats] Re: How many routines are there in VISTA?)

Ben Mehling

unread,

Nov 17, 2009, 2:30:56 PM11/17/09

to hard...@googlegroups.com

On Tue, Nov 17, 2009 at 10:43 AM, ChinaVistA <mjcha...@gmail.com> wrote:

> first we have to get VistA up and running - convert all the menus into
> simplified Chinese (hopefully adding translation dictionaries so
> others can develop other translations)

That's a large undertaking. The translation tools are available in
Medsphere FileMan, although to date I believe only the core of FileMan
has translations in the dictionary (done by George Timson).

Of course, doing this work will completely fork you from the VA patch
stream. Is that an issue for your team or do you have some strategies
in mind for keeping the patch stream maintained?

- Ben

Nancy Anthracite

unread,

Nov 17, 2009, 2:43:05 PM11/17/09

to hard...@googlegroups.com, Ben Mehling

I am wondering if the work that has been done by Kevin (on CPRS however) and
Chris Richardson can be used to do the translation in a way that is not a
fork.

--
Nancy Anthracite

r...@rcresearch.us

unread,

Nov 17, 2009, 3:34:09 PM11/17/09

to hard...@googlegroups.com

Nancy and Ben;

In a word, "YES". This is why I built the system the way I did. The big
problem now is to get GTM to accept ISO-10646 so that we can honor all of
the International Charactersets. MUMPS itself is agnostic in that it only
has one data type, characters, but the standard never said how big is a
character, 8 bits, yes, 32 bit (as called out by ISO-10646), also yes.
There are means of fenaggleing the 32 bits to be more space aware, with
very little overhead. Either 8, 16, or 32 bit character systems could be
manipulated as "characters".

The real beauty of my approach is that with ISO-10646, we could host any
or all written languages that have a published character set. The
instrumentation of VistA is a 45 minute run to extract all of the
literals and put them into the DIALOG file and generate the routines to
use the DIALOG file using the language specified by the user. The
original code can be modified, added to, and then run through my system
and the use of the extrinsic function calls for the proper string from
the DIALOG file to be represented. The instrumented code can be run in
production and the end user determines the language (or languages) they
can work with. I even had a method set up to allow for preferences in a
cascade determined by availability. Unfortunately, this will employ an
awful lot of translators to transcribe the input written in one language
into the parallels for the other languages.

Best wishes; Chris Richardson

> --
> http://groups.google.com/group/Hardhats
> To unsubscribe, send email to Hardhats-u...@googlegroups.com
>

K.S. Bhaskar

unread,

Nov 17, 2009, 7:12:49 PM11/17/09

to hard...@googlegroups.com

GT.M - Rock solid. Lightning fast. Secure. No compromises.

On 11/17/2009 03:34 PM, r...@rcresearch.us wrote:
> Nancy and Ben;
>
> In a word, "YES". This is why I built the system the way I did. The big
> problem now is to get GTM to accept ISO-10646 so that we can honor all of
> the International Charactersets. MUMPS itself is agnostic in that it only
> has one data type, characters, but the standard never said how big is a
> character, 8 bits, yes, 32 bit (as called out by ISO-10646), also yes.
> There are means of fenaggleing the 32 bits to be more space aware, with
> very little overhead. Either 8, 16, or 32 bit character systems could be
> manipulated as "characters".

[KSB] For about two years now (since V5.2-000), GT.M has had support for
Unicode (which for several years has been the same as ISO/IEC-1046 - the
standards track each other). For the best part of a year, there have
been packages of VistA on GT.M in UTF-8 mode. I have recently put out a
Toaster with the same GT.M database accessible in both M and UTF-8 mode
(for example, read through the thread at http://tinyurl.com/yau7mqc).

Chris, maybe you should set your e-mail filter to not file my hardhats
posts in the Junk mail folder? 8-)

Regards
-- Bhaskar

_____________

The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
_____________

K.S. Bhaskar

unread,

Nov 17, 2009, 7:17:52 PM11/17/09

to hard...@googlegroups.com

GT.M - Rock solid. Lightning fast. Secure. No compromises.

On 11/17/2009 03:34 PM, r...@rcresearch.us wrote:

> Nancy and Ben;
>
> In a word, "YES". This is why I built the system the way I did. The big
> problem now is to get GTM to accept ISO-10646 so that we can honor all of
> the International Charactersets. MUMPS itself is agnostic in that it only
> has one data type, characters, but the standard never said how big is a
> character, 8 bits, yes, 32 bit (as called out by ISO-10646), also yes.
> There are means of fenaggleing the 32 bits to be more space aware, with
> very little overhead. Either 8, 16, or 32 bit character systems could be
> manipulated as "characters".

[KSB] Also, I wouldn't necessarily characterize it as "very little
overhead". Even simple operations such as $L() require a string to be
parsed. I suspect that in this day and age, the overhead may not be
material, and also, there is no impact on database performance, only on
executing application code.

Also, many languages may require collation sequences to be developed.
Use of collation modules will have an impact on database performance.
Again, whether the impact is material remains unknown, and will remain so
until someone actually measures performance.

Ben Mehling

unread,

Nov 18, 2009, 11:09:47 AM11/18/09

to hard...@googlegroups.com

On Tue, Nov 17, 2009 at 11:43 AM, Nancy Anthracite
<nanth...@earthlink.net> wrote:
> I am wondering if the work that has been done by Kevin (on CPRS however) and
> Chris Richardson can be used to do the translation in a way that is not a
> fork.

I have heard Chris speak of his research before, but have never seen
the approach. Statically translated CPRS or dynamically translated
CIS only go so far to solve the problem -- the departmentals and other
applications are a huge part (the bulk) of the system. Another
approach would be middleware to translate on the fly, but I think that
would be very difficult to do accurately given the lack of context the
translation software would have about the strings being passed.

Chris- Is this code in WV EHR or elsewhere? Is your thought on the
process flow as follows (I'm inferring from your response, so please
correct me where I miss the point):

1) take a VistA distribution
2) run routines through your system to build "prep'd" dialog file (a
dialog file that has the English strings ready for translation)
3) run routines through your system to replace literals with extrinsic
function calls to fetch the translation from the dialog file
4) when new patches come out, run the same process (on the kids
builds?, routines? what?)
5) translate strings in dialog file

Also, what are your thoughts on dealing with strings that are built in
code? Subtler issues exist around plurals, etc.

The big issues (maintaining patches, dynamically generated strings and
QA of translations (although not a technical issue, still a big one))
in my mind aren't simple to solve, but Chris has spent a lot more time
thinking about this one.

- Ben

Nancy Anthracite

unread,

Nov 18, 2009, 3:22:01 PM11/18/09

to hard...@googlegroups.com, Ben Mehling

Chris' code is on the Trac.opensourcevista.net server and documentation about
how to use it is to come. He also has a white paper that needs to go up
there. Hopefully both will be there shortly. I don't think there is a
complete solution yet, but just brute force translation of VistA and CPRS
certainly does not seem to be a reasonable solution.

--
Nancy Anthracite

Steven McPhelan

unread,

Nov 19, 2009, 8:30:23 AM11/19/09

to hard...@googlegroups.com

For the given circumstance, you are not thinking large enough. On a
previous response, Chinavista clearly indicated that he (she?) is in
main land China. China is a communist country. China has 1.5 BILLION
people. VistA has about 25,000 routines. How long would it take
25,000 Chinese M programmers to brute force translate 25,000 routines.
Of course there are still the globals, but I think you get my point.

On Wed, Nov 18, 2009 at 3:22 PM, Nancy Anthracite
<nanth...@earthlink.net> wrote:
> Chris' code is on the Trac.opensourcevista.net server and documentation about
> how to use it is to come. He also has a white paper that needs to go up
> there. Hopefully both will be there shortly. I don't think there is a
> complete solution yet, but just brute force translation of VistA and CPRS
> certainly does not seem to be a reasonable solution.

--
Steve
The end of democracy will be when the electorate learns that it vote
itself largess from the public coffers.

JohnLeo Zimmer

unread,

Nov 19, 2009, 10:27:41 AM11/19/09

to hard...@googlegroups.com

So teach MUMPS to those drummers at the last Olympics and let them bang away.

Nancy Anthracite

unread,

Nov 19, 2009, 10:43:12 AM11/19/09

to hard...@googlegroups.com, JohnLeo Zimmer

Chris' solution, when complete, would allow those doctors in China to write
notes in Chinese and the doctors in other countries to work in their own
language. Using something other than the brute force solution shows foresight
in addition to making things easier.

--
Nancy Anthracite

Nancy Anthracite

unread,

Nov 19, 2009, 10:49:53 AM11/19/09

to hard...@googlegroups.com

I should have also said, ...on the same instance of VistA, i.e., using a
single database when viewing the same patient, healtcare workers in two
countries could collaborate on the care of a patient.

K.S. Bhaskar

unread,

Nov 19, 2009, 5:50:56 PM11/19/09

to hard...@googlegroups.com

Perhaps we should get levels of support for Chinese (or any other
language) clear in our analysis. In increasing order of work:

Enter notes and other data (names, addresses) in Chinese. *This should
work today when VistA is run on GT.M in UTF-8 mode.* It will require a
terminal emulator and keyboard that can input and display Chinese.
However, there likely will be some interesting hiccups when displaying
right to left languages.

Prompts in Chinese (or other languages). This would likely benefit from
Chris' VistA change.

Using culturally correct ordering, e.g., so that names appear in the
correct order. This likely would require a custom collation sequence to
be written.

Good name matching (e.g., on searches, Soundex equivalent, etc.). This
is a non-trivial problem.

Note that supporting Chinese with pinyin may be much easier than support
using either the simplified or traditional character sets.

Regards
-- Bhaskar

GT.M - Rock solid. Lightning fast. Secure. No compromises.

On 11/19/2009 10:43 AM, Nancy Anthracite wrote:
> Chris' solution, when complete, would allow those doctors in China to write
> notes in Chinese and the doctors in other countries to work in their own
> language. Using something other than the brute force solution shows
> foresight
> in addition to making things easier.

r...@rcresearch.us

unread,

Nov 20, 2009, 12:41:54 AM11/20/09

to hard...@googlegroups.com

So, Bhaskar, what would it take to have GTM support ISO-10646?

K.S. Bhaskar

unread,

Nov 20, 2009, 11:28:53 AM11/20/09

to hard...@googlegroups.com

What it would really take, Chris, is for you to start reading my Hardhats
posts! In fact, on this very thread, this very week, on November 17,

2009, I wrote:

"For about two years now (since V5.2-000), GT.M has had support for
Unicode (which for several years has been the same as ISO/IEC-1046 - the
standards track each other). For the best part of a year, there have
been packages of VistA on GT.M in UTF-8 mode. I have recently put out a
Toaster with the same GT.M database accessible in both M and UTF-8 mode
(for example, read through the thread at http://tinyurl.com/yau7mqc)."

(I realize just now that I wrote 1046 when I meant 10646.)

It doesn't help when you complain about something for the second time in
the same week, especially after your first complaint was answered less
than four hours after you posted it.

I do not plan to expend any time in responding to any further complaints
about the lack of support in GT.M for Unicode (ISO/IEC-10646).

Regards
-- Bhaskar

GT.M - Rock solid. Lightning fast. Secure. No compromises.

ChinaVistA

unread,

Nov 20, 2009, 2:58:51 PM11/20/09

to Hardhats

All - thank you for your thoughts, comments, & suggestions. I'll try
to address all of the comments in this single humongous email digest.
Apologies in advance for verboseness.

PROJECT DESCRIPTION
The HOSPITAL, as are many state and local hospitals - primarily serves
the poorer populations of China - the so-called rural socioeconomic
class. This sector is most akin to the american unskilled labor class.
The mission is to improve the quality of care to all - but most
especially to this working-class segment. WVA is only part of the
solution strategy. Properly and carefully implemented, it will
eventually enable many hospital advances - to include reducing contact
duration (currently a patient visit lasts 3 hours from the moment the
patient begins queuing for registration - which they must do daily as
the reg data is not stored) to the introduction of tracking
(charting?) patients so the HOSPITAL can begin to implement EBM
(evidence-based medicine) and other higher health care solutions to
create a more beneficial and harmonious society. Most hospitals in
China are PRE-PAY - even if you have insurance - it's still PRE-PAY -
the Patient or their survivors must individually submit the paperwork
to the insurance company and wait wait wait for the refund. All
procedures are pre-paid which causes a tremendous amount of patient
bouncing from the POC (Point of Care) to the Cashier's Window...and
then back (China invented ping pong right?). We will try to reduce
the bounce effect with payment terminals at each POC front desk -
which we hope to integrate into WVA or whatever CCHIT derivative helps
us complete the mission with minimum pain and maximum reliability.

WHO IS CHINAVISTA
This is Michael Chang (gender should be obvious now right?) - in
addition to many other things - I worked for SAIC developing various
sw packages for CHCS I and also taught VMS & CHCS systems
administration and worked several CHCS sites in Asia on the late 90's
(Japan, Korea, Guam). In 1999 we shrink-wrapped the sw onto a single
(I think) CD ROM, installable on an NT cluster server with a PHD (push
here dummy - no offense to any PhD lurkers) for MASH, shipboard, and
any other emergency use (Katrina?). The package would install the NT
OS and then layer in M and CHCS....kind of like Astronaut perhaps. I
used the ChinaVistA monikor to establish the obvious (obvious=see
project description) - not cause gender confusion.

SIMPLIFIED VS PINYIN (AKA LOCALIZATION OR L10N <- if you don't
understand the abbr, wikipedia.org it).
Pinyin is essentially english and requires the tones to vaguely guess
the meaning. Example <me/I> = <wo3> or 我 <-- assuming your browser
can see the chinese font. if not - no matter - not important. As
Chinese is tonal - in addition to sounds - a single word could have
multiple meanings - which is unacceptable for clinical work. So it's
mandatory to use the simplified chinese characters throughout. We
must train 450 doctors and about 600 support staff - not just on how
to use WVA - but probably also how to use the computer and god forbid
- how to type. So, implementation needs to be as painless as possible
- we want the staff to cheer as opposed to the vitreous OCONUS AHLTA
users. My experience with the military in Asia was fantastic - when
CHCS was down during prime time (a rarity) - physicians would
immediately lite up the help desk with "I can't see patients or
practice medicine until you fix the system"...if you want me to
elaborate on this comment - private mail me - ramifications are
complicated.

FORKING
We do NOT want to fork if possible as this is a certified clinical
system. Any changes we make have to be tested - which means we have
to write the infinitude of test cases for the 10k menus and 25k
routines to ensure we didn't break anything...SAIC got paid a billion
for this - we don't have that budget. Our preference for now is to
probably build a shell around WVA to accomplish the core mission(s).
Despite the population of China - we have neither the budget nor the
space to house all that volunteer staff - and all that cheering would
only distract. So to repeat the waay above statement - we would
prefer to encapsulate the app if that's possible, and build interfaces/
tools to accomplish our requirements. For that kind of work - there
truly are olympic proportions of SI firms.

CHINESE ON A TERMINAL EMULATOR
I'm using an Apple Mac - tested different font sets in the main shell
and worked beautifully - no mods, no configuration - although we're
still not sure how WVA and linux flavors will react (RH, Ubuntu,
etc). In case some of you are interested - we used this LOREM IPSUM
generator to quickly test different multi-byte font sets (if you don't
know what LOREM IPSUM is - wikipedia.org): http://www.lorem-ipsum.info/generator3

It generates random text in a variety of languages - works for arabic,
cyrillic, hebrew etc - but we can't verify the accuracy for arabic and
hebrew (they're right to left languages - but I think the numbers
display left to right).

PERFORMANCE TESTING
We're preparing a demo - will bring up DENTAL first - get the patient
reg, appts & scheduling working - try to activate provider notes - and
time permitting - interface with the existing billing system. Train
the DENTAL department (17 dentists, 3 nurses, 2 techs) on appts,
scheduling, registration, and hopefully procedures). Would be nice to
migrate/integrate everything from legacy billing to WVA - but one
feature at a time - as everything must be tested and I haven't figured
out how to populate the Provider Procedures yet. We also need to
build some kind of tool/script to automate patient registration so we
can build a test database to test system performance, global growth,
etc so we can size the Full-scale-deployment (FSD) system for a 5 year
capacity. In case anyone is interested - we're using an AGILE type
project management methodology (aka feature-based implementation) to
migrate the hospital. I'm in the process of scoping the entire
project and trying to build a timeline to complete the core mission
within a reasonable lifetime (aka 1-2 years). We are trying to cap
the TCO (Total Cost of Ownership) - which includes the systems/db/
network administration, training, infrastructure etc and build this
fixed cost.

This part should be fun - in a geeky sense.

TRAINING
Training is difficult to scope as we have to first understand, then
develop training manuals. SAIC used to send teams of two (2) or more
on training runs around the world for CHCS. CHCS had fantastic
training manuals - but they would have to be vetted against the
various VistA flavors for accuracy (remember - CHCS FORKED). The CHCS
documentation was on a nice compact CDROM and please don't ask. 1. I
don't have a copy and 2. if you want a copy - ask the DOD/DOH or
whatever they're called this week.

As the site will activate a second 1k bed hospital of similar or
larger size - we're (aka I) am contemplating how to keep these two
sites sync'd to avoid re-registration/duplication issues, and other
complications such as billing.

CERTIFICATION
Maintaining the CCHIT certification is critical as it provides a
differentiator from the other 400 odd solutions floating around
China.

SECURITY
We want to implement a biometric (fingerprint) login for everyone so
we can skip this username/pwd & ACCESS/VERIFY junk as most users (even
in the military) just scribble their pwd on a stickum and paste it
somewhere around the terminal - which defeats security at the physical
level. Instead of replacing the login - we may opt for an automated
script based on the initial biometric login (it would trigger a WVA
login and then hopefully gracefully exit to the user). Sounds
complicated for now - but that's the plan. Apple has a voice-
activated login - but the room's gotta be quiet - so we nixed that
(not to mention apple workstations although stunningly beautiful
increase the terminal cost by 3-4x) - although for the demo - it might
look cool.

RFP (Request for Proposal)
We're sending the RFP to our two sole source vendors next week - so
hopefully - we'll have the L10N (localization) version of the sw ready
thanks to Chris' trick and Bhaskar's GT.M mods.

CLIENT
We're NOT planning on using the CPRS in this incantation of the demo -
as we're using linux-based clients and the WVA people haven't released
a multi-lingual linux-based CPRS client for operational use yet.
So...we're planning on using JUST the old, boring, but ever reliable
TEXT version via term emulation. If they want color - they can change
the background of the TERM...for now.

We DO want the ability to provide basic teleradiology abilities to all
the docs - so BIG computer monitors are required - but we also want
all-in-one devices - makes it easier to keep things clean (desks,
monitors, keyboards, mice) - this IS a hospital after all and device
maintenance is the IT DEPT responsibility (my orders).

OPERATING MANUALS
When I supervised the Asia District for SAIC - the staff helped us
write a few manuals for VMS and CHCS systems and database
administration (not enough time or expertise back then to do the
network manuals) - so we'll also be writing these things (Standard
Operating Procedures/SOP, Standard Operating Manual/SOM, and Disaster
Recovery Manual/DRM) - lot's of standards for those things - but I
prefer milspec or ISO/ITU when possible.

Once again - thanks all for your amazing help - i hope our feedback/
FAQs as we progress proves helpful in return.

On behalf of our team, again many thanks for your community support!
Michael.

On Nov 21, 12:28 am, "K.S. Bhaskar" <ks.bhas...@fnis.com> wrote:
> What it would really take, Chris, is for you to start reading my Hardhats
> posts! In fact, on this very thread, this very week, on November 17,
>
> 2009, I wrote:
>
> "For about two years now (since V5.2-000), GT.M has had support for
> Unicode (which for several years has been the same as ISO/IEC-1046 - the
> standards track each other). For the best part of a year, there have
> been packages of VistA on GT.M in UTF-8 mode. I have recently put out a
> Toaster with the same GT.M database accessible in both M and UTF-8 mode

> (for example, read through the thread athttp://tinyurl.com/yau7mqc)."

Reply all

Reply to author

Forward