So:
GTM>w 2.0
2
GTM>w 2.1
2.1
GTM>w "2.0"
2.0
GTM>w +"2.0"
2
"2.1" would sort after "2.0" , I think.
And both of those after 2, and 2.1
Thus:
GTM>s x(2.0)="a",x(3.1)="b",x("2.0")="c",x("2.1")="d",x("2.10")="e"
^
GTM>zwr x !
x(2)="a" !
x(2.1)="d" <--------------------------------!
x(3.1)="b"
x("2.0")="c"
x("2.10")="e"
But "2.1" gets converted to numeric while "2.10" does not.
So, it turns out I can reproduce the phenomenon but not explain it.
{Well, now I can, because Maury just did.} :-)
jlz
Per Maury's suggestion:
(Adding just a space " " will force string indexes)
(Adding "+" forces numeric, "x(+y)"
Beating a dead horse a little, perhaps:
GTM>s x("2.0 ")=1,x("2.1 ")=2,x("2.0")=3,x(2.1)=4,x("2.1")=5,x(+"2.0")=6,
x(+"some text")=7,x("")=8,x(+"")=9
GTM>zwr
x(0)=9 <--- 7,9 equivalent, (0).
x(2)=6
x(2.1)=5 <--- 4,5 same-same.
x("")=8
x("2.0")=3
x("2.0 ")=1
x("2.1 ")=2
regards,
johnleo
Kevin:
-- --------------------------------------- Jim Self Systems Architect, Lead Developer VMTH Information Technology Services, UC Davis (http://www.vmth.ucdavis.edu/us/jaself) --------------------------------------- M2Web Demonstration with VistA (http://vista.vmth.ucdavis.edu/) ---------------------------------------
MUMPS is a strongly-typed language. MUMPS always knows exactly which
data type it is using. It is however a dynamically typed language. This
creates a far more flexible environment, but puts the burden on the
programmer to understand the rules for data-type transformation.
Yours truly,
Rick
You've hit the nail on the head here. Except... except that we can't really say that MUMPS is strongly typed if the typing rules aren't spelled out in the standard.
I agree with you that for practical purposes MUMPS behaves like dynamically typed language, but the implementation can't be the standard.
The difference between 0002.000 and "0002.000" shows up in code, where
the presence or absence of the quotes explicitly denotes your intention
for the value to be interpreted as a string or as a number. The main
places where values like "0002.000" come up is when importing them from
foreign system via HL7 interfaces, or reading them directly from
machines that encode their results in fixed numbers of decimal places,
or lifting them from blocks of text. In all such cases, any numeric
coercion of "0002.000" causes it to drop the quotes and then proceed
through numeric coercion, which yields the canonic value of 2.
That is, usually something like 0002.000 will be an intermediate
calculation or interface product rather than a result. If you care about
zeroes, for example if you are aligning figures in a table or recording
measurement precision, then you have to do the work to keep the value
interpreted as a string to prevent numeric interpretation from rendering
it in its canonic numeric form. Quotation marks, for example, will
preserve those zeroes when you save it as a variable value or as a
subscript value.
Interestingly, although with a value like 0002.000 you have the option
of collating it either as a number (2) or as a string ("0002.000"), with
a cacnonic number you do not have that option. There is no way to get a
number like 12 to collate with the strings, not even if you enclose it
in quotation marks (like this: "12"). The act of storing it in a
subscript forces numeric interpretation of any value, string or not,
whose form precisely matches a canonic number. Since the quotes are not
actually part of the value, "12" is rendered 12 and sorts as a number.
If you want to mix canonic number forms and noncanonic numbers and
strings together in a subscript and get them all to sort as strings, you
have to append something to them all (like a space) to ensure that none
of them can match the canonic form of a number.
This follows clearly from MUMPS's data-type coercion rules, but makes no
sense if all values in MUMPS are actually strings.
Yours truly,
Rick
-skip
-skip
Before the 1984 standard MUMPS only accepted positive numeric
subscripts, so the numeric portion of the modern MUMPS subscript
collation actually predates the string, which is why (for backward
compatibility) the numeric interpretation takes precedence over the
string when deciding how to interpret each piece of data going into a
subscript.
It's easiest to resolve these conflicts usually just by changing our
algorithm.
Task Manager had a problem for years because $HOROLOG's value is a
string, but when we look at it we see two numbers separated by a comma
and expect it to collate numerically. Since most of the day $H is five
numeric digits a comma and five more digits, the differences between
numeric and string collation didn't come up, but early in the morning
when the number of digits for seconds since midnight was less than five
digits, collation "errors" began cropping up, with tasks running "out of
order". Wally's solution, which was the right one, was to abandon using
$H as a subscript because we do not want string collation for time.
Instead, he converts it to seconds, a bug number, which collates
strictly numerically and hence keeps tasks in chronological order all
the time.
If we want KIDS version numbers, which are a mix of numbers (2.1) and
strings (2.0), to collate together and sequentially, we need to get rid
of the mix by imposing a uniform type, and either will do. If we want
strings, we need to concatenate something onto them all to make none of
them a canonic number form, and then we will have to do a transform when
comparing lookup values with subscript values to make them match again.
If we switch to strictly numeric, by plussing each value to force
numeric interpretation, we also escape the collation problem, but now we
still have to do transforms to compare a subscript value like 2 with the
original value like "2.0". That's probably the right solution, since
then only some of the values have to be transformed, whereas in the
string approach they all have to be transformed to make the effect of
the transforms on the collation zero out.
The industry believes it has a definition of data types because it is a
big herd in which everyone has the same family scent, doing what seems
like the same thing, which therefore becomes right in the way crowds
tend to find their own crowd behavior self-validating. They have also
constructed a lot of wonderful rationalizations that pass for theories
to prove they are doing the right thing. However, the truth is none of
us is doing the right thing, and the way we know is that we have not
solved the software crisis. Whoever actually gets to do so will get the
bragging rights of being proven by reality to be right, but until then
the rest of us are constructing plausible theories like toolkits we use
to help us solve limited problems. Since the majority prefers static
languages, their data type models are also static and tend to work best
when used in static ways, but the more dynamism is involved in the
algorithm or the data, the more clunky these static theories of data
types tend to behave, and the more they fall apart in practice.
I am in no way saying that static data typing is wrong, only that it
works better for some things than others, just as dynamic data typing
works better for some things than others. The best toolkit includes both
kinds of data typing, which is what we wanted to do with the
object-oriented extensions to MUMPS, to give Mumpsters the option of
whether to use dynamic, operation-driven typing or to use more static,
object-oriented typing case by case, problem by problem (the Omega guys
will remember our principle of Fire and Ice, which was about this).
As a side note, different parts of the industry also use the term data
type to mean different things, even though they tend to agree about the
static part. Sometimes it has a lot to do with the kinds of interfaces
and contracts the data will honor, sometimes more with the internal
structure of the data, and sometimes about what rela-world entity is
being modeled regardless of the implementation. The term is most useful
when we understand tha variety of ways it can be used and don't try to
apply too much rigor to it, because if we do we find it falls apart into
several related but quite distinct ideas for which we currently have but
a muddle of language to try to grasp.
Yours truly,
Rick
-skip
-skip
One of the first exercises I do when I encounter a new language was the first program I ever wrote and that is write out the squares from 1 to 100 using only addition and you can do 2+2, or 3+3, etc. Oh and btw you can't use a For loop.
Interestingly, as Thomas points out in his article, you can go pretty
far with MUMPS believing that the string is the only type and that the
others are all mappings or interpretations, and as long as you can
compartmentalize in your brain effectively, you can properly implement a
MUMPS system or code in MUMPS without ever fully acknowledging the
multiplicity of data types in MUMPS. So, technically, both readings are
correct.
The problem is that (healthy) human beings do not compartmentalize as
well as machines, so having it be both true and not true that the string
is the only data type causes no end of confusion among MUMPS students,
which is why so many of them end up getting to be capable but not fully
competent with MUMPS. MUMPS's data type behavior makes far more
intuitive sense, i.e., is more humanly comprehensible, learnable, and
predictable, only when interpreted as multiple data types, even though a
MUMPS implementor can get the correct behavior out of his implementation
without that interpretation just by strictly following the rules of the
standard.
Certainly as far as possible readings go the 1995 MUMPS standard is not
perfect. It has far more rigor than most language standard documents, a
state of affairs that is both surprising to people who assume MUMPS is
backward (the MDC has after all been standardizing MUMPS for longer than
most other language-standards bodies, and has made and learned from a
lot more mistakes accordingly) and a state of affairs that is appalling
to those of us who know that fully half the proposals for the millennium
MUMPS standard were error corrections or the removal of ambiguity. Error
handling in particular is very easy to misunderstand as currently
written in the standard, which is why it took me a year to come up with
a clear, straightforward explanation of how it works for my Paideia
class (who might not entirely agree with my characterization of my
explanation as clear and straightforward), and why one of the more
important error-correction proposals for the millennium standard was
written by David Marcus as a result of trying to correctly implement
error processing as written in the standard.
Still, the errors in the standard are often very subtle ($REFERENCE
excepted) and rarely if ever touch on the data-type issue, which was
laid down long ago and has had a lot of time for vetting and refinement.
It tends to be the newer stuff that is more raw.
If you're interested, when we get into the process of cutting the next
MUMPS standard, but after we have a chance to apply all the existing
proposals that repair defects in the standard, I would love to have your
mathematically precise eyes go over the results and help us clear out
any additional remaining ambiguities. To do so now would just be a waste
of your time, since we already have a great pile of fixes to apply (and
some of them might even introduce new mistakes for you to find).
Yours truly,
Rick
-skip
Hint
1 = 1
2 = 4
3 = 9
4 = 16
5 = 25
Play with those numbers.
-skip
Skip Ormsby wrote:The answer is not 42 and no Greg you are using exponentiation (a form of multiplication). You and only use addition and subtraction, remember the IBM 1400s did not have the internal code to do multiplication and division.
-skip
Ruben Safir wrote:On Tue, 2008-10-28 at 14:42 -0700, Greg Woodhouse wrote:Can you use math? (n+1)^2 - n^2 = 2n + 1Yes the answer is 42
Although making them all strings is an option, I think you actually want
the collation of numbers here, so it's better to make them all numbers.
Just plus them when you store them and you will force numeric coercion
even on values like "2.0". Now that means the index will store 2 instead
of "2.0", but it will collate before 12 instead of after "12.0", which I
suspect is what you want here.
You will still need to do conversions to use this index, to plus values
(+X) before you check for them in the index subscript, but the resulting
code should look just about as elegant as Pascal would be, with just the
addition of the +. As a side benefit, by plussing the values as you
stick them in the index, you will be back in the know about what their
data type is.
Yours truly,
Rick
I agree that we're approaching the same point, but our directions may
not be as different as at first seems to be the case. Getting a firmer
mathematical foundation for for the standard is a very very interesting
proposition to me. The earliest MUMPS standard included state-transition
diagrams to help make MUMPS's behavior even more rigorously specified,
and I would love to see something introduced to replace them. When we
were working on Omega, I experimented with an object-oriented
specification of MUMPS from the perspective of methods and properties of
the MUMPS virtual machine, and the initial experiments produced much
clearer results than the current specification.
Clearly then there is room for improvement. What do you have in mind?
Yours truly,
Rick
> <mailto:smcp...@alumni.uci.edu
There are two standards, because the Patch module predates the KIDS
module and each was developed by a different programmer. The patch
module allows version numbers like 2, but KIDS requires at least one
decimal position like 2.0. So there are two version numbering patterns,
which means you get to choose.
The reason you should go with numeric is so that 1 sorts before 2 which
sorts before 12, instead of "1.0" sorting before "12.0" which sorts
before "2.0". You want numeric collation so that versions sort in the
order you expect. Small custom storage arrays collate exactly the same
as indexes on Fileman files, in this regard and all others.
When you're ready to match the version numbers to versions that are
already in the Package file, just append the ".0" back onto the end of
any version number that lacks a decimal point, like this:
I VERSION'["." S VERSION=VERSION_".0"
before you check the Package file.
Rick,
What I mean is that in pascal, the conversions is explicit, not
dynamic as mumps does. For example, I appreciate your prior comments
that 2.1 is different from 2.0. But if I have a function like below
DoSet(i)
set x(i)=""
quit
I can't be sure what I am getting. What is "i"? Is it a number? or a
string? Sure mumps knows, but I don't.
Again, we need to learn the chess game for what it is an play it. I
was just complaining a bit. Trust me, when I am working with pascal,
I wish for mumps globals!
But my immediate problem is that when I have the statement:
S ^TMG("KIDS",VER)="" etc, I need a way to force the different
versions of Ver to behave the same.
I tried adding quotes around the variable, and that doesn't work:
GTM>set ver="1.0" set x(ver)=""
GTM>set ver=2.0 set x(""""_ver_"""")=""
GTM>zwr x
x("""2""")=""
x("1.0")=""
And in the example below, how could I have forced 1.1 to remain "1.1"
and so sort correctly?
GTM>set a="abc*1.0*123",b="abc*1.1*123"
GTM>set ver1=$p(a,"*",2),ver2=$p(b,"*",2)
GTM>w ver1,!,ver2
1.0
1.1
GTM>set x(ver1)="",x(ver2)=""
GTM>zwr x
x(1.1)=""
x("1.0")=""
Thank you,
Thom H. another HHGTTG
fan
He didn't have time to waste, when my 7 year old son showed him his pet
parakeet (Little), Tom insisted on showing him how to train him to do
tricks, and how to play with him properly. On countless times when I
said I was stumped on a problem he said well lets look at it together
right now. When we discussed talking to a third person about work, he
would instantly get that other person on the phone with us, (sometimes
regardless of how late it was).
He was missing the part that said work was supposed to be boring, for
him it was always exciting and fun. He was missing the part of people
that keep them from telling jokes you know no one will laugh at, but he
told them anyway, and we always laughed.
He wasn't missing a couple things. He wasn't missing friends, he wasn't
missing a wonderful family that it was clear he loved very much, he
wasn't missing a keen mind. He certainly wasn't missing respect which he
had in abundance.
Now however I find that I miss him and his friendship.
Tom's service was very moving, I know all on this list that knew him
would have been there if they could.
Rest in Peace Tom
Thank you. That's the Tom I knew and loved. For days I have been looking
for the words to express my love for Tom, and here they are. Thank you.
Yours truly,
Rick