A detail, type of foreign key to autoincrement key

49 views

Skip to first unread message

Dawn Wolthuis

unread,

Jan 22, 2012, 8:02:30 PM1/22/12

to intersystems-mv

On my "think of this at some point" list is an inconsistency in our data model. I would at least like to be consistent from here on out.

When we have an autoincrement key, sometimes we create foreign keys as reference types, but sometimes, such as when they are in multivalues, we specify them as either %String or %Integer. Most of the time these are %Integer in our class definitions. Although the counter adds 1 to these @ID's, so they act like numbers, it struck me that %Integers have a max value (maybe something like 2147483647 or ~ 2 billion) where we might someday want these keys to grow beyond the billions. I do not look forward to changing the type of existing fields because whether we quote them in SQL and such aligns with the type, so I might be lazy and wait until we get to billions ;-)

In any case, if I want to be consistent from here on out and/or change existing foreign keys (to autoincrement @ID's), other than the max issue, are there reasons to choose %String or %Integer or vice versa as the type for the foreign key?

And while I'm thinking of it -- if we were to experiment again with "list of ReferenceType" instead of "list of %Integer" or "list of %String" will we still encounter complexities (bugs) going that route? I do not think we can specify foreign keys separately from the property definition yet (can we?) so it is always helpful for easy SQL statements if we specify foreign keys as reference types.

These questions and more. Thanks. --dawn

--
Dawn M. Wolthuis

Take and give some delight today

Bill Farrell

unread,

Jan 22, 2012, 10:10:11 PM1/22/12

to InterSy...@googlegroups.com

If a key is a key is a key then you can use something like
Key = SYSTEM(99):SYSTEM(12):RND(Rnd(65535)) ; * use *nix time : milliseconds since midnight : some random value as a seed string
BinKey = $SYSTEM.Encryption->MD5Hash( Key )
Key = OConv( BinKey, "MX0C" ) ; * convert the binary to a hex string

They keys are always exactly the same length and are (almost) 100% guaranteed to be unique. The more unique stuff you throw in up front, the more certain you are to get a 100% unique key every time. It's easy to check with one more READ, and easy to fix (for example, throw the key you found into the mix as the "random thing" above and rerun the hash). The advantage of using an MD5 hash is that one bit's worth of difference produces a large variation in the resulting string. If you're selecting by column criteria alone and the key is just a way to store a record, it really doesn't matter what it is.

MD5 hashes are commonly used to check data integrity (I use them a lot in communications protocols) and to verify integrity of a deploy package.

Ymmv for taste in key-shape, but this will absolutely ensure you never run out of numbers.

B

--
You received this message because you are subscribed to the Google Groups "InterSystems: MV Community" group.
To post to this group, send email to Cac...@googlegroups.com
To unsubscribe from this group, send email to CacheMV-u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/CacheMV?hl=en

Dawn Wolthuis

unread,

Jan 22, 2012, 10:27:42 PM1/22/12

to intersy...@googlegroups.com

Hi Bill -- We use Cache' built-in autoincrement keys, which work great. This should never run out of numbers either. My tiny little detail was about the "type" of a foreign key that is not designated as a reference type. In other words, if we have an autoincrement key, should we write a foreign key property in another file as

Property PartyId As %String

Property PartyId As %Integer

if we are not using a reference type like

Property PartyId As P.Party ...

This is a much simpler question. Cheers! --dawn

Bill Farrell

unread,

Jan 23, 2012, 12:21:58 AM1/23/12

to intersy...@googlegroups.com

Hmm... that wants an answer to a question I've had for a good long while yet: How strong is Cache' typing? It seems to be nearly as lenient as Pick. (And aren't things always strings in MUMPS? What about COS?) There's more legacy here than I have learned about yet. Every day I'm reminded "It ain't Pick", which I usually take as an Extremely Good Thing(tm). Cache is unbelievably flexible and willing, much moreso than, say, PHP, or Dot Net. The down side is that endless flexibility lends endless room for lots of bad practice (to wit, the R81 code I'm wrestling with at a customer's site).

It has been said often of Pick(alike) code that it's so forgiving that even if a beginner writes it badly, it will still work.

I'm really retentive about consistent typing/shaping/enforcement. I like to keep all types exactly the same (I deal with MySQL and MS-SQL a lot and they're a LOT stricter about object typing). That is, if a field in File A is a %String, I wouldn't have a property in File B as %Integer. Cache', like Pick, does a lot of converting on the fly between numeric types and string types. (I'm guessing that's handed down from the MUMPS days.) What it does with other types, I haven't a clue (I haven't looked into it, but I haven't stumped my toe but once in passing object handles from MVBasic to COS... then it cared A LOT heh).

When is object typing enforced and when does automatic conversion occur? It could be that it might not matter if the system takes care of it most circumstances. OTOH it may matter a lot when you're expecting a certain "shape" of thing always; consistency is key. Otherwise, you might encounter an automatic conversion that has unpleasant results in the current computation, or worse encounter a place where an automatic conversion cannot occur and you get an exception. (Communications phantoms get especially PO'd about that and they're devilishly difficult to debug; another reason I have extensive tracing and debugging facilities in MVBooster.)

I tend to be very pessimistic about any system's automatic typing. Cache gives you the opportunity to enforce data integrity on the way in ( propertySet() ) and to enforce data shaping on the way out ( propertyGet() ). I make extensive use of these, so no matter what some fool (I shall remain nameless) may do in a file with EDIT, if I stick to using file object properties in my routines rather than the classic Pick "record< attribute >", I'm near 100% sure my data will at least be reasonable if not entirely accurate. (Accuracy I lay on human factors; "reasonable" means my conversions, scaling, DICT items, "English" retrieval, routines, etc will work 100% of the time.)

My version of protoclass always gives ...Set() and ...Get() methods for each attribute/property, along with interfaces to standard MVFileAbstraction utilities. I've been sentenced both to read my own code and to fix "How the heck could a user do..." so enforcing integrity on Set and shaping on Get are habit. I made my MVBProtoclass and MVFileAbstraction classes super-beefy to settle typing/integrity/shaping questions up-front. The hard work of writing a file class is done for me; all I need to do is tot-up any shaping or integrity enforcement and I'm done with it, ready to put it to work. It also means my typing will be consistent without my having to think too much about it. If I have to schlep code from one system to another, nine times out of nine and a quarter I've covered eccentricities in languages that are more picky (no pun!) about typing and shaping.

If it were me, I'd make any reference to a given field object match the original type, regardless of how much "help" the system may offer. If you ever have to port code from one system or language to another, consistency will be your BFF.

Maybe some nice ISC person can chime in and give a clue as to how (and how much) the system manages typing and where it's incumbent on the programmer to ensure consistent results.

B

Dawn Wolthuis

unread,

Jan 23, 2012, 7:53:32 AM1/23/12

to intersystems-mv

Good deal. Thanks, Scott. That makes it clearer to me. I will rest easy that we can treat %Integer as if it had no upper bound that would get in our way.

I previously opened the class doc for %Integer and saw that the xDBC projection had the "usual signed 32-bit integer limits" and wasn't sure if I should care or not (not using xDBC at this point). I didn't understand "the values of which is not exact outside of a certain range" in the blurb below.

"The logical value of the %Integer data type is an integer. The %Integer datatype does not have built-in limits. It is a hybrid datatype the values of which is not exact outside of a certain range (currently it is the same range as that of the 64-bit integer value). The current range for xDBC projections can be seen under Cache configuration/advanced. Its default corresponds to the usual signed 32-bit integer limits. "

It sounds like the difference in making foreign keys %Integer over %String would be in quoting in sql, so I'll probably go with %Integer then.

Thanks! --dawn

On Sun, Jan 22, 2012 at 10:36 PM, Scott Jones <Scott...@intersystems.com> wrote:

Just in case my reply to googlegroups doesn't get through, I'm replying also just to you...

Cache can handle integer values up to 2^63-1, (about 19 digits, not the 2^31-1 that you mentioned below), and I think that is what %Integer maps to...

That will last you until after the heat death of the universe, or so I've been told!

Scott

--
You received this message because you are subscribed to the Google Groups "InterSystems: MV Community" group.
To post to this group, send email to Cac...@googlegroups.com
To unsubscribe from this group, send email to CacheMV-u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/CacheMV?hl=en

Dawn Wolthuis

unread,

Jan 23, 2012, 9:01:29 AM1/23/12

to intersy...@googlegroups.com

On Sun, Jan 22, 2012 at 11:21 PM, Bill Farrell <james.west...@gmail.com> wrote:

Hmm... that wants an answer to a question I've had for a good long while yet: How strong is Cache' typing? It seems to be nearly as lenient as Pick.

The typing or lack thereof is in the languages used. One reason that we have a mixture of %Integer and %String for autoincrement foreign keys is that in most places here, the languages are forgiving. In SQL, for example, if you have a property declared as an %Integer and you have an SQL statement

select * from Some.File where MyInteger = '1'

It will do as expected (or as a Pickie would expect) and bring up the row with MyInteger = 1

(And aren't things always strings in MUMPS? What about COS?) There's more legacy here than I have learned about yet. Every day I'm reminded "It ain't Pick", which I usually take as an Extremely Good Thing(tm). Cache is unbelievably flexible and willing, much moreso than, say, PHP, or Dot Net. The down side is that endless flexibility lends endless room for lots of bad practice (to wit, the R81 code I'm wrestling with at a customer's site).

It has been said often of Pick(alike) code that it's so forgiving that even if a beginner writes it badly, it will still work.

Those preferring strong-typing suggest that even an expert cannot write anything that works correctly and is maintainable. It is way too easy to get things to compile and harder to debug with loose typing. There are pros and cons to each. When asked once whether I prefer to work with strong or loosely typed languages, I replied that loose typing was my comfort food.

I'm really retentive about consistent typing/shaping/enforcement. I like to keep all types exactly the same (I deal with MySQL and MS-SQL a lot and they're a LOT stricter about object typing). That is, if a field in File A is a %String, I wouldn't have a property in File B as %Integer.

In the case of an autoincrement key, we do not specify it to the class, so there is no spec of ours that has a type with it, which is why it seemed up in the air whether we spec it as an %Integer or %String. The key value is generated by Cache'. I think there are other PICK-alikes with an autoincrement feature but I have never used the DBMS for this and I really like it.

Cache', like Pick, does a lot of converting on the fly between numeric types and string types.

MVBASIC works as expected and COS is similar, it seems. By the way, when working with persistent classes, there are some hidden methods for DisplayToLogical and LogicalToDisplay that are employed.

(I'm guessing that's handed down from the MUMPS days.) What it does with other types, I haven't a clue (I haven't looked into it, but I haven't stumped my toe but once in passing object handles from MVBasic to COS... then it cared A LOT heh).

In a persistent class, you can declare the type of a property to be the name of another class. So, we have a persistent class named P.Party with an autoincrement key. Then in another class, P.Another we can define the foreign key as

Property PartyId As P.Party (MVATTRIBUTE = 5, MVWIDTH = 13);

instead of either %Integer or %String. Then if the P.Party file has FirstName in it and we want to do a Cache' SQL statement without specifying a join, we can write

select PartyId, PartyId->LastName from P.Another;

Because PartyId IS-A P.Party so it has a LastName. This is awesome! Instead of coding a flood of I-desc TRANS types we can reach from our current record to others that it can see by specifying types like this.

When is object typing enforced and when does automatic conversion occur?

I would like to be clearer on this too. I've been caught by it a few times. I think Zen cares in some places, perhaps when setting the value of a model field from JavaScript? I'm forgetting. I don't make mistakes often ;-)

It could be that it might not matter if the system takes care of it most circumstances. OTOH it may matter a lot when you're expecting a certain "shape" of thing always; consistency is key. Otherwise, you might encounter an automatic conversion that has unpleasant results in the current computation, or worse encounter a place where an automatic conversion cannot occur and you get an exception. (Communications phantoms get especially PO'd about that and they're devilishly difficult to debug; another reason I have extensive tracing and debugging facilities in MVBooster.)

I tend to be very pessimistic about any system's automatic typing. Cache gives you the opportunity to enforce data integrity on the way in ( propertySet() ) and to enforce data shaping on the way out ( propertyGet() ). I make extensive use of these, so no matter what some fool (I shall remain nameless) may do in a file with EDIT, if I stick to using file object properties in my routines rather than the classic Pick "record< attribute >", I'm near 100% sure my data will at least be reasonable if not entirely accurate. (Accuracy I lay on human factors; "reasonable" means my conversions, scaling, DICT items, "English" retrieval, routines, etc will work 100% of the time.)

We do not use setters and getters on the files. We had tried adding ...IsValid() methods to model classes, but backed off from that for some reason, so our server-side validation logic is not coded consistently enough. Application developers have enough rope to hang ourselves. Specifying foreign keys as reference types helps with referential integrity, which is also a nice feature of Cache'.

My version of protoclass always gives ...Set() and ...Get() methods for each attribute/property, along with interfaces to standard MVFileAbstraction utilities. I've been sentenced both to read my own code and to fix "How the heck could a user do..." so enforcing integrity on Set and shaping on Get are habit. I made my MVBProtoclass and MVFileAbstraction classes super-beefy to settle typing/integrity/shaping questions up-front. The hard work of writing a file class is done for me; all I need to do is tot-up any shaping or integrity enforcement and I'm done with it, ready to put it to work. It also means my typing will be consistent without my having to think too much about it. If I have to schlep code from one system to another, nine times out of nine and a quarter I've covered eccentricities in languages that are more picky (no pun!) about typing and shaping.

If it were me, I'd make any reference to a given field object match the original type, regardless of how much "help" the system may offer.

Agreed. Again, the issue here is that I do not see any spec for this type as the value is autogenerated. From an MV standpoint, we know the @ID is just a string. Because it is incremented, I know that the system is treating it in some way like an int.

If you ever have to port code from one system or language to another, consistency will be your BFF.

Maybe some nice ISC person can chime in and give a clue as to how (and how much) the system manages typing and where it's incumbent on the programmer to ensure consistent results.