Relationships vs using as a property or arrays

9 views
Skip to first unread message

Neerav Verma

unread,
Nov 6, 2009, 5:55:46 PM11/6/09
to Ensemble-in...@googlegroups.com
Little birdie told me that in Cache 5.0 if you had relationships with another class and the number of records went over 10k or so the performance issues were tremendous
As per my experience with other databases even millions of records don't make a difference

I wanted to have an opinion as to how true is that as that will help me better design the application in terms of performance


Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Ogden Nash  - "The trouble with a kitten is that when it grows up, it's always a cat."

Eric

unread,
Nov 9, 2009, 8:30:17 AM11/9/09
to InterSystems: Ensemble in Healthcare

My 2 cents answer applies to the current release:

a) arrays are always swizzled with the object whiole relationships
aren't. So arrays bring a lot more overhead and should never be use
with high cardinality or if you expect a lazy load behavior.

b) relationships are fast, however there are some cases where you
should bypass them. ex; it is faster to insert directly in a related
table that through the relationship. And if you are doing a LOT of IO,
using sql access will be faster than object access because you don't
swizzle anything and don't run constructors and destructors.

Relationships are fast but you do need to use them appropriately AND
index them. You also need to remember to unszwizzle if you loop over a
relationship. They bring power and flexibility but you still need to
understand how there are implemented internally to use them
efficiently. ex: you better understannd the difference between
MyObject.%Save(0) and MyObject.%Save(). The second can be 10x slower
in some circumstances if you work have relationship...and it can also
save you a lot of code, it's a trade-off.

hope that helps

On Nov 6, 5:55 pm, Neerav Verma <vnee...@gmail.com> wrote:
> Little birdie told me that in Cache 5.0 if you had relationships with
> another class and the number of records went over 10k or so the performance
> issues were tremendous
> As per my experience with other databases even millions of records don't
> make a difference
>
> I wanted to have an opinion as to how true is that as that will help me
> better design the application in terms of performance
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Ogden Nash <http://www.brainyquote.com/quotes/authors/o/ogden_nash.html>  -

Eric

unread,
Nov 9, 2009, 8:34:47 AM11/9/09
to InterSystems: Ensemble in Healthcare

Addemdum;)

> the number of records went over 10k

You can have millions of records in a relationship with no problem at
all...if you take care what you are doing (ex: previous comment on
%Save()). If you don't realize that some code will swizzle a lot of
objects automatically behind your back you will have poor performance.

On Nov 6, 5:55 pm, Neerav Verma <vnee...@gmail.com> wrote:
> Little birdie told me that in Cache 5.0 if you had relationships with
> another class and the number of records went over 10k or so the performance
> issues were tremendous
> As per my experience with other databases even millions of records don't
> make a difference
>
> I wanted to have an opinion as to how true is that as that will help me
> better design the application in terms of performance
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Ogden Nash <http://www.brainyquote.com/quotes/authors/o/ogden_nash.html>  -

Neerav Verma

unread,
Nov 9, 2009, 10:20:56 AM11/9/09
to ensemble-in...@googlegroups.com
So let's say I have one parent class Patient
and have 10 child objects. Address, Phone, Name etc... (just an example)

I create a Patient
Then I go and create / update its child objects one by one.

So  my code would be

obj = Patient. new
set patient properties
save patient

objC = PatientPhone New
set objc properties
set objc.Patient = Patient
save objc.

If i had patientphone serial object instead of persisten
I will still call a new
but instead of saving the objc i will insert it to patient
patient.setAt(1, objc)

To mainatin indexes etc I feel having the class persistent is a better approach. 
Is there any specific code here which I can change to make it faster?

How do I unswizzle objects?


Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Charles de Gaulle  - "The better I get to know men, the more I find myself loving dogs."

--
You received this message because you are subscribed to the Google Groups "InterSystems: Ensemble in Healthcare Community" group.
To post to this group, send email to Ensemble-in...@googlegroups.com
To unsubscribe from this group, send email to Ensemble-in-Healt...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/Ensemble-in-Healthcare?hl=en

Eric

unread,
Nov 9, 2009, 4:45:10 PM11/9/09
to InterSystems: Ensemble in Healthcare

A serial object and a persistent object are both ok but have a very
different design. A serial object is an embeddable object. You cannot
save it by itself because it is saved in its container (i.e. the class
(es) that use it).

> How do I unswizzle objects?

Look at the %UnSwizzleAt method in the %RelationshipObject object. You
will also find it in the documentation. There is also a good section
on relationships. There is also an example in the SAMPLES namespace in
Sample.Company; method PrintPayroll()

On Nov 9, 10:20 am, Neerav Verma <vnee...@gmail.com> wrote:
> So let's say I have one parent class Patient
> and have 10 child objects. Address, Phone, Name etc... (just an example)
>
> I create a Patient
> Then I go and create / update its child objects one by one.
>
> So  my code would be
>
> obj = Patient. new
> set patient properties
> save patient
>
> objC = PatientPhone New
> set objc properties
> set objc.Patient = Patient
> save objc.
>
> If i had patientphone serial object instead of persisten
> I will still call a new
> but instead of saving the objc i will insert it to patient
> patient.setAt(1, objc)
>
> To mainatin indexes etc I feel having the class persistent is a better
> approach.
> Is there any specific code here which I can change to make it faster?
>
> How do I unswizzle objects?
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Charles de Gaulle<http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html>
> - "The better I get to know men, the more I find myself loving dogs."
>

Neerav Verma

unread,
Nov 9, 2009, 4:47:31 PM11/9/09
to ensemble-in...@googlegroups.com
If I make a serial object and have it as an array 

How is it different from having one class persistent as children
besides the referential integrity part

Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

Eric

unread,
Nov 9, 2009, 5:16:30 PM11/9/09
to InterSystems: Ensemble in Healthcare

A serial object is actually stored as a List, not an array. The list
is stored in-row as a single field of the container. So it it stored
and loaded by and with the container. You see the serial object as a
property of its container class. You are also limited to 1 instance of
your serial object unless you modify the class definition to add
another property of the serial type. In short, the serial object is
handled as a single property/field of its container class.

A relationship can have a large cardinality. If a Customer has a child
relationship on Phones, each customer record can have an "unlimited"
number of phones. If Phone is implemented with a serial object the
number of phone numbers a customer can have is finite (usually 1 but
your serial could have Phone1, Phone2, Phone3 properties). If the
relationship is 1-many instead you could also have common (i.e.
shared) phone numbers between customers.

Cache is quite flexible but you need to know what are your specific
needs and make design decisions based on that.

On Nov 9, 4:47 pm, Neerav Verma <vnee...@gmail.com> wrote:
> If I make a serial object and have it as an array
>
> How is it different from having one class persistent as children
> besides the referential integrity part
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Marie von Ebner-Eschenbach<http://www.brainyquote.com/quotes/authors/m/marie_von_ebnereschenbac....>
> - "Even a stopped clock is right twice a day."
>

wolfkoelling

unread,
Nov 13, 2009, 7:18:33 AM11/13/09
to InterSystems: Ensemble in Healthcare
I can't fully agree with your response. The question related
specifically to Caché v5.0 and it is indeed true that in v4 and v5.0
relationships suffered from a near catastrophic performance drop once
you got into the thousands of child / many elements, in particular
when inserting objects. This was fixed in v5.1 (see chapter 5.2.1 in
release notes document at http://docs.intersystems.com/documentation/cache/cache51/PDFS/GCRN.pdf).

With regards to arrays of objects these are not swizzled into memory
when you open the "one" object, that would be ludicrous, but the
pointers to the "many" objects in the "one" class are indeed all
loaded which is a performance overhead.

On that version of Caché, if we talk about a very large collection and
we want Caché to take care of referential integrity and want to
maintain performance then I would suggest adopting a more relational
design, i.e. have nothing in the "one" class (neither relationship nor
array of objects) but simply add a foreign key constraint in the
"many" class.

Wolf
> > "The trouble with a kitten is that when it grows up, it's always a cat."- Hide quoted text -
>
> - Show quoted text -

Neerav Verma

unread,
Nov 13, 2009, 9:12:21 AM11/13/09
to ensemble-in...@googlegroups.com
Can you please explain more about the constraint

And does having parent/child also loads all the child objects in memory?

We use arrays a lot but the problem with arrays is that they are much harder to deal on zen pages
They don't maintain referential integrity and I can't have indexes on classes with arrays
(I generally define a serial class and use it as an array)
But if I define children, I still have data associated with parent and have referential integrity and also indexes

What would you comment on that?

Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Samuel Goldwyn  - "I'm willing to admit that I may not always be right, but I am never wrong."

wolfkoelling

unread,
Nov 13, 2009, 12:01:26 PM11/13/09
to InterSystems: Ensemble in Healthcare
As for foreign keys:
http://docs.intersystems.com/cache50/csp/docbook/DocBook.UI.Page.cls?KEY=GSQL_foreignkeys#GSQL_C10492

No, as Eric explained, opening a parent object doesn't swizzle the
child objects into memory. The performance problems are related to
amending elements in your child collection when Caché runs a query on
all child/many elements.

Btw, in later versions of Caché (certainly as of 2007.1) you can
define indices on array and list properties (even though some of these
were being ignored by the sql compiler when generating code for
queries - but all that is fixed as of 2009.1 I believe).

As you are using ZEN and ZEN wasn't launched until 2007.1 I'm assuming
that you have application servers with a fairly recent version of
Caché and v5.0 data servers? You should be able to compare
relationship behaviour between these two versions, quite a difference
believe me.

Wolf

On 13 Nov, 14:12, Neerav Verma <vnee...@gmail.com> wrote:
> Can you please explain more about the constraint
>
> And does having parent/child also loads all the child objects in memory?
>
> We use arrays a lot but the problem with arrays is that they are much harder
> to deal on zen pages
> They don't maintain referential integrity and I can't have indexes on
> classes with arrays
> (I generally define a serial class and use it as an array)
> But if I define children, I still have data associated with parent and have
> referential integrity and also indexes
>
> What would you comment on that?
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Samuel Goldwyn<http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html>
> >http://groups.google.com/group/Ensemble-in-Healthcare?hl=en- Hide quoted text -

Eric

unread,
Nov 13, 2009, 12:38:49 PM11/13/09
to InterSystems: Ensemble in Healthcare

> I can't fully agree with your response. The question related
> specifically to Caché v5.0

Yes, but I initially stated: "My 2 cents answer applies to the current
release: "


> With regards to arrays of objects these are not swizzled into memory
> when you open the "one" object, that would be ludicrous, but the
> pointers to the "many" objects in the "one" class are indeed all
> loaded which is a performance overhead.

yes and when talking about "millions of records", loading the entire
array each time is too expansive. I don't see any advantages to arrays
for persistent objects, do you? Child relationships seem much more
efficient.

Neerav Verma

unread,
Nov 13, 2009, 3:21:46 PM11/13/09
to ensemble-in...@googlegroups.com
So if I have an array of String how do I index it
I tried 

Property CrossReferenceCodes As array Of %String;
CrossReferenceCodesIdx on  CrossReferenceCodes

And it did not compile

Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Ted Turner  - "Sports is like a war without the killing."

Eric

unread,
Nov 13, 2009, 4:28:49 PM11/13/09
to InterSystems: Ensemble in Healthcare

I don't have a good answer since I am not using arrays much but if you
create an ArrayOfCodes class that extends %Library.ArrayOfDataTypes
and then declare your property as Property CrossReferenceCodes as
ArrayOfCodes, you can create an index on it (tested with 2010.1).

I didn't check if the index is used properly nor did I do any other
tests since I prefer relationships but if you experiment with this I
will be interested in knowing the results;)


On Nov 13, 3:21 pm, Neerav Verma <vnee...@gmail.com> wrote:
> So if I have an array of String how do I index it
> I tried
>
> Property CrossReferenceCodes As array Of %String;
> CrossReferenceCodesIdx on  CrossReferenceCodes
>
> And it did not compile
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Ted Turner <http://www.brainyquote.com/quotes/authors/t/ted_turner.html>  -
> "Sports is like a war without the killing."
>

Eric

unread,
Nov 14, 2009, 11:10:38 AM11/14/09
to InterSystems: Ensemble in Healthcare

> Property CrossReferenceCodes As array Of %String;

Try:

Index NewIndex1 On CrossReferenceCodes(ELEMENTS) [ Data =
CrossReferenceCodes(KEYS) ];

wolfkoelling

unread,
Nov 16, 2009, 4:54:06 AM11/16/09
to InterSystems: Ensemble in Healthcare
If you want to create an index on both key and value:
Index <myIndexName> On CrossReferenceCodes(ELEMENTS, KEYS);

If you want to create an index on just the key of the array entry:
Index <myIndexName> On CrossReferenceCodes(KEYS);

If your property was defined as a list of datatypes:
Index<myIndexName> On CrossReferenceCodes(ELEMENTS);

As I said, this is not available in Caché 5.0 and while, e.g. 2007.1
will accept all these examples and will build the indices correctly
the sql engine will certainly ignore array key index. As of 2009.1 all
of this is working fine, as far as I can see.

Wolf

On 13 Nov, 20:21, Neerav Verma <vnee...@gmail.com> wrote:
> So if I have an array of String how do I index it
> I tried
>
> Property CrossReferenceCodes As array Of %String;
> CrossReferenceCodesIdx on  CrossReferenceCodes
>
> And it did not compile
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Ted Turner <http://www.brainyquote.com/quotes/authors/t/ted_turner.html>  -
> "Sports is like a war without the killing."
>
>
>
> On Fri, Nov 13, 2009 at 12:38 PM, Eric <eric.ane...@gmail.com> wrote:
>
> > > I can't fully agree with your response. The question related
> > > specifically to Caché v5.0
>
> > Yes, but I initially stated: "My 2 cents answer applies to the current
> > release: "
>
> > > With regards to arrays of objects these are not swizzled into memory
> > > when you open the "one" object, that would be ludicrous, but the
> > > pointers to the "many" objects in the "one" class are indeed all
> > > loaded which is a performance overhead.
>
> > yes and when talking about "millions of records", loading the entire
> > array each time is too expansive. I don't see any advantages to arrays
> > for persistent objects, do you? Child relationships seem much more
> > efficient.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "InterSystems: Ensemble in Healthcare Community" group.
> > To post to this group, send email to
> > Ensemble-in...@googlegroups.com
> > To unsubscribe from this group, send email to
> > Ensemble-in-Healt...@googlegroups.com
> > For more options, visit this group at
> >http://groups.google.com/group/Ensemble-in-Healthcare?hl=en- Hide quoted text -

wolfkoelling

unread,
Nov 16, 2009, 5:08:35 AM11/16/09
to InterSystems: Ensemble in Healthcare
> Yes, but I initially stated: "My 2 cents answer applies to the current
> release: "

Indeed you did and I overlooked that. So me not agreeing with you made
no sense.

> I don't see any advantages to arrays for persistent objects, do you? Child > relationships seem much more efficient.

Totally agree when it comes to parent/child relationships. However,
arrays (and lists) can play a role as an alternative to one/many
relationships. For example, in our system we have a class called
"Client" which is pretty much a core class and we keep creating new
classes that relate to one or more clients. If I wanted to establish
one/many relationships then the Client class would be under near
constant development and have well in access of 50 relationship
properties. Hardly practical. So typically we create an array of
Client property on the new class thereby not touching the Client class
itself. Of course we know that we don't have millions of clients, not
even ten thousands and (unless business picks up in a dramatic
fashion) won't have in the future.


Wolf
-------------

Eric

unread,
Nov 16, 2009, 7:47:58 AM11/16/09
to InterSystems: Ensemble in Healthcare

> have well in access of 50 relationship
> properties. Hardly practical.

We use standard Foreign Keys for this. It provides referential
integrity + the cascading option that current relationships lack
(since they currently support just NOACTION for one-many
relationships). Using scalar fields with FK does not provide automatic
swizzling and does not allow the -> object syntax in SQL but having
referencial integrity and cascading options is more important for us.
I haven't tried using FK with arrays but it does not seem feasible.


On Nov 16, 5:08 am, wolfkoelling <wolf.koell...@slaughterandmay.com>
wrote:

Neerav Verma

unread,
Nov 16, 2009, 4:43:24 PM11/16/09
to ensemble-in...@googlegroups.com
This worked

Index ModifiersIdx1 On Modifiers(ELEMENTS);
Index ModifiersIdx2 On Modifiers(KEYS);

This didnot

Index ModifiersIdx1 On Modifiers(ELEMENTS, KEYS);


Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Ted Turner  - "Sports is like a war without the killing."

wolfkoelling

unread,
Nov 17, 2009, 5:32:41 AM11/17/09
to InterSystems: Ensemble in Healthcare
Sorry, should have been

Index ModifiersIdx3 On (Modifiers(ELEMENTS), Modifiers(KEYS));

Wolf

On 16 Nov, 21:43, Neerav Verma <vnee...@gmail.com> wrote:
> This worked
>
> Index ModifiersIdx1 On Modifiers(ELEMENTS);
> Index ModifiersIdx2 On Modifiers(KEYS);
>
> This didnot
>
> Index ModifiersIdx1 On Modifiers(ELEMENTS, KEYS);
>
> Thank You,
>
> Neerav Vermahttp://www.linkedin.com/in/vneerav
> ------------------------------------------------------
> Ted Turner <http://www.brainyquote.com/quotes/authors/t/ted_turner.html>  -
> "Sports is like a war without the killing."
>

Neerav Verma

unread,
Nov 17, 2009, 5:38:01 PM11/17/09
to ensemble-in...@googlegroups.com
That worked

Thank You,

Neerav Verma
http://www.linkedin.com/in/vneerav
------------------------------------------------------
Samuel Goldwyn  - "I'm willing to admit that I may not always be right, but I am never wrong."
Reply all
Reply to author
Forward
0 new messages