View this page "RESOURCE: More General Way to Use Data from External Tables"

61 views
Skip to first unread message

Lexi Hayden

unread,
Apr 3, 2009, 11:16:00 AM4/3/09
to InterSystems: DeepSee Community
Hi all,
I've discovered a general technique for building dimensions on data in
tables external to your base table. With this technique, it seems
theoretically possible to access anything in your database.

I'll add this to the doc. for 9.1. For now, click on
http://groups.google.com/group/intersystems-deepsee-community/web/resource-more-general-way-to-use-data-from-external-tables?hl=en
- or copy & paste it into your browser's address bar if that doesn't
work.

Glebanski

unread,
Apr 6, 2009, 7:45:19 AM4/6/09
to InterSystems: DeepSee Community
Very interesting.
But how about the performance?

Does it mean that when I run a pivot table containing this attribute
against the table with 100K rows this sql-statement will be executed
100 000 times?

Can someone enlighten us a little bit about the insides of Deep See?

Lexi Hayden

unread,
Apr 6, 2009, 10:52:16 AM4/6/09
to InterSystems: DeepSee Community
Fortunately, no, this SQL is not executed at runtime. It is used only
when the indices are built.

The DeepSee data model (created in the Architect) describes how to
build a fact table, which is a highly denormalized table with a high-
speed lookup mechanism (the indices). Each row of your base class gets
represented by a row in the fact table, and that row has all the
dimension and measure information for that base-class item.

Then at runtime (running the pivot table, I mean), DeepSee uses that
fact table. So instead of performing a complex query against the
database via SQL, possibly hitting multiple tables, we use a simple
query that selects rows from the fact table and uses them (in a manner
still slightly mysterious to me).

The fact table is central to DeepSee. In the Architect, you are
telling DeepSee what information to put into the fact table. Some of
that info. is simply copied from your normalized tables, but some of
it is computed (using any expression you can write in COS!). And that
computation occurs when the indices are built.

Kenneth Poindexter

unread,
Apr 6, 2009, 2:01:11 PM4/6/09
to intersystems-de...@googlegroups.com
I think it is important to point out that the original assumption could be correct when actually running the build process in Architect, or thru the Scheduler module.

I have found that it is critical to have proper indexes setup on your source data in order to prevent performance problems at build time. This ofcourse is only true if you are joining a base table with another. Of course, if you actually are able to do real-time updates of the fact table then much of this performance issue is eliminated, however so, it is important to make sure your source data is properly indexed to allow the build process to work faster, whether done in batch mode or real-time.

Ive seen re-builds take 4-5 hours to complete simply because the source data was not indexed properly.

An example, I have a patient table and a visit table. Patient has patientid as its primary key. Visit has visitid as its primary key. Visit also has a patientid property which does point to the patient class, but there is no index of the visit class by patientid. Thus, if my base class is patient and I want to include visittype from my visit class, if I join the tables using Link Property/Link To using patientid, the build process will actually traverse the entire visit table for each patient record in the patient table. However, if I create an index of visit table by patientid, then the build process will only traverse visits for the particular patient record it is working on.






Kenneth Poindexter
Sales Engineer - Federal





p:. (301) 693-3391

e:. kenneth.p...@intersystems.com
| InterSystems | Caché -&- Ensemble |

Jamie Titak

unread,
Jul 2, 2014, 2:41:57 PM7/2/14
to intersystems-de...@googlegroups.com
Hi Lexi,

I can't get this or any of the other links you posted to work.  Are these materials in a different location now?  Thanks for your help!

Jamie Titak
Senior Consultant-J2 Interactive

Lexi

unread,
Jul 2, 2014, 3:37:43 PM7/2/14
to intersystems-de...@googlegroups.com
Hi Jamie,
Apparently, yes, things have moved! I'll see if I can find out about that, probably next week.

If you need to create a level that uses data in some table other than the source table, first note that you can use Cache arrow syntax to access related tables, if there is a connection between the tables. But in other cases, the general technique is to create and use a method (or if you like) that retrieves the needed data.

The method would take, as input, the ID (or other needed properties) from the source table. Internally the method could use either embedded or dynamic SQL to query the other table or tables, with a suitable WHERE clause. The method would then return the value that you want to use for the level. It is good if there are appropriate indices in place on the target table.

Then in the level definition, use a source expression of the form ##class(MyPkg.MyClass).MyMethod(%source.ID). (Or pass other properties as needed.) This expression gets evaluated once per record in the source table, at build time.

There is an example in SAMPLES. The Patients cube is based on the table DeepSee.Study.Patients. Several of the levels use data in the table DeepSee.Study.Patients.

I hope this helps.
Cheers,
Lexi

Lexi

unread,
Jul 7, 2014, 3:46:19 PM7/7/14
to intersystems-de...@googlegroups.com
Hi again Jamie,

We did move the uploaded files. To find them, go to https://sites.google.com/site/intersystemsdeepsee/. The list on the left the has sections "Code Collection", "Useful Resources" and so on.

I hope this helps.
Cheers,
Lexi

Reply all
Reply to author
Forward
0 new messages