Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

SSAS2005 - When do partitions need to be processed?

106 views
Skip to first unread message

Jesse O.

unread,
May 12, 2006, 6:59:04 PM5/12/06
to
I've become increasingly frustrated while trying to figure out why some
dimensions, when process updated, require a processing of all the partitions
in the measure group within that processing job.

We've got a cube with a single measure group with anywhere from 90 to 113
partitions (90 day partitions and a partition added every hour through the
day). Every hour we update this cube adding a new partition to the measure
group, also process updating all the dimensions within it. When three out of
eight dimensions process during the hourly job, this adds 30 minutes of
processing time every hour. It's unacceptable and I've spent days on trying
to figure out why this happens.

At first I thought it was aggregations, however I removed all aggregations.
Second, I thought it may be due to only dimensions that have hierarchies,
however processing the Date dimension does not process all the partitions.
Third, I thought it may be because data changed. However, I can update the
dimension five times in a row without adding any data or partitions to the
cube and still get get the processing of the partitions. Fourth, does this
only happen to reference dimensions. Not the case.

My question is this: when and why does this happen?


See examples below:


Example one: Does NOT process the partitions within the measure group:


<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Parallel>
<Process xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Object>
<DatabaseID>Sales</DatabaseID>
<DimensionID>v Dim Calendar</DimensionID>
</Object>
<Type>ProcessUpdate</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
</Parallel>
</Batch>
Processing Dimension 'Date' completed successfully.
Start time: 5/12/2006 3:46:22 PM; End time: 5/12/2006 3:46:25 PM;
Duration: 0:00:03
Processing Dimension Attribute '(All)' completed successfully.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:23 PM;
Duration: 0:00:00
Processing Dimension Attribute 'Year' completed successfully. 3 rows have
been read.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_Calendar].[CalendarYEAR] AS [dbo_vDim_CalendarCalendarYEAR0_0]
FROM [dbo].[vDim_Calendar] AS [dbo_vDim_Calendar]
Processing Dimension Attribute 'Day Of The Week' completed successfully. 8
rows have been read.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_Calendar].[DayNameOrderID] AS
[dbo_vDim_CalendarDayNameOrderID0_0],[dbo_vDim_Calendar].[DayName] AS
[dbo_vDim_CalendarDayName0_1]
FROM [dbo].[vDim_Calendar] AS [dbo_vDim_Calendar]
Processing Dimension Attribute 'Day Of Month' completed successfully. 32
rows have been read.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_Calendar].[DayNumInMonth] AS [dbo_vDim_CalendarDayNumInMonth0_0]
FROM [dbo].[vDim_Calendar] AS [dbo_vDim_Calendar]
Processing Dimension Attribute 'Month' completed successfully. 18 rows
have been read.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_Calendar].[YEARMo] AS
[dbo_vDim_CalendarYEARMo0_0],[dbo_vDim_Calendar].[MonthName] AS
[dbo_vDim_CalendarMonthName0_1],[dbo_vDim_Calendar].[CalendarYEAR] AS
[dbo_vDim_CalendarCalendarYEAR0_2]
FROM [dbo].[vDim_Calendar] AS [dbo_vDim_Calendar]
Processing Dimension Attribute 'Day' completed successfully. 498 rows have
been read.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_Calendar].[FullDate] AS
[dbo_vDim_CalendarFullDate0_0],[dbo_vDim_Calendar].[DayLongName] AS
[dbo_vDim_CalendarDayLongName0_1],[dbo_vDim_Calendar].[YEARMo] AS
[dbo_vDim_CalendarYEARMo0_2]
FROM [dbo].[vDim_Calendar] AS [dbo_vDim_Calendar]
Processing Dimension Attribute 'CalendarDWID' completed successfully. 498
rows have been read.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_Calendar].[CalendarDWID] AS
[dbo_vDim_CalendarCalendarDWID0_0],[dbo_vDim_Calendar].[FullDate] AS
[dbo_vDim_CalendarFullDate0_1],[dbo_vDim_Calendar].[DayNameOrderID] AS
[dbo_vDim_CalendarDayNameOrderID0_2],[dbo_vDim_Calendar].[DayNumInMonth] AS
[dbo_vDim_CalendarDayNumInMonth0_3]
FROM [dbo].[vDim_Calendar] AS [dbo_vDim_Calendar]
Processing Hierarchy 'Date' completed successfully.
Start time: 5/12/2006 3:46:23 PM; End time: 5/12/2006 3:46:24 PM;
Duration: 0:00:01
Processing Cube 'Sales Current' completed successfully.
Start time: 5/12/2006 3:46:25 PM; End time: 5/12/2006 3:46:31 PM;
Duration: 0:00:06
Processing Measure Group 'Sales measures' completed successfully.
Start time: 5/12/2006 3:46:29 PM; End time: 5/12/2006 3:46:29 PM;
Duration: 0:00:00

Example two: Processes all the partitions within the measure group.


<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Parallel>
<Process xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Object>
<DatabaseID>Sales</DatabaseID>
<DimensionID>v Dim Business Unit Current</DimensionID>
</Object>
<Type>ProcessUpdate</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
</Parallel>
</Batch>
Processing Dimension 'BU2' completed successfully.
Start time: 5/12/2006 3:41:54 PM; End time: 5/12/2006 3:41:58 PM;
Duration: 0:00:04
Processing Dimension Attribute 'Business Unit Flat' completed
successfully. 27 rows have been read.
Start time: 5/12/2006 3:41:56 PM; End time: 5/12/2006 3:41:57 PM;
Duration: 0:00:01
SQL queries 1
SELECT
DISTINCT
[dbo_vDim_BusinessUnitCurrent].[BusinessUnitID] AS
[dbo_vDim_BusinessUnitCurrentBusinessUnitID0_0],[dbo_vDim_BusinessUnitCurrent].[BusinessUnitName]
AS
[dbo_vDim_BusinessUnitCurrentBusinessUnitName0_1],[dbo_vDim_BusinessUnitCurrent].[ParentBusinessUnitID]
AS [dbo_vDim_BusinessUnitCurrentParentBusinessUnitID0_2]
FROM [dbo].[vDim_BusinessUnitCurrent] AS [dbo_vDim_BusinessUnitCurrent]
Processing Dimension Attribute '(All)' completed successfully.
Start time: 5/12/2006 3:41:55 PM; End time: 5/12/2006 3:41:55 PM;
Duration: 0:00:00
Processing Hierarchy 'Business Unit' completed successfully.
Start time: 5/12/2006 3:41:56 PM; End time: 5/12/2006 3:41:57 PM;
Duration: 0:00:01
Processing Cube 'Sales Current' completed successfully.
Start time: 5/12/2006 3:41:58 PM; End time: 5/12/2006 3:42:57 PM;
Duration: 0:00:59
Processing Measure Group 'Sales measures' completed successfully.
Start time: 5/12/2006 3:42:02 PM; End time: 5/12/2006 3:42:55 PM;
Duration: 0:00:53
Processing Partition 'H_2006051214' completed successfully.
Start time: 5/12/2006 3:42:03 PM; End time: 5/12/2006 3:42:05 PM;
Duration: 0:00:02
Processing Partition 'H_2006051212' completed successfully.
Start time: 5/12/2006 3:42:07 PM; End time: 5/12/2006 3:42:08 PM;
Duration: 0:00:01
Processing Partition 'H_2006051200' completed successfully.
Start time: 5/12/2006 3:42:10 PM; End time: 5/12/2006 3:42:12 PM;
Duration: 0:00:02
Processing Partition 'H_2006051205' completed successfully.
Start time: 5/12/2006 3:42:15 PM; End time: 5/12/2006 3:42:18 PM;
Duration: 0:00:03
Processing Partition 'H_2006051202' completed successfully.
Start time: 5/12/2006 3:42:20 PM; End time: 5/12/2006 3:42:21 PM;
Duration: 0:00:01
Processing Partition 'H_2006051204' completed successfully.
Start time: 5/12/2006 3:42:23 PM; End time: 5/12/2006 3:42:24 PM;
Duration: 0:00:01
Processing Partition 'H_2006051201' completed successfully.
Start time: 5/12/2006 3:42:26 PM; End time: 5/12/2006 3:42:27 PM;
Duration: 0:00:01
Processing Partition 'H_2006051203' completed successfully.
Start time: 5/12/2006 3:42:29 PM; End time: 5/12/2006 3:42:31 PM;
Duration: 0:00:02
Processing Partition 'H_2006051213' completed successfully.
Start time: 5/12/2006 3:42:32 PM; End time: 5/12/2006 3:42:34 PM;
Duration: 0:00:02
Processing Partition 'H_2006051208' completed successfully.
Start time: 5/12/2006 3:42:35 PM; End time: 5/12/2006 3:42:37 PM;
Duration: 0:00:02
Processing Partition 'template' completed successfully.
Start time: 5/12/2006 3:42:38 PM; End time: 5/12/2006 3:42:39 PM;
Duration: 0:00:01
Processing Partition 'H_2006051206' completed successfully.
Start time: 5/12/2006 3:42:41 PM; End time: 5/12/2006 3:42:42 PM;
Duration: 0:00:01
Processing Partition 'H_2006051209' completed successfully.
Start time: 5/12/2006 3:42:44 PM; End time: 5/12/2006 3:42:45 PM;
Duration: 0:00:01
Processing Partition 'H_2006051210' completed successfully.
Start time: 5/12/2006 3:42:47 PM; End time: 5/12/2006 3:42:48 PM;
Duration: 0:00:01
Processing Partition 'H_2006051207' completed successfully.
Start time: 5/12/2006 3:42:50 PM; End time: 5/12/2006 3:42:52 PM;
Duration: 0:00:02
Processing Partition 'H_2006051211' completed successfully.
Start time: 5/12/2006 3:42:53 PM; End time: 5/12/2006 3:42:55 PM;
Duration: 0:00:02


Jeje

unread,
May 13, 2006, 1:15:16 PM5/13/06
to
well...

when you create partitions, do you associate this partition to a specific
slice of the cube?

use realtime partitions and the proactive caching feature. and/or
incremental processing.

I have seen a webcast where a billion rows database and the associated cubes
are updated in realtime.
The cache of the cube himself is updated, this mean that adding new content
in the cube fill the cache at the same time so the cube continue to be in
warm mode. A normal process remove the cache and the cube become cold and
slower to respond during the first accesses.

some articles:
http://www.microsoft.com/technet/prodtechnol/sql/2005/rtbissas.mspx
http://sqljunkies.com/WebLog/sqlbi/archive/2004/10/09/4542.aspx


"Jesse O." <jesp...@hotmail.com> wrote in message
news:%23vQmreh...@TK2MSFTNGP05.phx.gbl...

Jesse O.

unread,
May 14, 2006, 10:49:23 PM5/14/06
to
No, no slice set. We're using MOLAP. Thanks for your suggestion of proactive
caching.

I'm still confused as to why some dimensions process partitions while others
don't.

Perhaps someone from MS could chime in?

TIA.

Jesse.


"Jeje" <will...@hotmail.com> wrote in message
news:Ocp0ODrd...@TK2MSFTNGP02.phx.gbl...

Jeje

unread,
May 14, 2006, 11:44:39 PM5/14/06
to
generally a reprocess is required when there is a change in a hierarchy.
for example, if an employee is in New York and change to Boston, then the
hierarchy changes which required a process of the cubes to recalculates the
aggregations correctly.
Changing a label (like the name of the customer) required only a simple
update (if the key column is not the same as the label column).

if you anticipate this type of changing, you have to use the slow changing
feature in the dimension. I have not use it in AS2005 but in AS2000 yes.
this option will slow down accesses to the cubes because the system will
keep only aggregates at the top (all member) and bottom levels (the
employee)
intermediate levels are recalculate when asked by the user.
So when a employee change, there is no need to reprocess the cubes because
the aggregation for boston and new york are not stored on the disk.

With AS2005 there is a proactive caching option at the dimension level, but
I have not use it, if this caching works like in a cube... this could help
you by reaggregate data when a table in the database change.

take a close look at the proactive caching features.

"Jesse O." <jesp...@hotmail.com> wrote in message

news:uN%233so8d...@TK2MSFTNGP03.phx.gbl...

Darren Gosbell

unread,
May 15, 2006, 9:40:55 AM5/15/06
to
This behaviour could be depend on whether your dimension has flexible or
rigid attribute relationships. BOL did not have a very succinct
definition, but the following is from one of the tutorials:

>>>
...you can specify that the relationship is either flexible or rigid. If
you define a relationship as rigid, Analysis Services retains
aggregations when the dimension is updated. If a relationship that is
defined as rigid actually changes, Analysis Services generates an error
during processing unless the dimension is fully processed
>>>

Time inherently has rigid relationships (you can't change which month a
particular date belongs to). But I believe all other dimension default
to flexible relationships which would mean that processing them would
result in all the aggregations (not the leaf level data) being dropped
and re-processed.

--
Regards
Darren Gosbell [MCSD]
Blog: http://www.geekswithblogs.net/darrengosbell

In article <#DAflH9d...@TK2MSFTNGP02.phx.gbl>, will...@hotmail.com
says...

Akshai Mirchandani [MS]

unread,
May 15, 2006, 2:54:59 PM5/15/06
to
My guess here is that it is not aggregations but rather indexes being
processed.

I would suggest running Profiler against the server during processing and
seeing the events there -- the events for the partition should be for
ProcessIndex...

The actual processing of partitions doesn't appear to be significant here
though. It shows as:

Processing Cube 'Sales Current' completed successfully.
Start time: 5/12/2006 3:41:58 PM; End time: 5/12/2006 3:42:57 PM;
Duration: 0:00:59

Also, are you doing ProcessUpdate on all the dimensions in one operation
(using Batch/Parallel)? This would unify the cube processing into the same
transaction so that it only needs to happen once instead of repeated for
each of the dimensions.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for newsgroup
purposes only.

"Darren Gosbell" <j...@newsgroups.nospam> wrote in message
news:MPG.1ed32ff1...@news.microsoft.com...

Jesse O.

unread,
May 16, 2006, 2:01:15 PM5/16/06
to
Thanks for all the replies guys, I really appreciate it.

As far as the time only taking 59 seconds. That's with a limited dataset.

This is the AMO code we use to process our dimensions. Any suggestions?

Public Sub ProcessAllDimensions()

Dim oDimension As CubeDimension

Dim BeginTime As DateTime

For Each oDimension In objCube.Dimensions

Try

BeginTime = Now

InsertCubeProcessingLog("Process Dimension All Begin", 1, 0, BeginTime, Now,
objDatabase.Name, objCube.Name, oDimension.Name, "Dimension", "", "", "",
"", "")

oDimension.Dimension.Process(ProcessType.ProcessUpdate)

InsertCubeProcessingLog("Process Dimension All End", 1, 1, BeginTime, Now,
objDatabase.Name, objCube.Name, oDimension.Name, "Dimension", "", "", "",
"", "")

Catch

Dim ErrorReplace As String = Err.Description

ErrorReplace = ErrorReplace.Replace("'", "")

InsertCubeProcessingLog("Process Dimension All Failed: " & Err.Number & " -
" & ErrorReplace, 0, 0, BeginTime, Now, objDatabase.Name, objCube.Name,
oDimension.Name, "Dimension", "", "", "", "", "")

ObjectCleanup()

End Try

Next oDimension

End Sub

"Akshai Mirchandani [MS]" <aks...@online.microsoft.com> wrote in message
news:eoT5REFe...@TK2MSFTNGP05.phx.gbl...

Jesse O.

unread,
May 16, 2006, 4:27:55 PM5/16/06
to
I'm not so sold into Proactive Caching yet, at least in our environment.

We have a typical datawarehouse which is loaded and updated every hour
through a batch. Our process cube job runs every five minutes to check to
see if a batch has completed. If a batch has completed, the associated
dimensions are updated and a new partition is added to the cube. When there
are 24 batches completed for that day, we delete the hour partitions for
that day and create and process a day partition.

The amount of data we process every hour is large. Two of our dimensions
have two million plus members and our largest fact table grows by four
million rows an hour. We're planning on storing 90 days of data in this cube
which is processed hourly, and another cube which processes daily and has
all historical data. Our pain point right now is the length to process the
dimensions every hour. It constitutes 95% of our processing time.

I don't see a real need for proactive caching since it generally seems like
it's done against the transaction system, not the datawarehouse. Part of
that is due to my inexperience in SSAS2005 and not really having a good
grasp on proactive caching.

What do you guys think?


"Jeje" <will...@hotmail.com> wrote in message

news:%23DAflH9...@TK2MSFTNGP02.phx.gbl...

Shred

unread,
May 17, 2006, 2:42:02 AM5/17/06
to
One of the things that you could look at to speed up updates is snowflaking
your large dimensions. Here is a good Blog on the issue.
http://sqljunkies.com/WebLog/sqlbi/archive/2005/10/07/17040.aspx
I have not had the opportunity to test this, but my dimensions are not
nearly as large and we only update daily.

I am also going to be setting most of my dimensions to fixed for my 2nd
itteration to try and stop the index rebuilds that need to happen all the
time.

Jesse O.

unread,
May 17, 2006, 3:36:47 PM5/17/06
to
We already snowflake, good suggestion however.

What do you mean in "fixed" dimensions?


"Shred" <Sh...@discussions.microsoft.com> wrote in message
news:E9924DF9-C043-4019...@microsoft.com...

Adrian Dumitrascu [MS]

unread,
May 17, 2006, 4:23:14 PM5/17/06
to
To process multiple objects at once with AMO (in a Batch, in Parallel), this
is sample code (sorry it's C#, I don't have it available in VB.NET):

Server s = new Server();

s.Connect("localhost");

try
{
// We'll put AMO in capture-mode; this means that all .Process, .Update
etc commands
// are saved in a log (Server.CaptureLog), instead of being sent to the
server. And then
// we can run the commands from the log in a single Batch (eventually in
Parallel).
s.CaptureXml = true;

// Now we'll call the .Process methods as normally.
s.Databases["my database id"].Dimensions["my dimension id1"].Process();
s.Databases["my database id"].Dimensions["my dimension id2"].Process();

// Once we called Process on all the objects we want, we'll exit
capture-mode and
// we'll run the script.
s.CaptureXml = false; // we exit the capture mode
XmlaResultCollection results = s.ExecuteCaptureLog(true, true); //
transactional and parallel

// Now we'll check the results
foreach (XmlaResult result in results)
{
foreach (XmlaMessage message in result.Messages)
{
Console.WriteLine(message.Description);
if (message is XmlaError)
{
// the processing failed, there is at least one error
reported here.
}
}
}
}
finally
{
s.Disconnect();
}


Adrian Dumitrascu

"Jesse O." <jesp...@hotmail.com> wrote in message

news:%23BQl6KR...@TK2MSFTNGP02.phx.gbl...

Jesse O.

unread,
May 18, 2006, 12:32:38 PM5/18/06
to
Thanks Adrian.

We record a bunch of things for our processing in order to know which
partitions or dimensions need to be processed.

Can this be done in parallel processing? Ever see someone do it?


Table schema:
CREATE TABLE [dbo].[CubeProcessingLogSSAS2005](

[CubeProcessingLogID] [int] IDENTITY(1,1) NOT NULL,

[EventDescription] [varchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL,

[IsSuccess] [tinyint] NULL,

[RecordActive] [tinyint] NULL,

[EventStartTime] [datetime] NULL,

[EventEndTime] [datetime] NULL,

[DatabaseName] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[CubeName] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[ObjectName] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[ObjectType] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[PartitionName] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[SelectStatement] [varchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[ProcessingType] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[CalendarDWID] [int] NULL,

[TimeOfDayDWID] [int] NULL,

CONSTRAINT [pk_CubeProcessingLogSSAS2005] PRIMARY KEY CLUSTERED

(

[CubeProcessingLogID] ASC

"Adrian Dumitrascu [MS]" <adri...@microsoft.com> wrote in message
news:eGxY6%23eeGH...@TK2MSFTNGP05.phx.gbl...

0 new messages