For example...
Table1 has two fields - "fld1" and "fld2"
View1 includes "fld1", "fld2", "calc1" (which is simply the sum of
fld1+fld2) and "sumcalc1" which would be the sum of "calc1" + the value of
"calc1" in the previous row...
here's an example...
fld1, fld2, calc1, sumcalc1
1,1,2,2
2,2,4,6
3,3,6,12
4,4,8,20
etc...
You can see from the above example that the last field there is a running
total...
How can this be accomplished in an SQL View?
Thanks, Brad
"Brad Pears" <br...@truenorthloghomes.com> wrote in message
news:OIgtOQBn...@TK2MSFTNGP03.phx.gbl...
Back to basics: a table is a set or UNORDERED ROW (which are nothing
like the records in a sequential file system). Therefore, there is no
such thing as a "previous row" unless you have a value which defines
an ordering (Codd' Information Principle; Google it). You didn't give
anything for the ordering, tho. In short, you missed the most basic
concept of RDBMS and still have you head in 1950's sequential file
systems. We usually do this in the front end with a report writer,
since the total is subject to change when rows are inserted and
deleted.
CREATE TABLE Foobar
(foo_date DATETIME DEFAULT CURRENT_TIMESTANP NOT NULL PRIMARY KEY,
foo_value INTEGER NOT NULL);
Now, you can talk about a VIEW with a running total:
CREATE VIEW RunningFoobar (foo_date, foo_value, foo_value_runtot)
AS
SELECT F1.foo_date, F1.foo_value,
(SELECT SUM(F2.foo_value)
FROM Foobar AS F2
WHERE F2.foo_date <= F1.foo_date)
FROM Foobar AS F1;
This will run like cold glue as the table gets bigger. That is why we
do it in the front end with a report writer. If you can get to
another SQL product (DB2, Oracle, etc.), you can also use the OLAP
functions from SQL:2003 to write
CREATE VIEW RunningFoobar (foo_date, foo_value, foo_value_runtot)
AS
SELECT F1.foo_date, F1.foo_value,
SUM(F1.foo_value)
OVER (ORDER BY F1.foo_date ROWS BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW)
FROM Foobar AS F1;
This is quite fast.
Don't you need to insert some sample data first?
Come on, show us some date literals that will always work in SQL Server
regardless of regional / language settings. :-)
You can use the syntax outlined in Celko's post. Just a word of warning,
the performance will be awful for large datasets.
--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Taken from part of a seminar I did at SQLBits in March in Birmingham, UK.
/***
A look at RUNNING totals
Run these with the Actual Query Plan
SET STATISTICS IO ON
SET STATISTICS TIME ON
****/
create table #amts (
id int not null primary key clustered,
tran_date smalldatetime not null,
amount decimal(28,2) not null
)
create index ncidx_tran_date on #amts( tran_date )
truncate table #amts
insert #amts ( id, tran_date, amount ) values( 1, '20080208', 10 )
insert #amts ( id, tran_date, amount ) values( 2, '20080212', 10 )
insert #amts ( id, tran_date, amount ) values( 3, '20080208', 10 )
insert #amts ( id, tran_date, amount ) values( 4, '20080211', 10 )
insert #amts ( id, tran_date, amount ) values( 5, '20080212', 10 )
insert #amts ( id, tran_date, amount ) values( 6, '20080212', 10 )
-- Wrong....
select *,
rolling = coalesce(
( select sum( amount )
from #amts a2
where a2.tran_date <= a1.tran_date
and a2.id <> a1.id ), 0 )
+ amount
from #amts a1
order by tran_date, id
-- Wrong....
select *,
rolling = coalesce(
( select sum( amount )
from #amts a2
where a2.tran_date <= a1.tran_date
and a2.id < a1.id ), 0 )
+ amount
from #amts a1
order by tran_date, id
-- Wrong....
select *,
rolling = coalesce(
( select sum( amount )
from #amts a2
where a2.tran_date <= a1.tran_date
), 0 )
+ amount
from #amts a1
order by tran_date, id
-- Correct....
select *,
rolling = coalesce(
( select sum( amount )
from #amts a2
where a2.tran_date < a1.tran_date
or ( a2.tran_date = a1.tran_date
and a2.id < a1.id )
), 0 )
+ amount
from #amts a1
order by tran_date, id
go
-- Without the sub-query...
select a.*, dt.running_total + a.amount
from (
select a1.id, running_total = coalesce( sum( a2.amount ), 0 )
from #amts a1
left outer join #amts a2 on a2.tran_date < a1.tran_date
or ( a2.tran_date = a1.tran_date
and a2.id < a1.id )
group by a1.id
) as dt
inner join #amts a on a.id = dt.id
order by tran_date, id
/***
A look at using a CTE to help
***/
-- First we need a sequence
-- Notice the Sequence Generator Step
with Trans ( row_no, id, tran_date, amount )
as (
select row_no = row_number() over( order by tran_date, id ), id,
tran_date, amount
from #amts
)
select *
from Trans
go
-- Now the real query
-- Notice 3!!! Sequence Generator Steps - notice where the Filter step is
with Trans ( row_no, id, tran_date, amount )
as (
select row_no = row_number() over( order by tran_date, id ), id,
tran_date, amount
from #amts
)
select t1.*,
running_amount = coalesce(
( select sum( amount )
from Trans t2
where t2.row_no < t1.row_no ), 0 )
+ amount
from Trans t1
order by row_no
go
/************************
On Real Data Volumes
*********************/
-- Rows in the limited data set... - check IO
select count(*) from transactions where account_id between 1000 and 1020
select * from transactions where account_id between 1000 and 1020
-- 588
-- CTE way - row_number() approach
-- Problematic - row_number() is calculated on entire table rather than
just the ones we want!
-- Bug / Optimiser short fall?
-- Sub Query with CTE - ROW_NUMBER() just corrects the out of order id
(surrogate key)
with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount,
transaction_types_id, account_id )
as (
select row_no = row_number() over( order by account_id, tran_date, id ),
id, salesperson_id, tran_date, clear_date, amount, transaction_types_id,
account_id
from Transactions
where account_id between 1000 and 1020
)
select t1.*,
running_amount = coalesce(
( select sum( amount )
from Trans t2
where t2.row_no < t1.row_no
and t2.account_id = t1.account_id ), 0 )
+ amount
from Trans t1
order by account_id, row_no
go
-- Using a temporary table instead...
select row_no = row_number() over( order by account_id, tran_date, id ), id,
salesperson_id, tran_date, clear_date, amount, transaction_types_id,
account_id
into #t
from Transactions
where account_id between 1000 and 1020;
create unique clustered index clidx on #t ( account_id, row_no )
with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount,
transaction_types_id, account_id )
as (
select *
from #t
)
select t1.*,
running_amount = coalesce(
( select sum( amount )
from Trans t2
where t2.row_no < t1.row_no
and t2.account_id = t1.account_id ), 0 )
+ amount
from Trans t1
order by account_id, row_no
go
/***
The other methods on real data...
****/
-- Just using a sub-query and using a surrogate key (id column)
select *,
rolling = coalesce(
( select sum( amount )
from Transactions a2
where a2.account_id = a1.account_id
and ( a2.tran_date < a1.tran_date
or ( a2.tran_date = a1.tran_date
and a2.id < a1.id ) )
), 0 )
+ amount
from Transactions a1
where account_id between 1000 and 5000
order by account_id, tran_date, id
go
-- Self join approach
select a.*, dt.running_total + a.amount
from (
select a1.id, running_total = coalesce( sum( a2.amount ), 0 )
from Transactions a1
left outer join Transactions a2 on a2.account_id = a1.account_id
and ( a2.tran_date < a1.tran_date
or ( a2.tran_date = a1.tran_date
and a2.id < a1.id ) )
where a1.account_id between 1000 and 5000
group by a1.id
) as dt
inner join Transactions a on a.id = dt.id
order by account_id, tran_date, id
go
-- Cursor - using UPDATE method
set nocount on
create table #Trans(
row_no int not null,
account_id int not null,
id int not null,
amount decimal(28,2) not null,
rolling_total decimal(28,2) not null,
salesperson_id int not null,
tran_date smalldatetime not null,
clear_date smalldatetime not null,
transaction_types_id int not null,
primary key clustered( account_id, row_no )
)
insert #Trans (
row_no,
account_id,
id,
amount,
rolling_total,
salesperson_id,
tran_date,
clear_date,
transaction_types_id )
select row_no = row_number() over( partition by account_id order by
tran_date, id ),
account_id,
id,
amount,
0,
salesperson_id,
tran_date,
clear_date,
transaction_types_id
from Transactions
where account_id between 1000 and 5000
declare trans_cur cursor forward_only for
select account_id, row_no, amount
from #Trans
order by account_id, row_no
for update of rolling_total
declare @account_id int,
@row_no int,
@amount decimal(28,2),
@rolling_total decimal(28,2)
open trans_cur
fetch next from trans_cur into @account_id, @row_no, @amount
begin tran
while @@fetch_status = 0
begin
if @row_no = 1 -- initialise on each partition
set @rolling_total = @amount
else
set @rolling_total = @rolling_total + @amount
update #Trans
set rolling_total = @rolling_total
where current of trans_cur
fetch next from trans_cur into @account_id, @row_no, @amount
end
deallocate trans_cur
commit tran
select id, salesperson_id, tran_date, clear_date, amount,
transaction_types_id, amount, rolling_total
from #Trans
order by account_id, tran_date, id
drop table #Trans
go
-- Cursor - INSERT method
set nocount on
create table #Trans(
account_id int not null,
id int not null,
amount decimal(28,2) not null,
rolling_total decimal(28,2) not null,
salesperson_id int not null,
tran_date smalldatetime not null,
clear_date smalldatetime not null,
transaction_types_id int not null
)
declare trans_cur cursor fast_forward for
select account_id,
row_no = row_number() over( partition by account_id order by
tran_date, id ),
id,
amount,
salesperson_id,
tran_date,
clear_date,
transaction_types_id
from Transactions
where account_id between 1000 and 5000
order by account_id, row_no
declare @account_id int,
@row_no int,
@id int,
@amount decimal(28,2),
@salesperson_id int,
@tran_date smalldatetime,
@clear_date smalldatetime,
@transaction_types_id int,
@rolling_total decimal(28,2)
open trans_cur
fetch next from trans_cur
into @account_id, @row_no, @id, @amount,
@salesperson_id, @tran_date, @clear_date, @transaction_types_id
begin tran
while @@fetch_status = 0
begin
if @row_no = 1 -- initialise on each partition
set @rolling_total = @amount
else
set @rolling_total = @rolling_total + @amount
insert #Trans ( account_id, id, amount, rolling_total, salesperson_id,
tran_date, clear_date, transaction_types_id )
values( @account_id, @id, @amount, @rolling_total, @salesperson_id,
@tran_date, @clear_date, @transaction_types_id )
fetch next from trans_cur
into @account_id, @row_no, @id, @amount,
@salesperson_id, @tran_date, @clear_date, @transaction_types_id
end
deallocate trans_cur
commit tran
select id, salesperson_id, tran_date, clear_date, amount,
transaction_types_id, amount, rolling_total
from #Trans
order by account_id, tran_date, id
drop table #Trans
go
with Trans ( row_no, id, salesperson_id, tran_date, clear_date, amount,
transaction_types_id, account_id )
as (
select row_no = row_number() over( order by account_id, tran_date, id ),
id, salesperson_id, tran_date, clear_date, amount, transaction_types_id,
account_id
from Transactions
where account_id between 1000 and 5000
)
select t1.*,
running_amount = coalesce(
( select sum( amount )
from Trans t2
where t2.row_no < t1.row_no
and t2.account_id = t1.account_id ), 0 )
+ amount
from Trans t1
order by account_id, row_no
go
--
Tony Rogerson, SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson
[Ramblings from the field from a SQL consultant]
http://sqlserverfaq.com
[UK SQL User Community]
"Brad Pears" <br...@truenorthloghomes.com> wrote in message
news:OIgtOQBn...@TK2MSFTNGP03.phx.gbl...
You need to crawl back under that rock you just emerged from.
What you suggest here is quite erroneous and misleading. Please retract
or qualify your statement.
Every retrieved recordset has (internally) a row number and a specific
order, whether or not an ORDER BY existed in the select. As long as the
cursor is not forward only, you can indeed go to the previous row and
this is an important and necessary feature for cursors being placed in
grids or dynamic reports. Next row and last row are in fact quite
standard operations and always yield the same row.
What is not guaranteed is that the retrieved rows will be in exactly the
same order if there is no order by clause or the ordering columns are
not sufficiently unique.
What you were trying to say in your typical blundering way is that there
is no implicit row number in a SQL table natively unless you provide a
column for this purpose and cluster an index on it. This means that the
result set will have its own row numbering which probably will not match
what you expect as a row number in the underlying table.
That said, previous row makes perfect sense when used appropriately.
Geoff
"--CELKO--" <jcel...@earthlink.net> wrote in message
news:ff4a8867-dd58-4ae8...@m44g2000hsc.googlegroups.com:
in many such cases a cursors run much faster than set based solutions.
Of course, some people who haven't written more than five cursors in
their lives might object...
This is difficult for the simple reason that any statement based
approach has a cost that rises exponentially with the number of rows. It
all depends on how you are going to use the result. If the output is to
a static destination (eg. a printed report) you are better building a
table from a cursor and accumulating and populating the total in a loop
of that cursor.
If your need is for values to be displayed in a grid then you have
little choice but to calculate the value and store it in a permanent
column and suffer the retotalling needed forward of any value which
changes.
We have multi-million row tables where subsets of the rows carry a
running balance and we found the most efficient way was to compute and
store the balances.
Tough choices here.
Geoff
"Brad Pears" <br...@truenorthloghomes.com> wrote in message
news:OIgtOQBn...@TK2MSFTNGP03.phx.gbl:
The cursor solution was also slow in comparison to the nested join approach.
I'll try and write it up in the next couple of weeks - I'm presenting on it
again on 10th May at Scottish Developers Day.
Tony
--
Tony Rogerson, SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson
[Ramblings from the field from a SQL consultant]
http://sqlserverfaq.com
[UK SQL User Community]
"Alex Kuznetsov" <alk...@gmail.com> wrote in message
news:f0b1e974-843a-4246...@m3g2000hsc.googlegroups.com...
Very interesting. In my benchmarks usually if there is just one value
per group, than a set based solution is much faster. If there are
let's say 100K values per group, than a cursor is usually much faster,
and there is a threshold where both approaches perform more or less
the same. Let's discuss it next week.
> Very interesting. In my benchmarks usually if there is just one value
> per group, than a set based solution is much faster. If there are
> let's say 100K values per group, than a cursor is usually much faster,
> and there is a threshold where both approaches perform more or less
> the same. Let's discuss it next week.
This is indeed our experience, but with lower value sets. We have tables
that hold of the order of 2 million rows where the subsets number even
only 1000-5000 or so. They are client transactions that are date based
and there is an overall balance required (regardless of client) and a
client-only balance. They are running balances because they represent
daily net asset levels.
On one level it seems that you wouldn't need to store these balances -
just create them when you need them. The problems start, though, when
you need those values in further calculations (such as average daily
balances for all clients between two dates). Nested joins in this
scenario just clag the machine up whereas cursor loops run fine. The
cursor loop also has the advantage of providing a simple mechanism to
show a progress bar with info. This kind of feedback prevents the user
from thinking something has gone wrong.
It is an interesting problem.
Geoff
In this case I am only talking about a maximum of likely 10 rows being
selected using an "order by" clause to ensure I am subtotalling properly..
So very few rows - meaning nothing really as far as a performance hit. Also
this application is used infrequently by very few people so I really do not
have any of the problems that other folks have posted here.
Because I need to always carry forward the total from the previously read
row, I am assuming I should use a cursor?
Thanks, Brad
"Geoff Schaller" <geo...@softwareobjectivesx.com.au> wrote in message
news:4801b883$1...@dnews.tpgi.com.au...
The selected rows (a maximum of 10 - so hardly any really) are selected
using an order by clause on one of the fields in the table, so the previous
row I am talking about would be the row that was read before the current
one...
If you were doing this in code (VB.net for example), you would read through
the records sequentially (using the required "order by" clause) storing the
value of "calc2" from each row into a temporary variable, then adding that
variable to the next row's "calc1" field to come up with the new value for
"calc2". Then you would store "calc2" to the temporary variable before
reading the next record and do it all over again.
example.. (psuedo code only)
dim iCalc1 as integer
dim iCalc2 as integer
dim oRows as collection
' Code to fill the collection goes here
iCalc2 = 0
for each row in oRows
iCalc1 = fld1+fld2
iCalc2 = iCalc2 + iCalc1
debug.print fld1 & fld2 & iCalc1 & iCalc2
next oRow
Now, how would you accomplish this same thing in an SQL query?
Thanks, Brad
"Aaron Bertrand [SQL Server]" <ten...@dnartreb.noraa> wrote in message
news:eerjHTBn...@TK2MSFTNGP03.phx.gbl...
LOL! Or who have to port code to other products. I think I have now
written around 10-20 solutions with the new SUM() OVER() features in
DB2 (we don't use Oracle here). It is really powerful and is supposed
to be faster than an explicit cursor -- I don't know the internal
mechanism used.
Create Table FooBar (fld1 int Primary Key, fld2 int);
Insert FooBar (fld1, fld2)
Select 1,1
Union All Select 2,2
Union All Select 3,3
Union All Select 4,4;
go
Create View FooBarView As
Select f.fld1,
f.fld2,
f.fld1+f.fld2 As calc1,
(Select Sum(fld1+fld2)
From FooBar f2 Where f2.fld1 <= f.fld1) As sumcalc1
From FooBar f
go
Select fld1, fld2, calc1, sumcalc1
From FooBarView
Order By fld1
go
Drop View FooBarView
Drop Table FooBar
As others have noted, this technique will not be efficient if you have a
large number of rows, but for a small number of rows, it will work just
fine.
Tom
"Brad Pears" <br...@truenorthloghomes.com> wrote in message
news:%23E7nW$jnIHA...@TK2MSFTNGP02.phx.gbl...
"Brad Pears" <br...@truenorthloghomes.com> wrote in message
news:#E7nW$jnIHA...@TK2MSFTNGP02.phx.gbl: