I am not contradicting anyone's opininon here but I always here the term
"there is always a set based solution" and one does not need to use cursors.
However seldom we find ourselves not using cursors because of the data we
have or its that just we need to do a row by row operation. Personally I
dont like using cursors because they are a big performance drag. However I
have seen many a times there is no choice. For example in our system we have
the following code. I am trying to find out if this code can be rewritten
without using a cursor and using a SET BASED SOLUTION. I keep hearing that
cursors can be avoided but with a code like this can they be really
avoided?.
If anyone has a better solution to the following code ( a set based solution
I would prefer) and not use a cursor or even a loop (if possible) I would
really really appreciate it.
DECLARE @SalesRepID int
DECLARE @ASSESSMENT_TYPE_CPP int
DECLARE @AssessmentID int
DECLARE @DeviceNames varchar(3000)
AS
SET NOCOUNT ON
SET @ASSESSMENT_TYPE_CPP = 4
DECLARE @CPPReports TABLE
(
AssessmentID int null,
Customer varchar(50) null,
ContactName varchar(50) null,
DateCreated varchar(12) null,
DeviceNames varchar(3000) null
)
INSERT INTO
@CPPReports
(
AssessmentID,
Customer,
ContactName,
DateCreated
)
SELECT
a.AssessmentID AS AssessmentID,
cp.CompanyName AS Customer,
cp.FirstName + ' ' + cp.LastName AS ContactName,
CONVERT(varchar,a.InsertDate,101) as DateCreated
FROM
tblAssessment a
INNER JOIN
tblCustomer c ON a.CustomerID = c.CustomerID
INNER JOIN
tblProfile cp ON c.ProfileID = cp.ProfileID
WHERE
a.salesrepid = @salesrepid AND
a.AssessmentTypeID = @ASSESSMENT_TYPE_CPP AND
a.Deleted = 0
DECLARE ReportCursor CURSOR FOR
SELECT
AssessmentID
FROM
@CPPReports
OPEN ReportCursor
FETCH NEXT FROM ReportCursor
INTO
@AssessmentID
WHILE @@FETCH_Status = 0
BEGIN
SET @DeviceNames = NULL
SELECT
@DeviceNames = COALESCE(@DeviceNames + '<BR>', '') + d.DeviceName
FROM
tblDeviceFact df
INNER JOIN
vwIPADevice d ON df.DeviceID = d.DeviceID
WHERE
AssessmentID = @AssessmentID
UPDATE
@CPPReports
SET
DeviceNames = @DeviceNames
WHERE
AssessmentID = @AssessmentID
FETCH NEXT FROM ReportCursor
INTO
@AssessmentID
END
--
SELECT
*
FROM
@CPPReports
ORDER BY
AssessmentID DESC
CLOSE ReportCursor
DEALLOCATE ReportCursor
Thanks
M
AS
SET NOCOUNT ON
SET @ASSESSMENT_TYPE_CPP = 4
Select Top 1 @AssessmentID = assessmentId
from @CPPReports
order by assessmentId
While 1 = 1 Begin
SET @DeviceNames = NULL
SELECT
@DeviceNames = COALESCE(@DeviceNames + '<BR>', '') + d.DeviceName
FROM
tblDeviceFact df
INNER JOIN
vwIPADevice d ON df.DeviceID = d.DeviceID
WHERE
AssessmentID = @AssessmentID
UPDATE
@CPPReports
SET
DeviceNames = @DeviceNames
WHERE
AssessmentID = @AssessmentID
Select Top 1 @AssessmentID = assessmentId
from @CPPReports
where assessmentId > @AssessmentID
order by assessmentId
If @@rowcount = 0 begin break end
END
--
SELECT
*
FROM
@CPPReports
ORDER BY
AssessmentID DESC
"Meher Malakapalli" <mmalak...@cohesioninc.com> wrote in message
news:OPMqsyLH...@TK2MSFTNGP10.phx.gbl...
--
----------------------------------------------------------------------------
Louis Davidson (dr...@hotmail.com)
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
"JI" <anon...@discussions.microsoft.com> wrote in message
news:eqSjk7L...@tk2msftngp13.phx.gbl...
> SELECT
> @DeviceNames = COALESCE(@DeviceNames + '<BR>', '') + d.DeviceName
> FROM
> tblDeviceFact df
> INNER JOIN
> vwIPADevice d ON df.DeviceID = d.DeviceID
> WHERE
> AssessmentID = @AssessmentID
Is not really a safe way to to this, and to do this in SQL Server, you
really would need a cursor.
All this being said, you could eliminate your outside cursor and the temp
table by using a user defined function to do this:
SELECT
a.AssessmentID AS AssessmentID,
cp.CompanyName AS Customer,
cp.FirstName + ' ' + cp.LastName AS ContactName,
CONVERT(varchar,a.InsertDate,101) as DateCreated,
dbo.rollupDeviceNames(@assessmentId) as deviceNames
FROM
tblAssessment a
INNER JOIN
tblCustomer c ON a.CustomerID = c.CustomerID
INNER JOIN
tblProfile cp ON c.ProfileID = cp.ProfileID
WHERE
a.salesrepid = @salesrepid AND
a.AssessmentTypeID = @ASSESSMENT_TYPE_CPP AND
a.Deleted = 0
So instead of coding a cursor, you let the SQL engine do the cursoring thing
internally, like it always does, and let it call the function for every row.
The function would be something along the lines of (pseudocode)
for each row in deviceFact where assesmentId = @assessmentId
set @deviceNames = COALESCE(@DeviceNames + '<BR>', '') + d.DeviceName
return @deviceNames
Someone will post the problem with the rollup the way you did it like this:
> SELECT
> @DeviceNames = COALESCE(@DeviceNames + '<BR>', '') + d.DeviceName
> FROM
> tblDeviceFact df
> INNER JOIN
> vwIPADevice d ON df.DeviceID = d.DeviceID
> WHERE
> AssessmentID = @AssessmentID
But suffice it to say that under some circumstances, this undocumented
"trick" falls apart. In your case it probably works well enough, and I have
used it in some cases for production needs. However, you will find that a
cursor will work pretty fast for a very small set like this.
ji
"Louis Davidson" <dr_donts...@hotmail.com> wrote in message
news:OT4FiDMH...@TK2MSFTNGP10.phx.gbl...
--
----------------------------------------------------------------------------
Louis Davidson (dr...@hotmail.com)
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
"JI" <anon...@discussions.microsoft.com> wrote in message
news:%23jTJhIM...@TK2MSFTNGP11.phx.gbl...
"Louis Davidson" <dr_donts...@hotmail.com> wrote in message
news:e7$9LWMHE...@TK2MSFTNGP09.phx.gbl...
"Adam Machanic" <amac...@air-worldwide.nospamallowed.com> wrote in message
news:eec57wMH...@TK2MSFTNGP10.phx.gbl...
I seriously doubt that a FAST_FORWARD cursor has more overhead than your
temp table. Perhaps equivalent, though. What do you think it's doing
internally?
Cursors have markers indicating what row they are on. Fast Forward presents
a lot of advantages due to the forward only nature and in that aspect you
are correct. But in a scalability comparison, utilizing the select top 1 and
loop versus the cursor --- the loop is better. There is no locking with the
loop, there is no snapshot view (dirty pages). It reads the row and
processes the other statements. What do you think it is doing?
"Adam Machanic" <amac...@air-worldwide.nospamallowed.com> wrote in message
news:uN$T7ANHE...@TK2MSFTNGP10.phx.gbl...
Why do you feel that a FAST_FORWARD cursor incurs locking?
http://www.sql-server-performance.com/cursors.asp
Note, the link does state that the fast_forward cursor does produce the
LEAST AMOUNT of overhead on SQL Server indicating that there is overhead.
defending cursors...interesting.
"Adam Machanic" <amac...@air-worldwide.nospamallowed.com> wrote in message
news:uit7YQNH...@TK2MSFTNGP11.phx.gbl...
e.g: SELECT CategoryId, ProductName FROM Northwind..Products ;
The goal is to return a SQL resultset with two columns, one with the
CategoryId and the other with a concatenated list of all the ProductNames
separated by commas.
#1 Dynamic SQL.
There is enough literature out there which demonstrates the drawbacks and
implications of using Dynamic SQL. A popular one, at least from database
programmer’s perspective, is www.sommarskog.se/dynamic_sql.html. The Dynamic
SQL approach is based on creating a SQL query string based on the number of
groups and then use a series of CASE expressions to pivot the data for
concatenation.
DECLARE @r VARCHAR( 8000 ), @n INT, @i INT
SELECT @r = 'SELECT CategoryId, ' + CHAR( 13 ),
@n = ( SELECT TOP 1 COUNT( ProductName ) FROM Products
GROUP BY CategoryId ORDER BY COUNT( ProductName ) DESC )
SET @i = 1 ;
WHILE @i <= @n BEGIN
SET @r = @r +
CASE WHEN @i = 1 THEN 'MAX( CASE Seq
WHEN ' + CAST( @i AS VARCHAR ) + '
THEN ProductName ELSE SPACE( 0 ) END ) + ' + CHAR(13)
WHEN @i = @n
THEN 'MAX( CASE Seq
WHEN ' + CAST( @i AS VARCHAR ) + '
THEN '', '' + ProductName
ELSE SPACE( 0 ) END ) ' + CHAR( 13 )
ELSE 'MAX( CASE Seq
WHEN ' + CAST( @i AS VARCHAR ) + '
THEN '', '' + ProductName
ELSE SPACE( 0 ) END ) + ' + CHAR( 13 ) END
SET @i = @i + 1
END
SET @r = @r + '
FROM ( SELECT CategoryId, ProductName,
( SELECT COUNT(*) FROM Products p2
WHERE p2.CategoryId = p1.CategoryId
AND p2.ProductName <= p1.ProductName )
FROM Products p1 ) D ( CategoryId, ProductName, Seq )
GROUP BY CategoryId;'
EXEC( @r )
There is a chance that the SQL string can grow beyond 8000 chars which is
the documented limit for a VARCHAR datatype. In such cases, one may have to
use multiple VARCHAR variables and do: EXEC(@r1 + @r2 + @r3 + ..)
#2 Scalar UDF with t-SQL update extension
The usage of an expression that involves the column that is being used in
the LHS of the SET clause in an UPDATE statement is rarely a logical
approach. However, in general, the optimizer seems to process these values
in the order of materialization, either in the internal work tables or any
other storage structures. With that assumption, this approach though not a
good logical methods, works under general circumstances.
CREATE FUNCTION udf_scalar_concat ( @CategoryId INT )
RETURNS VARCHAR( 8000 ) AS
BEGIN
DECLARE @t TABLE( p VARCHAR(40 ) )
DECLARE @r VARCHAR( 8000 )
SET @r = SPACE(0)
INSERT @t
SELECT ProductName FROM Products
WHERE CategoryId = @CategoryId
IF @@ROWCOUNT > 0
UPDATE @t
SET @r = @r + p + ','
RETURN( @r )
END
Usage:
SELECT CategoryId, dbo.udf_scalar_concat( CategoryId )
FROM Products
GROUP BY CategoryId ;
#3 Scalar UDF with recursion
Recursive functions in t-SQL have a drawback that the maximum nesting level
is 32. So this approach is applicable only for smaller datasets, especially
when the number of items within a group, that needs to be concatenated, is
less than 32.
CREATE FUNCTION udf_recur ( @cid INT, @i INT )
RETURNS VARCHAR(8000) AS BEGIN
DECLARE @r VARCHAR(8000), @l VARCHAR(8000)
SELECT @i = @i - 1, @r = ProductName + ', '
FROM Products p1
WHERE CategoryId = @cid
AND @i = ( SELECT COUNT( * ) FROM Products p2
WHERE p2.CategoryId = p1.CategoryId
AND p2.ProductName <= p1.ProductName ) ;
IF @i > 0 BEGIN
EXEC @l = dbo.recur @cid, @i
SET @r = @l + @r
END
RETURN @r
END
GO
This function can be used as follows:
SELECT CategoryId, dbo.udf_recur( CategoryId, COUNT(ProductName) )
FROM Products p1
GROUP BY CategoryId ;
#4 Table valued UDF with a WHILE loop
This approach is based on the idea by Steve/Lindawie where a table variable
with three columns is used within a table valued UDF. The first column
represents the group, second represents the currently processing value
within a group and the third represents the concatenated list of values.
CREATE FUNCTION udf_tbl_Concat()
RETURNS @t TABLE( CategoryId INT,
Product VARCHAR( 40 ),
list VARCHAR( 7991 ) )
BEGIN
INSERT @t ( CategoryId, Product, list )
SELECT CategoryId, MIN( ProductName ), MIN( ProductName )
FROM Products
GROUP BY CategoryId
WHILE ( SELECT COUNT( Product ) FROM @t ) > 0 BEGIN
UPDATE t
SET list = list + COALESCE(
( SELECT ', ' + MIN( ProductName ) FROM Products
WHERE Products.CategoryId = t.CategoryId
AND Products.ProductName > t.Product ), '' ),
Product = ( SELECT MIN(ProductName) FROM Products
WHERE Products.CategoryId = t.CategoryId
AND Products.ProductName > t.Product )
FROM @t t END
RETURN
END
SELECT CategoryId, list
FROM udf_tbl_Concat() ;
#5 The Cursors approach (least recommended)
The drawbacks of using cursors are well-known among the SQL Server
community. Given the fact that they are generally resource intensive,
procedural and inefficient, one should strive not to use cursor based
approaches for programming requirements unless as a last resort.
DECLARE @tbl TABLE ( id INT PRIMARY KEY, list VARCHAR( 8000 ) )
SET NOCOUNT ON
DECLARE @c INT, @p VARCHAR( 8000 ), @cNext INT, @pNext VARCHAR( 40 )
DECLARE c CURSOR FOR
SELECT CategoryId, ProductName
FROM Products
ORDER BY CategoryId, ProductName
OPEN c
FETCH NEXT FROM c INTO @cNext, @pNext
SET @c = @cNext
WHILE @@FETCH_STATUS = 0 BEGIN
IF @cNext > @c BEGIN
INSERT @tbl SELECT @c, @p
SELECT @p = @PNext, @c = @cNext
END ELSE
SET @p = COALESCE( @p + ',', SPACE( 0 ) ) + @pNext
FETCH NEXT FROM c INTO @cNext, @pNext
END
INSERT @tbl SELECT @c, @p
CLOSE c
DEALLOCATE c
SET NOCOUNT OFF
SELECT * FROM @tbl
With the upcoming version of SQL Server, some of these queries will become
obsolete since there are certain proprietary operators which can handle
pivoting datasets without much procedural logic. Also there are certain
additional capabilities of XML in SQL Server 2005 which can facilitate such
transpositions.
--
Anith
T-SQL is not a reporting language. You can return the results in a single
query if you perform the data formatting in your application. This will
improve performance and provide more flexibility. You might consider
formatting ContactName and DateCreated in your app as well.
Below is an example that you can tweak for your actual schema.
SELECT
a.AssessmentID AS AssessmentID,
cp.CompanyName AS Customer,
cp.FirstName + ' ' + cp.LastName AS ContactName,
CONVERT(varchar,a.InsertDate,101) as DateCreated,
d.DeviceName
FROM
tblAssessment a
INNER JOIN
tblCustomer c ON a.CustomerID = c.CustomerID
INNER JOIN
tblProfile cp ON c.ProfileID = cp.ProfileID
INNER JOIN
tblDeviceFact df ON df.AssessmentID = @AssessmentID
INNER JOIN
vwIPADevice d ON df.DeviceID = d.DeviceID
WHERE
a.salesrepid = @salesrepid AND
a.AssessmentTypeID = @ASSESSMENT_TYPE_CPP AND
a.Deleted = 0
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Meher Malakapalli" <mmalak...@cohesioninc.com> wrote in message
news:OPMqsyLH...@TK2MSFTNGP10.phx.gbl...
This is a statement that I still struggle with. T-SQL is a data
manipulation language, and you need to manipulate data to do reporting. It
is also true that row by row actions are usually easier to do on the
presentation layer, but will it always be faster?
Consider this set for a report
Col1, ..., Col100, RolledUpColumn1
...
Col1, ..., Col100, RolledUpColumn100
If Col1 through Col100 contain the exact same values for every row, and we
want to display the RolledUpColumn as:
Value1, Value2, ..., Value100
If we had a solid method of doing this at the data layer, it would certainly
be faster than the alternative. Now if you had additional groups applied to
the set, Col1!...Col100!, it would get really bad.
Is is terribly wrong to use SQL to manipulate data in a set wise fashion,
even if the final output is not a proper normalized table? I never hear
anyone tell someone that is would be better to do other aggregate type
functions on the front end:
Select value1, value2, sum(value) as value3
...
group by value1, value
is considered okay, so why not:
Select value1, value2, concatenate(value) as value3
...
group by value1, value
This is one of the first aggregate functions I will create when given the
possibility, since this kind of thing annoyingly pops up all of the time,
since users don't always think in normalized structures, they frequently
think in lists.
There are two major arguments, at least in my mind, against doing reporting
on the server:
1. Time and resources: The server has very finite limits on the resources
available to it ... mostly CPU and memory. It is therefore advantageous in
almost all environments to offload work from the database server so that the
database server can concentrate on doing what it does best ... namely, in a
reporting environment, the retrieval of data. Isn't is better to have a
hundred boxes doing formatting locally rather than a single box doing the
formatting for all those hundred boxes :-) As an aside, by doing the
formatting locally, one can also query the local Windows settings to
determine how best to format things like dates and time rather than having a
single pre-determined format for all users.
2. SQL, and T-SQL, is not a general purpose programming language. It is a
very specialized language. It is, as is often noted, a "data manipulation
language". While this does not mean that database vendors have not extended
the ANSI SQL standard, it does have two significant implications. First, by
doing the formatting in T-SQL you likely reduce portability. While you might
not care right now whether the code can run in another database, it is
possible that later the code might have to be ported. By sticking to ANSI
SQL coding conventions, you reduce potential problems. I realize that this
is a little bit like telling a 20-year old to begin saving for retirement,
but when that individual is 65, he's probably going to be a lot happier if
he did start thinking about retirement funds at the age of 20 rather than
waiting till he's 40 or so :-) Secondly, I'll assume you've heard the
phrase "when all you have is a hammer, everything looks like a nail". This
same concept applies here. While T-SQL can be used to do formatting, it's
just not the best tool for the job. Would you defend someone using assembler
language to code most commercial applications just because (a) they know
assembler, and (b) they claim that they can get "optimal" performance. The
latter might be true, but one has to consider the expense. Even if
performance could be doubled, would the expense and cost of coding in
assembler be worth it ... keeping in mind that coding in assembler would
almost certainly entail an order of magnitude (if not more) increase in cost
and delivery time.
I'm not trying to "pick an argument" with you. I'm just trying to elaborate
on why most SQL professionals voice this opinion.
Chief Tenaya
"Louis Davidson" <dr_donts...@hotmail.com> wrote in message
news:OtZBJkXH...@TK2MSFTNGP09.phx.gbl...
I have heard people going on about time and resources thing, but because I
have never been involved with a case where my SQL Server was ever CPU or
memory bound, even when I had thousands of concurrent users, and I was doing
some pretty wild data manipulation to present web pages (I was just serving
up data, but it was pretty complex.) Usually it was the web servers that
were taxed. On "most" systems, nothing is too overly taxed, and the only
performance problems are caused by poor design and infrastructure.
And SQL IS a data manipulation language, hence my problem with not using it
to transform data from one shape to another. It is not like using SQL to
format data is a foreign thing, other than a few cases (like the one that
got this thread started.) Plus it is very doggone good at it. The reason
that I had always used SQL Server to do data formatting for the past 12
years has been because it was so much more elegant of a solution to format
it in a Stored Procedure using set based manipulations and give it to the
client in a use it as it is format. Sure three tier development means that
some of this can be moved out of SQL, but it is so easy to change:
SELECT dateValue
to
SELECT convert(100, varchar(20),dateValue)
instead of coding hooks in a dll to do this formatting, even though it is
usually a better use of resources.
I am not arguing with you at all, I am doing my weekly (or so) try to
justify in my brain that it is better to push off work that SQL Server does
so very well to a place where the coding takes significantly longer to get
done (unless you are using a reporting tool or canned software, then it is
probably already done.)
--
----------------------------------------------------------------------------
Louis Davidson (dr...@hotmail.com)
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
"Tenaya" <c...@ct.ct> wrote in message
news:e8eouWZH...@TK2MSFTNGP09.phx.gbl...
First, thanks for understanding that I wasn't "attacking" you or anything
like that :-)
The bottom line, of course, is if it works, then fine and good. I'm not such
a purist that I'm going to insist that code conform to some theoretical
ideal. There have been times when I was in situations where I coded some
"amazing" things in T-SQL, that should never have been coded in T-SQL in the
first place.
But I'd still argue that if one "knows" the logical arguments for moving
coding off the database server, then it is worthwhile doing it. It sort of
goes back to the concept that while it might work today, under the current
workload, the same coding might labor unnecessarily down the road ... and if
I can avoid a potential problem in the future, then I think it's worthwhile
to do so. Forewarned is forearmed :-)
At the very least, I'd argue that formatting things that differ from country
to country ... date and time formats, or even just the representation of
numerics ... is better done closer to the client than at the database
server. As I mentioned in my first post ... isn't it better to query the
local Windows settings to determine what the user wants to see re: a date or
a time format, or whether a comma or a decimal should be used to separate
the fractional part of a number.
I won't argue that it's easier to make a change in one place than in many
places, but I do contend that it's even better to accomodate a user's wishes
rather than have to explain why one's application is "locked" into a
particular display that ignores the local Windows settings.
But, again, bottom line is it may be better not to change something if it's
not broken :-)
Chief Tenaya
"Louis Davidson" <dr_donts...@hotmail.com> wrote in message
news:%23DdWqBb...@tk2msftngp13.phx.gbl...
My view is that the presentation task is better done outside the data tier.
Modern programming languages offer a robust, specialized toolset for
presenting data in a variety of formats. Sure you can do a lot of cool
stuff in T-SQL but this doesn't mean you should. Formatting data outside of
SQL allows you to use the same optimized T-SQL code regardless of how data
are presented. You can render data as HTML, in WinForms, etc. without
creating customized SQL for each data presentation.
To add to Teneya's remarks, you limit your scalability options when you do
too much work in the data layer. Although this may not be a concern for
smaller applications, I think it makes sense to take the same approach
regardless of app size so that you can hone your skills to develop
applications that can scale as needed.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Louis Davidson" <dr_donts...@hotmail.com> wrote in message
news:OtZBJkXH...@TK2MSFTNGP09.phx.gbl...
Thanks,
--
----------------------------------------------------------------------------
Louis Davidson (dr...@hotmail.com)
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services. All other replies will be ignored :)
"Tenaya" <c...@ct.ct> wrote in message
news:OpTKtRdH...@TK2MSFTNGP10.phx.gbl...