Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Message from discussion Set based solution
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Anith Sen  
View profile  
 More options Apr 7 2004, 3:49 pm
Newsgroups: microsoft.public.sqlserver.programming
From: "Anith Sen" <an...@bizdatasolutions.com>
Date: Wed, 7 Apr 2004 14:46:16 -0500
Local: Wed, Apr 7 2004 3:46 pm
Subject: Re: Set based solution
Just by glancing through your code, all you are doing is concatenating the
data in a column & updating another table. Concatenating column values from
multiple rows are generally best done in a client side application language,
since the capabilities of SQL & SQL based DBMSs are limited. However, you
can do this using different approaches in t-sql, but avoiding such methods
may be your best bet. Here is the kludge-list with the Northwind..Products
table which you can adapt to suit your own needs.

e.g: SELECT CategoryId, ProductName FROM Northwind..Products ;

The goal is to return a SQL resultset with two columns, one with the
CategoryId and the other with a concatenated list of all the ProductNames
separated by commas.

#1 Dynamic SQL.

There is enough literature out there which demonstrates the drawbacks and
implications of using Dynamic SQL. A popular one, at least from database
programmer’s perspective, is www.sommarskog.se/dynamic_sql.html. The Dynamic
SQL approach is based on creating a SQL query string based on the number of
groups and then use a series of CASE expressions to pivot the data for
concatenation.

DECLARE @r VARCHAR( 8000 ), @n INT, @i INT
SELECT @r = 'SELECT CategoryId, ' + CHAR( 13 ),
       @n = ( SELECT TOP 1 COUNT( ProductName ) FROM Products
               GROUP BY CategoryId ORDER BY COUNT( ProductName ) DESC )
SET @i = 1 ;
WHILE @i <= @n BEGIN
SET @r = @r +
  CASE WHEN @i = 1  THEN 'MAX( CASE Seq
       WHEN ' + CAST( @i AS VARCHAR ) + '
       THEN ProductName ELSE SPACE( 0 ) END ) + ' + CHAR(13)
       WHEN @i = @n
       THEN 'MAX( CASE Seq
       WHEN ' + CAST( @i AS VARCHAR ) + '
       THEN '', '' + ProductName
       ELSE SPACE( 0 ) END ) ' + CHAR( 13 )
       ELSE 'MAX( CASE Seq
       WHEN ' + CAST( @i AS VARCHAR ) + '
       THEN '', '' + ProductName
       ELSE SPACE( 0 ) END ) + ' + CHAR( 13 )  END
 SET @i = @i + 1
END
SET @r = @r + '
FROM ( SELECT CategoryId, ProductName,
              ( SELECT COUNT(*) FROM Products p2
                 WHERE p2.CategoryId = p1.CategoryId
                  AND p2.ProductName <= p1.ProductName )
        FROM Products p1 ) D ( CategoryId, ProductName, Seq )
       GROUP BY CategoryId;'
EXEC( @r )

There is a chance that the SQL string can grow beyond 8000 chars which is
the documented limit for a VARCHAR datatype. In such cases, one may have to
use multiple VARCHAR variables and do: EXEC(@r1 + @r2 + @r3 + ..)

#2 Scalar UDF with t-SQL update extension

The usage of an expression that involves the column that is being used in
the LHS of the SET clause in an UPDATE statement is rarely a logical
approach. However, in general, the optimizer seems to process these values
in the order of materialization, either in the internal work tables or any
other storage structures. With that assumption, this approach though not a
good logical methods, works under general circumstances.

CREATE FUNCTION udf_scalar_concat ( @CategoryId INT )
        RETURNS VARCHAR( 8000 ) AS
BEGIN
DECLARE @t TABLE( p VARCHAR(40 ) )
DECLARE @r VARCHAR( 8000 )
       SET @r = SPACE(0)
       INSERT @t
       SELECT ProductName FROM Products
        WHERE CategoryId = @CategoryId
       IF @@ROWCOUNT > 0
       UPDATE @t
          SET @r = @r + p + ','
      RETURN( @r )
END

Usage:

SELECT CategoryId, dbo.udf_scalar_concat( CategoryId )
  FROM Products
 GROUP BY CategoryId ;

#3 Scalar UDF with recursion

Recursive functions in t-SQL have a drawback that the maximum nesting level
is 32. So this approach is applicable only for smaller datasets, especially
when the number of items within a group, that needs to be concatenated, is
less than 32.

CREATE FUNCTION udf_recur ( @cid INT, @i INT )
RETURNS VARCHAR(8000) AS BEGIN
DECLARE @r VARCHAR(8000), @l VARCHAR(8000)
SELECT @i = @i - 1, @r = ProductName + ', '
  FROM Products p1
 WHERE CategoryId = @cid
   AND  @i = ( SELECT COUNT( * ) FROM Products p2
                WHERE p2.CategoryId = p1.CategoryId
                  AND p2.ProductName <= p1.ProductName ) ;
IF @i > 0 BEGIN
      EXEC @l = dbo.recur @cid, @i
      SET @r =  @l + @r
END
RETURN @r
END
GO

This function can be used as follows:

SELECT CategoryId, dbo.udf_recur( CategoryId, COUNT(ProductName) )
  FROM Products p1
 GROUP BY CategoryId ;

#4 Table valued UDF with a WHILE loop

This approach is based on the idea by Steve/Lindawie where a table variable
with three columns is used within a table valued UDF. The first column
represents the group, second represents the currently processing value
within a group and the third represents the concatenated list of values.

CREATE FUNCTION udf_tbl_Concat()
RETURNS @t TABLE( CategoryId INT,
        Product VARCHAR( 40 ),
        list VARCHAR( 7991 ) )
BEGIN
 INSERT @t ( CategoryId, Product, list )
 SELECT CategoryId, MIN( ProductName ),  MIN( ProductName )
   FROM Products
  GROUP BY CategoryId
WHILE ( SELECT COUNT( Product ) FROM @t ) > 0 BEGIN
UPDATE t
   SET list = list + COALESCE(
                 ( SELECT ', ' + MIN( ProductName ) FROM Products
                    WHERE Products.CategoryId = t.CategoryId
                      AND Products.ProductName > t.Product ), '' ),
       Product = ( SELECT MIN(ProductName) FROM Products
                    WHERE Products.CategoryId = t.CategoryId
                      AND Products.ProductName > t.Product )
  FROM @t t END
RETURN
END

SELECT CategoryId, list
  FROM udf_tbl_Concat() ;

#5 The Cursors approach (least recommended)

The drawbacks of using cursors are well-known among the SQL Server
community. Given the fact that they are generally resource intensive,
procedural and inefficient, one should strive not to use cursor based
approaches for programming requirements unless as a last resort.

DECLARE @tbl TABLE ( id INT PRIMARY KEY, list VARCHAR( 8000 ) )
SET NOCOUNT ON
DECLARE @c INT, @p VARCHAR( 8000 ), @cNext INT, @pNext VARCHAR( 40 )
DECLARE c CURSOR FOR
SELECT CategoryId, ProductName
  FROM Products
 ORDER BY CategoryId, ProductName
OPEN c
FETCH NEXT FROM c INTO @cNext, @pNext
SET @c = @cNext
WHILE @@FETCH_STATUS = 0 BEGIN
     IF @cNext > @c BEGIN
          INSERT @tbl SELECT @c, @p
          SELECT @p = @PNext, @c = @cNext
     END ELSE
          SET @p = COALESCE( @p + ',', SPACE( 0 ) ) + @pNext
     FETCH NEXT FROM c INTO @cNext, @pNext
END
INSERT @tbl SELECT @c, @p
CLOSE c
DEALLOCATE c
SET NOCOUNT OFF
SELECT * FROM @tbl

With the upcoming version of SQL Server, some of these queries will become
obsolete since there are certain proprietary operators which can handle
pivoting datasets without much procedural logic. Also there are certain
additional capabilities of XML in SQL Server 2005 which can facilitate such
transpositions.

--
Anith


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google