Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Break Up Large Table Query Into Results of N Rows
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  7 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
pbd22  
View profile  
 More options Jan 31, 9:57 am
Newsgroups: comp.databases.ms-sqlserver
From: pbd22 <dush...@gmail.com>
Date: Tue, 31 Jan 2012 06:57:42 -0800 (PST)
Local: Tues, Jan 31 2012 9:57 am
Subject: Break Up Large Table Query Into Results of N Rows
Hi.

I am working in SQL Server 2005 and want to break up a table of 1M
rows into distinct results of 80,0000.

I feel like this is basically

use [database]

select * from [table]
where email_address like '%gmail.com'
group by ????

I am messing up with the group by (if this is the right way to go).
How do achieve this?

Thanks.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pbd22  
View profile  
 More options Jan 31, 3:50 pm
Newsgroups: comp.databases.ms-sqlserver
From: pbd22 <dush...@gmail.com>
Date: Tue, 31 Jan 2012 12:50:27 -0800 (PST)
Local: Tues, Jan 31 2012 3:50 pm
Subject: Re: Break Up Large Table Query Into Results of N Rows
On Jan 31, 9:57 am, pbd22 <dush...@gmail.com> wrote:

Anybody? To people use this group any more?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Barrows  
View profile  
 More options Jan 31, 4:49 pm
Newsgroups: comp.databases.ms-sqlserver
From: "Bob Barrows" <reb01...@NOyahooSPAM.com>
Date: Tue, 31 Jan 2012 16:49:46 -0500
Local: Tues, Jan 31 2012 4:49 pm
Subject: Re: Break Up Large Table Query Into Results of N Rows

We sure do, but I, for one, don't understand your question. maybe show us:

CREATE TABLE statement
INSERT STATEMENT with sample data
desired results from supplied sample data

and we'll have a better chance of understanding your request (first off,
what's 80,0000 mean?)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erland Sommarskog  
View profile  
 More options Jan 31, 5:03 pm
Newsgroups: comp.databases.ms-sqlserver
From: Erland Sommarskog <esq...@sommarskog.se>
Date: Tue, 31 Jan 2012 23:03:05 +0100
Local: Tues, Jan 31 2012 5:03 pm
Subject: Re: Break Up Large Table Query Into Results of N Rows

I am here! However, I only look in here about once a day. The traffic
does not really warrant anymore. If you want speedy answers to your
questioms, this is a busy place:
http://social.msdn.microsoft.com/Forums/en-US/transactsql/threads
Although, a bit too many people answers questions that goes outside
what they really know.

Anyway, your problem is a little vague. What is the purpose with
this division? And must the batches be exactly 80000 rows in size
or can the number vary?

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pbd22  
View profile  
 More options Feb 1, 10:34 am
Newsgroups: comp.databases.ms-sqlserver
From: pbd22 <dush...@gmail.com>
Date: Wed, 1 Feb 2012 07:34:49 -0800 (PST)
Local: Wed, Feb 1 2012 10:34 am
Subject: Re: Break Up Large Table Query Into Results of N Rows
On Jan 31, 5:03 pm, Erland Sommarskog <esq...@sommarskog.se> wrote:

Hi Erland and Bob,

Thanks both a bundle for your responses. Looks like there is life on
planet
Google Groups after all, much appreciated!

The specifics is that we are doing email deployments but google is
moving
all of the email sent to gmail users to their spam boxes. As a result,
we have
to "chunk" the gmail users out of the total amount and send in
manageable
batches. We have figured that 80,000 per batch out of the total gmail
users
in the table is possible.

And, Erland, to answer your question, I would prefer to turn this into
a stored
procedure we can use for the purpose of "chunking" email addresses.
Accordingly,
the number per batch should be a variable parameter.

Bob, the table we are querying against is pretty simple. Essentially,
it has one
one column - "email_address" which is a varchar. Its data is about 1
million email
addresses (but that number changes often). The result set table(s)
should only have
two columns, the count (INT) and the email_address (varchar). Please
see below.

The query I am trying to write is supposed to dump each result set
(batch)
to a text file in some folder on the hard drive. Each result set
should have
a count and the email addresses as columns. Something like this:

RESULT 1:

[COUNT] [EMAIL_ADDRESS]
 1              n...@gmail.com
 2              n...@gmail.com
 3              n...@gmail.com
 4              n...@gmail.com
 5              n...@gamil.com
...
80,000       n...@gmail.com

RESULT 2:

[COUNT] [EMAIL_ADDRESS]
80,001      n...@gmail.com
80,002      n...@gmail.com
80,003      n...@gmail.com
80,004      n...@gmail.com
80,005      n...@gamil.com
...
16,000       n...@gmail.com

And so on up to the to total amount of the gmail addresses out of the
original table.

So, the statement should read something like this (pseudo code):

select all distinct users
from the master table
where email_address like '%gmail.com'
return in batches of N (such as 80,000)
and write each batch to a text file on the
hard drive.

I hope I have explained myself well. Let me know if anything is
unclear.

Thanks a bundle for your help.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Barrows  
View profile  
 More options Feb 1, 3:36 pm
Newsgroups: comp.databases.ms-sqlserver
From: "Bob Barrows" <reb01...@NOyahooSPAM.com>
Date: Wed, 1 Feb 2012 15:36:07 -0500
Local: Wed, Feb 1 2012 3:36 pm
Subject: Re: Break Up Large Table Query Into Results of N Rows

You keep making a point of saying "distinct". Does that imply that there are
duplicate email addresses in that 1-million row table?

> from the master table
> where email_address like '%gmail.com'
> return in batches of N (such as 80,000)
> and write each batch to a text file on the
> hard drive.

I assume you can handle this part, correct? It's only batching the data that
you need help with?

> I hope I have explained myself well. Let me know if anything is
> unclear.

> Thanks a bundle for your help.

SQL 2008 has some paging functionality builtin but you're using SQL 2005.
I can think of a couple approaches. Here's one:
1. Create a temp table (#batches) with an identity column (indentcol) and an
email column (email). Insert the distinct email addresses into it:
insert #batches (email)
select distinct email_address from master_table where email_address like
'%gmail.com'

Then use a WHILE loop to retrieve the batches, using a variable to keep
track of them.
declare @batchsize int --convert this to a parameter for your sproc
set @batchsize=80000

declare @lastrec int, @endrec int
set @lastrec=(select max(identcol) from #batches)
set @endrec=@batchsize
WHILE @endrec-@batchsize<=@lastrec
BEGIN
    select email from #batches where identcol >=@lastrec
    --process the batch
    delete #batches  where identcol >=@lastrec
    set @endrec=@endrec + @batchsize
END


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erland Sommarskog  
View profile  
 More options Feb 1, 5:54 pm
Newsgroups: comp.databases.ms-sqlserver
From: Erland Sommarskog <esq...@sommarskog.se>
Date: Wed, 01 Feb 2012 23:54:36 +0100
Local: Wed, Feb 1 2012 5:54 pm
Subject: Re: Break Up Large Table Query Into Results of N Rows

pbd22 (dush...@gmail.com) writes:
> The specifics is that we are doing email deployments but google is
> moving all of the email sent to gmail users to their spam boxes. As a
> result, we have to "chunk" the gmail users out of the total amount and
> send in manageable batches. We have figured that 80,000 per batch out of
> the total gmail users in the table is possible.
>...
> Bob, the table we are querying against is pretty simple. Essentially,
> it has one one column - "email_address" which is a varchar. Its data is
> about 1 million email addresses (but that number changes often). The
> result set table(s) should only have two columns, the count (INT) and
> the email_address (varchar). Please see below.

The simple-minded solution would be:

   WITH numbered AS (
      SELECT email, row_number() OVER(ORDER BY email) AS rowno
      FROM   addresses
      WHERE  email LIKE '...@gmail.com'
   )
   SELECT email
   FROM   numbered
   WHERE  rowno > (@batchno - 1) * @batchsize
     AND  rowno <= (@batchno - 1) * @batchsize

But it would be far more efficient to do:

   CREATE TABLE gmail_addresses (rowno int NOT NULL,
                                 gmail nvarchar(255) NOT NULL,
      CONSTRAINT pk_gmail PRIMARY KEY CLUSTERED (rowno),
      CONSTRAINT pk_unique UNIQUE NONCLUSTERED(gmail))

   INSERT gmail_addresses(rowno, gmail)
     SELECT row_number() OVER(ORDER BY (SELECT 1)), gmail
     FROM   (SELECT DISTINCT email
             FROM   addresses
             WHERE  email like  '...@gmail.com') AS x

Then you have materialised the row number once for all, and a selection
of 80000 accounts will be quick.

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »