37 views

Skip to first unread message

Jan 20, 2005, 10:56:22 AM1/20/05

to

I have been losing sleep over this puzzle, and I'm convinced my train

of thought is heading in the wrong direction. It is difficult to

explain my circumstances, so I will present an identical make-believe

challenge in order to avoid confusing the issue further.

of thought is heading in the wrong direction. It is difficult to

explain my circumstances, so I will present an identical make-believe

challenge in order to avoid confusing the issue further.

Suppose I was hosting a dinner and I wanted to invite exactly 12 guests

from my neighborhood. I'm really picky about that... I have 12 chairs

besides my own, and I want them all filled or there ain't gonna be no

dinner.

I have created a table of all the families in my neighborhood that I

can tolerate for an hour, and a count of all the members in each of the

families. I added an identity column and prioritized my potential

guests by the size of each family in descending order, like so:

Identity FamilyName SeatsNeeded

1 McClusky 8

2 Olson 7

3 Adams 6

4 Wilson 3

5 Sayjak 2

What is the easiest way to identify the first or only group of guests

that required EXACTLY 12 seats? The only solution, by the way, is to

invite the McClusky's, the Wilson's and the Sayjak's (7+3+2=12).

The procedure I am currently using starts with the table #ungrouped

(similar to the example above), and inserts the first record from

#ungrouped into an empty table, #grouped (set up just like #ungrouped,

except records from "SeatsNeeded" become "SeatsTaken"). While the sum

of all SeatsTaken in #grouped is less than 12, I continue to add to it

unique records from #ungrouped where SeatsNeeded

<=(12-SUM(#grouped.SeatsTaken)). As soon as #grouped contains a total

of 12 under the SeatsTaken column, those records are tagged with a

GroupID in their source tables and removed from the temporary tables.

The process is repeated until the procedure can't find any more groups

of 12 (if a situation exists similar to my example table, I have to

identify the last group manually). The remaining records get grouped

when additional records become available.

I apologize if my SQuengLish is confusing or in any way insulting. I

guess the bottom line is that I have a solution that only works when

the first records is part of the result set, and the procedure has many

more records to sort through. It would be nice if I could find as many

groups of 12 as possible, as efficiently as possible, with as few

left-overs as possible.

Any ideas? Clues?

Silvio

Jan 20, 2005, 12:13:01 PM1/20/05

to

Silvio Again...

I made a mistake regarding my example record set. If I want to iscolate

the records where the sum of all values in SeatsNeaded was 12, from

this table:

Identity FamilyName SeatsNeeded

1 McClusky 8

2 Olson 7

3 Adams 6

4 Wilson 3

5 Sayjak 2

... the correct record set would include the OLSON's, the Wilson's and

the Sayjak's (7+3+2=12)... not the McClusky's.

Jan 20, 2005, 12:58:21 PM1/20/05

to

On 20 Jan 2005 07:56:22 -0800, stu_gots wrote:

(snip)

>Suppose I was hosting a dinner and I wanted to invite exactly 12 guests

>from my neighborhood. I'm really picky about that... I have 12 chairs

>besides my own, and I want them all filled or there ain't gonna be no

>dinner.

(snip)

Hi Silvio,

I don't have a ready-made solution for this particular problem. I do have

a solution to a related problem, that might or might not help. I'll get to

that later on in this message. First, some thoughts about your problem.

An interesting starting question is whether there is a mathematical

solution for this problem - I'm not a mathematician, but maybe someone

else reading this is?

Anyway, if there is no mathematical solution, or if it is too complicated

to use in SQL Server, your only other option is to use trial and error. In

a third generation programming language, I'd be tempted to make a

recursive procedure:

* add largest family that still fits room to the "to be invited" set

* if all seats filled: report success & return

* if empty seats left:

* call self

* if succes: report succes & return

* if failure: retry from start with next available family

* if all families have been tried: report failure & return

You COULD try to create such a procedure in SQL Server, but recursion is

not exactly a particular strength of SQL Server. It would also perform

terrible (from the rest of your message, I understand that your real

problem involves lots more families and not one but many dinner parties).

I'll mull your problem over for a while. If I come up with something

creative, I'll let you know.

Meanwhile, allow me to present you with the code I once wrote to solve a

related problem. To stay with your analogy: you have to invite ALL guests

from your neighbourhood to dinner; you want to get it over as quickly as

possible, so if you can serve them all with three dinner aprties, there

definitely won't be four. I bet you see both the similarities and the

differences; I'll leave it to you to decide if this code will help you or

not.

The code I'll post uses a different analogy: examinations. There are

different subjects (families); for each subject, the number of students

that enrolled for the exam (family members) is known. The examination room

has a capacity of 100 seats. All students taking an examination for a

subject will be invited on the same day; examination sessions for multiple

subjects might be combined as long as the room doesn't overflow. The aim

of my code is to combine as much subjects as possible, making sure the

least possible amount of days is used to get all examinations done. I

don't think my code will always find the best possible combination, but I

do believe that it gets pretty close, and it's a lot faster than using

cursors (as most peeple would do in this case).

Some extra complications in my example that are not present in your

analogy are:

* If more than 100 students enroll for the same subject (and only in that

case!), the examinations for that subject will be split: as many sessions

for exactly 100 students as needed, plus one session for the remaining

students. This last session might be combined with sessions for other

subjects.

* The examinations are not help once, but each quarter; the number of

students enrolling for a subject might change from quarter to quarter.

Here's the code I used to solve this. I originally wrote this script for

Joe Celko, so I used the ANSI-standard UPDATE syntax; converting it to

proprietary UPDATE FROM syntax would probably speed it up.

-- Create the tables

CREATE TABLE Entries (Quarter int NOT NULL

,SubjCode char(3) NOT NULL

,NumberOfCandidates int NOT NULL

,PRIMARY KEY (Quarter, SubjCode)

-- ,FOREIGN KEY (SubjCode) REFERENCES Subjects

,CHECK (Quarter IN (1, 2, 3, 4))

,CHECK (NumberOfCandidates > 0)

)

CREATE TABLE Sessions (Quarter int NOT NULL

,SessionNo int NOT NULL

,SpaceLeft int NOT NULL -- could be derived, but

that would hurt performance

,PRIMARY KEY (Quarter, SessionNo)

,CHECK (Quarter IN (1, 2, 3, 4))

)

CREATE TABLE SplitEntries (Quarter int NOT NULL

,SubjCode char(3) NOT NULL

,SplitNo int NOT NULL

,NumberOfCandidates int NOT NULL

,SessionNo int DEFAULT NULL

,PRIMARY KEY (Quarter, SubjCode, SplitNo)

,FOREIGN KEY (Quarter, SubjCode) REFERENCES

Entries

,FOREIGN KEY (Quarter, SessionNo) REFERENCES

Sessions

-- ,CHECK (Quarter IN (1, 2, 3, 4))

-- ,CHECK (NumberOfCandidates > 0 AND

NumberOfCandidates <= 100)

)

go

-- Insert some sample data

INSERT Entries (Quarter, SubjCode, NumberOfCandidates)

SELECT 1, 'A01', 117

UNION ALL

SELECT 1, 'A02', 30

UNION ALL

SELECT 1, 'A03', 263

UNION ALL

SELECT 1, 'A04', 15

UNION ALL

SELECT 1, 'B01', 57

UNION ALL

SELECT 1, 'B02', 32

UNION ALL

SELECT 1, 'B03', 14

UNION ALL

SELECT 2, 'A01', 198

UNION ALL

SELECT 2, 'A02', 54

UNION ALL

SELECT 2, 'A03', 202

UNION ALL

SELECT 2, 'A04', 19

UNION ALL

SELECT 2, 'B01', 42

UNION ALL

SELECT 2, 'B02', 40

UNION ALL

SELECT 2, 'B03', 18

UNION ALL

SELECT 2, 'B04', 16

go

-- Find subjects with over 100 entries, split them in units of max 100.

-- Declare max candidates per session as constant, to make maintenance

easier

DECLARE @MaxCandidatesPerSession int

SET @MaxCandidatesPerSession = 100

-- Make split entries for each full 100 in the original entries

-- A numbers table is needed for this step,

-- see http://www.aspfaq.com/show.asp?id=2516

INSERT INTO SplitEntries (Quarter, SubjCode, SplitNo, NumberOfCandidates)

SELECT Entries.Quarter,

Entries.SubjCode,

numbers.n / @MaxCandidatesPerSession,

@MaxCandidatesPerSession

FROM Entries

INNER JOIN numbers

ON numbers.n <= Entries.NumberOfCandidates

AND numbers.n % @MaxCandidatesPerSession = 0

UNION ALL

-- don't forget the remaining candidates after alloting the full 100's.

SELECT Entries.Quarter,

Entries.SubjCode,

1 + Entries.NumberOfCandidates / @MaxCandidatesPerSession,

Entries.NumberOfCandidates % @MaxCandidatesPerSession

FROM Entries

WHERE Entries.NumberOfCandidates % @MaxCandidatesPerSession <> 0

-- Here comes the code to combine the (split) entries into as little

-- examination sessions as possible.

-- Declare max candidates per session as constant, to make maintenance

easier

DECLARE @MaxCandidatesPerSession int

SET @MaxCandidatesPerSession = 100

-- Helper variables to control WHILE loops

DECLARE @AllDone char(1)

SET @AllDone = 'N'

DECLARE @SessionsFull char(1)

-- For each quarter, compute the minimal number of sessions needed and

insert rows for those sessions.

-- Note: minimal number of sessions needed = total applications / max per

session, rounded up.

-- A numbers table is needed for this step,

-- see http://www.aspfaq.com/show.asp?id=2516

INSERT INTO Sessions (Quarter, SessionNo, SpaceLeft)

SELECT Quarter,

(SELECT COALESCE(MAX(s.SessionNo), 0)

FROM Sessions AS s

WHERE s.Quarter = NeedByQuarter.Quarter) + n,

@MaxCandidatesPerSession

FROM (SELECT Quarter,

(SUM(NumberOfCandidates) + @MaxCandidatesPerSession -

1) / @MaxCandidatesPerSession AS MinNeeded

FROM SplitEntries

GROUP BY Quarter) AS NeedByQuarter

CROSS JOIN numbers

WHERE n BETWEEN 1 AND NeedByQuarter.MinNeeded

-- Keep adding and filling sessions until all split entries are assigned a

session

WHILE @AllDone = 'N'

BEGIN

SET @SessionsFull = 'N'

-- Keed adding split entries to sessions until they are all as full as

possible

WHILE @SessionsFull = 'N'

BEGIN

-- Add another split entry to each session that still has room for

another split entry.

-- Note that this is not a best fit algorithm, but rather a compromise

-- to keep the query simple (cough) and limit execution time.

-- Each session gets a ranking based on descending SpaceLeft (within

quarter), with SessionNo as tiebreaker.

-- Each split entry gets a ranking, based on descending

NumberOfCandidates (within quarter),

-- with SubjectCode and SplitNo as tiebreakers.

-- Split entries that won't fit any available session are omitted -

they'll wait for the next round of sessions.

-- The tables are joined on rank and if the NumberOfCandidates doesn't

exceed the space left, it is added.

-- The subquery is repeated twice, to satisfy ANSI standard syntax.

-- Execution time on MS SQL Server decreases if proprietary T-SQL

update syntax is used.

UPDATE SplitEntries

SET SessionNo = (SELECT s.SessionNo

FROM Sessions AS s

WHERE s.Quarter = SplitEntries.Quarter

AND s.SpaceLeft >=

SplitEntries.NumberOfCandidates

AND (SELECT COUNT(*)

FROM Sessions AS s2

WHERE s2.Quarter =

SplitEntries.Quarter

AND (s2.SpaceLeft > s.SpaceLeft

OR (s2.SpaceLeft = s.SpaceLeft

AND s2.SessionNo <=

s.SessionNo)))

= (SELECT COUNT(*)

FROM SplitEntries AS se2

WHERE se2.Quarter =

SplitEntries.Quarter

AND se2.SessionNo IS

NULL

AND se2.NumberOfCandidates <=

(SELECT MAX(SpaceLeft)

FROM Sessions AS s3

WHERE s3.Quarter = SplitEntries.Quarter)

AND (se2.NumberOfCandidates >

SplitEntries.NumberOfCandidates

OR (se2.NumberOfCandidates =

SplitEntries.NumberOfCandidates

AND se2.SubjCode <

SplitEntries.SubjCode)

OR (se2.NumberOfCandidates =

SplitEntries.NumberOfCandidates

AND se2.SubjCode =

SplitEntries.SubjCode

AND se2.SplitNo <=

SplitEntries.SplitNo))))

WHERE SessionNo IS NULL

AND NumberOfCandidates <= (SELECT MAX(SpaceLeft)

FROM Sessions AS b3

WHERE b3.Quarter =

SplitEntries.Quarter)

AND EXISTS (SELECT s.SessionNo

FROM Sessions AS s

WHERE s.Quarter = SplitEntries.Quarter

AND s.SpaceLeft >= SplitEntries.NumberOfCandidates

AND (SELECT COUNT(*)

FROM Sessions AS s2

WHERE s2.Quarter = SplitEntries.Quarter

AND (s2.SpaceLeft > s.SpaceLeft

OR (s2.SpaceLeft = s.SpaceLeft

AND s2.SessionNo <= s.SessionNo)))

= (SELECT COUNT(*)

FROM SplitEntries AS se2

WHERE se2.Quarter = SplitEntries.Quarter

AND se2.SessionNo IS NULL

AND se2.NumberOfCandidates <= (SELECT

MAX(SpaceLeft)

FROM

Sessions AS s3

WHERE

s3.Quarter = SplitEntries.Quarter)

AND (se2.NumberOfCandidates >

SplitEntries.NumberOfCandidates

OR (se2.NumberOfCandidates =

SplitEntries.NumberOfCandidates

AND se2.SubjCode <

SplitEntries.SubjCode)

OR (se2.NumberOfCandidates =

SplitEntries.NumberOfCandidates

AND se2.SubjCode =

SplitEntries.SubjCode

AND se2.SplitNo <=

SplitEntries.SplitNo))))

IF @@ROWCOUNT = 0

BEGIN

-- If no updates were done, all sessions are as full as possible and

inner loop stops

SET @SessionsFull = 'Y'

END

ELSE

BEGIN

-- One or more split entries were assigned to a session -

recalculate remaining room in all sessions

UPDATE Sessions

SET SpaceLeft = @MaxCandidatesPerSession

- (SELECT SUM(NumberOfCandidates)

FROM SplitEntries

WHERE SplitEntries.Quarter = Sessions.Quarter

AND SplitEntries.SessionNo =

Sessions.SessionNo)

END

END -- End inner loop (adding split entries to sessions)

-- All sessions are filled as far as possible, but maybe some split

entries are still not allotted to any session.

-- For each quarter, compute and insert the minimal number of extra

sessions needed.

-- (Note - this is basicallly a repeat of the query before the outermost

WHILE loop,

-- with an extra WHERE to skip split entries already assigned to

a session)

INSERT INTO Sessions (Quarter, SessionNo, SpaceLeft)

SELECT Quarter,

(SELECT COALESCE(MAX(s.SessionNo), 0)

FROM Sessions AS s

WHERE s.Quarter = NeedByQuarter.Quarter) + n,

@MaxCandidatesPerSession

FROM (SELECT Quarter,

(SUM(NumberOfCandidates) + @MaxCandidatesPerSession

- 1) / @MaxCandidatesPerSession AS MinNeeded

FROM SplitEntries

WHERE SessionNo IS NULL

GROUP BY Quarter) AS NeedByQuarter

CROSS JOIN numbers

WHERE n BETWEEN 1 AND NeedByQuarter.MinNeeded

IF @@ROWCOUNT = 0

BEGIN

-- No new sessions inserted can only mean that there are no more split

entries with SessionNo still NULL

SET @AllDone = 'Y'

END

END -- End outer loop (adding sessions until all split entries are

alotted)

-- All done!

Sorry for the line wrapping. My SQL code was obviously not formatted with

the 80-character usenet limitation in my mind <g>

Best, Hugo

--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jan 21, 2005, 12:00:03 PM1/21/05

to

Thanks for the ideas. I'm not sure yet what it was, but soon after I

read your post I came up with a solution. I think I've completely

solved my root problem, and I kind of feeel silly for not seeing it in

the first place... I guess I never found any examples.

read your post I came up with a solution. I think I've completely

solved my root problem, and I kind of feeel silly for not seeing it in

the first place... I guess I never found any examples.

Anyway, this seems to be the fastest and easiest way to (in my case) to

identify a group of "orders" where the total number of "sets" = 12.

More on this later, here's what I got so far...

--Again, instead of inviting families to fill exactly 12 seats,

--I'm looking for the fewest OrderIDs

--where the number of Sets they contain totals exactly 12.

--This returns a value of 2,3,7

CREATE TABLE #Example

(

OrderID int,

SetCount int NULL

)

INSERT #Example VALUES (1,8)

INSERT #Example VALUES (2,7)

INSERT #Example VALUES (3,6)

INSERT #Example VALUES (4,3)

INSERT #Example VALUES (5,2)

INSERT #Example VALUES (6,DEFAULT)

SELECT TOP 1 a.SetCount AS SetA,

b.SetCount AS SetB,

c.SetCount AS SetC

FROM #Example a

INNER JOIN #Example b ON a.OrderID <> b.OrderID

INNER JOIN #Example c ON a.OrderID <> c.OrderID

AND c.OrderID <> b.OrderID

GROUP BY a.SetCount, b.SetCount, c.SetCount

HAVING (SUM(a.SetCount + b.SetCount + c.SetCount) = 12)

Jan 21, 2005, 12:18:37 PM1/21/05

to

"stu_gots" <slv_...@yahoo.com> wrote in message

news:1106236582....@c13g2000cwb.googlegroups.com...

> I have created a table of all the families in my neighborhood that I

> can tolerate for an hour, and a count of all the members in each of the

> families. I added an identity column and prioritized my potential

> guests by the size of each family in descending order, like so:

>

> Identity FamilyName SeatsNeeded

> 1 McClusky 8

> 2 Olson 7

> 3 Adams 6

> 4 Wilson 3

> 5 Sayjak 2

>

> What is the easiest way to identify the first or only group of guests

> that required EXACTLY 12 seats? The only solution, by the way, is to

> invite the McClusky's, the Wilson's and the Sayjak's (7+3+2=12).

news:1106236582....@c13g2000cwb.googlegroups.com...

> I have created a table of all the families in my neighborhood that I

> can tolerate for an hour, and a count of all the members in each of the

> families. I added an identity column and prioritized my potential

> guests by the size of each family in descending order, like so:

>

> Identity FamilyName SeatsNeeded

> 1 McClusky 8

> 2 Olson 7

> 3 Adams 6

> 4 Wilson 3

> 5 Sayjak 2

>

> What is the easiest way to identify the first or only group of guests

> that required EXACTLY 12 seats? The only solution, by the way, is to

> invite the McClusky's, the Wilson's and the Sayjak's (7+3+2=12).

Ok, I have a solution. It is not going to have fabulous performance but it

seems to work.

The basic procedure that I use is:

Create a List (length = 1) of Families

Loop until length = count(Items)

Find Member Size of Each List

Delete all Lists where Size > desiredsize

If Size = desiredsize add list to Solutions

If you want to shorten runtime and find only smallest solutions end

here.

Build new lists from all remaining lists by adding each family not in the

list to the list.

End

Expanding the list back into the family names is left as an excercise for

the reader.

Regards,

Jim

-- BEGIN SQL

--CREATE SAMPLE DATA

create table #Dining (Fam integer, famname varchar(10), members integer)

insert into #Dining VALUES (1,'McClusky',8)

insert into #Dining VALUES (2,'Olson',7)

insert into #Dining Values (3,'Adams',6)

insert into #Dining Values (4,'Wilson',3)

insert into #Dining Values (5,'Sayjak',2)

insert into #Dining Values (6,'Test',4)

--CREATE TEMP TABLES FOR PROCESS

create table #sol(fams varchar(200), Total integer, maxfam integer, length

integer)

create table #solution(fams varchar(200), Total integer)

declare @length integer

declare @curmax integer -- Number of Items in table

declare @desiredsize integer

set @desiredsize = 13 -- Size to Find

select @curmax = Count(*) from #Dining

set @length = 1

insert into #sol(fams, maxfam, length) Select Fam, Fam, @length from

#Dining -- Singles

while @length < @curmax

BEGIN

set @length = @length + 1

insert into #sol(fams, maxfam, length ) Select a.Fams + ',' + CAST(b.Fam

as varchar(2)), b.fam, @length

from #sol a, #dining b

where a.maxfam < b.fam

delete #sol where #sol.length < @length

update #sol Set Total = t.mem from (Select Fams, sum(members) mem from

#Dining b, #sol a where ','+a.Fams+',' like '%,'+CAST(b.Fam as

varchar(2))+',%'

Group by Fams) t inner join #sol a on t.Fams = a.Fams

insert into #solution(fams,Total) SELECT #sol.fams, #sol.Total FROM #sol

where Total = @desiredsize

delete #sol where Total > @desiredsize

END

select * from #solution

--CLEAN UP

drop table #Dining

drop table #sol

drop table #solution

--END SQL

Jan 21, 2005, 1:17:58 PM1/21/05

to

To elaborate, I am not interested cutting seconds or preserving system

resources. I'm working in pre-press during a swing shift, assembling

about 200 orders per evening containing a total of about 800 sets. This

is a print production solution... only 12 sets can pass through the

press at once in a variety of different circumstances too difficult to

explain. Orders, by the way, can not be split up to print in different

groups because of consistancy issues, and are consequently not sold in

groups of more than 12.

resources. I'm working in pre-press during a swing shift, assembling

about 200 orders per evening containing a total of about 800 sets. This

is a print production solution... only 12 sets can pass through the

press at once in a variety of different circumstances too difficult to

explain. Orders, by the way, can not be split up to print in different

groups because of consistancy issues, and are consequently not sold in

groups of more than 12.

This will be a long series of loops that applies the grouping procedure

to a bout 20 or more different subgroups of orders. The object will be

to have as few ungrouped orders as possible when the procedure is out

of places to turn. Perhaps there is a better procedure than the cross

join I posted a bit ago (the full context of which would actually

return 12 values... '2', '3', '7', and nine NULLs... I just gave you

the abridged version). The reason I'm so thrilled my the cross join is

that it will do the job a lot better than the current method of

grouping orders (with two hands and a pot of coffee), and it is simple

enough for the beginner... someone will have to take over for me

eventually. I have a long way to go before I can automate the process,

but I'm pretty sure the most difficult brain-teasers are behind me.

What do you think... errors in my logic?

-badabing!

Jan 21, 2005, 1:55:44 PM1/21/05

to

The following code will return all the permutations of 3 families to

get the result. It returns families #2, #4, #5. I couldn't figure out

how to narrow the permutations down to only combinations though.

get the result. It returns families #2, #4, #5. I couldn't figure out

how to narrow the permutations down to only combinations though.

select A,B,C

from

(

select a.orderid as A, b.orderid as B, c.orderid as C,

isnull(a.setcount,0) as AA, isnull(b.setcount,0) as BB,

isnull(c.setcount,0) as CC

from #Example A, #Example B, #Example C

) Derived

where

A<>B and A<>C and B<>C and

(AA+BB+CC) =12

Jan 21, 2005, 2:11:42 PM1/21/05

to

Just thought it thru. Added top clause and order by clause. It now

returns one row #2, #4, #5. (Bonus style points for me :) Note this

only checks for combinations of 3 families. You'll need to modify for

combos of 4 etc.

returns one row #2, #4, #5. (Bonus style points for me :) Note this

only checks for combinations of 3 families. You'll need to modify for

combos of 4 etc.

select top 1 A,B,C

from

(

select a.orderid as A, b.orderid as B, c.orderid as C,

isnull(a.setcount,0) as AA, isnull(b.setcount,0) as BB,

isnull(c.setcount,0) as CC

from #Example A, #Example B, #Example C

) Derived

where

A<>B and A<>C and B<>C and

(AA+BB+CC) =12

order by A,B,C

Jan 21, 2005, 3:35:54 PM1/21/05

to

Exactly what I was thinking. I suppose I could make use of up to 12

output fields (A, B, C, ..., K, L). Most of them would never contain

more than Null anyway, and I'd have them in the rare instance when my

procedure came down to a final group of 12 Orders, all with one set.

output fields (A, B, C, ..., K, L). Most of them would never contain

more than Null anyway, and I'd have them in the rare instance when my

procedure came down to a final group of 12 Orders, all with one set.

Jan 21, 2005, 3:49:24 PM1/21/05

to

"stu_gots" <slv_...@yahoo.com> wrote in message

news:1106331478....@c13g2000cwb.googlegroups.com...> This will be a long series of loops that applies the grouping procedure

> to a bout 20 or more different subgroups of orders. The object will be

> to have as few ungrouped orders as possible when the procedure is out

> of places to turn. Perhaps there is a better procedure than the cross

> join I posted a bit ago (the full context of which would actually

> return 12 values... '2', '3', '7', and nine NULLs... I just gave you

> the abridged version). The reason I'm so thrilled my the cross join is

> that it will do the job a lot better than the current method of

> grouping orders (with two hands and a pot of coffee), and it is simple

> enough for the beginner... someone will have to take over for me

> eventually. I have a long way to go before I can automate the process,

> but I'm pretty sure the most difficult brain-teasers are behind me.

> What do you think... errors in my logic?

>

> -badabing!

>

to maintain to me, especially if the batch size changes. Is there some

reason you are subgrouping your orders? BTW this is a variant of a problem

called the Bin-Packing Problem and this problem is considered mathematically

hard (Technically, NP-Complete, for all of you geeks out there). The upshot

of which is that finding the 'best' solution isn't possible for the general

case in a reasonable amount of time. (While I know you said you aren't

concerned about time, reasonable in this case goes into hundreds of years

for not terrible large sets of data). For example: With the minimal data

set (5 samples) in the sample there are 27 different possible batches. With

10 there are 1014. With 200 there are 1.6x10^60.

Having Said that, I have made a couple revisions to accomodate batch

assignment. It will assign all jobs possible to batches, each batch will

have SetQty of 12. The Remainder of Jobs will not be batchable for Qty 12.

The current ordering creates each batch with the smallest number of jobs and

I suspect this to be reasonable at batching the most possible jobs, although

this is not guaranteed. The only caveat in the current routine is that any

set with qty 12 gets set to Batch #0 to indicate it must be run separately.

Ideally each of these should be given individual batch numbers and probably

the batches should be prioritized in some manner.

Let me know how it goes.

Jim

--BEGIN SQL

--CREATE SAMPLE DATA

create table #Jobs (JobNum integer, JobName varchar(10), Qty integer)

insert into #Jobs VALUES (1,'McClusky',8)

insert into #Jobs VALUES (2,'Olson',7)

insert into #Jobs Values (3,'Adams',6)

insert into #Jobs Values (4,'Wilson',3)

insert into #Jobs Values (5,'Sayjak',2)

insert into #Jobs Values (6,'Test',4)

insert into #Jobs VALUES (7,'MAX',12)

--CREATE TEMP TABLES FOR PROCESSING

create table #sol(JobNums varchar(200), Total integer, maxJobNum integer,

length integer)

create table #solution(JobNums varchar(200), Total integer)

create table #batches(batchnum integer, setnum integer, qty integer)

--DECLARE VARIABLES

declare @length integer

declare @curmax integer

declare @desiredsize integer

declare @found integer

declare @done integer

declare @batchnum integer

set @desiredsize = 12 -- BATCH SIZE SHOULD BE PASSED INTO SP

set @done = 0

set @batchnum = 0

-- SINCE EACH SET MUST BE > 0 THERE CAN BE NO MORE THAN 12 SETS IN A VALID

SOLUTION

set @curmax = @desiredsize

-- SET ALL SINGLES to Batch #0

insert into #batches SELECT @batchnum, JobNum, Qty From #Jobs where Qty =

@desiredsize

While (@done = 0)

BEGIN

delete from #sol

set @length = 1

--START WITH SINGLE SET BATCHES

insert into #sol(JobNums, maxJobNum, length)

Select JobNum, JobNum, @length

from #Jobs left join #batches on #Jobs.JobNum = #batches.SetNum

where SetNum is Null -- Singles

set @found = 0

while (@length < @curmax) and (@found = 0)

BEGIN

set @length = @length + 1

--ADD A SET TO EACH BATCH

insert into #sol(JobNums, maxJobNum, length ) Select a.JobNums + ',' +

CAST(b.JobNum as varchar(2)), b.JobNum, @length

from #sol a, (#Jobs b left join #batches c on b.JobNum =

c.SetNum)

where a.maxJobNum < b.JobNum and c.SetNum is Null

--DELETE SMALLER BATCHES

delete #sol where #sol.length < @length

--CALC BATCH SIZE

update #sol Set Total = t.mem from (Select JobNums, sum(Qty) mem from

#Jobs b, #sol a where ','+a.JobNums+',' like '%,'+CAST(b.JobNum as

varchar(2))+',%'

Group by JobNums) t inner join #sol a on t.JobNums = a.JobNums

--SET SOLUTIONS IF BATCH SIZE = DESIRED SIZE

insert into #solution(JobNums,Total) SELECT #sol.JobNums, #sol.Total

FROM #sol where Total = @desiredsize

IF @@ROWCOUNT > 0

BEGIN

--IF FOUND START OVER

set @found = 1

END

--GET RID OF ALL TOO LARGE BATCHES

delete #sol where Total >= @desiredsize

END

IF @found=0 --WE SIZED OUT SO WE ARE DONE

BEGIN

set @done = 1

END

--ASSIGN FOUND SOLUTIONS TO A BATCH

set @batchnum = @batchnum+1

insert into #batches select @batchnum, a.JobNum, a.Qty from #Jobs a inner

join #solution b on ','+b.JobNums+',' like '%,'+CAST(a.JobNum as

varchar(2))+',%'

delete from #solution

END

--SHOW FOUND BATCHES

select * from #batches

--SHOW REMAINDER

select a.* from #Jobs a left join #batches b on a.JobNum = b.SetNum where

b.SetNum is null

drop table #Jobs

drop table #sol

drop table #solution

drop table #batches

Jan 21, 2005, 3:56:13 PM1/21/05

to

Actually, I think you may have to create several blocks of code. The

first block checks if combos of 2 return a resultset. If not then try

combos of 3. If not then try combos of 4. I may be wrong but here's

why. If I take your sample data but don't insert the 6th family with

no setcounts. The following query returns zero rows. The cross join

assumes that you're going to use 4 and exactly 4 families

first block checks if combos of 2 return a resultset. If not then try

combos of 3. If not then try combos of 4. I may be wrong but here's

why. If I take your sample data but don't insert the 6th family with

no setcounts. The following query returns zero rows. The cross join

assumes that you're going to use 4 and exactly 4 families

CREATE TABLE #Example

(

OrderID int,

SetCount int NULL

)

INSERT #Example VALUES (1,8)

INSERT #Example VALUES (2,7)

INSERT #Example VALUES (3,6)

INSERT #Example VALUES (4,3)

INSERT #Example VALUES (5,2)

select

top 1 A,B,C,D

from

(

select a.orderid as A, b.orderid as B, c.orderid as C, d.orderid as D,

isnull(a.setcount,0) as AA, isnull(b.setcount,0) as BB,

isnull(c.setcount,0) as CC, isnull(d.setcount,0) as DD

from #Example A, #Example B, #Example C, #Example D

) Derived

where

A<>B and

A<>C and B<>C and

A<>D and B<>D and C<>D and

(AA+BB+CC+DD) =12

order by A,B,C,D

Jan 21, 2005, 5:03:04 PM1/21/05

to

On 21 Jan 2005 09:00:03 -0800, stu_gots wrote:

(snip)

>Anyway, this seems to be the fastest and easiest way to (in my case) to

>identify a group of "orders" where the total number of "sets" = 12.

>More on this later, here's what I got so far...

(snip)

Hi Silvio,

This will work, but the downside is that it will only search for sets of

three. Expanding the code to look for sets of other sizes as well will

make it quite ugly.

A quick improvement to this code is this:

SELECT TOP 1 a.SetCount AS SetA,

b.SetCount AS SetB,

c.SetCount AS SetC

FROM #Example AS a

INNER JOIN #Example AS b

ON b.OrderID > a.OrderID

INNER JOIN #Example AS c

ON c.OrderID > b.OrderID

WHERE a.SetCount + b.SetCount + c.SetCount = 12

The change of <> to > means each combination is tried only once.

The removal of an unneeded (and potentially dangerous? just a gut feeling,

didn't really dive into it) GROUP BY and SUM simplifies the query.

More in a reply to one of your other posts.

Jan 21, 2005, 5:12:48 PM1/21/05

to

OK... now you're all just reading way into my project and spoiling all

my fun. Just kidding... thank you all again for the input. It just

doesn't seem right that you should offer up all these practically

cut-and-paste-able examples.

my fun. Just kidding... thank you all again for the input. It just

doesn't seem right that you should offer up all these practically

cut-and-paste-able examples.

I am intending to cycle through all the variations that must run in

groups, such as paper stock, run quantity, the type ink and number of

inks (only two inks per pass, one pass per day, and some inks must run

alone). We would rarely have more than 100 orders for the most common

run variation, one ink (black) on plain white stock. I can use a

simpler procedure to find groups in large result sets... up until this

morning, my biggest dilema was finding the less obvious groups in

smaller result sets.

What really complicates the task is the ability to duplicate a job set

one or more times on the printing plate. For instance, if I had six

remaining sets I could place each twice on the plate for a total of 12

sets, which would then be printed on half as much paper. Or, in another

instance, one set of 1000 and one set of 2000 could be placed on the

plate four and eight times, respectively, and printed 250 times. This

was the first issue I tackled, but the last option I'll use. Reducing

the number of impressions for each run, increases the overall number of

plate changes throughout the day, which in turn leads to more over-time

hours for the well-paid press staff.

It is also essential that I arrange my final production schedule for

two identical offset printers, and ensure the fewest number of ink

changes possible. Confusing enough, yet?

It gets worse. But I have a good grasp on it so far, and with some of

the ideas I saw here, I should have something to show for my sleepless

nights in about two weeks.

Silvio

Jan 21, 2005, 5:20:42 PM1/21/05

to

On 21 Jan 2005 10:17:58 -0800, stu_gots wrote:

>To elaborate, I am not interested cutting seconds or preserving system

>resources. I'm working in pre-press during a swing shift, assembling

>about 200 orders per evening containing a total of about 800 sets. This

>is a print production solution... only 12 sets can pass through the

>press at once in a variety of different circumstances too difficult to

>explain. Orders, by the way, can not be split up to print in different

>groups because of consistancy issues, and are consequently not sold in

>groups of more than 12.

>

>This will be a long series of loops that applies the grouping procedure

>to a bout 20 or more different subgroups of orders. The object will be

>to have as few ungrouped orders as possible when the procedure is out

>of places to turn.

Hi Silvio,

Before starting to code, I'd like to get the business requirement a little

more clear. Your description IS already very clear, but I still have one

question. If (as you describe) you make as many batches of exactly 12 sets

as possible, you'll end up with one or more orders that can't be combined

to a batch of 12. How do you deal with those? My logic says that you'd

combine them into one (or maybe even more) slightly smaller batches. But

it might also be that those orders are simply not served, but postponed to

the next day when they (hopefully) CAN be combined with new orders for a

total of 12 sets.

If there is a machine limitation (or a cost-based requirement) that each

batch always prints exactly 12 sets, then you're (IMO) on the right track.

But if you really want to serve all orders in as little batches as

possible, you might want to rethink. It *seems* logical that striving for

full batches results in the least number of batches - but that's not

always the case! Consider 6 orders, 3 for 4 sets and 3 for 7 sets. If you

look for combinations of 12 sets, you'll combine the first three orders

into one batch. You'll be left with the other three orders, each for 7

sets, that can't be combined and have to go in seperate batches. The total

number of batches is 4. However, if you combine each order for 7 sets with

one order for 4 sets, you'd serve all orders in three batches of 11 sets

each.

If your true goal is to minimize the number of batches, I'd urge you to

take a look at the code I posted yesterday. If needed, I can help you to

adapt it to your situation.

> Perhaps there is a better procedure than the cross

>join I posted a bit ago (the full context of which would actually

>return 12 values... '2', '3', '7', and nine NULLs... I just gave you

>the abridged version).

Maybe, maybe not. The main problem I see with this query is that it forms

ony one batch at a time. You then have to go and remove the rders in that

batch from the total set and repeat the procedure. Not very efficient

(though I fully agree that it beats manual grouping <g>).

I'm still mulling over some possibilities to do this set-based, forming

all (or at least as many as possible) 12-set batches at once. I'm not sure

yet if the ideas I have will eventually combine or not. But do let me know

if you actually wwanted to serve all orders in as little batches as

possible - in that case, I'll stop mulling and start customizing my

solution for the bin packing problem to your specific needs.

Jan 22, 2005, 8:07:24 AM1/22/05

to

To elaborate on my previous post, no orders are postponed... the orders

that can't be put in groups totaling 12 sets (e.g., 3 orders each with

1 set, 1000 copies per set, black ink on standard white stock) can be

grouped with what was left over from the next group with identical

traits, only fewer quantity (6 orders each with 1 set, 500 copies,

black ink, standard stock).

that can't be put in groups totaling 12 sets (e.g., 3 orders each with

1 set, 1000 copies per set, black ink on standard white stock) can be

grouped with what was left over from the next group with identical

traits, only fewer quantity (6 orders each with 1 set, 500 copies,

black ink, standard stock).

So these orders...

1000 Copies, blk, white stock

ID SetCount

1 1

2 1

3 1

And these orders...

500 Copies, blk, white stock

ID SetCount

4 1

5 1

6 1

7 1

8 1

9 1

Become this group of 12, scheduled for 500 imprents (copies)...

ID SetCount (this time meaning the number of copies on the plate)

1 2

2 2

3 2

4 1

5 1

6 1

7 1

8 1

9 1

> Consider 6 orders, 3 for 4 sets and 3 for 7 sets. If you

> look for combinations of 12 sets, you'll combine the first three

orders

> into one batch. You'll be left with the other three orders, each for

7

> sets, that can't be combined and have to go in seperate batches. The

total

> number of batches is 4. However, if you combine each order for 7 sets

with

> one order for 4 sets, you'd serve all orders in three batches of 11

sets

> each.

I am somewhat new to this business, but schedulers will argue whether

it is financially more efficient overall to ensure that every plate is

full, or occasionally leave a spot on the plate empty (as would be the

case with an 11-set run). Some will waste four or five spots in a night

our of many dozen runs, but the seemingly good schedulers don't do it

at all. There is a lot more to the press side than printing, and I get

the feeling they don't like empty spaces on the plates.

I don't mean to get help with any of this, it's really to big... I just

thought you'd be interested that a lot of wholesale offset printers do

this the old fashion way.

Sil

Jan 22, 2005, 7:22:59 PM1/22/05

to

I find this discussion quite interesting. We have a somewhat related setup;

we print a large number of strip maps each month, chosen from about 750

stock numbers according to the amount ordered and the amount left in

inventory. Maps are printed in pairs (actually there are 6 images on a

plate, but for production reasons we always do 3 of one map and 3 of the

other). The production quantities vary wildly, from a few thousand to half

a million. We are allowed to overprint by any amount since we just retain

the stock for orders later in the year, but once a year the maps are

revised and any leftover stock must be destroyed unpaid for, so it's

important to keep the quantities down. On the other hand if we print enough

in one run to last several months, that is a big win. The final constraint

is that the orders - which are delivered to any of 300 clubs each month -

are picked and packed in a specific order, and the further "out-of-order"

the press run gets, the more temporary storage space is wasted.

we print a large number of strip maps each month, chosen from about 750

stock numbers according to the amount ordered and the amount left in

inventory. Maps are printed in pairs (actually there are 6 images on a

plate, but for production reasons we always do 3 of one map and 3 of the

other). The production quantities vary wildly, from a few thousand to half

a million. We are allowed to overprint by any amount since we just retain

the stock for orders later in the year, but once a year the maps are

revised and any leftover stock must be destroyed unpaid for, so it's

important to keep the quantities down. On the other hand if we print enough

in one run to last several months, that is a big win. The final constraint

is that the orders - which are delivered to any of 300 clubs each month -

are picked and packed in a specific order, and the further "out-of-order"

the press run gets, the more temporary storage space is wasted.

The solution I came up with is embarrassingly simple-minded. Step 1, any

maps whose print volume is larger than an upper limit are paired with

themselves. Step 2, we walk through the list of maps (procedurally, not

set-based SQL) looking for any exact-press-quantity match within a distance

25 stock numbers; if one is found, that makes a pair. Step 3, we walk

through what remains of the list finding the matching map within 25 numbers

that is within 25,000 of the same press quantity; those make a pair. Step

4, we pair anything left over with itself.

The whole procedure takes a few minutes because it is procedural, not set-

based, so there's no optimization going on. There are also one or two other

constraints I haven't mentioned. But the fact is that it isn't worth my

time to develop an algorithm for a more optimal press run (even though I've

suggested many times that I search for a new algorithm), because as long as

we let the inventory run out before revision time so that the destroys are

low, anything extra can just sit on the shelf. They just have me tweak the

formula for deciding the initial required print volume once in a while, but

the matching algorithm stays the same.

Jan 23, 2005, 7:49:30 AM1/23/05

to

I'm doing fine business stationary... not everything is grouped in

12's, but cards are the biggest business and the only scheduling

headache, 12 of them fit on 8.5x11 stock. Sets, by the way, are cards

with different names but identical layouts and main lines (eg. Wayne

County Sherrif's Office). The grouping job in itself isn't even a real

headache, as one person can do it in a little over an hour. But as the

paper work for each order is grouped in sets of 12, the scheduler needs

to write out a form labeling the group's run number, paper stock, inks

and run quantity. Then, those who do the grunt work of assembling the

plates, need to hand-write on the scheduler's form specific details

about every set (all information taken straight from a printout of

stored information). Then, the scheduler spends at least an hour

preparing reports for the night in Excel. It's really quite a scene.

I'm sure I can eliminate half those steps in less than 5 minutes. For

now though, as you might guess, I'm doing the grunt work from above...

and have no access to the FoxPro database, other than editing and

viewing orders.

12's, but cards are the biggest business and the only scheduling

headache, 12 of them fit on 8.5x11 stock. Sets, by the way, are cards

with different names but identical layouts and main lines (eg. Wayne

County Sherrif's Office). The grouping job in itself isn't even a real

headache, as one person can do it in a little over an hour. But as the

paper work for each order is grouped in sets of 12, the scheduler needs

to write out a form labeling the group's run number, paper stock, inks

and run quantity. Then, those who do the grunt work of assembling the

plates, need to hand-write on the scheduler's form specific details

about every set (all information taken straight from a printout of

stored information). Then, the scheduler spends at least an hour

preparing reports for the night in Excel. It's really quite a scene.

I'm sure I can eliminate half those steps in less than 5 minutes. For

now though, as you might guess, I'm doing the grunt work from above...

and have no access to the FoxPro database, other than editing and

viewing orders.

Funny thing, though... all the details I'd need are provided to us

every night in a giant Excel spreadsheet, which the database spits out

shortly after 5 p.m., after all the orders are entered. Initially, I'm

going to set this up with a Web page as the front end, and one of my

home computers as the back end. I'll upload to my computer at home the

Excel file, which contains all the details for every set from every

order:

OrderID, SetID, Mainline, Name, Stock, Quantity, Ink1, Ink2, Ink3

...which I am able to transform into three related tables, [Orders],

[Sets], and [Inks], and run the many processes I've been discussing

above. Eventually I'll have it spit out an Access .mdb file, complete

with all the automatic reports they need to accompany each and every

group.

Once I can show that it works well, I'll begin to tune it up a little,

and likely move the whole instance of SQL on site...but as you might

have also guessed, I'm still quite new to instances, stored procedures

and everything else that makes SQL Server different from Access, so I'm

hesitant to go that far any time soon. I no doubt will post problems as

they arise, given the amount of help I received... likely, though, in a

seperate topic. In fact, I have another question in mind if I don't

find a solution by this afternoon.

Sil

Jan 23, 2005, 10:52:08 PM1/23/05

to

Google for "knapsack problem". I believe your problem

has been proved NP complete, meaning that if you want

an optimal solution (as opposed to just *some* soln),

you cannot do better than an exhaustive search.

has been proved NP complete, meaning that if you want

an optimal solution (as opposed to just *some* soln),

you cannot do better than an exhaustive search.

In this case, you could do an exhaustive search for

all parties to which 1 family, 2 families, 3, etc,

up to 12 families, are invited (the 12 family case

is a 12-way self-join). The result is the union of

all the queries.

If you want just any solution, or an "often nearly optimal"

solution, there might be heuristics that will help,

depending on the statistics of your data (like for example,

families only ever have from 2 to 6 members).

Clifford Heath.

Jan 24, 2005, 7:24:26 AM1/24/05

to

Incidentally, I tried the 12-way self-join, as I hinted I would in a

previous post. Since most of the 12 result fields would come up null

while grouping an average of six families (the bulk of the families are

grouped with a simpler algorith until it fails to find the last

remaining groups), I figured the 12-way self-join wouldn't take too

long... actually, it turned out to take almost four minutes with the

table:

previous post. Since most of the 12 result fields would come up null

while grouping an average of six families (the bulk of the families are

grouped with a simpler algorith until it fails to find the last

remaining groups), I figured the 12-way self-join wouldn't take too

long... actually, it turned out to take almost four minutes with the

table:

ID SetCount

1 8

2 7

3 6

4 3

5 2

6 Null

I can't be sure that my tables won't include more than 12 records when

the procedure resorts to the cartesian solution, so I limited the

self-join to six. That should certainly be more than enough in my

situation.

I'll take a look at what this "heuristics" business is all about...

haven't heard of it yet.

Jan 24, 2005, 7:51:11 AM1/24/05

to

Fantastic! I should have seen that knapsack problem before. It's not a

big deal that I can't find THAT optimal solution, although it would be

nice. It sure illustrate a serious flaw in artificial intelligence that

I never noticed.

big deal that I can't find THAT optimal solution, although it would be

nice. It sure illustrate a serious flaw in artificial intelligence that

I never noticed.

Jan 24, 2005, 4:58:59 PM1/24/05

to

Hi Silvio,

The code I posted earlier in this thread for the knapsack or bin packing

problem results in an almost optimal solution, but much faster than trying

all possibilities. Did you already try to adapt it for your case?

I've been trying to find an efficient way to form groups of exactly 12

sets, but I'm afraid I'm stuck. I've decided to explain the way I was

heading - maybe you are able to go on where I am stranded.

My idea was to use a temp table and populate it first with "groups" of one

order - the number of groups would equal the number of orders. Next step

would be to join these groups with the orders, to create all possible

groups of two orders, except those that would exceed the 12-set limit. The

number of groups in this stage would be maximum (#orders) * (#orders - 1);

in reality slightly less. I'd repeat joining the temp "groups" table to

the orders table until none of the groups could be expanded without

exceeding the 12-set limit. Then, I'd remove all groups with less than 12

sets.

This is the point where I'm stuck. The above would result in a table with

all possible groups that total 12 sets, but many orders would be used in

more than one group. The next step would have been to use this temp table

to find a combination of groups in which as many orders as possible are

used, but no order is used more than once. I don't see an elegant way to

solve this step.

And since the first part (the steps to create the temp "groups" table) is

quite brute-force, I actually think it wouldn't beat a "one group at a

time" solution anyway.

That being said, it certainly was an intriguing and thought-provoking

puzzle. Thanks!!

Feb 2, 2005, 5:22:08 PM2/2/05

to

"stu_gots" <slv_...@yahoo.com> wrote in message

news:1106569466.8...@c13g2000cwb.googlegroups.com...I've come to this thread late and haven't read through it fully but

I'll offer my take.

Say we have a set of natural numbers, i.e., a collection of distinct

integers each >= 1, and we're interested in finding subsets whose

elementwise sum equals S. We know that for any positive integer

N, 1+2+...+N = N(N+1)/2, e.g., 1+2+3+4+5 = 5(5+1)/2 = 15. This

also tells us that the greatest number of distinct positive integers that

sum to 15 is 5. Generally, the greatest number of distinct positive

integers that sum to S is found by solving N(N+1)/2 = S for N and then

taking the floor of the result, that is, the integer portion. If S=12, then

no more than four distinct positive integers can sum to this value. So

the maximum number of distinct positive integers summing to S is

proportional to the square root of S, or O(SQRT(S)).

When we're potentially dealing with a multiset, and not just a set, of

natural numbers then the multiplicity of integers must be accounted

for. This means that the greatest number of positive integers that sum

to S is S, that is, S ones. This relationship is linear, or O(S), and

obviously

grows much faster than the square root of S. However, we can consider

the multiset of positive integers to be a set where each element is a tuple

of a value and its number of occurrences, e.g., the multiset {1,1,2,4,4,4}

can be represented as the set {(1,2),(2,1),(4,3)}. This allows us to revert

to finding sequences of distinct positive integers, however, each value is

also accompanied, and multiplied, by an appropriate number of

occurrences to sum to S. So we can craft a solution that is similarly

O(SQRT(S)), within a constant factor of two of the solution for simply

distinct positive integers summing to S. If S=12 and our addends are

drawn from a multiset, then no more than eight values are possible,

four distinct positive integers and their four corresponding occurrence

numbers, e.g., 1*1+2*2+3*1+4*1 which setwise is {(1,1),(2,2),(3,1),(4,1)}.

Let's look at an SQL solution that will allow sequences of up to five

distinct

positive integers and their corresponding occurrence numbers. This will

allow us to find all addends that sum to < 6(7)/2 = 21, that is, to 20.

CREATE TABLE V

(

i INT NOT NULL CHECK (i>0) PRIMARY KEY, -- distinct positive integer

n INT NOT NULL CHECK (n>0) -- occurrence number

)

-- Sample data

INSERT INTO V (i, n)

VALUES (1, 20) -- 20 values of 1

INSERT INTO V (i, n)

VALUES (2, 20)

INSERT INTO V (i, n)

VALUES (3, 20)

INSERT INTO V (i, n)

VALUES (4, 20)

INSERT INTO V (i, n)

VALUES (5, 20)

INSERT INTO V (i, n)

VALUES (6, 20)

INSERT INTO V (i, n)

VALUES (7, 20)

INSERT INTO V (i, n)

VALUES (8, 20)

INSERT INTO V (i, n)

VALUES (9, 20)

INSERT INTO V (i, n)

VALUES (10, 20)

INSERT INTO V (i, n)

VALUES (11, 20)

INSERT INTO V (i, n)

VALUES (12, 20)

CREATE VIEW Digits (d)

AS

SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL

SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL

SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL

SELECT 9

-- Table of all nonnegative integers to some arbitrary upper bound

CREATE TABLE N

(

i INT NOT NULL CHECK (i>=0) PRIMARY KEY

)

INSERT INTO N (i)

SELECT Ones.d + 10*Tens.d + 100*Hundreds.d -- from 0 to 999

FROM Digits AS Ones

CROSS JOIN

Digits AS Tens

CROSS JOIN

Digits AS Hundreds

DECLARE @s INT -- targeted sum

SET @s = 12 -- find all positive integers that sum to this value

SELECT V1.i AS v1, N1.i AS n1,

V2.i AS v2, N2.i AS n2,

V3.i AS v3, N3.i AS n3,

V4.i AS v4, N4.i AS n4,

V5.i AS v5, N5.i AS n5

FROM V AS V1

INNER JOIN

N AS N1

ON V1.i <= @s AND

N1.i BETWEEN 1 AND V1.n AND

N1.i <= @s / V1.i

LEFT OUTER JOIN

V AS V2

ON V2.i < @s AND

V2.i > V1.i AND

V2.i <= @s - V1.i * N1.i

LEFT OUTER JOIN

N AS N2

ON N2.i BETWEEN 1 AND V2.n AND

N2.i <= (@s - V1.i * N1.i) / V2.i

LEFT OUTER JOIN

V AS V3

ON V3.i < @s AND

V3.i > V2.i AND

V3.i <= @s - V1.i * N1.i - V2.i * N2.i

LEFT OUTER JOIN

N AS N3

ON N3.i BETWEEN 1 AND V3.n AND

N3.i <= (@s - V1.i * N1.i - V2.i * N2.i) / V3.i

LEFT OUTER JOIN

V AS V4

ON V4.i < @s AND

V4.i > V3.i AND

V4.i <= @s - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i

LEFT OUTER JOIN

N AS N4

ON N4.i BETWEEN 1 AND V4.n AND

N4.i <= (@s - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i)

/ V4.i

LEFT OUTER JOIN

V AS V5

ON V5.i < @s AND

V5.i > V4.i AND

V5.i <= @s - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i -

V4.i * N4.i

LEFT OUTER JOIN

N AS N5

ON N5.i BETWEEN 1 AND V5.n AND

N5.i =

(@s - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i - V4.i *

N4.i) / V5.i

WHERE V1.i * N1.i +

COALESCE(V2.i * N2.i, 0) +

COALESCE(V3.i * N3.i, 0) +

COALESCE(V4.i * N4.i, 0) +

COALESCE(V5.i * N5.i, 0) = @s

With the sample data and target sum given, 77 results are returned in a

fraction of a second (SQL Server 2005 Beta 2 on a very fast single-processor

box).

--

JAG

Feb 11, 2005, 4:40:20 AM2/11/05

to

I haven't tried it yet, but this code appears to be good as gold to me.

I'll post again in a bit.

I'll post again in a bit.

Sil

Feb 12, 2005, 9:29:33 AM2/12/05

to

Thanks y'all,

Those were some interesting results, John. I'm thinking if I break the

results out over a cross tab with each of the 77 rows representing a

possible combination and each column representing a possible number of

sets (only 1 through 12 in my case), I should be able to identify the

fewest combinations where the column totals come closest to the values

in TABLE V... i.e., identify the fewest groups of 12, with the fewest

leftovers.

The sample data you provided in TABLE V, for instance, would be grouped

easily with my first and simplest procedure as 20 groups of 12's, 20

groups of 11's and 1's, 20 groups of 10's and 2's, etc... ending with

10 groups of 6's. In this case, I would have found the most ideal

grouping possible (a total of 130 groups) simply by starting at the top

record and adding the next highest value with a sum not exceeding 12.

Most of the time, however, this procedure has leftovers or gets tripped

up by a combination similar to the one in my first post. If leftovers

are unavoidable, I can simply group them with sets from a lesser

quantity... 8 sets of 500 can be grouped with 2 sets of 1000, the

latter placed twice on a plate so a run count of 500 sheets produced

1000 copies. For that I'm pretty much using a variation of the first

code I posted along with numerous suggestions. I guess I'll attach an

example below in case anyone's curious and still following this. To

be certain, I'm neither a mathematician nor a dba, and what I do

understand is difficult for me to explain!

Anyway I'm not sure I said it before, but I don't think a perfect

solution to my entire task is a knapsack problem. If it's not, I'm

pretty certain you've lead me in the right direction. In any case, I

think I need to revisit everything from the start once I get to the

bottom of all this mathematics business.

Thanks again y'all,

Sil

--BEGIN SQL

CREATE TABLE #Jobs

(TempID int identity PRIMARY KEY,

JobID int NULL,

Quantity int DEFAULT 0,

SetCount int DEFAULT 0,

GroupID int NULL)

--some examples of leftover sets of various quantities

--to be ordered by quantity then sets

INSERT #Jobs (JobID, Quantity, SetCount)

VALUES (106, 250, 1)

INSERT #Jobs (JobID, Quantity, SetCount)

VALUES (105, 250, 2)

INSERT #Jobs (JobID, Quantity, SetCount)

VALUES (104, 500, 1)

INSERT #Jobs (JobID, Quantity, SetCount)

VALUES (103, 500, 2)

INSERT #Jobs (JobID, Quantity, SetCount)

VALUES (102, 1000, 2)

INSERT #Jobs (JobID, Quantity, SetCount)

VALUES (101, 1500, 1)

--each record will be assigned a GroupID when finished.

--There will be three distinct groups using these records.

INSERT #Jobs DEFAULT VALUES

--I still can't see any way around adding these

--null and zero values for the upcoming insert statement.

CREATE TABLE #Groups

(GroupID int identity PRIMARY KEY,

RunCount int,

SetA int,

SetB int,

SetC int,

SetD int,

SetE int,

SetF int)

CREATE TABLE #Divisors

(DivisorID tinyint identity PRIMARY KEY,

Divisor tinyint NOT NULL)

--this table will help calculate the run counts.

--I'm not sure of the significance, but the run count

--is always the lowest quantity in the group

--divided by one of these numbers....

INSERT #Divisors (Divisor)

VALUES (1)

INSERT #Divisors (Divisor)

VALUES (2)

INSERT #Divisors (Divisor)

VALUES (3)

INSERT #Divisors (Divisor)

VALUES (4)

INSERT #Divisors (Divisor)

VALUES (6)

INSERT #Divisors (Divisor)

VALUES (8)

INSERT #Divisors (Divisor)

VALUES (9)

INSERT #Divisors (Divisor)

VALUES (12)

DECLARE @RunCount smallint

DECLARE @MinQ smallint

DECLARE @Cursor tinyint

DECLARE @Divisor tinyint

DECLARE @Group tinyint

SET @Cursor = (SELECT TOP 1 DivisorID FROM #Divisors)

--or SET @Cursor = 1, for short.

WHILE (SELECT COUNT(*) FROM #Jobs

WHERE (GroupID IS NULL AND JobID IS NOT NULL)) > 0

BEGIN

SET @Divisor = (SELECT TOP 1 Divisor FROM #Divisors

WHERE DivisorID = @Cursor)

SET @MinQ = (SELECT TOP 1 Quantity

FROM #Jobs WHERE (GroupID IS NULL AND JobID IS NOT NULL)

ORDER BY Quantity)

SET @RunCount = @MinQ / @Divisor

--@RunCount is for reporting only, to identify the

--common denominator that will be used to group the jobs

--if the following insert statement succeeds. Since @RunCount

--is often a rounded intiger, it is not used in calculations.

--Would be nice to round this particular variable up, though.

INSERT #Groups (RunCount, SetA, SetB, SetC, SetD, SetE, SetF)

SELECT TOP 1 @RunCount,

A.JobID,

B.JobID,

C.JobID,

D.JobID,

E.JobID,

F.JobID

FROM #Jobs A

INNER JOIN #Jobs B ON A.TempID < B.TempID OR B.JobID IS NULL

INNER JOIN #Jobs C ON B.TempID < C.TempID OR C.JobID IS NULL

INNER JOIN #Jobs D ON C.TempID < D.TempID OR D.JobID IS NULL

INNER JOIN #Jobs E ON D.TempID < E.TempID OR E.JobID IS NULL

INNER JOIN #Jobs F ON E.TempID < F.TempID OR F.JobID IS NULL

WHERE (@Divisor * A.Quantity / @MinQ * A.SetCount)

+ (@Divisor * B.Quantity / @MinQ * B.SetCount)

+ (@Divisor * C.Quantity / @MinQ * C.SetCount)

+ (@Divisor * D.Quantity / @MinQ * D.SetCount)

+ (@Divisor * E.Quantity / @MinQ * E.SetCount)

+ (@Divisor * F.Quantity / @MinQ * F.SetCount) = 12

AND (A.Quantity = @MinQ)

AND (A.GroupID IS NULL)

AND (B.GroupID IS NULL)

AND (C.GroupID IS NULL)

AND (D.GroupID IS NULL)

AND (E.GroupID IS NULL)

AND (F.GroupID IS NULL)

ORDER BY A.Quantity, A.TempID DESC, B.Quantity, B.TempID DESC,

C.Quantity, C.TempID DESC, D.Quantity, D.TempID DESC,

E.Quantity, E.TempID DESC, F.Quantity, F.TempID DESC

IF @@ROWCOUNT = 0

BEGIN

SET @Cursor = (SELECT TOP 1 DivisorID FROM #Divisors

WHERE DivisorID > @Cursor ORDER BY DivisorID)

CONTINUE

END

--If the insert statement could not any jobs where the sets

--laid out proportionally on the plate do not fill 12 spots,

--@cursor advances and the while statement restarts.

--Otherwise, the new GroupID is assigned to each job

--in the new group...

SET @Group = (SELECT TOP 1 GroupID FROM #Groups

ORDER BY GroupID DESC)

UPDATE #Jobs

SET GroupID = @Group

WHERE (JobID = (SELECT SetA FROM #Groups WHERE GroupID=@Group))

OR (JobID = (SELECT SetB FROM #Groups WHERE GroupID=@Group))

OR (JobID = (SELECT SetC FROM #Groups WHERE GroupID=@Group))

OR (JobID = (SELECT SetD FROM #Groups WHERE GroupID=@Group))

OR (JobID = (SELECT SetE FROM #Groups WHERE GroupID=@Group))

OR (JobID = (SELECT SetF FROM #Groups WHERE GroupID=@Group))

SET @Cursor = 1

END

--@Cursor returns to 1, and the while statement continues

--grouping any remaining jobs.

DELETE FROM #Jobs WHERE JobID IS NULL

--just to get the null and zero record out of the report.

SELECT * FROM #Groups

SELECT * FROM #Jobs

--clean up

DROP TABLE #Groups

DROP TABLE #Jobs

DROP TABLE #Divisors

Feb 17, 2005, 11:30:11 PM2/17/05

to

Here's a bin-packing solution to your problem that builds on my

previous post. Note that bin-packing problems are NP-complete so

performance with other than modest-sized input could be problematic.

However, this solution is fully relational and should be useful as it

performs reasonably for small inputs. Note that the solution allows

you to specify bin capacity, that is, it's not assumed to be 12. All

code is repeated here for completeness.

-- 0-9

CREATE VIEW Digits (d)

AS

SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL

SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL

SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL

SELECT 9

-- Nonnegative integers to some suitable upper bound

CREATE TABLE N

(

i INT NOT NULL CHECK (i>=0) PRIMARY KEY

)

INSERT INTO N (i)

SELECT Ones.d + 10*Tens.d + 100*Hundreds.d

FROM Digits AS Ones

CROSS JOIN

Digits AS Tens

CROSS JOIN

Digits AS Hundreds

-- Table of values with the number of occurrences for each

-- corresponding value

CREATE TABLE V

(

i INT NOT NULL CHECK (i>0) PRIMARY KEY, -- value

n INT NOT NULL CHECK (n>0) -- occurrence number

)

-- Sample data

INSERT INTO V (i, n)

VALUES (7, 1) -- 1 occurrence of 7

INSERT INTO V (i, n)

VALUES (5, 2) -- 2 occurrences of 5

INSERT INTO V (i, n)

VALUES (2, 3)

INSERT INTO V (i, n)

VALUES (1, 4)

-- s : sum

-- vi : ith value

-- ni : number of occurrences of vi

-- rni: remaining occurrences of vi after ni are used

-- Note that the sum is not simply a variable but an attribute,

-- that is, it can be read or restricted, e.g., s < 10

-- s = v1*n1 + ... + vn*nn

CREATE VIEW Addends

(s, v1, n1, v2, n2, v3, n3, v4, n4, v5, n5, rn1, rn2, rn3, rn4, rn5)

AS

SELECT S.i,

V1.i, N1.i,

V2.i, N2.i,

V3.i, N3.i,

V4.i, N4.i,

V5.i, N5.i,

V1.n - N1.i,

V2.n - N2.i,

V3.n - N3.i,

V4.n - N4.i,

V5.n - N5.i

FROM N AS S

INNER JOIN

V AS V1

ON S.i > 0 AND

V1.i <= S.i

INNER JOIN

N AS N1

ON N1.i BETWEEN 1 AND V1.n AND

N1.i <= S.i / V1.i

LEFT OUTER JOIN

V AS V2

ON V2.i < S.i AND

V2.i > V1.i AND

V2.i <= S.i - V1.i * N1.i

LEFT OUTER JOIN

N AS N2

ON N2.i BETWEEN 1 AND V2.n AND

N2.i <= (S.i - V1.i * N1.i) / V2.i

LEFT OUTER JOIN

V AS V3

ON V3.i < S.i AND

V3.i > V2.i AND

V3.i <= S.i - V1.i * N1.i - V2.i * N2.i

LEFT OUTER JOIN

N AS N3

ON N3.i BETWEEN 1 AND V3.n AND

N3.i <= (S.i - V1.i * N1.i - V2.i * N2.i) / V3.i

LEFT OUTER JOIN

V AS V4

ON V4.i < S.i AND

V4.i > V3.i AND

V4.i <= S.i - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i

LEFT OUTER JOIN

N AS N4

ON N4.i BETWEEN 1 AND V4.n AND

N4.i <=

(S.i - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i) / V4.i

LEFT OUTER JOIN

V AS V5

ON V5.i < S.i AND

V5.i > V4.i AND

V5.i <=

S.i - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i - V4.i * N4.i

LEFT OUTER JOIN

N AS N5

ON N5.i BETWEEN 1 AND V5.n AND

N5.i =

(S.i - V1.i * N1.i - V2.i * N2.i - V3.i * N3.i - V4.i * N4.i) /

V5.i

WHERE V1.i * N1.i +

COALESCE(V2.i * N2.i, 0) +

COALESCE(V3.i * N3.i, 0) +

COALESCE(V4.i * N4.i, 0) +

COALESCE(V5.i * N5.i, 0) = S.i

-- Helper view that returns addends as strings

-- s : sum

-- vi : ith value

-- svi : ith value as a string

-- ni : number of occurrences of vi

-- sv : all values as a string

-- srn : all remaining occurrences of values as a string

CREATE VIEW AddendsStr

(s, v1, sv1, n1,

v2, sv2, n2,

v3, sv3, n3,

v4, sv4, n4,

v5, sv5, n5,

sv, srn)

AS

SELECT s,

v1, CAST(COALESCE(v1, 0) AS CHAR(10)), COALESCE(n1, 0),

v2, CAST(COALESCE(v2, 0) AS CHAR(10)), COALESCE(n2, 0),

v3, CAST(COALESCE(v3, 0) AS CHAR(10)), COALESCE(n3, 0),

v4, CAST(COALESCE(v4, 0) AS CHAR(10)), COALESCE(n4, 0),

v5, CAST(COALESCE(v5, 0) AS CHAR(10)), COALESCE(n5, 0),

CAST(COALESCE(v1, 0) AS CHAR(10)) +

CAST(COALESCE(v2, 0) AS CHAR(10)) +

CAST(COALESCE(v3, 0) AS CHAR(10)) +

CAST(COALESCE(v4, 0) AS CHAR(10)) +

CAST(COALESCE(v5, 0) AS CHAR(10)),

CAST(COALESCE(rn1, 0) AS CHAR(10)) +

CAST(COALESCE(rn2, 0) AS CHAR(10)) +

CAST(COALESCE(rn3, 0) AS CHAR(10)) +

CAST(COALESCE(rn4, 0) AS CHAR(10)) +

CAST(COALESCE(rn5, 0) AS CHAR(10))

FROM Addends

-- View that gives all bin-packing solutions (not just fewest bins)

-- Maximum arbitrarily defined as five groups (bins) with each group

-- having a maximum of five values

-- s : sum

-- bin_capacity : capacity of each bin, e.g., capacity of 12 for sum

-- bin_tally : number of occupied bins

-- vij : jth value of ith group

-- nij : number of occurrences of vij

-- Note that bin_capacity is an attribute so it can be restricted

CREATE VIEW BinPackingAddends

(s, bin_capacity, bin_tally,

v11, n11, v12, n12, v13, n13, v14, n14, v15, n15,

v21, n21, v22, n22, v23, n23, v24, n24, v25, n25,

v31, n31, v32, n32, v33, n33, v34, n34, v35, n35,

v41, n41, v42, n42, v43, n43, v44, n44, v45, n45,

v51, n51, v52, n52, v53, n53, v54, n54, v55, n55)

AS

SELECT S.total, BC.i,

1 + SIGN(COALESCE(A2.v1, 0)) + SIGN(COALESCE(A3.v1, 0)) +

SIGN(COALESCE(A4.v1, 0)) + SIGN(COALESCE(A5.v1, 0)),

A1.v1, NULLIF(A1.n1, 0),

A1.v2, NULLIF(A1.n2, 0),

A1.v3, NULLIF(A1.n3, 0),

A1.v4, NULLIF(A1.n4, 0),

A1.v5, NULLIF(A1.n5, 0),

A2.v1, NULLIF(A2.n1, 0),

A2.v2, NULLIF(A2.n2, 0),

A2.v3, NULLIF(A2.n3, 0),

A2.v4, NULLIF(A2.n4, 0),

A2.v5, NULLIF(A2.n5, 0),

A3.v1, NULLIF(A3.n1, 0),

A3.v2, NULLIF(A3.n2, 0),

A3.v3, NULLIF(A3.n3, 0),

A3.v4, NULLIF(A3.n4, 0),

A3.v5, NULLIF(A3.n5, 0),

A4.v1, NULLIF(A4.n1, 0),

A4.v2, NULLIF(A4.n2, 0),

A4.v3, NULLIF(A4.n3, 0),

A4.v4, NULLIF(A4.n4, 0),

A4.v5, NULLIF(A4.n5, 0),

A5.v1, NULLIF(A5.n1, 0),

A5.v2, NULLIF(A5.n2, 0),

A5.v3, NULLIF(A5.n3, 0),

A5.v4, NULLIF(A5.n4, 0),

A5.v5, NULLIF(A5.n5, 0)

FROM (SELECT SUM(i*n)

FROM V) AS S(total)

INNER JOIN

N AS BC -- bin capacity

ON BC.i > 0 AND

BC.i <= S.total

INNER JOIN

AddendsStr AS A1

ON A1.s <= BC.i

LEFT OUTER JOIN

AddendsStr AS A2

ON A2.s <= A1.s AND

A2.s <= S.total - A1.s AND

A2.n1 <=

COALESCE(CAST(SUBSTRING(A1.srn,

NULLIF(CHARINDEX(A2.sv1, A1.sv, 1), 0),

10) AS INT), A2.n1) AND

A2.n2 <=

COALESCE(CAST(SUBSTRING(A1.srn,

NULLIF(CHARINDEX(A2.sv2, A1.sv, 1), 0),

10) AS INT), A2.n2) AND

A2.n3 <=

COALESCE(CAST(SUBSTRING(A1.srn,

NULLIF(CHARINDEX(A2.sv3, A1.sv, 1), 0),

10) AS INT), A2.n3) AND

A2.n4 <=

COALESCE(CAST(SUBSTRING(A1.srn,

NULLIF(CHARINDEX(A2.sv4, A1.sv, 1), 0),

10) AS INT), A2.n4) AND

A2.n5 <=

COALESCE(CAST(SUBSTRING(A1.srn,

NULLIF(CHARINDEX(A2.sv5, A1.sv, 1), 0),

10) AS INT), A2.n5)

LEFT OUTER JOIN

AddendsStr AS A3

ON A3.s <= A2.s AND

A3.s <= S.total - A1.s - A2.s AND

A3.n1 <=

COALESCE(CAST(SUBSTRING(A2.srn + A1.srn,

NULLIF(CHARINDEX(A3.sv1,

A2.sv + A1.sv, 1), 0),

10) AS INT), A3.n1) AND

A3.n2 <=

COALESCE(CAST(SUBSTRING(A2.srn + A1.srn,

NULLIF(CHARINDEX(A3.sv2,

A2.sv + A1.sv, 1), 0),

10) AS INT), A3.n2) AND

A3.n3 <=

COALESCE(CAST(SUBSTRING(A2.srn + A1.srn,

NULLIF(CHARINDEX(A3.sv3,

A2.sv + A1.sv, 1), 0),

10) AS INT), A3.n3) AND

A3.n4 <=

COALESCE(CAST(SUBSTRING(A2.srn + A1.srn,

NULLIF(CHARINDEX(A3.sv4,

A2.sv + A1.sv, 1), 0),

10) AS INT), A3.n4) AND

A3.n5 <=

COALESCE(CAST(SUBSTRING(A2.srn + A1.srn,

NULLIF(CHARINDEX(A3.sv5,

A2.sv + A1.sv, 1), 0),

10) AS INT), A3.n5)

LEFT OUTER JOIN

AddendsStr AS A4

ON A4.s <= A3.s AND

A4.s <= S.total - A1.s - A2.s - A3.s AND

A4.n1 <=

COALESCE(

CAST(

SUBSTRING(A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A4.sv1,

A3.sv + A2.sv + A1.sv, 1), 0),

10) AS INT), A4.n1) AND

A4.n2 <=

COALESCE(

CAST(

SUBSTRING(A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A4.sv2,

A3.sv + A2.sv + A1.sv, 1), 0),

10) AS INT), A4.n2) AND

A4.n3 <=

COALESCE(

CAST(

SUBSTRING(A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A4.sv3,

A3.sv + A2.sv + A1.sv, 1), 0),

10) AS INT), A4.n3) AND

A4.n4 <=

COALESCE(

CAST(

SUBSTRING(A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A4.sv4,

A3.sv + A2.sv + A1.sv, 1), 0),

10) AS INT), A4.n4) AND

A4.n5 <=

COALESCE(

CAST(

SUBSTRING(A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A4.sv5,

A3.sv + A2.sv + A1.sv, 1), 0),

10) AS INT), A4.n5)

LEFT OUTER JOIN

AddendsStr AS A5

ON A5.s <= A4.s AND

A5.s = S.total - A1.s - A2.s - A3.s - A4.s AND

A5.n1 =

COALESCE(

CAST(

SUBSTRING(A4.srn + A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A5.sv1,

A4.sv + A3.sv + A2.sv + A1.sv,

1), 0),

10) AS INT), A5.n1) AND

A5.n2 =

COALESCE(

CAST(

SUBSTRING(A4.srn + A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A5.sv2,

A4.sv + A3.sv + A2.sv + A1.sv,

1), 0),

10) AS INT), A5.n2) AND

A5.n3 =

COALESCE(

CAST(

SUBSTRING(A4.srn + A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A5.sv3,

A4.sv + A3.sv + A2.sv + A1.sv,

1), 0),

10) AS INT), A5.n3) AND

A5.n4 =

COALESCE(

CAST(

SUBSTRING(A4.srn + A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A5.sv4,

A4.sv + A3.sv + A2.sv + A1.sv,

1), 0),

10) AS INT), A5.n4) AND

A5.n5 =

COALESCE(

CAST(

SUBSTRING(A4.srn + A3.srn + A2.srn + A1.srn,

NULLIF(CHARINDEX(A5.sv5,

A4.sv + A3.sv + A2.sv + A1.sv,

1), 0),

10) AS INT), A5.n5)

WHERE A1.s +

COALESCE(A2.s, 0) +

COALESCE(A3.s, 0) +

COALESCE(A4.s, 0) +

COALESCE(A5.s, 0) = S.total

-- View that returns all bin-packing solutions that use the

-- fewest bins

CREATE VIEW FewestBinPackingAddends

(s, bin_capacity, bin_tally,

v11, n11, v12, n12, v13, n13, v14, n14, v15, n15,

v21, n21, v22, n22, v23, n23, v24, n24, v25, n25,

v31, n31, v32, n32, v33, n33, v34, n34, v35, n35,

v41, n41, v42, n42, v43, n43, v44, n44, v45, n45,

v51, n51, v52, n52, v53, n53, v54, n54, v55, n55)

AS

SELECT s, T.bin_capacity, bin_tally,

v11, n11, v12, n12, v13, n13, v14, n14, v15, n15,

v21, n21, v22, n22, v23, n23, v24, n24, v25, n25,

v31, n31, v32, n32, v33, n33, v34, n34, v35, n35,

v41, n41, v42, n42, v43, n43, v44, n44, v45, n45,

v51, n51, v52, n52, v53, n53, v54, n54, v55, n55

FROM (SELECT bin_capacity, MIN(bin_tally)

FROM BinPackingAddends

GROUP BY bin_capacity) AS T(bin_capacity, fewest)

INNER JOIN

BinPackingAddends AS A

ON A.bin_tally = T.fewest AND

A.bin_capacity = T.bin_capacity

-- Given sample data, query for all addends where the bin

-- capacity is as specified. This returns 3733 rows in under

-- 20 sec on a 3.8 GHz P4 with 2 GB RAM running SQL Server 2005

-- Beta 2.

SELECT *

FROM BinPackingAddends

WHERE bin_capacity = 12

Let's look at one row returned (randomly chosen)

s = 27 -- sum of values in table V

bin_capacity = 12 -- as specified

bin_tally = 4 -- number of bins used by this result

1st bin = {(5,1),(7,1)} -- bins contain tuples of (value, occurrences)

2nd bin = {(5,1)}

3rd bin = {(5,1)}

4th bin = {(1,1), (2,2)}

-- The following query will return only those results from the

-- previous query that require the fewest bins (in this case, the

-- fewest bins needed is 3 and there are 218 results). This

-- executed in just over 30 sec.

SELECT *

FROM FewestBinPackingAddends

WHERE bin_capacity = 12

Let's look at one row returned (randomly chosen)

s = 27 -- sum of values in table V

bin_capacity = 12 -- as specified

bin_tally = 3 -- number of bins used by this result

1st bin = {(1,2),(2,1),(7,1)}

2nd bin = {(1,2),(2,2),(5,1)}

3rd bin = {(5,1)}

-- Note that if we wanted optimal solutions for different bin

-- capacities the query is straightforward, e.g.,

SELECT *

FROM FewestBinPackingAddends

WHERE bin_capacity BETWEEN 10 AND 12

--

JAG

Feb 21, 2005, 7:30:20 PM2/21/05

to

Sorry for not responding sooner. I did get a grasp on this most recent

code, although it appears to be incomplete. I added the last lines to

the CREATE VIEW statement and got it working, but I don't think it

actually finds the best combination of combinations. I see where it's

going with the extra columns to calculate the remaining occurances of

each value after they have been grouped... am I supposed to figure out

the rest myself?

code, although it appears to be incomplete. I added the last lines to

the CREATE VIEW statement and got it working, but I don't think it

actually finds the best combination of combinations. I see where it's

going with the extra columns to calculate the remaining occurances of

each value after they have been grouped... am I supposed to figure out

the rest myself?

In any case, I adapted your first code (John G.) to my situation, and I

think it works as good as I'm going to get it. I'd attach an example,

but I gotta leave for work. I'll do it later....

Thanks again all,

Sil

Feb 24, 2005, 8:48:16 PM2/24/05

to

OK, the basic idea behind my last post was right but the implementation

had some real problems. The code below should remedy that. There's a

view Addends that returns sets (rows) of natural numbers where each set

sums to a particular value. Then there's a view BinPackingAddends that

returns sets of bins where the collective sum of each set of bins is a

particular value and where each individual bin's sum is no greater than

another value, e.g., 12. Finally, the view FewestBinPackingAddends

returns all solutions from BinPackingAddends that use the fewest bins.

The code here is complete, runs with the data provided, and appears to

work properly.

As mentioned previously, this is a bin-packing problem, and thus NP-

complete, so performance with other than modest-sized input could be

long-running indeed. However, this solution is purely relational and

performs quite well for small inputs. Note that the solution allows

bin capacity to be specified, that is, it's not hard-coded to be 12.

VALUES (4, 3) -- 3 occurrences of 4

INSERT INTO V (i, n)

VALUES (7, 3) -- 3 occurrences of 7

-- Sum is 3*4 + 3*7 = 33 so each set of bins must sum to this value,

-- however, individual bins can have an arbitrary capacity, e.g., 12

-- Addends view

-- s : sum, that is, v1*n1 + ... + vn*nn

-- vi : ith value

-- ni : number of occurrences of vi used (ni <= mni)

-- mni : max number of occurrences of vi

CREATE VIEW Addends

(s, v1, n1, mn1, v2, n2, mn2, v3, n3, mn3, v4, n4, mn4, v5, n5, mn5)

AS

SELECT S.i,

V1.i, N1.i, V1.n,

V2.i, N2.i, V2.n,

V3.i, N3.i, V3.n,

V4.i, N4.i, V4.n,

V5.i, N5.i, V5.n

FROM N AS S

INNER JOIN

V AS V1

ON S.i > 0 AND

S.i <= (SELECT SUM(i*n)

FROM V) AND

-- BinPackingAddends view

<