Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to represent category, subcategory, product

7 views
Skip to first unread message

TH

unread,
Nov 21, 2006, 2:39:06 PM11/21/06
to
I don't know if this counts as database theory or not so feel free to
tell me this is an inappropriate group for this post but anyway: can
anyone please offer any advice as to what would be the right pattern
(is pattern the right word?) to model the following scenario in a
relational database?

A category contains has many subcategories, a subcategory belongs to
one category. So:

CatTable: (CatID, CatName)
SubcatTable: (SubcatID, SubcatName, CatID)

So far so good. Now a product is either in a subcategory or directly in
a category with no subcategorisation. Of course if it's in a
subcategory, its category is implied by the subcategory's parent
category.

I could use:

ProdTable: (ProdID, ProdName, CatID, SubcatID)

But this doesn't seem quite right because it allows the possibility
that Prod.CatId != Prod.Subcat.CatId.

Is there a better way of representing this? I'd also be grateful if
anyone knows of a good resource for finding relational representations
of common scenarios like this.

Thanks! TH.

Gene Wirchenko

unread,
Nov 21, 2006, 3:22:32 PM11/21/06
to
"TH" <thin...@googlemail.com> wrote:

[snip]

>So far so good. Now a product is either in a subcategory or directly in
>a category with no subcategorisation. Of course if it's in a

Why not always have a subcategorisation?
cat, shorthair
cat, longhair
cat, other <-- your non-subcategorised
(cat, not otherwise classified)

>subcategory, its category is implied by the subcategory's parent
>category.

You might think of it as being at the top of a list, but that is
just presentation.

[snip]

Sincerely,

Gene Wirchenko

Computerese Irregular Verb Conjugation:
I have preferences.
You have biases.
He/She has prejudices.

Lennart

unread,
Nov 21, 2006, 3:43:13 PM11/21/06
to

TH wrote:
[...]

>
> But this doesn't seem quite right because it allows the possibility
> that Prod.CatId != Prod.Subcat.CatId.
>

Heres one way of handling this:

create table SubcatTable (
SubcatID ... not null primary key
, SubcatName ... not null
, CatID ... not null references Cattable...
, unique (SubcatID, CatID)
)

create table ProdTable (
ProdID ...,
ProdName ...,
CatID...,
SubcatID...,
foreign key (SubcatID, CatID)
references SubcatTable (SubcatID, CatID)
)

/Lennart

[...]

Neo

unread,
Nov 21, 2006, 4:00:15 PM11/21/06
to
> But this doesn't seem quite right because it allows the possibility that Prod.CatId != Prod.Subcat.CatId.

T_Cat (CatID)
T_Product (PrID)
T_ProdCat (PrID, CatID)

Derive subCats.

Lennart

unread,
Nov 21, 2006, 4:09:40 PM11/21/06
to

You've lost me here, how can you derive subcategory from PrID, CatID?

/Lennart

Neo

unread,
Nov 21, 2006, 4:19:13 PM11/21/06
to
> > Derive subCats.
>
> You've lost me here, how can you derive subcategory from PrID, CatID?

Suppose set1 has members A, B, C.
Suppose set2 has members B, C.
Here we can derive set2 is a proper subset of set1 because
every member of set2 is in set1
but not every member of set1 is in set2.

Lennart

unread,
Nov 21, 2006, 4:37:41 PM11/21/06
to

Agreed, but how do you map this to the relations category, subcategory
and product?

/Lennart

Neo

unread,
Nov 21, 2006, 4:55:21 PM11/21/06
to
> Agreed, but how do you map this to the relations category, subcategory and product?

T_Cat
1 Device
2 Computer
3 Printer

T_Prod
1 DellPc1
2 HpPr1

T_ProdCat
PROD CAT
DellPc1 Device
DellPc1 Computer
HpPr1 Device
HpPr1 Printer

Based on above data, we can derive Computer and Printer are currently
subCats of Device,
because:

Device = {DellPc1, HpPr1}
Computer = {DellPc1}
Printer = {HpPr1}

Computer and Printer are proper subcategories of Device.

Bob Badour

unread,
Nov 21, 2006, 4:58:33 PM11/21/06
to
TH wrote:

> I don't know if this counts as database theory or not so feel free to
> tell me this is an inappropriate group for this post but anyway: can
> anyone please offer any advice as to what would be the right pattern
> (is pattern the right word?) to model the following scenario in a
> relational database?

Design requires tradeoffs to balance the needs of all your requirements.
I assume the vast majority of your requirements are missing from the
very simple problem statement you gave.

To design well, one must be able to devise any number of designs and
evaluate the consequences of different designs.

TH

unread,
Nov 21, 2006, 6:47:21 PM11/21/06
to
Wow, what a great response! I waited days with no reply on a SQL Server
group, I should have come here first.

Thanks Gene, Lennart, Neo, Bob for your suggestions, some very helpful
ideas there. As Bob suggests, this was something of a reduction of my
current requirements so I'll have to consider how the various options
sit with the rest of the design, but I can see how each of approaches
would be appropriate in situations I've come across.

I think Lennart's method fits the best in this case. I must admit I
didn't realise it was possible to have a two-field foreign key.
Although for some reason which I can't fathom, this method appears to
allow a Product to reference a non-existent Category or Subcategory as
long at the second part of the key is a null. I don't know if it's
possible to tighten up this aspect?

Neo, your suggestion is intriguing, I'll be looking into it. It sounds
a very unusual approach. Is it? It seems you'd need a pretty heavy
piece of SQL every time you wanted to reference the derived table of
subcategories, and I don't suppose it would be possible for another
table to have a foreign key onto the derived table, i.e. for an entity
which could only ever be a direct member of a subcategory, not a top
level category.

I'm surprised how difficult it's been to find a decent resource which
helps teach these building blocks of relational database design, I
haven't found any database counterparts of the libraries of algorithms
or design patterns that you find in the programming world. Everything
I've seen in the way of database tutorials so far seems to teach you
normal form and send you on your way. Surely there must be some way to
break into this circle without nagging the sages of usenet!

Thanks all, TH.

Neo

unread,
Nov 21, 2006, 9:41:00 PM11/21/06
to
> Neo, your suggestion is intriguing, I'll be looking into it.
> It sounds a very unusual approach. Is it?

Initial categories/subcats
Person
Engineer
Doctor
Animal
Dog
Cat

Later comes along a martian. He can perform all the functions of a
doctor. What category does he go under? Later comes a toy dog made by
Sony. It barks and retrieves the newspaper. What category does it go
under?

Lennart

unread,
Nov 22, 2006, 12:31:55 AM11/22/06
to

Neo wrote:
[...]

> Based on above data, we can derive Computer and Printer are currently
> subCats of Device,
> because:
>
> Device = {DellPc1, HpPr1}
> Computer = {DellPc1}
> Printer = {HpPr1}
>
> Computer and Printer are proper subcategories of Device.

Now I see you, once again :-)

Lennart

unread,
Nov 22, 2006, 9:11:50 AM11/22/06
to

TH wrote:
> Wow, what a great response! I waited days with no reply on a SQL Server
> group, I should have come here first.
>
> Thanks Gene, Lennart, Neo, Bob for your suggestions, some very helpful
> ideas there. As Bob suggests, this was something of a reduction of my
> current requirements so I'll have to consider how the various options
> sit with the rest of the design, but I can see how each of approaches
> would be appropriate in situations I've come across.
>
> I think Lennart's method fits the best in this case. I must admit I
> didn't realise it was possible to have a two-field foreign key.
> Although for some reason which I can't fathom, this method appears to
> allow a Product to reference a non-existent Category or Subcategory as
> long at the second part of the key is a null. I don't know if it's
> possible to tighten up this aspect?
>

Add another fk pointing to category.

create table ProdTable (
ProdID ...,
ProdName ...,
CatID...,
SubcatID...,
foreign key (SubcatID, CatID)

references SubcatTable (SubcatID, CatID),
foreign key (CatID)
references catTable
)

now if subcatid is null, catid must still exist


/Lennart

kvnkr...@gmail.com

unread,
Nov 22, 2006, 1:28:19 PM11/22/06
to
Cat:
CatID
CatName
ParentCatID

Prod:
ProdID
ProdName
CatID

You can derive levels of subcategorization via closure on ParentCatID
-> CatID (a NULL ParentCatID indicates a root "category").

JOG

unread,
Nov 22, 2006, 5:37:27 PM11/22/06
to

> > Computer and Printer are proper subcategories of Device.Now I see you, once again :-)

I'd take neo's responses with a pinch of salt TH. He is promoting his
own pet project. In terms of a serious answer to your question, I echo
the sentiments that your particular modelling needs will affect the
correct implementation, and that without any understanding of them
suggestions here are going to be completely in the dark. However, for
what it is worth I have learnt that reducing my problem to an
underlying set of propositions and analyzing them often helps me far
more than thinking in terms of tables, connections and categorizations.
Consider a contrived category/sub_category set of data such as:

* bennie is a member of the set of cats and of the set of mammals.
* tc is a member of the set of cats and of the set of mammals.
* yogi is a member of the set of bears and of the set of mammals.
* kermit is a member of the set of frogs and of the set of amphibians.
* nemo is a member of the set of fish

By normalizing to 3NF it becomes clear that you are actually dealing
with two distinct predicate structures:

* bennie is a cat.
* tc is a cat.
* yogi is a bear.
* kermit is a frog.
* nemo is a fish.

* All cats are mammals.
* All bears are mammals.
* All frogs are amphibians.

or written as relations:

R1(id, category) = { (bennie, cat), (tc, cat), (yogi, bear), (kermit,
frog), (nemo, fish) }
R2(sub_category, super_category) = { (cat, mammal), (bear, mammal),
(frog, amphibian) }

This seems an intuitive and logical approach to me. To get a list of
everything in a subcategory:
SELECT * FROM R1 INNER JOIN R2 ON (R1.category = R2.sub_category)

To get a list of everything not in a subcategory:
SELECT * FROM R1 ANTI JOIN R2 ON (R1.category = R2.sub_category)
or:
SELECT * FROM R1 WHERE NOT EXISTS (SELECT * FROM R2 WHERE R1.category
= R2.sub_category)

To get a list of every item and its root parent category:
SELECT id, super_category AS category FROM R1 INNER JOIN R2 ON
(R1.category = R2. sub_category)
UNION
SELECT id, category FROM R1 ANTI JOIN R2 ON (R1.category = R2.
sub_category)

I emphasize that this may not be the correct approach for what /your
specific situation/ requires so do not take it as a prescription, but
rather merely as food for thought.

Lennart

unread,
Nov 23, 2006, 1:56:30 AM11/23/06
to

JOG wrote:
[...]

> I'd take neo's responses with a pinch of salt TH. He is promoting his
> own pet project.

Whether the idea is good or bad where irrelevant to me at that point. I
didnt understand it at all and asked Neo to explain, which he did.

[...]


/Lennart

paul c

unread,
Nov 23, 2006, 8:28:11 AM11/23/06
to
JOG wrote:
...

> * kermit is a member of the set of frogs and of the set of amphibians.
...

> By normalizing to 3NF it becomes clear that you are actually dealing
> with two distinct predicate structures:
...

> * All frogs are amphibians.
...

I like your point that making up tables without first forming the
predicates is jumping the gun but in this example I don't see how the
frogs are necessarily a subset of the amphibians (ignoring outside
information).

p

Bob Badour

unread,
Nov 23, 2006, 8:35:32 AM11/23/06
to
paul c wrote:

If some frogs are not amphibians, then we don't have a
category/subcategory problem in the first place.

-CELKO-

unread,
Nov 23, 2006, 8:50:51 AM11/23/06
to
The classic scenario calls for a root class with all the common
attributes and then specialized sub-classes under it. As an example,
let's take the class of Vehicles and find an industry standard
identifier (VIN), and add two mutually exclusive sub-classes, Sport
utility vehicles and sedans ('SUV', 'SED').

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
UNIQUE (vin, vehicle_type),
..);

Notice the overlapping candidate keys. I then use a compound candidate
key (vin, vehicle_type) and a constraint in each sub-class table to
assure that the vehicle_type is locked and agrees with the Vehicles
table. Add some DRI actions and you are done:

CREATE TABLE SUV
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
CHECK(vehicle_type = 'SUV'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type = 'SED'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

I can continue to build a hierarchy like this. For example, if I had a
Sedans table that broke down into two-door and four-door sedans, I
could a schema like this:

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type IN ('2DR', '4DR', 'SED')),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE TwoDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
CHECK(vehicle_type = '2DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE FourDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
CHECK(vehicle_type = '4DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans (vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

The idea is to build a chain of identifiers and types in a UNIQUE()
constraint that go up the tree when you use a REFERENCES constraint.
Obviously, you can do variants of this trick to get different class
structures.

If an entity doesn't have to be exclusively one subtype, you play with
the root of the class hierarchy:

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
PRIMARY KEY (vin, vehicle_type),
..);

Now start hiding all this stuff in VIEWs immediately and add an INSTEAD
OF trigger to those VIEWs.

JOG

unread,
Nov 23, 2006, 10:35:43 AM11/23/06
to
paul c wrote:

See bob's response, but also consider that we must always use outside
information. A relation value is just a single member of a (perhaps
infinite) set of all values for a relvar. And one certainly cannot
construct an ideal schema from considering a single relation value in
isolation (my point is that it can give you a lot of pointers
nonetheless). But yes, because enumerating the complete set of possible
rel values is also often unachievable, one finds the middleground,
using heuristics and external knowledge to make our best effort of
organizing the data. All best, J.

Neo

unread,
Nov 23, 2006, 10:55:19 AM11/23/06
to
> The classic scenario calls for a root class with all the common
> attributes and then specialized sub-classes under it. As an example,
> let's take the class of Vehicles and find an industry standard
> identifier (VIN), and add two mutually exclusive sub-classes, Sport
> utility vehicles and sedans ('SUV', 'SED').

Below is an alternative solution using dbd where a vehicle can have 0
to many classifications. Disadv: subClasses need to be derived. Adv:
flexibility to classify things as desired.

(new 'vin)

(new 'vehicle0)
(set vehicle0 vin (val+ '@1234))

(new 'vehicle1 'car '1-door)
(set vehicle1 vin (val+ '45?67))

(new 'vehicle2 'car 'suv)
(set vehicle2 vin (val+ 'JFH#3638))

(new 'vehicle3 'car 'sedan '5-door)
(set vehicle3 vin (val+ '8-9-A0))

(new 'vehicle4 'car 'suv 'sedan)
(set vehicle4 vin (val+ '3G$67))

(new 'vehicle5 'car 'boat 'suv 'sedan)
(set vehicle5 vin (val+ 'NH&74H6))

(new 'vehicle6 'boat 'airplane 'suv 'sedan 'rv '9-door)
(set vehicle6 vin (val+ 'N5*J4H6))

(new 'vehicle7 'car 'economy 'beetle 'bug)
(set vehicle7 vin (val+ 'VW=578))

(; Find suvs)
(; Returns vehicle2, vehicle4, vehicle5, vehicle6)
(get suv instance *)

Lennart

unread,
Nov 23, 2006, 12:08:59 PM11/23/06
to

Neo wrote:
[...]

> Below is an alternative solution using dbd where a vehicle can have 0
> to many classifications. Disadv: subClasses need to be derived.

With the possibility that I missunderstoud your proposal, is'nt the
risk of adding new subclasses by mistake (example sedan, Sedan sEdan,
etc), another disadv. ?

/Lennart

[...]

Neo

unread,
Nov 23, 2006, 12:46:42 PM11/23/06
to
> is'nt the risk of adding new subclasses by mistake
> (example sedan, Sedan sEdan, etc), another disadv. ?

Via the new function, such a mistake is possible. The new function
creates a new thing, names it and classifies it as specified (ie
sedan). If the specified class (ie sEdan) does not exist, it will
create it.

To avoid this situation, use the following method:
(new 'vehicle8)
(set sedan instance vehicle8)

Via above method, the following will not be accepted ...
(set sEdan instance vehicle8)

... and dbd prompts user with "What is sEdan?"

For more info, see www.dbfordummies.com/nli/new.asp

0 new messages