here is a short description of the schema :
- I've orders with an amount
- An order can be associated with several deliveries, each one having
also an amount (a delivery has a FK referencing the order).
I would like to produce following result that joins both tables without
repeating order amounts :
Order A $100 Develiery1 $50
Develiery2 $30
Develiery3 $20
Order B $200 Develiery4 $200
Order C $120 Develiery5 $70
Develiery6 $50
....
I would like to find a select request without using PL/SQL procedure if
it is possible.
Thanks a lot for your help.
> here is a short description of the schema :
I don't use Oracle, but your question doesn't seem to have anything
Oracle-specific. I'll answer using PostgreSQL.
> - I've orders with an amount
pif_orders=> CREATE TABLE stock_order (
code VARCHAR NOT NULL,
amount NUMERIC(10,2) NOT NULL,
PRIMARY KEY (code)
);
pif_orders=> INSERT INTO stock_order (code, amount) VALUES
('A', 100.00),
('B', 200.00),
('C', 120.00);
> - An order can be associated with several deliveries, each one having
> also an amount (a delivery has a FK referencing the order).
pif_orders=> CREATE TABLE stock_delivery (
order_code VARCHAR NOT NULL
REFERENCES stock_order (code),
sequence INTEGER NOT NULL,
amount NUMERIC(10,2),
PRIMARY KEY (order_code, sequence)
);
pif_orders=> INSERT INTO stock_delivery (order_code, sequence, amount)
VALUES
('A', 1, 50.00),
('A', 2, 30.00),
('A', 3, 20.00),
('B', 1, 200.00),
('C', 1, 70.00),
('C', 2, 50.00);
> I would like to produce following result that joins both tables
> without repeating order amounts :
>
> Order A $100 Develiery1 $50
> Develiery2 $30
> Develiery3 $20
> Order B $200 Develiery4 $200
> Order C $120 Develiery5 $70
> Develiery6 $50
That's not a relation (some tuples are incomplete), so it's not
something you should expect the relational DBMS to produce. Sometimes a
particular DBMS product might have something that can be distorted to do
that kind of thing, but it will always be limited and usually lead to
frustration since it's not the role of the DBMS to produce generic
reports.
Instead, use a ‘SELECT’ to produce the relation:
pif_orders=> SELECT DISTINCT
order_code,
order_amount,
delivery_sequence,
delivery_amount
FROM
(
SELECT
code AS order_code,
amount AS order_amount
FROM stock_order
) AS o
NATURAL JOIN
(
SELECT
order_code,
sequence AS delivery_sequence,
amount AS delivery_amount
FROM stock_delivery
) AS d
ORDER BY order_code, delivery_sequence;
order_code | order_amount | delivery_sequence | delivery_amount
------------+--------------+-------------------+-----------------
A | 100.00 | 1 | 50.00
A | 100.00 | 2 | 30.00
A | 100.00 | 3 | 20.00
B | 200.00 | 1 | 200.00
C | 120.00 | 1 | 70.00
C | 120.00 | 2 | 50.00
(6 rows)
and then use a function or user application to process the relation into
whatever output text you like for the report. Turning relations into
reports isn't the job of the DBMS.
--
\ “It's up to the masses to distribute [music] however they want |
`\ … The laws don't matter at that point. People sharing music in |
_o__) their bedrooms is the new radio.” —Neil Young, 2008-05-06 |
Ben Finney
Ben Finney wrote on 28.11.2009 00:12:
> pif_orders=> SELECT DISTINCT
> order_code,
> order_amount,
> delivery_sequence,
> delivery_amount
> FROM
> (
> SELECT
> code AS order_code,
> amount AS order_amount
> FROM stock_order
> ) AS o
> NATURAL JOIN
> (
> SELECT
> order_code,
> sequence AS delivery_sequence,
> amount AS delivery_amount
> FROM stock_delivery
> ) AS d
> ORDER BY order_code, delivery_sequence;
This can be done a lot easier including the surpressing of repeating values
(works in Postgres and Oracle):
select case
when lag(o.code) over (partition by o.code order by d.sequence) = o.code then ''
else o.code
end,
case
when lag(o.code) over (partition by o.code order by d.sequence) = o.code then null
else o.amount
end as order_amount,
d.sequence as delivery_sequence,
d.amount as delivery_amount
from stock_order o
join stock_delivery d on o.code = d.order_code
order by o.code, d.sequence;
And even if the "blank" stuff should be in there I don't understand the complicated sub-query you were using :)
Thomas
> And even if the "blank" stuff should be in there I don't understand
> the complicated sub-query you were using :)
For didactic purposes. I prefer to have attribute names that make sense
in the context of the relation; and then use attribute renames in the
select so that a natural join will just work. Join conditions give me
hives; every join should be natural anyway :-)
--
\ “I fly Air Bizarre. You buy a combination one-way round-trip |
`\ ticket. Leave any Monday, and they bring you back the previous |
_o__) Friday. That way you still have the weekend.” —Steven Wright |
Ben Finney
I agree :)
> Join conditions give me hives; every join should be natural anyway :-)
Hmm, natural joins give _me_ the creeps.
They are not relying on FK relations but on naming conventions. Now if I have a column ID in table product and that is referenced by a product_id in another table, the natural join would yield a cartesian join...
> Thomas Kellerer <OTPXDA...@spammotel.com> writes:
>
>> And even if the "blank" stuff should be in there I don't understand
>> the complicated sub-query you were using :)
>
> For didactic purposes. I prefer to have attribute names that make sense
> in the context of the relation; and then use attribute renames in the
> select so that a natural join will just work. Join conditions give me
> hives; every join should be natural anyway :-)
Hmm. Interesting assertion. Why avoid things like outer joins?
I've recently been working on improving the performance of an
application and by far, the most beneficial performance gains I've made
has been from replacing really inefficient sub-queiries with a much
simpler and more efficient outer join.
From experience, I know that people seem to have problems with things
like outer joins, but I've never really understood what the issue is. I
could understand some confusion with older versions of Oracle that had
its own oracle method of defining outer joins, but now that it is more
standards compliant in its join syntax, I don't see what the problem
is. I also know you can get into trouble when working with different
RDMS as they can behave differently depending on how the join is
defined, which can be a little frustrating, but provided adequate
testing, is not a huge issue.
However, I find it odd that people will use a more complex and usually
inefficient sub-query rather than a simpler join. Am I missing something
here?
Tim
--
tcross (at) rapttech dot com dot au
> Ben Finney wrote on 28.11.2009 01:36:
> > Thomas Kellerer <OTPXDA...@spammotel.com> writes:
> >
> >> And even if the "blank" stuff should be in there I don't understand
> >> the complicated sub-query you were using :)
> >
> > For didactic purposes. I prefer to have attribute names that make sense
> > in the context of the relation; and then use attribute renames in the
> > select so that a natural join will just work.
>
> I agree :)
>
> > Join conditions give me hives; every join should be natural anyway :-)
>
> Hmm, natural joins give _me_ the creeps.
>
> They are not relying on FK relations
Right, foreign key relations are database constraints, not join
conditions. You can't rely on a foreign key anyway, you need to specify
it either with an explicit join condition or (as I've done) an explicit
name for a natural join.
> but on naming conventions.
Note that I'm explicitly using distinct attribute names in the select
operation, regardless of what the attributes are named in the base
relation. That also ensures that all the names in the ‘SELECT’ clause,
the ‘WHERE’ clause, et cetera are all unique without further
qualification.
That is, instead of:
SELECT DISTINCT
stock_order.code,
stock_item.code,
stock_item.name,
stock_order.amount,
stock_delivery.code,
stock_delivery.amount
FROM
stock_order
INNER JOIN stock_delivery
ON stock_order.code = stock_delivery.order_code
INNER JOIN stock_item
ON stock_order.item_code = stock_item.code
INNER JOIN stock_category
ON stock_item.category_code = stock_category.code
WHERE
stock_order.amount > 100
AND stock_category.name = 'widget'
I'm using names that are both shorter and don't need to be qualified
further:
SELECT DISTINCT
order_code,
item_code,
item_name,
order_amount,
delivery_code,
delivery_amount
FROM
(
SELECT DISTINCT
code AS order_code,
item_code,
amount AS order_amount
FROM stock_order
) AS o
NATURAL JOIN (
SELECT DISTINCT
code AS delivery_code,
amount AS delivery_amount
FROM stock_delivery
) AS d
NATURAL JOIN (
SELECT DISTINCT
code AS item_code,
category_code,
name AS item_name
FROM stock_item
) AS i
NATURAL JOIN (
SELECT DISTINCT
code AS category_code,
name AS category_name
FROM stock_category
) AS c
WHERE
order_amount > 100
AND category_name = 'widget'
I find that both more explicit, less repetitious, and much clearer, than
a bunch of table qualifications and join conditions; especially since
the same technique continues to work for arbitrarily complex
sub-selects.
--
\ “Pray, v. To ask that the laws of the universe be annulled in |
`\ behalf of a single petitioner confessedly unworthy.” —Ambrose |
_o__) Bierce, _The Devil's Dictionary_, 1906 |
Ben Finney
> Ben Finney <bignose+h...@benfinney.id.au> writes:
>
> > I prefer to have attribute names that make sense in the context of
> > the relation; and then use attribute renames in the select so that a
> > natural join will just work. Join conditions give me hives; every
> > join should be natural anyway :-)
>
> Hmm. Interesting assertion. Why avoid things like outer joins?
Because NULL in query results makes them non-relational.
> I've recently been working on improving the performance of an
> application and by far, the most beneficial performance gains I've
> made has been from replacing really inefficient sub-queiries with a
> much simpler and more efficient outer join.
Well, you're the one with that evidence and I don't have it to examine.
I won't deny that you've experienced what you claim.
But I would say two things about that: either the sub-selects *could* be
written a whole lot more efficiently that they were, before resorting to
outer joins; or the DBMS product isn't doing its job of making the best
execution plan from a declarative query.
> However, I find it odd that people will use a more complex and usually
> inefficient sub-query rather than a simpler join. Am I missing
> something here?
I find natural joins with distinctly-named attributes from sub-selects
much clearer, not least because they are more explicit. There's no need
to visualise the different styles of join; the rules for a natural join
are clear.
As for efficiency: If two query expressions are semantically identical,
the DBMS should recognise that and yield identical execution plans. I
consider it to have a bug if that's not the case.
--
\ “When I was a kid I used to pray every night for a new bicycle. |
`\ Then I realised that the Lord doesn't work that way so I stole |
_o__) one and asked Him to forgive me.” —Emo Philips |
Ben Finney
order_id sum delivery_id cost
> Order A $100 Develiery1 $50
> Develiery2 $30
> Develiery3 $20
> Order B $200 Develiery4 $200
> Order C $120 Develiery5 $70
> Develiery6 $50
> ....
>
>
> I would like to find a select request without using PL/SQL procedure if
> it is possible.
perhaps something like this. (untested)
SELECT o.order_id,o_tot,d_delivery_id,d.cost
FROM ( select order.id, sum(cost) as tot, min(delivery_id) from order
join delivery on delivery.order_id = order.order_id ) as o
RIGHT OUTER JOIN delivery on delivery.order_id = o.order_id
ORDER BY delivery.order_id,delivery_id
but it does not strike me as a good idea.
>
> and then use a function or user application to process the relation into
> whatever output text you like for the report. Turning relations into
> reports isn't the job of the DBMS.
>
So what is the job of the DBMS? From wikipedia (just 'cause it was
the first thing that popped up in google, you are welcome to cite
others):
"A Database Management System (DBMS) is a set of computer programs
that controls the creation, maintenance, and the use of the database
with computer as a platform or of an organization and its end users."
Turning relations into reports may or may not be the job of the
database, but by definition it quite well can be the job of the
Database Management System. In fact, I would go so far as to say
using a DBMS as a bit bucket is a mistake I've seen often. There are
some things apps may do better, and some things the DBMS may do
better. Which things may be debatable, but saying either should do
all or neither is wrong.
Can't a function be part of the DBMS?
jg
--
@home.com is bogus.
http://radar.oreilly.com/2009/11/steve-souders-making-web-sites.html
Yet it does have something Oracle-specfic: the ability, through
SQL*Plus, to suppress repeating output, and analytic functions to
compute the sum in-line so the stock_order table isn't necessary
(columns are renamed, as necessary, to remove reserved words):
break on order_cde on ttl_ord_amt
select 'Order '||order_code as order_cde.
sum(delivery_amount) over (partition by order_code order by
order_seq) as ttl_ord_amt,
'Delivery '||delivery_seq as delivery_seq,
'$'||amount
from stock_delivery
order by 1, 3;
Of course the stock_order table can be used:
break on order_cde on ttl_ord_amt
select 'Order '||d.order_code as order_cde.
o.amoount as ttl_ord_amt,
'Delivery '||d.delivery_seq as delivery_seq,
'$'||d.amount
from stock_delivery d join stock_order o on (d.order_code = o.code)
order by 1, 3;
There is nothing wrong with the relation as all 'tuples' are complete;
you're confusing display magic with missing data and there are no
missing values in that result set, simply suppressed output courtesy
of SQL*Plus.
David Fitzjarrell
You don't need SQL*Plus for the formatting, analytical functions are enough
But you have a good point about the stock_order table not being necessary:
select case
when lag(order_code) over (partition by order_code order by delivery_sequence) = order_code then ''
else order_code
end as display_code,
case
when lag(order_code) over (partition by order_code order by delivery_sequence) = order_code then null
else sum(amount) over (partition by order_code)
end as order_amount,
delivery_sequence,
delivery_amount
from stock_delivery
order by order_code, delivery_sequence
Runs on PostgreSQL and DB2 just as well.
Thomas
> On Nov 27, 3:12 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
> wrote:
>
> > and then use a function or user application to process the relation
> > into whatever output text you like for the report. Turning relations
> > into reports isn't the job of the DBMS.
>
> So what is the job of the DBMS?
The job of a relational DBMS is to implement the relational model, by
implementing a logical set of rlations in a physical database
implementation, and to provide an interface to applications for managing
and querying the relations in that database. The output of those
operations should be either status messages or new relations.
> From wikipedia (just 'cause it was the first thing that popped up in
> google, you are welcome to cite others):
I'm paraphrasing my understanding of the writings of E. F. Codd, C. J.
Date, and Hugh Darwen.
> "A Database Management System (DBMS) is a set of computer programs
> that controls the creation, maintenance, and the use of the database
> with computer as a platform or of an organization and its end users."
That seems fair, since it's not restricted to relational DBMS. I wasn't
specific before about the fact that we're talking about *relational*
database management systems behaviour, but I believe that's been the
topic consistently through this thread.
> Turning relations into reports may or may not be the job of the
> database
It's not; the job of the database is to be an implementation of a data
model. As you imply by making the distinction, DBMS ≠ database.
> but by definition it quite well can be the job of the Database
> Management System.
Not the relational DBMS. A DBMS might do *more* than the job of a
relational DBMS, but the distinction I'm making is that one should not
expect to find that behaviour in a relational DBMS since that's not its
job.
> There are some things apps may do better, and some things the DBMS may
> do better. Which things may be debatable, but saying either should do
> all or neither is wrong.
The output of a query into a relational DBMS should always be a relation
(or a status message). It's up to the application to use non-relational
means to turn that relation into whatever other form it requires.
> Can't a function be part of the DBMS?
A non-relational function shouldn't be part of a relational DBMS, no. An
application can *store* its non-relational functions in the *database*,
of course, and access them there; but database ≠ DBMS.
--
\ “Those who write software only for pay should go hurt some |
`\ other field.” —Erik Naggum, in _gnu.misc.discuss_ |
_o__) |
Ben Finney
> But you have a good point about the stock_order table not being
> necessary:
[…]
> Runs on PostgreSQL and DB2 just as well.
Interesting. But you've changed it from a relational to a non-relational
result (some tuples now contain nulls). It also had some other errors
(referencing names that weren't defined).
I'm new to windowing operations (thank you for the motivation to learn
them), but I think this is an improvement of both the above points, plus
a bit clearer for me to read:
SELECT
CASE
WHEN (
LAG(order_code) OVER order_sequence
) = order_code THEN ''
ELSE order_code
END AS display_code,
CASE
WHEN (
LAG(order_code) OVER order_sequence
) = order_code THEN ''
ELSE CAST (
SUM(delivery_amount) OVER (PARTITION BY order_code)
AS VARCHAR)
END AS order_amount,
delivery_sequence,
delivery_amount
FROM (
SELECT
order_code,
sequence AS delivery_sequence,
amount AS delivery_amount
FROM stock_delivery
) AS d
WINDOW order_sequence AS (
PARTITION BY order_code ORDER BY delivery_sequence ASC)
ORDER BY order_code ASC, delivery_sequence ASC;
PostgreSQL 8.4 gives me:
=====
display_code | order_amount | delivery_sequence | delivery_amount
--------------+--------------+-------------------+-----------------
A | 100.00 | 1 | 50.00
| | 2 | 30.00
| | 3 | 20.00
B | 200.00 | 1 | 200.00
C | 120.00 | 1 | 70.00
| | 2 | 50.00
(6 rows)
=====
--
\ “Pinky, are you pondering what I'm pondering?” “Uh, I think so, |
`\ Brain, but I get all clammy inside the tent.” —_Pinky and The |
_o__) Brain_ |
Ben Finney
>joel garry <joel-...@home.com> writes:
[snip]
>> Can't a function be part of the DBMS?
>
>A non-relational function shouldn't be part of a relational DBMS, no. An
>application can *store* its non-relational functions in the *database*,
>of course, and access them there; but database ? DBMS.
Addition of integral values is a function mapping two integral
values to an integral value. Note the lack of a relation in the RDBMS
sense. I think it would be a very useful thing for an RDBMS to have
it.
(How do you more precisely make the division between in and out?)
Sincerely,
Gene Wirchenko
> On Tue, 01 Dec 2009 08:50:34 +1100, Ben Finney
> <bignose+h...@benfinney.id.au> wrote:
>
> >joel garry <joel-...@home.com> writes:
>
> [snip]
>
> >> Can't a function be part of the DBMS?
> >
> >A non-relational function shouldn't be part of a relational DBMS, no.
This was far too broad, I now see. Originally it was in the context of a
*query* returning a non-relational result, which is really as far as I
should have taken it.
> Addition of integral values is a function mapping two integral
> values to an integral value.
Right; of course, there are heaps of functions operating on attribute
values that can be used *within* relational operations to modify the
relation that will be returned. Such functions definitely belong as part
of the relational DBMS.
What I was trying to express was that relational operations — like the
various relational operations that ‘SELECT’ implements — should only
return data as relations (they might also return status responses). They
should never return non-relation data.
Transforming relational data into a non-relation is not the job of the
relational DBMS, but the job of applications that receive relational
data from the DBMS.
--
\ “Anyone who believes exponential growth can go on forever in a |
`\ finite world is either a madman or an economist.” —Kenneth |
_o__) Boulding |
Ben Finney
>Gene Wirchenko <ge...@ocis.net> writes:
>
>> On Tue, 01 Dec 2009 08:50:34 +1100, Ben Finney
>> <bignose+h...@benfinney.id.au> wrote:
>>
>> >joel garry <joel-...@home.com> writes:
>>
>> [snip]
>>
>> >> Can't a function be part of the DBMS?
>> >
>> >A non-relational function shouldn't be part of a relational DBMS, no.
>
>This was far too broad, I now see. Originally it was in the context of a
>*query* returning a non-relational result, which is really as far as I
>should have taken it.
I think you still have problems with the division. (I do not
claim to be able to define the split myself. I know how difficult it
can be to rigourously define something.)
>> Addition of integral values is a function mapping two integral
>> values to an integral value.
>
>Right; of course, there are heaps of functions operating on attribute
>values that can be used *within* relational operations to modify the
>relation that will be returned. Such functions definitely belong as part
>of the relational DBMS.
>
>What I was trying to express was that relational operations � like the
>various relational operations that �SELECT� implements � should only
>return data as relations (they might also return status responses). They
>should never return non-relation data.
A relation does not have order. This would not be a relation
select clicode,cliname from clients order by clicode
because of the order by clause.
Having to go through a sort stage when the DBMS might well have
been able to easily handle it would be counterproductive.
For example, a group by in SQL forces the result to be sorted by
the grouping unless otherwise overridden.
>Transforming relational data into a non-relation is not the job of the
>relational DBMS, but the job of applications that receive relational
>data from the DBMS.
See above about sorting.
Sincerely,
Gene Wirchenko
> On Tue, 01 Dec 2009 13:45:44 +1100, Ben Finney
> <bignose+h...@benfinney.id.au> wrote:
> >What I was trying to express was that relational operations — like
> >the various relational operations that ‘SELECT’ implements — should
> >only return data as relations (they might also return status
> >responses). They should never return non-relation data.
>
> A relation does not have order. This would not be a relation
> select clicode,cliname from clients order by clicode
> because of the order by clause.
Hmm. I'm not sure it's right to say the result would not *be* a
relation; but I certainly take the point about ‘ORDER BY’ requesting
order be imposed on an orderless relation.
I guess I would want to say that a ‘SELECT’ result, though it is ordered
(either implicitly or at the request of the query author), is
nevertheless *compatible with* a relation, as distinct from the way that
the desired “omit some of the values to make the report look closer to
what I want” that started this thread is incompatible with relations.
> Having to go through a sort stage when the DBMS might well have
> been able to easily handle it would be counterproductive.
This is certainly a good point, and I agree.
I'm glad I've been asked about my definitions, it has led to some
clarification. Does anyone have a better definition they'd like to offer
of what “the job of a relational DBMS” is?
--
\ “I do not believe in immortality of the individual, and I |
`\ consider ethics to be an exclusively human concern with no |
_o__) superhuman authority behind it.” —Albert Einstein, letter, 1953 |
Ben Finney
Well that was the initial question, wasn't it? Surpress the repating values.
And it is the same solution which I have already posted, just without the join (but the basic idea and technique ist the same)
It also had some other errors
> (referencing names that weren't defined).
I changed the names so that they would not conflict with reserved words.
I rather rename columns than create unnecessary sub-selects :)
Putting the partition into the WINDOW clause at the end will only work for Postgres, not for Oracle or DB2 (they don't support that). An for just a single usage of the window it doesn't add to the clarity in my opinion. But formatting (and the usage of unnecessary sub-select) is a matter of personal style!
Regards
Thomas
My take is that a Natural join should never be used. Queries written
using natural joins are diasters waiting to happen. All you need is
for someone to add a new column name to one of the tables that also
happens to have a corresponding but probably unrelated column with the
same name in the other table and now you join is garbage.
IMHO -- Mark D Powell --
And perhaps relvars can torture non-relations to relations. But this
is all too deep for me. I mostly agree with you guys. There should
be clear delineations between relational and non-relational parts of
systems, as well as standard SQL versus extensions and procedurals.
>
> >> Addition of integral values is a function mapping two integral
> >> values to an integral value.
>
> >Right; of course, there are heaps of functions operating on attribute
> >values that can be used *within* relational operations to modify the
> >relation that will be returned. Such functions definitely belong as part
> >of the relational DBMS.
>
> >What I was trying to express was that relational operations — like the
> >various relational operations that ‘SELECT’ implements — should only
> >return data as relations (they might also return status responses). They
> >should never return non-relation data.
>
> A relation does not have order. This would not be a relation
> select clicode,cliname from clients order by clicode
> because of the order by clause.
>
> Having to go through a sort stage when the DBMS might well have
> been able to easily handle it would be counterproductive.
>
> For example, a group by in SQL forces the result to be sorted by
> the grouping unless otherwise overridden.
This is certainly wrong in Oracle, and ought to be wrong everywhere
else. I don't say that as an Oracle wonk, but rather because ordering
needs to be explicit by definition.
jg
--
@home.com is bogus.
I thought that one was about the order of attributes, not rows.
>>
>> Having to go through a sort stage when the DBMS might well have
>> been able to easily handle it would be counterproductive.
>>
>> For example, a group by in SQL forces the result to be sorted by
>> the grouping unless otherwise overridden.
>
> This is certainly wrong in Oracle, and ought to be wrong everywhere
> else. I don't say that as an Oracle wonk, but rather because ordering
> needs to be explicit by definition.
>
> jg
> --
> @home.com is bogus.
> http://codeoffsets.com/
>
Shakespeare
Never has been true. Group by does *not* sort the result.
Not even in Oracle 8 and and certainly not for any Oracle version > 9
(and not for any Postgres as well)
Thomas
Kinda. The tuples of a relation (AKA rows in a table) have no order,
which leads to no order on the attributes. Applications use queries
to access data. So we wind up with confusion between application
access and relational inquiry with SQL. It makes no sense to maintain
that SQL (of any variety) only does or should do relational
operations. It happens to be the single language of both relational
queries and applications, using the relational terminology. So Ben is
both right and wrong about user applications being the one to process
the data for users, as SQL confounds the difference. I used to (like,
15-20 years ago) think this was a big shortcoming of SQL, since it
does only part of the user application. Now that the various extended
SQL's can do all of this (even *gasp* html and xml), and the
standardization is close enough to allow it to be used as a data
access language by various tools, that is less of an issue, though the
basic differences between engines makes for a different problem. I
think we can all agree that the popular engines distance themselves
from relational theory to a great degree - but I don't think that
leads to judging these distances as bad at all. I know that is
flamebait in comp.databases, but, well, I didn't start the
crossposting.
As far as analytics, I think it is good to be able to shove some
processing into the sql engine, but I think we are also seeing that it
is hard to predict that performance will be better by doing that, as a
generality. Basically the same problem as forcing order into group
by, there are times we as programmers know better how to manipulate
the data, especially sorting and ordering mass quantities.
jg
--
@home.com is bogus.
"What does HTML stand for?"
"I don't, ah, HoT Man Links?" - heard on radio call-in trivia
contest.
[snip]
>>> A relation does not have order. This would not be a relation
>>> select clicode,cliname from clients order by clicode
>>> because of the order by clause.
>
>I thought that one was about the order of attributes, not rows.
It is both. Remember that a relation is a set of tuples. A set
has no order.
[snip]
Sincerely,
Gene Wirchenko
It does in my dialect (Visual FoxPro). If you are correct, I
overgeneralised.
Granted that that would be implementation-dependent, but it would
be convenient for it to do it that way while grouping. (How else
would it?) My point is that if the DBMS does any sorting internally,
the relational-data-to-app converter should get the benefit of it if
it needs it. It would be silly for the DBMS to sort internally and
then the converter do it again.
Sincerely,
Gene Wirchenko
Never? It did at least in Oracle 7. As a well known and too many times
used side effect. I remember having to reprogram queries because
developed programs relied on the sort....
Shakespeare
> It would be silly for the DBMS to sort internally and
> then the converter do it again.
Agreed. But I think any modern optimizer will combine these steps if that would be quicker
Regards
Thomas
I don't know about Oracle 7, but check out these articles by Tom Kyte:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:74320098178823
http://tkyte.blogspot.com/2005/08/order-in-court.html
Thomas
Oracle never sorted the result set, it sorted the table data to effect
grouping. That the result set retained the ordering enforced to
provide a group by result was, and is, a side effect of the
operation. In that regard Oracle still sorts (orders) the groupings,
they're simply in hash key order rather than column order. Still,
it's not the result set being ordered, it's the interim processing
doing the ordering. And, as stated earlier, it was merely a happy
coincidence that the one affected the other.
David Fitzjarrell
And you did it wrong, even in Oracle7.
Grouping needs some sort but in an internal (binary) one
which may not be a user sort (any country using a language with
national characters has experimented this).
Regards
Michel
That is one of the reasons why we had to change it. Note that I did call
it a side effect.... and side effects are never to rely on..
Shakespeare
Even back on version 7.x we experienced a group by query not returning
the result set rows in the same order as the group by expression.
Technically since at least the version 7 manuals the order by has
always been requried to ensure the order of the data returned. The
reality before 10g is that most statements using group by would return
the data in the group by order; however, with 10g and the hash group
by plan feature, the odds of the data not always returning in the
group by expression order have gone way up. I would suggest that
whenever the order that the result set is returned/displayed in is
important that the order by claude be made a manditory coding
practice.
HTH -- Mark D Powell --
Should anyone care, I ran across this while meditating upon Chris Date
giving a seminar in Dallas next month:
http://books.google.com/books?id=406_pJtiJ6sC&pg=PT175&lpg=PT175&dq=%22What+about+ORDER+BY%3F%22+chris+date&source=bl&ots=xXj0qFAVr9&sig=MH2zmtp51-x1BtSjgYBJO4MCcLc&hl=en&ei=ougeS8axMYbAsQPT9qX3CQ&sa=X&oi=book_result&ct=result&resnum=1&ved=0CAgQ6AEwAA#v=onepage&q=&f=false
>
jg
--
@home.com is bogus.
http://www.signonsandiego.com/news/2009/dec/07/man-gets-18-months-in-prison-for-computer-fraud/
> On Nov 30, 8:52 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
> wrote:
> > Hmm. I'm not sure it's right to say the result would not *be* a
> > relation; but I certainly take the point about ‘ORDER BY’ requesting
> > order be imposed on an orderless relation.
>
> Should anyone care, I ran across this while meditating upon Chris Date
> giving a seminar in Dallas next month:
[…]
Yes, that about covers it. While ‘ORDER BY’ produces something that
isn't a relation, it clearly is an operation useful enough that it
belongs in a relational DBMS.
--
\ “If you do not trust the source do not use this program.” |
`\ —Microsoft Vista security dialogue |
_o__) |
Ben Finney