Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PostgreSQL is much faster than MySQL, only when...

33 views
Skip to first unread message

Marek Lewczuk

unread,
Nov 25, 2003, 7:08:55 AM11/25/03
to
Hello,
I have changed DB from MySQL to PostgreSQL. When I have run my
application on PostgreSQL it was disaster - it was much slower than
MySQL...

I have tried to change PG configuration file etc.. no luck. After many
long days of thinking what is wrong I have made several tests with
"EXPLAIN" statement, and to my amusement there was many SeqScan - MySQL
didn't show that things. I have made some changes in PG db structure
(new indexes etc) and then I have made another test: PostgreSQL time -
10ms, MySQL time - 1 sec. I've tried to change the structure in MySQL
but it didn't change the results.

So my advise for all users which think that MySQL is faster than PG -
re-check your DB structure, check all your querys using EXPLAIN... maybe
new indexes should be added etc.


Regards,
Marek


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majo...@postgresql.org so that your
message can get through to the mailing list cleanly

Martijn van Oosterhout

unread,
Nov 25, 2003, 7:15:51 AM11/25/03
to
And ofcourse, you ran ANALYZE before doing any timings, right?

--
Martijn van Oosterhout <kle...@svana.org> http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Marek Lewczuk

unread,
Nov 25, 2003, 7:23:03 AM11/25/03
to
Użytkownik Martijn van Oosterhout napisał:

> And ofcourse, you ran ANALYZE before doing any timings, right?

Of course.


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majo...@postgresql.org

Marek Lewczuk

unread,
Nov 25, 2003, 8:07:13 AM11/25/03
to
Użytkownik te...@greatgulfhomes.com napisał:

> You said:
> "When I have run my application on PostgreSQL it was disaster"
>

> I think you meant:
> "When I have run my application on MySQL it was disaster"

Well, my english isn't perfect or it was simple mistake... but what I
wanted to say is that my application wasn't working well with PostgreSQL
before the db structure has been changed (new indexes). After that it
was much better...


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Randolf Richardson

unread,
Nov 25, 2003, 1:08:53 PM11/25/03
to
Cross-posted to "comp.databases.postgresql.advocacy" because
PostgreSQL could be very helpful to MySQL DBAs who wish to optimize their
databases -- just convert it to PostgreSQL on a test system and use EXPLAIN
and ANALYZE to identify the weaknesses.

Even if these MySQL DBAs don't use PostgreSQL for anything else, at
the very least it will increase awareness of the product. =)

> Hello,
> I have changed DB from MySQL to PostgreSQL. When I have run my
> application on PostgreSQL it was disaster - it was much slower than
> MySQL...
>
> I have tried to change PG configuration file etc.. no luck. After many
> long days of thinking what is wrong I have made several tests with
> "EXPLAIN" statement, and to my amusement there was many SeqScan - MySQL
> didn't show that things. I have made some changes in PG db structure
> (new indexes etc) and then I have made another test: PostgreSQL time -
> 10ms, MySQL time - 1 sec. I've tried to change the structure in MySQL
> but it didn't change the results.
>
> So my advise for all users which think that MySQL is faster than PG -
> re-check your DB structure, check all your querys using EXPLAIN... maybe
> new indexes should be added etc.

--
Randolf Richardson - r...@8x.ca
Vancouver, British Columbia, Canada

Please do not eMail me directly when responding
to my postings in the newsgroups.

Robert Treat

unread,
Nov 25, 2003, 4:26:20 PM11/25/03
to
On Tue, 2003-11-25 at 08:07, Marek Lewczuk wrote:
> Użytkownik te...@greatgulfhomes.com napisał:
>
> > You said:
> > "When I have run my application on PostgreSQL it was disaster"
> >
> > I think you meant:
> > "When I have run my application on MySQL it was disaster"
>
> Well, my english isn't perfect or it was simple mistake... but what I
> wanted to say is that my application wasn't working well with PostgreSQL
> before the db structure has been changed (new indexes). After that it
> was much better...
>

Yes. I think the gist of your post was "out of the box postgresql
performed like garbage compared to mysql, but then i spent some time
tweaking and tuning, taking advantage of indexes, and now it performs so
quickly that i am unable to make any changes within mysql to match
postgresql's performance"

I'm curious as to what type of application you run and what first
prompted you to switch to postgresql?

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Shridhar Daithankar

unread,
Nov 26, 2003, 1:13:41 AM11/26/03
to
Robert Treat wrote:
> Yes. I think the gist of your post was "out of the box postgresql
> performed like garbage compared to mysql, but then i spent some time
> tweaking and tuning, taking advantage of indexes, and now it performs so
> quickly that i am unable to make any changes within mysql to match
> postgresql's performance"

I would put it this way. 'Applications designed for mysql does not perform good
when ported to postgresql when porting means "just make it work with postgresql"'.

Rather obvious isn't it?

Shridhar

Marek Lewczuk

unread,
Nov 26, 2003, 3:21:25 AM11/26/03
to
Użytkownik Robert Treat napisał:

> On Tue, 2003-11-25 at 08:07, Marek Lewczuk wrote:
>
>
> Yes. I think the gist of your post was "out of the box postgresql
> performed like garbage compared to mysql, but then i spent some time
> tweaking and tuning, taking advantage of indexes, and now it performs so
> quickly that i am unable to make any changes within mysql to match
> postgresql's performance"
Exactly. But as I wrote before - tweaking and tuning wasn't so
important... You cannot use the same db structure in both databases -
even if you think that all indexes, keys are well for MySQL it could be
not sufficient for PostgreSQL. Many people use the same structure and
they see that PostgreSQL is slower... so I just wanted to tell them,
that it is necessary to re-check structure which will be used with PG.


>
> I'm curious as to what type of application you run and what first
> prompted you to switch to postgresql?

It's analysis system for automotive market. There are many complicated
querys, with many joins... PostgreSQL is much more flexible, more
powerful and more stable than MySQL. There are also many other features
which are just not available in MySQL...

>
> Robert Treat

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

cnliou

unread,
Nov 26, 2003, 4:41:44 AM11/26/03
to
>> I'm curious as to what type of application you run and
what first
>> prompted you to switch to postgresql?

I would like to add my 2 cents to this although I have not
been asked by anyone about this :-)

If I remember correctly the fact about 7 years ago...
when PostgreSQL already supported
- sub-query
- transaction
- triggers
- stored procedures (functions)
, mySQL did not have any of these critical and essential
capabilities required by any serious bussiness applications.

When I noticed that fact, I have never turned my head back
to mySQL again since then. Why? Because I thought I would
take a train instead of a plane (if it indeed is), when I am
100% sure that the plane will not fly due to bad wether.

Regards,

CN

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Barbara Lindsey

unread,
Nov 26, 2003, 8:53:57 AM11/26/03
to
How does the performance compare using models mimicked from Oracle? Is
there any information anywhere on that?

--
Barbara E. Lindsey,
COG RDC
Phone: (352) 392-5198 ext. 314 Fax: (352) 392-8162

----
CONFIDENTIALITY NOTICE: The information contained in this electronic
message is legally privileged and confidential and intended only for the
use of the individual(s) or entity(ies) named above. If the reader of
this message is not the intended recipient, you are hereby notified that
any dissemination, distribution, or copying of this email or any of it's
components is strictly prohibited. If you have received this email in
error, please contact the sender.
----


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Bruce Momjian

unread,
Nov 26, 2003, 1:14:07 PM11/26/03
to
Barbara Lindsey wrote:
> How does the performance compare using models mimicked from Oracle? Is
> there any information anywhere on that?

From reports we have received, we are usually +/- 10% compared to Oracle
on most queries.

--
Bruce Momjian | http://candle.pha.pa.us
pg...@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Chris Travers

unread,
Nov 27, 2003, 12:35:53 AM11/27/03
to
Hi all;

Since we are on the topic of what prompted us to use PostgreSQL, I figured I
would share my experiences as well, and some additional thoughts that I had.

I chose PostgreSQL about 2 years ago when I realized that the application I
was building needed something more robust than MySQL.  I found the learning
curve to be a bit steep at first, but eventually got the hang of it and have
developed a real appreciation for PostgreSQL.  Unlike many members here I
won't say I would never go back to MySQL, because there are some
circumstances where I would (where I need an RDBMS-like interface but not an
RDBMS, f. ex. MySQL's heap tables could be very useful for real-time control
systems).  However, I am actually seeing less and less value in MySQL as I
become more familiar with the extensibility features in PostgreSQL.

Anyway one additional thought I had was that it should be possible to write
parsers for text files in PL/PERLu and then plug those in as views.  In this
way, files such as /etc/passwd or even /var/log/messages could be used as if
they were tables in the database.  Information from these tables could then
be selected into other tables, presented directly to an application, etc.
While it mught be possible to do the same for MySQL, it would be much more
work.  Imagine being able to run a query such as:

select * from logs.iptables
where proto = 'TCP' and td_entered > '2003-01-01' and port =
    (select port from sysinfo.services where service='ssh');

in order to get every hit against the ssh service on the firewall in the
current year.  Of course, I imagine, it would take some time to run ;-)  But
for generating reports of system activirt, I would not assume this to be a
problem.

Next question-- any ideas how one could generate something like MySQL's heap
tables (maybe in shared memory?) within PostgreSQL?

Best WIshes,
Chris Travers

Shridhar Daithankar

unread,
Nov 27, 2003, 1:54:47 AM11/27/03
to
Chris Travers wrote:
> Anyway one additional thought I had was that it should be possible to write
> parsers for text files in PL/PERLu and then plug those in as views. In this
> way, files such as /etc/passwd or even /var/log/messages could be used as if
> they were tables in the database. Information from these tables could then
> be selected into other tables, presented directly to an application, etc.
> While it mught be possible to do the same for MySQL, it would be much more

Copy can import delimited data with user specified delimiter. May be you can
upload the files in batch to database periodically.

Of course if you need multiple delimiter parsing, then you need to do something
about it. Sed/AWK scripts probably..

> Next question-- any ideas how one could generate something like MySQL's heap
> tables (maybe in shared memory?) within PostgreSQL?

Pardon the dumb question(Used mysql seriously onced 2 years back.). What is heap
table? What do you need that postgresql does not offer? Just trying to understand..

Shridhar

Randolf Richardson

unread,
Nov 27, 2003, 2:09:23 AM11/27/03
to
> Since we are on the topic of what prompted us to use PostgreSQL, I
> figured I would share my experiences as well, and some additional
> thoughts that I had.
>
> I chose PostgreSQL about 2 years ago when I realized that the
> application I was building needed something more robust than MySQL. I
> found the learning curve to be a bit steep at first, but eventually got
> the hang of it and have developed a real appreciation for PostgreSQL.
> Unlike many members here I won't say I would never go back to MySQL,
> because there are some circumstances where I would (where I need an
> RDBMS-like interface but not an RDBMS, f. ex. MySQL's heap tables could
> be very useful for real-time control systems). However, I am actually
> seeing less and less value in MySQL as I become more familiar with the
> extensibility features in PostgreSQL.

I'm moving from Oracle 8i to PostgreSQL. I don't really do much fancy
stuff in SQL now, but I do use views a little bit and so MySQL just isn't
an option for me.

I just finished reading another book on PostgreSQL so I can imagine
why you like it so much. The only two things I'd really like to see in
PostgreSQL are:

1. Support for "CONNECT BY" (useful for family trees), which I
understand is being worked on now. The only twist is that I'd also like to
see a specialized index option for supporting this as this is something
that Oracle can't do -- my vision of such an index file is that it would be
quite different from others in that it would be designed to increase the
speed of traversing through the tree. I'm not convinced that Oracle's
solution to this is optimal in this regard.

2. Moving to table spaces (PostgreSQL version 8 maybe?) rather
than just storing a whole bunch of files in a single directory. Oracle's
implementation is nice because tables, indexes, etc., can span multiple
table spaces, and there are great performance optimization and scalability
advantages that otherwise just aren't possible without them. I read in
another thread (approx. 2 months old) earlier this evening that some folks
would like to see OIDs deprecated, and if this is the case then the sub-
directories under "base/" will obviously need a different naming mechanism,
so instead of re-thinking this perhaps it would be a good opportunity for
the PostgreSQL team to look at the possibility of implementing things
within table spaces.

> Anyway one additional thought I had was that it should be possible to
> write parsers for text files in PL/PERLu and then plug those in as
> views. In this way, files such as /etc/passwd or even /var/log/messages
> could be used as if they were tables in the database. Information from
> these tables could then be selected into other tables, presented
> directly to an application, etc. While it mught be possible to do the
> same for MySQL, it would be much more work. Imagine being able to run a
> query such as:
>
> select * from logs.iptables

> where proto = 'TCP' and td_entered > '2003-01-01' and port (select


> port from sysinfo.services where service='ssh');
>
> in order to get every hit against the ssh service on the firewall in the
> current year. Of course, I imagine, it would take some time to run ;-)
> But for generating reports of system activirt, I would not assume this
> to be a problem.

A very interesting idea, but my feeling is that pure PERL is best
suited for dealing with flat text files.

> Next question-- any ideas how one could generate something like MySQL's
> heap tables (maybe in shared memory?) within PostgreSQL?

I have no idea. Anyone else?

Chris Travers

unread,
Nov 27, 2003, 3:08:14 AM11/27/03
to
Hi;

I am trying to determine if there is anything I can do with MySQL that I
can't do with PostgreSQL. HEAP tables are the last one I can't figure out.

HEAP tables are MySQL tables which reside in memory and are visible across
sessions. For obvious reasons, this violates the D in ACID.

In many cases this can be very usefull-- for example if I am building a
system where I want the simplicity of a SQL-like interface with the low
latency of memory-only devices. WOuld I have to run the entire db in a
ramdisk, or can anyone think of a way to do something like this?

It has occurred to me that when table spaces are supported, then this would
be much easier (have a HEAP table space sitting on a ramdisk).

Best WIshes,
Chris Travers


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majo...@postgresql.org)

Shridhar Daithankar

unread,
Nov 27, 2003, 4:18:06 AM11/27/03
to
Chris Travers wrote:
> HEAP tables are MySQL tables which reside in memory and are visible across
> sessions. For obvious reasons, this violates the D in ACID.

Postgresql depend upon two level of buffers. One is it's own buffers and others
are OS buffer cache. So when you say update the table, the relevant portion is
fetched in postgresql buffers and updated. OS does the physical fetch for
postgresql and keeps a copy in it's own buffer cache.

Even if postgresql does not keep them in it's own buffers after a certain time,
most probably OS buffer cache will. So effectively it is cached in RAM. However
other IO operations might need memory and it might get reused under load.

Postgresql does not have a special name for it. It goes for every operation it
makes. So if your table is small enough, it is already cached in the RAM by all
chances.

There should be no performance difference between small tables in postgresql and
heap tables in mysql.

> In many cases this can be very usefull-- for example if I am building a
> system where I want the simplicity of a SQL-like interface with the low
> latency of memory-only devices. WOuld I have to run the entire db in a
> ramdisk, or can anyone think of a way to do something like this?

No. Just build a normal table in postgresql. That's it. And the bright side of
it, it satisfy D.

Hope this helps. I would be interested in numbers that say postgresql is slower
than mysql heap tables. (You can force postgresql to load entire table by doin
select * from table. Of course the table is expected to be small enough.. Then
compare the results. It will always be slow first time..)

Shridhar


---------------------------(end of broadcast)---------------------------

Tony

unread,
Nov 27, 2003, 4:19:56 AM11/27/03
to
Hi All,

I've just been reading an article in PHP Architect magazine
(http://www.phparch.com) which is the cover story for October called
"Migrating from MySQL to PostgreSQL". I must say that this is a highly
compelling article, especially for me, and is aimed at programmers that
aren't necessarilly SQL experts or DBAs. For instance, like many PHP
Web developers who use MySQL instead of flat files to store stuff!
Instead of using a DB as a powerful tool. This article presents reasons
as to why a more standards compliant DB is good for programmers, and why
in some cases MySQL can be less of a friend to programmers than perhaps
PostgreSQL.

I honestly believe that if the advocates of PostgreSQL wrote an article
or case study along the lines of this article, it would go a long way to
attracting many more programmers. In my experience all of the articles
and tutorials are written from the perspective of why PG is a better DB
as a DB. Rather than emphasise aspects like "PG is great because you
can move complicated code like this <insert complicated PHP/Perl code
here> ..... normally dealt programatically to your DB which can be both
faster and applied to any other programmers (VB, Java) that you are
sharing the important enterprise data with. I've not seen anything in
articles aimed at PHP/MySQL users saying, "Hey, look at how these
triggers can make your life soooo much easier" or "Hey, look at how
cascading can save you oh so much coding" or "Hey look at all this
programmatical logic that can be put into queries just by writing your
own functions"

I have recently compared the PostgreSQL users to the Debian users (meant
as a complement) by the fact that they are in general highly
knowledgable of thier own subject and peripheral subjects too. They are
passionate and well versed, and happy to nudge people in the direction
of enlightenment without spoonfeeding them. But in the same way, the
advocacy (IMHO) falls into the same boat as Debian. There is a certain
self-assuredness that PostgreSQL is a far superior product and if
someone can't see how obvious that is then maybe PG isn't for them (a
little harsh I know but I'm trying to illustrate a point).

My point is that there are thousands, tens of thousands of programmers
out there, that need to know why and how PG is so great. My eyes have
now been fully opened by this article, and got rid of my nagging feeling
that there was something great about PG that I "Just wasn't grasping,
and couldn't put my finger on". Maybe the advocacy team should be
aiming for all those programmers that desperately need PG, but don't
know it yet, and probably don't have time to garner enough DB experience
to understand why they need it!

Sadly the PHP Architect article is not free, I bought the electronic
magazine for about $2, but believe it's worth every penny and more.

Just my 2 cents.

Apologies if the PG articles ARE out there and please notice that my
comments do not say that they don't exist, but that I have never seen them.

Cheers

Tony.

Paul Thomas

unread,
Nov 27, 2003, 5:02:08 AM11/27/03
to

On 27/11/2003 08:08 Chris Travers wrote:
> Hi;
>
> I am trying to determine if there is anything I can do with MySQL that I
> can't do with PostgreSQL. HEAP tables are the last one I can't figure
> out.
>
> HEAP tables are MySQL tables which reside in memory and are visible
> across
> sessions. For obvious reasons, this violates the D in ACID.
>
> In many cases this can be very usefull-- for example if I am building a
> system where I want the simplicity of a SQL-like interface with the low
> latency of memory-only devices. WOuld I have to run the entire db in a
> ramdisk, or can anyone think of a way to do something like this?
>
> It has occurred to me that when table spaces are supported, then this
> would
> be much easier (have a HEAP table space sitting on a ramdisk).


If your table is that small, you could just cache it in your application.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Chris Travers

unread,
Nov 27, 2003, 5:52:30 AM11/27/03
to
HEAP tables in MySQL can be basically used as a network-aware SQL interface
for shared memory segments. I was more looking at the possibility of using
these for coordinating various processes on different systems on a network.

If there is not much performance difference between reasonably small tables
and HEAP tables in MySQL, then I could simply create a simple database and
use that. I was just wondering if there was any other tricks that could be
used, aside from functions written in C ;-)

Jonathan Bartlett

unread,
Nov 27, 2003, 8:28:32 AM11/27/03
to
This should be pretty easy. Look into PostgreSQL's rule system. You
should just need a lot of "do instead" functions for the table.

Jon

Paul Thomas

unread,
Nov 28, 2003, 11:17:08 AM11/28/03
to

Maybe there's not such a need for the advanced features of PostgreSQL
amongst PHP programmers as you seem to believe. Most of the PHP stuff I've
seen is read-only content display stuff and that doesn't really require a
top-notch RDBMS; a more limited database should also be up to the job. For
complex transactional web applications, J2EE/Model II is a far superior
technology to scripts/Model I and that means a different target audience
for the apps where PostgreSQL can offer those essential extra features.
Whilst most J2EE developers will be using Oracle/DB2/MSSQL as their
back-end, the awareness of PostgreSQL seems quite high and, in the few
usenet groups I monitor, I don't recall anyone being flamed for
recommending PostgreSQL over MySQL. Maybe seasoned, professional
developers don't like being told that they're crap programmers just
because they ask for something as fundamental as referential integrity!

Coming to your point about advocacy, I certainly don't recognize what you
describe. Of course the members of the advocacy group believe in the
quality of PostgreSQL (a view shared by most of the subscribers to list).
What I think you need to bear in mind is that PostgreSQL is a genuinely
open-source product _not_ a commercial product in GPL clothing like MySQL.
The developers and advocates are not making $xx per box shifted or trying
to seduce users down a supposedly free path into their licensed software
lair. That has a big effect on advocacy. Instead of smarmy marketing types
who rely on spread FUD and misinformation about every product they
consider a competitor, we have a group of people acting with honesty and
integrity. Welcome to the real world of open source :-)

--
Paul Thomas
+------------------------------+---------------------------------------------+


| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+

---------------------------(end of broadcast)---------------------------

Jason Tesser

unread,
Nov 28, 2003, 12:10:15 PM11/28/03
to
hi,

<huge snip>

> Maybe there's not such a need for the advanced features of PostgreSQL
> amongst PHP programmers as you seem to believe. Most of the PHP stuff I've
> seen is read-only content display stuff and that doesn't really require a
> top-notch RDBMS; a more limited database should also be up to the job. For
> complex transactional web applications, J2EE/Model II is a far superior
> technology to scripts/Model I and that means a different target audience
> for the apps where PostgreSQL can offer those essential extra features.
> Whilst most J2EE developers will be using Oracle/DB2/MSSQL as their
> back-end, the awareness of PostgreSQL seems quite high and, in the few
> usenet groups I monitor, I don't recall anyone being flamed for
> recommending PostgreSQL over MySQL. Maybe seasoned, professional
> developers don't like being told that they're crap programmers just
> because they ask for something as fundamental as referential integrity!

I completely disagree. I do a lot of programming with PHP and the features
of Postgres come in handy. Let me give you an example of just some
basic things. Triggers! Why should I have to write insert and update
triggers in the logic (PHP) if I can handle it at the database level. Sql
is 10x as fast as the language. Better to handle what you can at the database
level. Same with views and stored procedures. MySQL cannot even handle
sub-queries yet. I also use Python for standalone interfaces to the data.
Why should I not be able to use the same views and triggers etc in there
that I use for my web apps. PHP is quite powerful if used correctly.
Java has its own issues and I am not sure it is as far supiour as you
are claming it is. But that is not for this dscussion. MySQL may be more
popular with (cheap) web hosting places but that doesn't mean it is the best
or that Postgres wouldn't serve better even in this area. I am glad
to see the article written for PHP mag as Postgres would help alot of PHP guys that are using MySQL.

<another snip>

Randolf Richardson

unread,
Nov 28, 2003, 7:16:07 PM11/28/03
to
Cross-posted to "comp.databases.postgresql.advocacy" because I believe
this is something that should be considered for future additions to the
PostgreSQL advocacy web site.

"pa...@tmsl.demon.co.uk (Paul Thomas)" wrote in
comp.databases.postgresql.general:

--

Chris Travers

unread,
Nov 28, 2003, 7:34:02 PM11/28/03
to
Randolf Richardson Wrote:
> 2. Moving to table spaces (PostgreSQL version 8 maybe?) rather
> than just storing a whole bunch of files in a single directory. Oracle's
> implementation is nice because tables, indexes, etc., can span multiple
> table spaces, and there are great performance optimization and scalability
> advantages that otherwise just aren't possible without them. I read in
> another thread (approx. 2 months old) earlier this evening that some folks
> would like to see OIDs deprecated, and if this is the case then the sub-
> directories under "base/" will obviously need a different naming
mechanism,
> so instead of re-thinking this perhaps it would be a good opportunity for
> the PostgreSQL team to look at the possibility of implementing things
> within table spaces.

I believe this is being worked on also.

>
> > Anyway one additional thought I had was that it should be possible to
> > write parsers for text files in PL/PERLu and then plug those in as
> > views. In this way, files such as /etc/passwd or even /var/log/messages
> > could be used as if they were tables in the database.

<SNIP>


> A very interesting idea, but my feeling is that pure PERL is best
> suited for dealing with flat text files.

True,. and I use it for that (see my project at
http://sourceforge.net/projects/fwreport). However the ability to take a
similar parser and then use it to present the same information to a RDBMS
would then provide some additional flexibility, as you could use the RDBMS
for managing the query interface to the files. Not very useful if you only
want to see the same files the same way every time, but very useful if you
need to extract different information from them.

>
> > Next question-- any ideas how one could generate something like MySQL's
> > heap tables (maybe in shared memory?) within PostgreSQL?
>
> I have no idea. Anyone else?
>
> --
> Randolf Richardson - r...@8x.ca
> Vancouver, British Columbia, Canada
>
> Please do not eMail me directly when responding
> to my postings in the newsgroups.
>

> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majo...@postgresql.org so that your
> message can get through to the mailing list cleanly
>
>

Randolf Richardson

unread,
Nov 28, 2003, 8:12:43 PM11/28/03
to
> Randolf Richardson Wrote:
>> 2. Moving to table spaces (PostgreSQL version 8 maybe?) rather
>> than just storing a whole bunch of files in a single directory.
Oracle's
>> implementation is nice because tables, indexes, etc., can span multiple
>> table spaces, and there are great performance optimization and
scalability
>> advantages that otherwise just aren't possible without them. I read in
>> another thread (approx. 2 months old) earlier this evening that some
folks
>> would like to see OIDs deprecated, and if this is the case then the sub-
>> directories under "base/" will obviously need a different naming
> mechanism,
>> so instead of re-thinking this perhaps it would be a good opportunity
for
>> the PostgreSQL team to look at the possibility of implementing things
>> within table spaces.
>
> I believe this is being worked on also.

Yes!!!!!!! That's excellent! I look forward to helping with testing
it on the NetWare port (and possibly FreeBSD if I ever get it running).

[sNip]


>> A very interesting idea, but my feeling is that pure PERL is best
>> suited for dealing with flat text files.
>
> True,. and I use it for that (see my project at
> http://sourceforge.net/projects/fwreport). However the ability to take a
> similar parser and then use it to present the same information to a RDBMS
> would then provide some additional flexibility, as you could use the
RDBMS
> for managing the query interface to the files. Not very useful if you
only
> want to see the same files the same way every time, but very useful if
you
> need to extract different information from them.

[sNip]

Obviously there's no disagreement here from anyone about the
usefulness of this. I guess I should have clarified a bit more in my post
because I'm concerned about PostgreSQL getting too fragmented from a
project management perspective -- it's a database engine, with many
different interfaces available, and to make it into a "Swiss Army Knife"
(or a Chinese knock-off I saw once which had even more functionality) could
result in possibly slower productivity in the long run (not to mention
greater system resource requirements, &c).

Albeit that's only a concern -- those developers who do all the hard
work on PostgreSQL would obviously know best what the answer to this is.

cnliou

unread,
Nov 28, 2003, 11:37:01 PM11/28/03
to
"Jason Tesser" <JTe...@nbbc.edu>

> MySQL cannot even handle sub-queries yet.

Ohh! Really?
Allow me to pay my highest respect to the genius mySQL
programmers!
I completely have no clue on how to construct any single
tiny database on a DBMS having no sub-query capability.

Being too dumb, I solicit mySQL programmers' help by showing
me employee FOO's birthday and his/her latest job title
effective on or before 2003-1-1 from the following tables:

CREATE TABLE t1 (employee TEXT,BirthDay DATE);
CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle
TEXT);

And make the result like this:

FOO 1980-1-1 programmer

Please do not give me the answer that you will merge these
two tables to form one like this:

CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate
DATE,JobTitle TEXT);

Regards,
CN

Chris Travers

unread,
Nov 28, 2003, 9:41:17 PM11/28/03
to
"Jason Tesser" <JTe...@nbbc.edu> wrote:
> I completely disagree. I do a lot of programming with PHP and the
features
> of Postgres come in handy. Let me give you an example of just some
> basic things. Triggers! Why should I have to write insert and update
> triggers in the logic (PHP) if I can handle it at the database level. Sql
> is 10x as fast as the language. Better to handle what you can at the
database
> level. Same with views and stored procedures. MySQL cannot even handle
> sub-queries yet. I also use Python for standalone interfaces to the data.
> Why should I not be able to use the same views and triggers etc in there
> that I use for my web apps. PHP is quite powerful if used correctly.

I guess I am coming at this from the other direction: MySQL is popular and
many people use it for lightweight stuff. Partly this may be because better
tools exist for providing hosted solutions, and this is an area we could
improve (automatically adding entries to the pg_hba.conf, etc.-- may have to
look into doing this).

THe real problem I see is that this keeps PHP from being an ideal skill for
developing enterprise applications. The features you are mentioning are
extremely helpful, even necessary, when you have many applications working
against the same database. The triggers, etc. can give you some consistant
business logic, and you can use views to present information to the
applications in a way that is natural for them.

In essence, my point is that for single-use databases, MySQL isn't all that
bad (aside from consistancy issues). However, the popularity of the LAMP
development environment holds PHP back from being a serious corporate
development environment, IMO.

> Java has its own issues and I am not sure it is as far supiour as you
> are claming it is. But that is not for this dscussion. MySQL may be more
> popular with (cheap) web hosting places but that doesn't mean it is the
best
> or that Postgres wouldn't serve better even in this area. I am glad
> to see the article written for PHP mag as Postgres would help alot of PHP
> guys that are using MySQL.

Again, I think that the most important benefit would be lowering the barrier
to entry of serious development. You can start with
Linux/Apache/PostgreSQL/PHP for a simple site, and then use your knowledge
better to develop more serious applications. But the critical issue to
resolve is to make available a tool or set of tools to manage shared hosting
environments in an easier way. I would be happy to try to generate such a
set of tools.

Best Wishes,
Chris Travers

Oliver Elphick

unread,
Nov 29, 2003, 2:53:44 AM11/29/03
to
On Sat, 2003-11-29 at 04:37, cnliou wrote:
> "Jason Tesser" <JTe...@nbbc.edu>
>
> > MySQL cannot even handle sub-queries yet.
>
> Ohh! Really?
> Allow me to pay my highest respect to the genius mySQL
> programmers!
> I completely have no clue on how to construct any single
> tiny database on a DBMS having no sub-query capability.
>
> Being too dumb, I solicit mySQL programmers' help by showing
> me employee FOO's birthday and his/her latest job title
> effective on or before 2003-1-1 from the following tables:
>
> CREATE TABLE t1 (employee TEXT,BirthDay DATE);
> CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle
> TEXT);
>
> And make the result like this:
>
> FOO 1980-1-1 programmer
>
> Please do not give me the answer that you will merge these
> two tables to form one like this:
>
> CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate
> DATE,JobTitle TEXT);

I have great trouble following your meaning, but I think you are talking
about joining two tables in a query:

SELECT t1.employee, t1.birthday, t2.jobtitle
FROM t1, t2
WHERE t1.employee = t2.employee;

That is not the same as using a sub-query:

SELECT employee
FROM t1
WHERE birthday > (
SELECT MIN(effectivedate)
FROM t2
);

(select employees who were born after the longest-serving employee
started work.)
--
Oliver Elphick Oliver....@lfix.co.uk
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C
========================================
"Who shall ascend into the hill of the LORD? or who
shall stand in his holy place? He that hath clean
hands, and a pure heart..." Psalms 24:3,4

Tony

unread,
Nov 29, 2003, 4:21:58 AM11/29/03
to
HI All,

I'm glad that this thread prompted some thoughtful response. I think
one of my main points I was trying to make, Jason hit the nail on the
head. The article to which I was referring uses a great example which I
have experienced many times before, but in order to grasp this, PHP et
al, must be thought of as a scripting language which crosses many
corporate boundries, and it is easy to assume that it's primary use
(simple web site back ends) are the only thing to discuss. But the
situation has changed enourmously since the release of PHP v4. Now many
consultant/developer/sys-admins like myself are going to client site on
a contract (this is especially true in the UK, I can't speak for
anywhere else) and finding complex stocktrading systems, inventory
systems, CRM systems, and others, all written in PHP backed by MySQL.
Whether this is right or wrong, good choice or bad choice is not what
I'm interested in debating. The point is that when these systems where
architected, the developers used MySQL not because they were dumb, but
because many of them develop awesome code and can get around most
problems in the code, with a little ingenuity. Many simply do not have
the insight into the potential benefits of *proper* RDBMS can offer.
Had they had the benefit of such knowledge the code they have written
would be faster (in DB) and more legible. Sadly often the developers are
the only source of DBA for some of these companies.

The second scenario, is with admin systems, written by people like
myself for companies, whether they be simple or complex systems, that
are intended as a temporary work around to an immediate problem.
In a very short space of time the stop-gap application you had written
to sort out the immediate problem quickly becomes a core business
application (I recently returned to a site after not being there for two
years and the temporary address book/ email system that I knocked up in
an afternoon was not only still being used, but now relied upon heavily).

So on to my point, MySQL guys will happily say "Hey, we're not saying
that the features MySQL is missing aren't important, and we're working
towards them, but in the meantime these issues can be worked around like
this....." and happily play the whole thing down. Many LAMP developers
aren't aware of the benefits of stored procedures, of triggers and other
good stuff. Like myself, if they were aware how much easier life could
be if these things were accessible to them, they'd probably be converts too.

There is not enough emphasis put on the basic importance of these
functions in PG. Someone needs to standup and say "Hey, look how this
can simplify your programming lives" until I started using
Druid/Postgres, I had no idea why I needed triggers or what a cascade
effect did, or why I might want one.

The Linux community has grown at least in part because it has
educated potential users and journo's to its benefits. I believe if
the PG advocacy team did the same, then it would attract many more
serious LAMP developers.

Like Linux vs. Windows, PG has an awful lot going for it in respect to
MySQL, so why not crow about it. It needs to be pointed at a crowd that
are DB novices, they need to be told why PG is worth the time/knowledge
investment, because anyone who reads the MySQL site, will come away with
the impression that the Trigger, Stored Procs, and other things are a
luxurious overhead not necessary for getting the job done.

I'd gladly help out with such a paper, but find myself in the sad
position of my prose being open to attack due to my newbieness in the DB
world and not able to speak authoratatively on the subject.

Have a think, I'd like to know if others agree.

Cheers

T.

Unihost Web Hosting

unread,
Nov 29, 2003, 6:41:31 AM11/29/03
to
Further to this post, what might actually work is to convince O' Reilly
(since they have PostgreSQL book/s) to do some articles like they have
for PG, but making full use of the PG database. For instance, building
a simple data-warehouse using PG. Articles that show off an OSS
product/project in a clearly enterprise light in a step-by-step
fashion. There have been so many articles on DB design using MySQL.
How about an article on DB design using all the functionality of a real
ORDBMS.

Just a few thoughts.

Cheers

T.

Tony wrote:


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Randal L. Schwartz

unread,
Nov 29, 2003, 8:54:46 AM11/29/03
to
>>>>> "Unihost" == Unihost Web Hosting <to...@unihost.net> writes:

Unihost> Further to this post, what might actually work is to convince O'
Unihost> Reilly (since they have PostgreSQL book/s) to do some articles like
Unihost> they have for PG, but making full use of the PG database. For
Unihost> instance, building a simple data-warehouse using PG. Articles that
Unihost> show off an OSS product/project in a clearly enterprise light in a
Unihost> step-by-step fashion. There have been so many articles on DB design
Unihost> using MySQL. How about an article on DB design using all the
Unihost> functionality of a real ORDBMS.

Well, since I need 2.5 ideas per month for the three columns I'm still
writing, I'm certainly in a position to write nice things about PG,
although I always have to work it in from a Perl slant.

Actually, I'm sure that any of the magazines I'm in would appreciate
an additional article or two from me.

If you can think of something that fits in 2000 words or so (or 4000
if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate
some inspiration.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<mer...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Nigel J. Andrews

unread,
Nov 29, 2003, 9:04:06 AM11/29/03
to

I think he means for the employee FOO show only the latest job title. Or in
other words:

SELECT t1.employee, t1.birthday, t2.jobtitle
FROM t1, t2
WHERE
t1.employee = t2.employee

AND
t1.employee = 'FOO'
AND
t2.effectivedate > CAST('2003-1-1' TO DATE)
ORDER BY t2.effectivedate DESC
LIMIT 1


which of course uses a PostgreSQL customisation.

I've got a feeling it's possible doing self joins and the like but I'll leave
it at that I think.


--
Nigel

Robert Treat

unread,
Nov 29, 2003, 12:24:22 PM11/29/03
to
On Thursday 27 November 2003 04:18, Shridhar Daithankar wrote:
> Chris Travers wrote:
<snip explination of postgresql cacheing tables in memory>

>
> Hope this helps. I would be interested in numbers that say postgresql is
> slower than mysql heap tables. (You can force postgresql to load entire
> table by doin select * from table. Of course the table is expected to be
> small enough.. Then compare the results. It will always be slow first
> time..)
>

the difference is that with mysql, nothing pushes the table out of memory; it
always stays in memory. in postgresql, a big query on another tables, or
perhaps a vacuum, or other highly active applications on the same server can
cause the small tables to be pushed out of memory. both approches have
positives and negatives, and in many cases you would probably notice no
differance

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Alvaro Herrera

unread,
Nov 29, 2003, 3:04:50 PM11/29/03
to
On Sat, Nov 29, 2003 at 12:24:22PM -0500, Robert Treat wrote:

> [MySQL's heap tables]

> the difference is that with mysql, nothing pushes the table out of memory; it
> always stays in memory. in postgresql, a big query on another tables, or
> perhaps a vacuum, or other highly active applications on the same server can
> cause the small tables to be pushed out of memory. both approches have
> positives and negatives, and in many cases you would probably notice no
> differance

If this is a small heavily used table, 7.5 with the new ARC buffer
management policy should do much better. Even better, the table does
not actually need to be small: the heavily used portion will stay in
memory where it can be very fast, and the rest will be just wait its
turn on disk.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Coge la flor que hoy nace alegre, ufana. ¿Quién sabe si nacera otra mañana?"

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majo...@postgresql.org

Randolf Richardson

unread,
Nov 29, 2003, 4:14:34 PM11/29/03
to
[sNip]

>> the difference is that with mysql, nothing pushes the table out of
>> memory; it always stays in memory. in postgresql, a big query on
>> another tables, or perhaps a vacuum, or other highly active
>> applications on the same server can cause the small tables to be pushed
>> out of memory. both approches have positives and negatives, and in
>> many cases you would probably notice no differance
>
> If this is a small heavily used table, 7.5 with the new ARC buffer
> management policy should do much better. Even better, the table does
> not actually need to be small: the heavily used portion will stay in
> memory where it can be very fast, and the rest will be just wait its
> turn on disk.

Is this a configurable option by any chance? If not, then perhaps it
should be on a per-table, per-index (etc.) basis.

Christopher Browne

unread,
Nov 29, 2003, 10:17:01 PM11/29/03
to
A long time ago, in a galaxy far, far away, Randolf Richardson <r...@8x.ca> wrote:
> [sNip]
>>> the difference is that with mysql, nothing pushes the table out of
>>> memory; it always stays in memory. in postgresql, a big query on
>>> another tables, or perhaps a vacuum, or other highly active
>>> applications on the same server can cause the small tables to be
>>> pushed out of memory. both approches have positives and
>>> negatives, and in many cases you would probably notice no
>>> differance
>>
>> If this is a small heavily used table, 7.5 with the new ARC buffer
>> management policy should do much better. Even better, the table
>> does not actually need to be small: the heavily used portion will
>> stay in memory where it can be very fast, and the rest will be just
>> wait its turn on disk.
>
> Is this a configurable option by any chance? If not, then
> perhaps it should be on a per-table, per-index (etc.) basis.

It is a MUCH BETTER thing to have policies that don't require
configuration effort.

One of the characteristic problems with Oracle is that you have
immense numbers of "knobs" to tune. You can get it to work "just
right" if you throw a large enough horde of DBAs at it.

In the case of the ARC policy, what Jan is trying to do is to come up
with a strategy that is an improvement irrespective of the
characteristics of the table. If that works out as hoped for, there
will be no need to "configure" anything in order to take advantage of
it.

You'd find your applications running faster simply by installing a 7.5
server; no need to configure anything. It's like getting Pentium chip
with improved execution strategies; you don't have to recompile
anything (the way IA-64 mandates it); you just install the app on the
new box and watch it speed up.
--
"cbbrowne","@","acm.org"
http://www.ntlug.org/~cbbrowne/spreadsheets.html
Rules of the Evil Overlord #222. "I reserve the right to execute any
henchmen who appear to be a little too intelligent, powerful, or
devious. However if I do so, I will not at some subsequent point shout
"Why am I surrounded by these incompetent fools?!"
<http://www.eviloverlord.com/>

Chris Travers

unread,
Nov 29, 2003, 11:26:36 AM11/29/03
to
Tony <to...@unihost.net> Wrote:
> Now many
> consultant/developer/sys-admins like myself are going to client site on
> a contract (this is especially true in the UK, I can't speak for
> anywhere else) and finding complex stocktrading systems, inventory
> systems, CRM systems, and others, all written in PHP backed by MySQL.

I started the CRM system I am developing on MySQL before realizing it was
the wrong choice. Part of it is simple because people have heard of the
software and don't have the time/stamina/patience to do proper research into
the benefits of alternatives. There is also a learning curve when going
from MySQL to a more standards-compliant RDBMS like PostgreSQL. Heck, I
found that going from PostgreSQL to Firebird give me headaches :-P And
these RDBMS's have most of the same features!

> Whether this is right or wrong, good choice or bad choice is not what
> I'm interested in debating. The point is that when these systems where
> architected, the developers used MySQL not because they were dumb, but
> because many of them develop awesome code and can get around most
> problems in the code, with a little ingenuity. Many simply do not have
> the insight into the potential benefits of *proper* RDBMS can offer.

I would actually venture to say that many of them are using the RDBMS as a
sort of object-persistance store, and not really trying to use the
*relational* features of the software. They might as well be using Berkeley
DB 4. I know that is how I started with MySQL.

What most of these programmers do not understand is that an RDBMS is not
simply a search-engine for stored persistant objects, but is actually a
fully-featured information storage management system. With the right
features, this information can be stored, queried, presented in another
form, etc. all while ensuring that the stored information is EXACTLY what
was intended. The tasks that the RDBMS handles include data storage,
integrity enforcement, and data presentation. Most MySQL programmers only
use it for data storage. Sadly, this is about all MySQL is good for, and
hence the barrier to learning how to USE a REAL RDBMS are a bit higher
because of the prevalence of the likes of MySQL and MS Access.

> The second scenario, is with admin systems, written by people like
> myself for companies, whether they be simple or complex systems, that
> are intended as a temporary work around to an immediate problem.
> In a very short space of time the stop-gap application you had written
> to sort out the immediate problem quickly becomes a core business
> application (I recently returned to a site after not being there for two
> years and the temporary address book/ email system that I knocked up in
> an afternoon was not only still being used, but now relied upon heavily).

But again, if you start with the right tools, it is easier to modify later
to adapt to changing needs. I think that this is one of the messages we
should be presenting. With updateable views, different applications can
even have access to different presentations of the data.

> So on to my point, MySQL guys will happily say "Hey, we're not saying
> that the features MySQL is missing aren't important, and we're working
> towards them, but in the meantime these issues can be worked around like
> this....." and happily play the whole thing down. Many LAMP developers
> aren't aware of the benefits of stored procedures, of triggers and other
> good stuff. Like myself, if they were aware how much easier life could
> be if these things were accessible to them, they'd probably be converts
too.

Agreed completely. Now we just have to sell the PostgreSQL solution. Here
is what the MySQL people will say (and we need good evidence to counter):
1: MySQL is faster.
2: MySQL has more community support.
3: MySQL has replication as part of its core distribution. MySQL's
replication is better tested...

> There is not enough emphasis put on the basic importance of these
> functions in PG. Someone needs to standup and say "Hey, look how this
> can simplify your programming lives" until I started using
> Druid/Postgres, I had no idea why I needed triggers or what a cascade
> effect did, or why I might want one.

The basic issue is that many programmers are not taught to value information
management systems, such as RDBMS's. These programmers are interested only
in the data storage issues of the database, and not on how to use it to
manage the information stored therein. Changing this may take a lot of
effort. Also, using an RDBMS to its full extent rubs some OO programmers
the wrong way because it strikes them as violating rules of OO design. Of
course, then why not use an OO database? ;-)

> The Linux community has grown at least in part because it has
> educated potential users and journo's to its benefits. I believe if
> the PG advocacy team did the same, then it would attract many more
> serious LAMP developers.

I agree. But it will take some time to sell, and will require some
extremely basic work to get going.

> I'd gladly help out with such a paper, but find myself in the sad
> position of my prose being open to attack due to my newbieness in the DB
> world and not able to speak authoratatively on the subject.

I would be happy to help out as well. How about a paper entitled:
Why PostgreSQL?
It would cover why PostgreSQL is ideal for serious application development.

Note, I have only about 2 years experience with PostgreSQL, and so I would
feel more comfortable with lengthy peer review of whatever I write, but I am
reasonably familiar with both worlds, and can make a strong case for
PostgreSQL, I think.

> Have a think, I'd like to know if others agree.
>
> Cheers
>
> T.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Tony

unread,
Nov 30, 2003, 5:38:31 AM11/30/03
to
Comments within:


Chris Travers wrote:
Tony <to...@unihost.net> Wrote:
  
 Now many
consultant/developer/sys-admins like myself are going to client site on
a contract (this is especially true in the UK, I can't speak for
anywhere else) and finding complex stocktrading systems, inventory
systems, CRM systems, and others, all written in PHP backed by MySQL.
    
I started the CRM system I am developing on MySQL before realizing it was
the wrong choice.  Part of it is simple because people have heard of the
software and don't have the time/stamina/patience to do proper research into
the benefits of alternatives.  There is also a learning curve when going
from MySQL to a more standards-compliant RDBMS like PostgreSQL.  Heck, I
found that going from PostgreSQL to Firebird give me headaches :-P  And
these RDBMS's have most of the same features!

  
It's the learning curve part that I'm finding difficult, not because it's a too complicated, but because I can't find a good source of information to learn from.  I'm sure I'll get flamed for this, but I seem to be unable to find information on proper design principle, including where and when to use triggers, stored procs, etc, etc, that isn't 20 years old already.  In the liquid world of IT, I find it worrying (perhaps incorrectly) learning from a book written 14 years ago. 


  
Whether this is right or wrong, good choice or bad choice is not what
I'm interested in debating.   The point is that when these systems where
architected, the developers used MySQL not because they were dumb, but
because many of them develop awesome code and can get around most
problems in the code, with a little ingenuity.  Many simply do not have
the insight into the potential benefits of *proper* RDBMS can offer.
    
I would actually venture to say that many of them are using the RDBMS as a
sort of object-persistance store, and not really trying to use the
*relational* features of the software.  They might as well be using Berkeley
DB 4.  I know that is how I started with MySQL.

  

Agreed...

What most of these programmers do not understand is that an RDBMS is not
simply a search-engine for stored persistant objects, but is actually a
fully-featured information storage management system.  With the right
features, this information can be stored, queried, presented in another
form, etc. all while ensuring that the stored information is EXACTLY what
was intended. The tasks that the RDBMS handles include data storage,
integrity enforcement, and data presentation.  Most MySQL programmers only
use it for data storage.  Sadly, this is about all MySQL is good for, and
hence the barrier to learning how to USE a REAL RDBMS are a bit higher
because of the prevalence of the likes of MySQL and MS Access.
  

Indeed, and I believe that the lack of education with regards to this (or even available information presented at the right level) perpetuates the issue, along with the mis-information put forward by MySQL that these aspects aren't really all that important anyway.
1.  Let's do apples to apples, NOT apples to Oranges as has been done many times in the past.   It would be far more useful IMHO to put forward a "real world" scenario (rather than purely transactions per second) which shows 2 examples of the same application, the first MySQL written with much of the logic in the code, PHP, Perl, whatever. the second would be written in the same language, with the appropriate use of the true RDBMS features in full effect.  Then let's comapare results.  I'm sure that the more we can usefully abstract to the DB the more performace you'll see from PG, especially as load increases.  The big issue I've found is not raw speed, if half of the logic is in regexps and algorithms in the scripting, this will be many times slower to run than the same en DB.  Although correct me if I'm way off the mark here.  Also the point needs to be made that, it is far more trivial to write other interfaces to the program/application.  Who wouldn't enjoy the ability to work on their CRM with a Python or Java frontend, without the need to write 75% of the logic in a different language.

  
There is not enough emphasis put on the basic importance of these
functions in PG.  Someone needs to standup and say "Hey, look how this
can simplify your programming lives"  until I started using
Druid/Postgres, I had no idea why I needed triggers or what a cascade
effect did, or why I might want one.
    
The basic issue is that many programmers are not taught to value information
management systems, such as RDBMS's.  These programmers are interested only
in the data storage issues of the database, and not on how to use it to
manage the information stored therein. Changing this may take a lot of
effort.  Also, using an RDBMS to its full extent rubs some OO programmers
the wrong way because it strikes them as violating rules of OO design.  Of
course, then why not use an OO database? ;-)
  

In some respects adding much of the logic to the DB doesn't fly in the face of OO, since you are providing an encapsulated data interface, only needing to change the programatic code in one place
(the DB) for potentially high returns in many places.  Hence taking OO one step further than the language and abstracting one level further.

  
The Linux  community has grown  at least in part because it has
educated  potential users and journo's to its benefits.  I believe if
the PG advocacy team did the same, then it would attract many more
serious LAMP developers.
    
I agree.  But it will take some time to sell, and will require some
extremely basic work to get going.

  
I'd gladly help out with such a paper, but find myself in the sad
position of my prose being open to attack due to my newbieness in the DB
world and not able to speak authoratatively on the subject.
    
I would be happy to help out as well.  How about a paper entitled:
Why PostgreSQL?
It would cover why PostgreSQL is ideal for serious application development.
  

What about "PostgreSQL - Ideal for any application development"

Note, I have only about 2 years experience with PostgreSQL, and so I would
feel more comfortable with lengthy peer review of whatever I write, but I am
reasonably familiar with both worlds, and can make a strong case for
PostgreSQL, I think.
  

I have only 2 months and have oodles of enthusiasm, but short on PG knowledge, and even shorter on real RDBMS basics and experience.  But I'll be happy to proof read as a novice. 
  
  

Chris Travers

unread,
Nov 30, 2003, 7:49:04 AM11/30/03
to
Re: [GENERAL] PostgreSQL Advocacy, Thoughts and CommentsRegarding the
learning curve issue, maybe people can recommend their favorite books. I
recommend "SQL Unleashed" (I forget the author), pub. Samms. "SQL For
Smarties" also gets recommended often around here, but again, I don't know
the author (or in this case, even the publisher). Maybe there are others
too that people can recommend.

As for the paper-- I think your title is good ("PostgreSQL - Ideal for any
application development"), and I will have to look at how to organize it. I
wanted to cover the following topics:
1) Different types of databases, and what is meant by Object Relational
2) Enterprise-ready features (Views, Stored Proceedures, Subselects, etc.)
3) RDBMS tasks and how these features fit in.

Perhaps a followup paper could be written as a basic treatise in database
design.

The other issues I am seing here involve finding a suitable venue for
publication. Any suggestions here are welcome also.

Best Wishes,
Chris Travers

Chris Travers

unread,
Nov 30, 2003, 7:59:09 AM11/30/03
to
Hi;

The MySQL manual states that Joins perform better than subselects. The
stated reason is that the planner can better optimize a join.

I am, however, very skeptical of this, and would generally assume things to
be the other way around.

How do subselects perform in relation to hash joins, etc?

Best Wishes,
Chris Travers


---------------------------(end of broadcast)---------------------------

Martijn van Oosterhout

unread,
Nov 30, 2003, 9:12:29 AM11/30/03
to
Hehe, it's kinda funny when people say things like that.

It depends on what you call a "join". From a certain point of view, an IN
expression is a join too, making subselects a slightly unusual join.

In some ways I think a subselect is the SQL way way of expressing a type of
join that cannot be expressed in the normal way and that with a different
query language, the concept may go away entirely.

I do know that in recent versions of postgres, certain types of subselects
are actually optimised into specific types of joins. It's all just
relational algebra people!

So the moral is, MySQL needs a better query optimiser?

Hope this helps,

--
Martijn van Oosterhout <kle...@svana.org> http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Tom Lane

unread,
Nov 30, 2003, 1:03:44 PM11/30/03
to
"Chris Travers" <ch...@travelamericas.com> writes:
> The MySQL manual states that Joins perform better than subselects.

Very possibly true ... in MySQL. Since they have such an immature
subselect implementation (not even out of alpha apparently), it'd
not be surprising if they can't optimize subselects worth a damn yet.
Our planner has been hacked on repeatedly to do a good job with
subselects --- and I wouldn't want to imply that the process is done
yet.

One of the more amusing aspects of watching MySQL's response to the
"feature race" is how they invariably gloss over the difference between
having a minimal implementation of a feature, and having a feature that
is mature, complete, and efficient. Subselects are one example where
there's a lot of mileage yet to cover after you get to the point where
you can say "it works".

regards, tom lane

Oleg Bartunov

unread,
Nov 30, 2003, 1:27:11 PM11/30/03
to
Randal,

we have contrib/tsearch2 which could be used for full text searching
and perl wrapper illustrating simple search engine would be quite
useful.

Oleg

On Sat, 29 Nov 2003, Randal L. Schwartz wrote:

> >>>>> "Unihost" == Unihost Web Hosting <to...@unihost.net> writes:
>
> Unihost> Further to this post, what might actually work is to convince O'
> Unihost> Reilly (since they have PostgreSQL book/s) to do some articles like
> Unihost> they have for PG, but making full use of the PG database. For
> Unihost> instance, building a simple data-warehouse using PG. Articles that
> Unihost> show off an OSS product/project in a clearly enterprise light in a
> Unihost> step-by-step fashion. There have been so many articles on DB design
> Unihost> using MySQL. How about an article on DB design using all the
> Unihost> functionality of a real ORDBMS.
>
> Well, since I need 2.5 ideas per month for the three columns I'm still
> writing, I'm certainly in a position to write nice things about PG,
> although I always have to work it in from a Perl slant.
>
> Actually, I'm sure that any of the magazines I'm in would appreciate
> an additional article or two from me.
>
> If you can think of something that fits in 2000 words or so (or 4000
> if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate
> some inspiration.
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Rick Morris

unread,
Nov 30, 2003, 7:41:47 PM11/30/03
to
Thanks, Tony

It's good to hear that my article hasn't fallen on deaf ears (er...
rather... blind eyes). I wrote this article as a result of reading many
messages in the pgsql mail archives wishing for such a thing. I also
intended to write another article (actually, a series of articles) to be
distributed freely for online use, and not focusing only on PHP. I am
still working on this, but a recent job change and an urgent project
have slowed me down a little. I plan to have at least a couple of these
ready before the new year, when I will post links to them here
(actually, in the pgsql-advocacy), and will welcome any reviews,
criticism, etc... I will also be more than happy to participate in any
way (reviews, proofreading, etc...) for any others writing articles.
Count me in, as a (more than) happy PostgreSQL user.

And yes, I think you hit the nail on the head here: often those who are
very familiar with PostgreSQL don't realize just how little
understanding many developers have of its potential. Since I was, until
2-3 years ago, an 'outsider' to many of the serious concepts of the
relational model, I can empathize with them somewhat, and it has helped
me present the case.

I was recently made moderator of the Database section at Dev Shed
Forums, where I have hopefully had a hand in explaining PostgreSQL to
more than a few developers. Based on what I am seeing, I sincerely
believe PostgreSQL is going to get LOT more recognition over the next
couple of years. This is especially the case if a few of us start
presenting *positive explanatory material*, in addition to the
comparisons to other DBMSs. The sad fact is, most developers need to be
educated in the basics, just to be able to understand what the benefits are.

Regards,

Rick Morris

Tony wrote:

> Sadly the PHP Architect article is not free, I bought the electronic
> magazine for about $2, but believe it's worth every penny and more.
>
> Just my 2 cents.
>
> Apologies if the PG articles ARE out there and please notice that my
> comments do not say that they don't exist, but that I have never seen
> them.
>
> Cheers
>
> Tony.


>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your

> joining column's datatypes do not match

Rick Morris

unread,
Nov 30, 2003, 8:59:19 PM11/30/03
to
Tony wrote:

> HI All,
>
> I'm glad that this thread prompted some thoughtful response. I think
> one of my main points I was trying to make, Jason hit the nail on the
> head. The article to which I was referring uses a great example which
> I have experienced many times before, but in order to grasp this, PHP
> et al, must be thought of as a scripting language which crosses many
> corporate boundries, and it is easy to assume that it's primary use
> (simple web site back ends) are the only thing to discuss. But the
> situation has changed enourmously since the release of PHP v4. Now
> many consultant/developer/sys-admins like myself are going to client
> site on a contract (this is especially true in the UK, I can't speak
> for anywhere else) and finding complex stocktrading systems, inventory
> systems, CRM systems, and others, all written in PHP backed by MySQL.

So true! I am in the U.S (Florida), and I am seeing the same thing here.
Starting around 2000, many fairly complex, mission-critical PHP/MySQL
apps were developed, which are just beginning to surface. We all know
how prevalent PHP and MySQL became overnight, but how many of us realize
that it was not just used for 'lightweight' applications?. Imagine how
big a problem all these PHP/MySQL applications are going to become over
the next few years. I have had the dubious pleasure of moving a few of
these from MySQL to PostgreSQL already (Yes, financial systems using
MySQL's unconstrained numeric types!!), and I shudder to think about all
the companies that might end up with *years* of poorly-constrained data.

> Whether this is right or wrong, good choice or bad choice is not what
> I'm interested in debating. The point is that when these systems
> where architected, the developers used MySQL not because they were
> dumb, but because many of them develop awesome code and can get around
> most problems in the code, with a little ingenuity. Many simply do
> not have the insight into the potential benefits of *proper* RDBMS can
> offer. Had they had the benefit of such knowledge the code they have
> written would be faster (in DB) and more legible. Sadly often the
> developers are the only source of DBA for some of these companies.

Most medium/small business managers don't even know there is a
difference between the two.

<snip>

> Like Linux vs. Windows, PG has an awful lot going for it in respect to
> MySQL, so why not crow about it. It needs to be pointed at a crowd
> that are DB novices, they need to be told why PG is worth the
> time/knowledge investment, because anyone who reads the MySQL site,
> will come away with the impression that the Trigger, Stored Procs, and
> other things are a luxurious overhead not necessary for getting the
> job done.
>
> I'd gladly help out with such a paper, but find myself in the sad
> position of my prose being open to attack due to my newbieness in the
> DB world and not able to speak authoratatively on the subject.

You're not doing too badly, really. Your writing is good and clear, and
your knowlege is well above the typical corporate IT magazine hack ;-).

Regards,

Rick Morris

Brendan Jurd

unread,
Dec 1, 2003, 12:22:07 AM12/1/03
to
Hello,

I'd just like to add that this thread is perfectly aligned with my own
experiences on the DB front.

When I started out as a developer, I was familiar with the concepts of
good database design, but *not* with the range of DBMSs available, and
their respective advantages and disadvantages. I started using MySQL
because it was popular, fast, free and (sort of) easy to set up. Then
my major project, a payroll management system, previously PHP/MSSQL,
needed a new environment established very quickly. I used MySQL because
time was short and it was what I knew. To get around the lack of proper
foreign key constraints I did crazy things like manually maintaining a
table of "foreign keys" and enforcing them from PHP.

When the client on that project hired a new sysadmin, the sa took one
look at the system and said "Uhh, dude. Do you know anything about
postgres?"

I'd heard of postgres but didn't really know anything about it. After
having the sysadmin tell me about postgres' capabilities, and checking
out the manual for myself, I realised that I'd been barking up the wrong
proverbial tree for months. So we migrated the system to postgres
earlier this year, and I couldn't be more pleased with it. The
immediate payoff of having referential integrity enforced properly was
wonderful, but the advantages just kept on coming. I started using
postgres functions and views to shift more of the work of "organising
data" over to the piece of software that *should* be handling it. Some
of those views enabled me to chop vast tracts of superfluous PHP code
out of the system. The more I used postgres, the more I came to
understand -- this is the way it ought to be.

BJ

Rick Morris wrote:

---------------------------(end of broadcast)---------------------------

Robert Treat

unread,
Dec 1, 2003, 1:31:38 AM12/1/03
to

Here's one of the more famous articles of this type which documents the switch
from mysql to postgresql by the folks at sourcforge. (and we didnt even have
to give them thousands of $$$ in development and product support like IBM...
:-)

http://www.phpbuilder.com/columns/tim20001112.php3

You'll note in one test postgresql performs at 6x mysql... and this was with
version 7.1.


Another interesting note is the mention that interbase was looked at and then
passed over due, in part, to "serious community and corporate issues around
it's Open Sourceness".

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---------------------------(end of broadcast)---------------------------

Scott Ribe

unread,
Dec 1, 2003, 2:28:48 PM12/1/03
to
> I would be happy to help out as well. How about a paper entitled:
> Why PostgreSQL?
> It would cover why PostgreSQL is ideal for serious application development.

Good idea.

> Note, I have only about 2 years experience with PostgreSQL, and so I would
> feel more comfortable with lengthy peer review of whatever I write, but I am
> reasonably familiar with both worlds, and can make a strong case for
> PostgreSQL, I think.

Same here...


--
Scott Ribe
scott...@killerbytes.com
http://www.killerbytes.com/
(303) 665-7007 voice


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Jan Wieck

unread,
Dec 1, 2003, 4:09:39 PM12/1/03
to
Christopher Browne wrote:

The stuff is in CVS HEAD. Randolf, look at the README file in
src/backend/storage/buffer for some explanations.


Jan

>
> You'd find your applications running faster simply by installing a 7.5
> server; no need to configure anything. It's like getting Pentium chip
> with improved execution strategies; you don't have to recompile
> anything (the way IA-64 mandates it); you just install the app on the
> new box and watch it speed up.


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanW...@Yahoo.com #

Rick Gigger

unread,
Dec 1, 2003, 7:45:02 PM12/1/03
to
Here is a link to the sql for smarties book:

http://www.amazon.com/exec/obidos/tg/detail/-/1558603239/102-3995931-7263333?v=glance

by Joe Celko

Has some cool ways of handling trees in sql

Randolf Richardson

unread,
Dec 2, 2003, 1:05:13 PM12/2/03
to
[sNip]

> I was recently made moderator of the Database section at Dev Shed
> Forums, where I have hopefully had a hand in explaining PostgreSQL to
> more than a few developers. Based on what I am seeing, I sincerely
> believe PostgreSQL is going to get LOT more recognition over the next
> couple of years. This is especially the case if a few of us start
> presenting *positive explanatory material*, in addition to the
> comparisons to other DBMSs. The sad fact is, most developers need to be
> educated in the basics, just to be able to understand what the benefits
> are.
[sNip]

What will also be helpful is for people to see working systems in
action that use PostgreSQL in the backend. I've got a few projects that are
in the works, but nothing that's ready for public consumption yet (but once I
do, I will certainly let the Advocacy group know about it and will be
included a link from a section of the web site to both the PostgreSQL home
page and the Advocacy page).

Chris Travers

unread,
Dec 3, 2003, 10:19:02 AM12/3/03
to

Here is a paper I have written for the purposes of providing some additional educational material for the MySQL crowd.  Permission is granted for it to be redistributed via the mailing list and archives, but not to be posted elsewhere until I have a final version completed.

I am interested in peer review regarding the concepts in this paper, etc.  Any feedback?

 

Choosing a Database Manager

A Study of Open Source Relational Database Management Systems

By Chris Travers

 

Introduction

There sometimes comes a point of epiphany, dreaded by programmers, when it becomes clear that at least one of the components of an application is completely inadequate for the task at hand, and thus limits progress or usefulness of the program in some fundamental way.  All too often this is the relational database management system (RDBMS).

 

Choosing the proper relational database management system can make or break an application.  The choice of a primary RDBMS is thus very important for the long-term viability of an application.  While this paper focuses on open source RDBMS’s, the same methodology can be applied towards commercial systems as well, such as Oracle, MS SQL, and DB2. Also, the concept of application specific database managers will be discussed briefly in order to provide some degree of contrast to the relational paradigm.

Types of Database Managers

Application Specific Database Managers

Application specific database managers include the GDBM (GNU Database Manager), and the Berkeley Database Manager (available, open source or commercially licensed from Sleepycat software).  Basically, such a database provides a simple persistence layer for database objects.  They do not allow for complex querying, and are often used in single-purpose applications where the data does not need to be accessed by any outside application.

 

If two applications share the same database using these database managers, they must be intimately familiar with the internal data structures that are stored in the database manager.  No presentation is given to the data aside from how it is represented in the database.

Relational Database Managers

In contrast to the application specific database managers, relational database managers present an abstraction layer between the information in the database and the application using that information.  The information can then be subject to much more complex queries, and the information can be represented to the application in whatever form makes sense to the application.  Usually, SQL is the language used for such queries.

 

In a relational database manager, data is stored in “tables” and tables can be joined together based on common criteria.  In this way, several different applications can access the same information in different ways.

What does “Object Relational” mean?

PostgreSQL is “Object Relational” which means that although it is a relational database management system, the database “objects” have at least some of the traits of objects in object-oriented programming.  This tables can be inherited, though this does not apply to user defined data types, etc.  With an Object Relational system, purely relational databases can be designed and used, but also inheritance can be used in various ways.  Use of such features is beyond the scope of this paper, however.

Features Important to Enterprise Applications

Subselects

Subselects are important in large applications for two reasons.  First they allow performance tuning for certain types of queries above and beyond what a planner can normally do.  Very often subselects will outperform inner joins because the planner can focus on a simpler pattern with a maximum of one table scan per subselect and a smaller memory footprint than other forms of single-scan joins.

 

Secondly, when compared to inner joins, subselects often make for more readable queries, especially where several tables are involved.  Greater readability makes for easier maintenance of the application.

Views

A view is a stored query that appears to be a table from the application’s perspective.  Views can be used to create alternative representations of the data for other applications or for reporting purposes.  In many database managers, views can be updateable, and sometimes data can even be inserted or deleted from the view.  Thus views can be used to create logical tables that present data in a way that is less normalized than the actual structure of the information in the tables themselves.

 

Views can also be used for partitioning tables for permissions purposes-- vertically (groups of rows), horizontally (groups of columns), or both.  They can also be used to create complex virtual tables for reporting or online analytic processing, or OLAP.  In this way, views can help manage information and also help make sense of complex patterns as well.

Stored Procedures

Stored Procedures, also known as User Defined Functions, allow for a greater quantity of business logic to be stored in the database.  These functions can allow for the development of more complex views than would otherwise be possible, and can also allow for the development of a database-level API that can then be accessed by any application that has access to the database.  In this way, they can reduce development time by allowing the development of “modules” which are common to all clients of the database.

 

Stored procedures also reduce network latency issues by reducing the number of communications between the client and server.  Whole sets of queries can be run before any result set needs to be sent to the client.  In this way, performance can be improved, given sufficient hardware on the side of the database server.

Triggers

Triggers are a mechanism for automatic maintenance and management of information in the database.  Triggers can be activated when information is modified in any table. Triggers are often used to enforce referential integrity (ensuring that records in one table reference records in another table that actually exist).

 

One example of another use of a trigger is in maintaining some other record of the change made to the information of a database.  This information can later be used for reporting purposes to ensure that a consistent picture of the database can be constructed for any arbitrary point in time.  For example, I could store historical information about prices for items from my online store, and then do historical processing to see how sales were affected by price changes at a later date since all historical information would be maintained in the database.

Database Tasks: How These Features Help

Data Integrity Enforcement

For data to be useful in a large business program, it must be meaningful.  In many cases, the largest threat to the ability to make the data meaningful is the so-called “orphaned record” problem.  Orphaned records refer to records in other tables that, from the point of view of the application, no longer exist.  This can happen when records are deleted or their primary keys are updated.  Triggers can be used to ensure that this problem never exists by doing one of several things:  cascading updates/deletes, restricting updates/deletes when records reference the column being updated or row being deleted, or setting the foreign key in the referencing table to NULL (which means, in database terms, “unknown”).

 

Triggers can also restrict the types of information being inserted or updated in the database, ensuring that only meaningful input is stored.  They can also be used to maintain historical information, allowing a consistent picture of all tables to be built for any point in time.

Data Presentation

Assume for a moment that your database has several front-ends that access the information in different ways.  Perhaps this is not the case now, but it could be in the future.  It is extremely helpful if the database manager can present the information to the programs in ways that would be natural for them.  This effectively negates the requirement that every application accessing the database must have in-depth knowledge of the database schema.  Instead, the application’s expected database schema can be presented to it.

 

A very useful side effect of this capability is in integrating applications that may or may not have been designed to be integrated.  In this way, larger, more unified databases can be built for use of internal applications. 

 

Views and, depending on complexity, stored procedures would be needed to implement this sort of system, which could also provide advanced reporting capabilities, also based on views and stored procedures.

Common Modules As Stored Procedures

When several programs access the same database, there may be a reasonable amount of common functionality in the functions used by the programs.  These can be simplified using views to some extent, but often this is simply not enough.  Using stored procedures, the functions that two or more front-ends have in common can be moved to the database backend, so that there is one single point of maintenance and reference.  This technique can reduce development and maintenance time and is essential for any database being used by multiple applications.

Questions to Ask

Where will my application be deployed?

The choice of a database manager must, no doubt, take into account availability of the solutions at the point of deployment.  Of course, applications run at the business’s site can use whatever solutions are found to be affordable and/or best suited to the task at hand.  Hosted web applications are more at the mercy of what is available, however.

How necessary is integration with other programs?

Integration capability is always a selling point, as it allows for more powerful solutions to be built in pieces and lowers overall system maintenance.  However, the ability to integrate with other solutions may come at a cost, especially regarding availability of the chosen database manager in a shared hosting environment.

Major Open Source Relational Database Managers

MySQL

MySQL is the most common open source RDBMS among low-cost web hosting providers.  As of version 4.0, it does not support subselects, views, stored procedures, or triggers.  Additionally, its enforcement of data integrity is questionable at best, and it is sometimes known to truncate numeric or text values when they are too large to fit.  It is also sometimes possible to insert invalid timestamps into the database.  This can cause extreme problems for any application that relies on the information in the database being exactly what was entered (for example, accounting applications).

 

Even when views are supported in MySQL, it may be some time before these views can be used for insert, update, or delete queries.  A similar period of maturation should be expected for other features such as stored procedures and triggers.

 

Despite its limitations, MySQL’s ubiquity and the availability of a Windows port makes it an attractive database platform.  For internal business tools, however, there are better choices.

Firebird

Firebird is based on a source-code release of Borland’s Interbase RDBMS.  It is mature, and supports views, subselects, triggers, and stored procedures. As of version 1.0, it has a native Windows port available, making it ideal for in-house applications that must support both Windows and UNIX operating systems.  However, I found the learning curve to be steeper than other RDBMS’s particularly regarding the development of stored procedures.

PostgreSQL

PostgreSQL is an object-relational database management system that features every feature listed here, and extensible type system, and the ability to write stored procedures in a variety of different languages.  Indeed, there are four stored procedure languages (Perl, Tcl, SQL, and PL/PGSQL) distributed with PostgreSQL and many more available from third-party sites.

 

Although the learning curve is steeper than that of MySQL, I found it easier than Firebird.

 

As of version 7.4, PostgreSQL does not yet have a native Windows port available, making it unsuitable at the moment for applications that must be able to run on Windows database servers.

 

Although PostgreSQL has not achieved the ubiquity among shared hosting providers that MySQL has, there are many providers offering hosting at a variety of costs.  You can find a list of them at http://techdocs.postgresql.org/hosting.php.

Conclusion

Selecting the right database manager is crucial to the long-term viability of any application.  Care should be taken to ensure that the database manager chosen will continue to meet the needs of the application in the future as well as in the present.

Alex Satrapa

unread,
Dec 4, 2003, 2:33:37 AM12/4/03
to
Chris Travers wrote:
>
> Here is a paper I have written for the purposes of providing some
> additional educational material for the MySQL crowd.

Here's my contribution:


Why I choose PostgreSQL (PostgreSQL in 21 Seconds)

I choose referential integrity, meaning my lookups always work.

I choose stored procedures, meaning all my developers - Windows or Unix,
Perl, C++ or Java - can access the database in the same way, using the
same locking, with the same checking and cleaning

I choose subselects and outer joins, which allow me to build complex
queries to get exactly the information I want from the database, rather
than wasting my time munging data in my code. Even better, I can put
those common queries into stored procedures, so other developers can get
the same results as I do!

I choose partial indexes, so lookups on NULL fields are just as fast if
not faster.

I choose a user community that believes getting the results right is
more important than getting them quickly.

I choose getting the right results, right now!

I choose funny capitalisation, and a name that can't be pronounced!

I choose PostgreSQL.

Rory Campbell-Lange

unread,
Dec 15, 2003, 12:30:07 PM12/15/03
to
On 29/11/03, Randal L. Schwartz (mer...@stonehenge.com) wrote:
> Well, since I need 2.5 ideas per month for the three columns I'm still
> writing, I'm certainly in a position to write nice things about PG,
> although I always have to work it in from a Perl slant.
>
> Actually, I'm sure that any of the magazines I'm in would appreciate
> an additional article or two from me.
>
> If you can think of something that fits in 2000 words or so (or 4000
> if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate
> some inspiration.

Hi Randal

I think I may have an idea for an article which would address a common
problem for people writing database client interfaces:

The problem is simply explained.

Problem title:

The "page of pages" problem (!)

The problem:

You want to return a subset of a large number items using some
fairly complex search criteria. You want to make only one database
call, benefit from a cached query, and don't want to have all the
rows in memory. How do you get the total count of pages for the
relevant search criteria?

Why is this relevant?

Moving logic that is inherent to the database to the database
provides a potentially rich yet simple interface to database
queries that can benefit a number of client applications.

Typically this sort of query would be written as at least two
dynamically generated queries in the client program that has to be
parsed by the backend before it is executed. By using functions we
can hide complex joins behind simple field names, and provide
flexible (if limited) search capabilites, as well as caching and
sensible error messages.

Approach:

Using Postgres one can construct a function and then do either

SELECT * from function fn_explore($searchstring, $limit, $offset);
OR
SELECT
*
FROM
function fn_explore()
WHERE
searchterm ~* 'test'
LIMIT
5
OFFSET
10;

What is cool about the second format is that (if the function
returned a type 'explore_result' as below), your PHP/Perl programmer
can at their interface do something like

'... where id <> 1 AND author IN ('james', 'bill')...'

However I don't know how you get back the total rows in this case,
also maybe the caching effects are minimised?

--------------------------------------------------------------------
Type definition:

CREATE TYPE explore_result as (
id INTEGER, -- some sort of row id
total INTEGER, -- total rows for query
author VARCHAR,
image BYTEA
/*
Not needed unless search is done outside db.
, searchterm VARCHAR
*/
);

--------------------------------------------------------------------
Sketch function definition:

CREATE OR REPLACE FUNCTION
fn_explore (integer, integer, integer) RETURNS setof explore_result
AS '
DECLARE
searchstring ALIAS for $1;
offsetter ALIAS for $2;
limiter ALIAS for $3;
resulter explore_page%rowtype;
BEGIN

/*
variable verifation section chopped
*/

FOR resulter IN
SELECT
n_id as id,
LOJ.pagetotal as total
pers.t_name as author,
image.b_contents as image
/*
need searchterm returned if we are doing search outside
the database
, COALESCE(t_title || '' '' || t_text, '''') as searchterm

FROM
db
/*
-> self join on db LOJ for unoffset, unlimited row count
refer to searchterm stuff below
*/
WHERE
/* note, if we are doing a search outside of the
* function and t_title or t_text could be empty then we
* need to coalesce to an empty string
* COALESCE(t_title || '' '' || t_text, '''') as searchterm
*/
searchstring ~ t_title || '' '' || t_text
ORDER BY
dt_modified DESC
LIMIT
limiter
OFFSET
offsetter
,

LOOP

RETURN NEXT
resulter;

END LOOP;

RETURN;
END;'
LANGUAGE plpgsql;

--
Rory Campbell-Lange
<ro...@campbell-lange.net>
<www.campbell-lange.net>

0 new messages