The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Mergesort: why's it efficient?
 There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic. There was an error processing your request. Please try again. Standard view   View as tree
 Messages 76 - 93 of 93 < Older

From:
To:
Cc:
Followup To:
Subject:
 Validation: For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon.

More options Dec 5 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: a...@research.att.com (Andrew Koenig)
Date: 1996/12/05
Subject: Re: Mergesort: why's it efficient?

In article <ARE.96Dec4135...@laphroig.mch.sni.de> a...@laphroig.mch.sni.de writes:
> If you limit N to below a fixed value, then it doesn't make any sense
> to speak about O notation, because that make sense only in the limit:
> to say f(n) = O(g(n)) means lim f(n)/g(n) < c for some constant c.

Actually it does, because you don't necessarily know the value in advance.
--
--Andrew Koenig
a...@research.att.com

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: mik...@ix.netcom.com (Mike Rubenstein)
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

No.  This shows a misunderstanding of what O(f(n)) means.

For one thing, O(f(n)) means that the time (or other quantity of
interest) is smaller that A*f(n) for some constant A.  It does not
imply that it is not much smaller.

Thus heapsort is O(n!) and bubble sort is O(n^2).  This does not mean
that bubble sort is better than heapsort.

More to the point, O(f(n)) means that if n is sufficiently large the
quantity is smaller than A*f(n).  But performance may be quite bood if
n is not sufficiently large.  Consider two sort algorithms:

1.  Use heapsort if the number of records is less than 10^20
and bubble sort if it is larger.

2.  Use bubble sort if the number of records is less than
10^20 and heapsort if it is larger.

The first is O(n^2) time and the second is O(n log n).  Which would
you choose?

Finally, even if the proportionality is exact, the constant must be
considered.  Suppose we have two sort algorithms one in which the time
is 10^-30 * n^2 hours and the second in which the time is 100 n log n
hours.  The first is O(n^2) and the second O(n log n).  Which is
better for practical sized problems?

Michael M Rubenstein

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: Erik Naggum <nob...@naggum.no>
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

* Dann Corbit
| If the problem is tiny, it may not matter what sort of algorithm, since
| all will finish in a blink.  If the problem is huge, then it matters
| enormously, because for some problem size, O(N*log(N)) will always be
| better than O(n^2).

it appears that many people don't undertand the O notation.  O indicates
complexity, not execution speed.  the execution speed depends on a constant
factor that is insignificant compared to the variable factor as it grows
without bound.  a function of complexity O(n log n) may have a constant
factor (k) that makes execution time (or other resources) _higher_ than a
function of complexity O(n^2) with a much smaller constant (c) as long as
the relation

k       n
--- >> -----
c     log n

holds.  however, your statement that O(n log n) will _always_ be better
than O(n^2) does not hold, precisely because of the possibility of the
above relation.

since Andrew Koenig surprised many by limiting n and then arguing how
insignificant log n is compared to n, as if this is anything but obvious,
others have followed in his muddled footsteps and also conveniently forget
the constant factor or the actual meaning of the O notation.  I'm frankly
amazed that this is possible for even half-educated computer scientists.

the O notation does _not_ indicate execution time.  the O notation is the
_simplest_ function that yields an upper bound on the _relation_ between
the length of the input and the execution time (or space).

I recommend that Andrew Koenig and others who have been confused by him
read Alfred V. Aho and Jeffrey D. Ullman's excellent book "Foundations of
Computer Science" (Computer Science Press, 1992; ISBN 0-7167-8233-2), which
has grown out of the course CS109 -- Introduction to Computer Science at
Stanford University.  the topic of chapter 3 is the running time of
programs.  section 3.10 specifically analyses merge sort.  it should be
instructive to more than the target audience of students, and any
programmer worth his salt _should_ be able to understand this chapter with
little effort.  in particular, their treatment of constants and of the
simplification involved in the O notation is fundamental knowledge and
_should_ not come as a surprise to anyone who has been following these
discussions.  _please_ take the time to look up this book.  (Dr. Aho and
Dr. Ullman have both worked at Bell Labs, which is why I chose this
textbook over many other candidates.  this will hopefully make it harder
for Andrew Koenig to ignore them, as he is wont to do for most everything
else shown to correct or contradict his inaccurate statements.)

#\Erik
--
stupid flames, courtesy copies, etc, should be sent to "nobody".

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: r...@viking.ruhr.com (Rolf Czedzak)
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

Dann Corbit  wrote:          <01bbe2fb\$c692d1c0\$c761e...@DCorbit.solutionsiq.com>

DC> Rolf Czedzak <r...@viking.ruhr.com> wrote in article
DC> <6MIHn7ir...@09.viking.ruhr.com>...
DC> [mega-snip]
DC> > AK> Again: O(N log N) is almost as good as O(N).  It is not almost
DC> > AK> as good as O(1), and is much better than O(N^2).
DC> >
DC> > Its uncomparable. O(n) might easily be outscored by O(N^2) for all
DC> > practical problems.
DC> False.

Hi Dann,

I wrote 'uncomparable' and 'might'+'prcatical problems' in reply to
Andrew's statement about real life problems (= upper limints for N)
and comparability of O-Classes, which in fact are classifiers of
asymptotic behaviour.
What could be wrong with this?

Rolf

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

In article <32b06793.15188...@nntp.ix.netcom.com>, mik...@ix.netcom.com

(Mike Rubenstein) wrote:

[references snipped]

> No.  This shows a misunderstanding of what O(f(n)) means.

[snip]

> Thus heapsort is O(n!) and bubble sort is O(n^2).  This does not mean
> that bubble sort is better than heapsort.

[snip]

I think most people here are using O when they really mean Theta.
Although this is a bit sloppy in terminology, in practice, I've found most
people do this, especially online, where it's not convenient to represent
a 'Theta'.

-Tim

--
---------------------------------------
Tim Fischer
Coda Music Technology

The following email address is mangled to prevent automated
unsolicited junk mail.  Replace the '_AT_' with an '@':

tfischer_AT_codamusic.com

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: Erik Naggum <nob...@naggum.no>
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

* Tim Fischer
| I think most people here are using O when they really mean Theta.

we mean Omega, sometimes called "big-oh".

textbooks on this topic abound.  get one.  read it.

#\Erik
--
stupid flames, courtesy copies, etc, should be sent to "nobody".

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: Graham Hughes <graham.hug...@resnet.ucsb.edu>
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

-----BEGIN PGP SIGNED MESSAGE-----

Erik Naggum <nob...@naggum.no> writes:
>we mean Omega, sometimes called "big-oh".

Not always.  When people call quicksort O(n log n), they really do mean
Theta.
- --
Graham Hughes (graham.hug...@resnet.ucsb.edu)
alt.PGPlike-key."gra...@A-abe.resnet.ucsb.edu".finger.look.examine
alt.homelike-page."http://A-abe.resnet.ucsb.edu/~graham/".search.browse.view
alt.silliness."http://www.astro.su.se/~robert/aanvvv.html".look.go.laugh

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQCVAwUBMqfniSqNPSINiVE5AQEVKgQAgwBptSneV/Fp2n50LUJ59L0ZdlUP5SPC
D8Ber9e4ITn0+rnGcs2BfpfHdM+ys1c6Ot//tsHqJcfWMwHT0eZgfapeIqdkB/NO
29SBfV0oJh8=
=y74a
-----END PGP SIGNATURE-----

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: "Dann Corbit" <dcor...@solutionsiq.com>
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

Mike Rubenstein <mik...@ix.netcom.com> wrote in article
<32b06793.15188...@nntp.ix.netcom.com>...
[snip]

> Thus heapsort is O(n!) and bubble sort is O(n^2).  This does not mean
> that bubble sort is better than heapsort.

In what sense is heapsort O(n!)?

[snip]

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: "Dann Corbit" <dcor...@solutionsiq.com>
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

Erik Naggum <nob...@naggum.no> wrote in article
<3058824781104...@naggum.no>...

> * Dann Corbit
> | If the problem is tiny, it may not matter what sort of algorithm, since
> | all will finish in a blink.  If the problem is huge, then it matters
> | enormously, because for some problem size, O(N*log(N)) will always be
> | better than O(n^2).
[snip]

> the O notation does _not_ indicate execution time.  the O notation is the
> _simplest_ function that yields an upper bound on the _relation_ between
> the length of the input and the execution time (or space).

But there is a relationship between big O notation and execution time.
You notice that I prefaced my statement "for some problem size", meaning
that given a large enough input set, an O(n) algorithm WILL outperform
(for the average case) an O(n^2) algorithm

|                      *    /
|                      *  /
|                      */
| T                   /*
| I                 / *
| M               /  *
| E             /   *
|             /    *
|           /     *
|         /     *
|            *
+---------------------------------
f(0)     f(i)          f(k)

Input set size (n)

The asterisks are a parabolic curve y0 = a*x^2 + b
The diagonal lines are some linear equation y1 = m*x + b
Early on, because of startup and whatever, the algorithm
may not clearly fit big O notation behavior by examination.
By the time our algorithm f() takes an input of size i,
from that point forward, the running time of the algorithm
will always lay below the graphs for some functions y0 and y1.
Further, at some input set size f(k), the graph of y0, being
parabolic will always be above the graph of y1, being linear.
Of course, a specific case may run in linear time (or even 0
time) for the n^2 algorithm, but its average behavior is
more accurately defined as being limited by the parabola for
some constants a and b.

I also mentioned benchmarking as being necessary to make the
proper choice.  None the less, there is clearly a relationship
between complexity and running time.  In the case of Big O
notation applied as average case behavior (Vs worst case as
an alternative) we can statistically predict how long the
routine will take to complete on average.  We cannot predict
how long a single run will take, except as a function of
probability.
[snip]

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: mik...@ix.netcom.com (Mike Rubenstein)
Date: 1996/12/06
Subject: Re: Mergesort: why's it efficient?

a...@laphroig.mch.sni.de wrote:
> If you limit N to below a fixed value, then it doesn't make any sense
> to speak about O notation, because that make sense only in the limit:
> to say f(n) = O(g(n)) means lim f(n)/g(n) < c for some constant c.

Not quite.  f(n) = O(g(n)) means that for sufficiently large n f(n) /
g(n) < c.  There is no requirement that there be a limit.  For
example, sin(n) = O(1) even though sin(n) / 1 does not approach a
limit.

Michael M Rubenstein

To post a message you must first join this group.
You do not have the permission required to post.
 Discussion subject changed to "virtual functions" by T. Wang
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help
From: "T. Wang" <yen...@u.washington.edu>
Date: 1996/12/06
Subject: virtual functions

I am a biginner in studying Programming C++.

And I function to understand virtual functions in C++.

I don't know when to use virtual, and when which function should I put

as virtual function ?

Can anyone explain to me ?

Thanks very much !

Tracy,

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 6 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help
From: s...@zanshin.com (Steve R Webster)
Date: 1996/12/06
Subject: Re: virtual functions

In article
<Pine.A41.3.95b.961206150613.14140A-100...@dante13.u.washington.edu>, "T.

Wang" <yen...@u.washington.edu> wrote:
> I am a biginner in studying Programming C++.

> And I function to understand virtual functions in C++.

> I don't know when to use virtual, and when which function should I put

> as virtual function ?

> Can anyone explain to me ?

Try http://www.iftech.com/oltc/cpp/cpp8.stm (the web is vast) for an intro.

-steve

To post a message you must first join this group.
You do not have the permission required to post.
 Discussion subject changed to "Mergesort: why's it efficient?" by Steve Heller
More options Dec 7 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: hel...@utdallas.edu (Steve Heller)
Date: 1996/12/07
Subject: Re: Mergesort: why's it efficient?

"Dann Corbit" <dcor...@solutionsiq.com> wrote:
>I also mentioned benchmarking as being necessary to make the
>proper choice.  None the less, there is clearly a relationship
>between complexity and running time.  In the case of Big O
>notation applied as average case behavior (Vs worst case as
>an alternative) we can statistically predict how long the
>routine will take to complete on average.  We cannot predict
>how long a single run will take, except as a function of
>probability.
>[snip]

That's another reason I like distribution counting (at least if you
have enough physical memory): its run time for a given number (and
size) of elements is essentially independent of the distribution or
arrangement of the keys and can therefore be predicted accurately.

Steve Heller, author and software engineer
http://ourworld.compuserve.com/homepages/steve_heller

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 7 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help, comp.lang.lisp
From: a...@research.att.com (Andrew Koenig)
Date: 1996/12/07
Subject: Re: Mergesort: why's it efficient?

In article <3058824781104...@naggum.no> Erik Naggum <nob...@naggum.no> writes:
> since Andrew Koenig surprised many by limiting n and then arguing how
> insignificant log n is compared to n, as if this is anything but obvious,
> others have followed in his muddled footsteps and also conveniently forget
> the constant factor or the actual meaning of the O notation.  I'm frankly
> amazed that this is possible for even half-educated computer scientists.

Notation is usually defined by usage and implicit understanding.
When a complexity theorist uses O notation in formal writing, that
means something slightly different from what the practical programmers
I've seen mean when they use it in informal conversation.

For example, I understand that, formally speaking, if I can describe
f(n) as O(n), I can also describe it as O(n^2).  However, that property
of the notation is not terribly useful in the context of my previous posting.

I don't think that anyone who wants to understand what I was saying
will have the slightest difficulty doing so.
--
--Andrew Koenig
a...@research.att.com

To post a message you must first join this group.
You do not have the permission required to post.
 Discussion subject changed to "virtual functions" by Thomas HORAK
More options Dec 8 1996, 3:00 am
Newsgroups: comp.lang.c++, comp.sys.mac.programmer.help
From: HO...@horak.franken.de (Thomas HORAK)
Date: 1996/12/08
Subject: Re: virtual functions

Need help on virtual functions ? Here's a brief introduction.

If a base class contains a virtual function, and a devived class from it
also contains the same function declaration (same type), then a call of
this function for an object of the derived class will invoke the function
of the derived class. The function is also invoked if it is called via a
pointer or reference to the base class.

So when are virtual functions to be used ?
Well the simpliest answer is, when they are needed. If you dont need
this feature, dont use the key word 'virtual'.

One use of  virtual functions is the following:

A base class defines an interface for which a variety of implementations
are provided by derived classes. Take the classic exsample of a base class
'shape' and derived classes 'circle' and 'rectangle'.
Now the clue is that a function exists which uses the interface of the base
class but does not know the derived classes. This function is passed a
pointer or a reference to a base object. The task of this function is
for exsample to print out the name of the passed object. If a pointer or
reference to an instance of one of the derived classes is passed to this
function the appropriate virtual function is called. Converting a pointer
to a devived object into a pointer to a base oblect is a standard conversion.
This is very useful if derived classes are added to the project, and the code of the function for
printing out the name cannot be modified (library). In this case the dynamic
binding of the virtual function is needed.
It is not that easy to find out which member functions of a base class should
be virtual. This requires looking into the future. A base class is also
designed to be used for derivation. So design the interface so that you
and other programmers will use it as a base class in order to solve their
specific problems (which you do not know yet). So the interface must be
flexible enough to offer staticaly and dynamicaly binded functions in order to
cope with furture problem-solving.

Exsample:

class shape {
virtual void print_name ( void ) { cout << "I am a shape";}

};

class circle : public shape {
void print_name ( void ) { cout << "I am a circle";}

};

class rectangle : public shape {
void print_name ( void ) { cout << "I am a rectangle";}

};

void who_am_i ( shape * s)
{
s->print_name();

}

Years later a third class is added.
class polygon : public shape {
void print_name ( void ) { cout << "I am a polygon";}

};

But function who_am_i cannot be changed or modified. But it prints out the
right string if passed a pointer to an instance of polygon.

--
+----------------------+
| Thomas Horak         |
| t...@horak.franken.de |
+----------------------+
## CrossPoint v3.11 ##

To post a message you must first join this group.
You do not have the permission required to post.
 Discussion subject changed to "Mergesort: why's it efficient?" by Reini Urban
More options Dec 8 1996, 3:00 am
Newsgroups: comp.sys.mac.programmer.help, comp.lang.lisp
From: <rurban-At-sbox.tu-graz.ac.at> (Reini Urban)
Date: 1996/12/08
Subject: Re: Mergesort: why's it efficient?

>Knuth Vol. 3, Sorting and Searching.
>Numerical Recipes (Press, Flannery, Teukolsky and Vetterling)

Sedgewick: Algorithms
is quite complete too and has nice illustrations, that's why I prefer
that one.
---
Reini Urban, TU Graz, Architecture & X-RAY
Attention! From: header is garbled on purpose!

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 14 1996, 3:00 am
Newsgroups: comp.sys.mac.programmer.help, comp.lang.lisp
Date: 1996/12/14
Subject: Re: Mergesort: why's it efficient?

Sorry, I just got in on this discussion, but here's the straight goods.

On average, Quicksort is actually faster than Mergesort, but Mergesort has
a couple of nice features:

Mergesort GUARANTEES O(nlogn) runtime.  By the same token, it always
performs all of the steps.  Ie.  There are no skipped steps or
advantageous cases in most codings of the algorithm.  Unless you drop out
of the mergesort at a sufficiently small size of arrays to use an
insertion sort or something else which works well on small problems.

Furthermore Mergesort is a stable algorithm.  Ie.  It sorts identical
elements and leaves them in their original positions.  This is
advantageous in certain conditions.  Ie.  Imagine a contest for guessing
the number of jelly beans in a jar.  If a bunch of people have the correct
answer, you want the prize to go to the first person who submitted his
entry.....You get the picture.

THe main problem is extra space requirements, but its guaranteed runtime
is a big plus!

Keir
--
--------------------------------------------------------------------------- ---
University of Waterloo

To post a message you must first join this group.
You do not have the permission required to post.
More options Dec 16 1996, 3:00 am
Newsgroups: comp.sys.mac.programmer.help, comp.lang.lisp
From: Marty Hall <h...@apl.jhu.edu>
Date: 1996/12/16
Subject: Re: Mergesort: why's it efficient?

> On average, Quicksort is actually faster than Mergesort

gently but immediately given compelling evidence that this is false
for the vast majority of situations.
- Marty

(proclaim '(inline skates))