Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What the deque?!

269 views
Skip to first unread message

Scott Meyers

unread,
Jun 29, 2000, 3:00:00 AM6/29/00
to
So there I was, minding my own business, when I noticed a glazy stare aimed
in my direction. This was bad news, because my business of late is
explaing the STL to professional C++ programmers, and a glazy stare is a
sign of a confused programmer. In particular, a programmer who wanted to
know what one would naturally use a deque for -- why the standard raises
the comparatively obscure deque up to the heights of a vector, list, and
string. It was a good question, and I didn't have a good answer for it.

This evening I turned to my handsome AW C++ library for deque motivation.
(One of the few perqs of being an author is that you can generally get some
free books out of it.) I consulted -- and pardon my name dropping, but I
just can't resist -- Budd, Knuth, Musser & Saini, Josuttis, Koenig & Moo,
Breymann, Sutter, Lippman, and Stroustrup in seach of a realistic general
problem for which a deque is a natural solution. I didn't come up with
much. Lippman and Josuttis mention a queue, but they fail to explain why
one would prefer a std::deque to a std::queue to model a queue. Austern
provides the obligatory description of deque's behavior and technical
strengths and weaknesses, but he then goes out of his way to discourage use
of a deque vis a vis a vector. (In conjunction with his recent C++ Report
column, this puts Matt on record as discouraging the use of deque, set,
multiset, map, and multimap :-}) Stroustrup manages to come up with two
examples of how a deque might be used ("to model a section of a railroad or
to represent a deck of cards in a game"), but having only .3333 of his 1019
pages to devote to the topic, he provides no details. Personally, I'm
skeptical that there are a lot of programmers modeling railroads out there,
and because most card games have a fixed number of cards, it's not clear
why a vector or (gasp!) array used as a circular buffer wouldn't be as
suitable a choice as a deque.

So I'm still stumped and the programmer's eyes are still probably glazed.
Can you help me out? Why *is* deque so fundamental a data structure that
it made it into the standard library? And what *are* some typical
applications for which a deque is uniquely suited?

I can describe deque's behavior and technical strengths/weaknesses until
the ruminants come home, but that's not what I need. What I need are some
plausible examples of problems where a deque's behavior, strengths, and
weaknesses are a better match for the problems than are the other standard
containers or container adapters.

Thanks,

Scott

{The following may not have the kinds of examples you want, but:
For contrasting points of view, see GotW #53 and #54 (and
incidentally #46), and my columns "Using the Standard Library,
Part 1: Vectors and Deques" (C++ Report, Jul/Aug 99) and the
"... Part 2"'s final 'In the Mail' subhead (C++ Report, Oct 99).
Finally, "deque<bool>" may be the easiest portable way of avoiding
the standard-enforced (mis)behavior of vector<bool>.

This material didn't make it into XC++, but will be in MXC++. I do
note that some of my conclusions are controversial. -mod/hps}

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]


Thant Tessman

unread,
Jun 29, 2000, 3:00:00 AM6/29/00
to

Scott Meyers wrote:

[...]

> I can describe deque's behavior and technical strengths/weaknesses until
> the ruminants come home, but that's not what I need. What I need are
some
> plausible examples of problems where a deque's behavior, strengths, and
> weaknesses are a better match for the problems than are the other
standard
> containers or container adapters.

I use deques for holding digits in my arbitrary-precision arithmetic
package. Addition, subtraction, and even multiplication can be done with
anything that supports push_back, but in order to be efficient, division
requires push_front, and random access. Deques are perfect. (E-mail me
if you want to see the implementation.)

-thant

Chris Riesbeck

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
In article <MPG.13c481907...@news.supernews.com>,
sme...@aristeia.com (Scott Meyers) wrote:

>Can you help me out? Why *is* deque so fundamental a data structure that
>it made it into the standard library? And what *are* some typical
>applications for which a deque is uniquely suited?

An instructor for our data structures class this quarter
asked the same thing, so I searched a dozen standard data
structures books. Ten have either no deques at all, or no
example of use.

Sedgewick uses a deque to implement a non-recursive
non-deterministic regular expression string matcher.
Basically, the non-deterministic parts need a stack, the
rest need a queue. It adds things to both ends, but only
removes from one end.

Another notes that the lines visible in an editor screen
buffer is a deque, since you can scroll backward and
forward. No actual implementation and I'd be surprised if
that's how any editor is actually implemented, but perhaps
an editor-like application with a moving window might use
it.

The instructor suggested a size-limited undo stack, because
after you push on MAX items-to-undo, any further pushes
require popping something off the other end.

Carlos Moreno

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
Scott Meyers wrote:
>
> So I'm still stumped and the programmer's eyes are still probably glazed.
> Can you help me out? Why *is* deque so fundamental a data structure that
> it made it into the standard library? And what *are* some typical
> applications for which a deque is uniquely suited?

An example:

Geometry. Many algorithms in polygon processing, require insertions
and removals of points -- sometimes anywhere in the polygon, sometimes
only at both ends.

So, this discards the vector (no front operations, although insert and
erase allow you to get around that limitation -- at a brutal cost in
terms of execution time), and leaves two choices: list, and deque.

Now, it turns out that many algorithms dealing with convex polygons
may take advantage of binary search techniques; but binary search
requires subscripted access, which leaves out of the picture the
use of list!

If you want more details, you can check my web-site, where I have
the article that I published in the C/C++ Users Journal a couple of
years ago, on Geometric Operations -- Funny that you will see that
I used list to represent polygons! (sometimes irony can be so
thick!:-)).

If interested, go to:

http://www.mochima.com/articles

And follow the link to "Efficient 2-D Geometric Operations" --
you may find the algorithm for finding the convex hull of a
simple polygon a nice illustration of the convenience of the
deque (again, I used std::list because in many other cases,
insertions and removals are required at any point in the
polygon, and I was trying to keep it as simple as possible,
so I didn't use different containers for different algorithms).

Cheers,

Carlos
--

Fred Sanford

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
Mr. Meyers,

My name is Fred Sanford, and I own a junk yard. I've designed a Winsock
application that listens for connections on port 85. I've developed a
client
application as well, and have distributed it to everyone who owns junk. The
junk owners can (any time after 6pm), execute their client application and
start typing in the names and prices of items that they would like to
unload
to my junk yard.

On the server side, all incoming junk notifications from the clients
are
placed in a std::deque<JUNK_INFO>. Items are added to the queue using
deque::push_back(). Every 21 seconds, a timer is fired and my server must
process each junk request *in the order that it was received.* I read each
item from the queue by examining m_myQueue[0], and then calling
m_myQueue.pop_front() to erase the piece of data that I just read.

I am a very old man who died several years ago, and have come back from
the dead to type this message.

Now, if I were using a vector instead of a deque, I would be in trouble
because vector does not support pop_front(). Instead, I would have to use
vector::erase() which is much slower. This is what makes deque so unique --
I can call pop_front() to efficiently erase from the front of the queue.

In your heart you have said in a loud voice, "Why shall I not use deque
all the time and forget about vector all together?"

My friend: let me tell you -- there is more overhead involved in
instantiating and maintaining a deque than a vector. A deque should only be
used then if you (in your heart) need to call pop_front() on the STL
container that you have used in the junk yard example that was described by
me in the paragraph that you read just above in my reply to the original
poster about the STL question.

"And that's why deque rules all," he said.

"Scott Meyers" <sme...@aristeia.com> wrote in message
news:MPG.13c481907...@news.supernews.com...

{58 quoted lines snipped, including a quoted .sig and banner.
Please don't overquote. -mod/hps}

Andrei Alexandrescu

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
Scott Meyers <sme...@aristeia.com> wrote in message
news:MPG.13c481907...@news.supernews.com...
> Lippman and Josuttis mention a queue, but they fail to explain why
> one would prefer a std::deque to a std::queue to model a queue.

I guess you meant std::list, not std::queue.

> Can you help me out? Why *is* deque so fundamental a data structure
that
> it made it into the standard library? And what *are* some typical
> applications for which a deque is uniquely suited?

First of all, the obvious stuff: queue<deque> has lower space and time
overhead compared to queue<list>. If all you store is a small amount
of data, you waste a lot of space with list, in addition to using
dynamic allocation for each node.

Second, Herb's arguments: deque is much more memory-friendly than
vector. But this might be an implementation detail.

In a commercial app, I use a deque for calculating the transfer speed.
I enter samples on a side, and I discard them on the other side. This
way I keep a history of the recent download chunks, and I use that to
compute speed. Because each entry holds only two integers (bytes and
time), it would be wasteful to use a list. (Maybe this is only a
detail, though). It's also true I could have used a circular buffer,
but deque was so handy.

In the same app, we use a queue<deque> for passing asynchronous
messages. Again, the messages are small, so a list would have been
inefficient.

Hmmmmmmm, I guess I wasn't convincing.


Andrei

Dennis Yelle

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
The moderator said:
[...]

> Finally, "deque<bool>" may be the easiest portable way of avoiding
> the standard-enforced (mis)behavior of vector<bool>.
>
> This material didn't make it into XC++, but will be in MXC++. I do
> note that some of my conclusions are controversial. -mod/hps}

What is this "standard-enforced (mis)behavior of vector<bool>"?

Dennis Yelle
--
I am a computer programmer and I am looking for a job.
There is a link to my resume here: http://table.jps.net/~vert

Raja R Harinath

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
sme...@aristeia.com (Scott Meyers) writes:
> ... In particular, a programmer who wanted to know what one would

> naturally use a deque for -- why the standard raises the
> comparatively obscure deque up to the heights of a vector, list, and
> string.

Maybe it was included in STL to demonstrate a resizeable data
structure with O(1) push_back cost, not amortized O(1) like a vector.
It stayed on when STL was incorporated en masse into the C++ Standard
Library.

- Hari
--
Raja R Harinath ------------------------------ hari...@cs.umn.edu
"When all else fails, read the instructions." -- Cahn's Axiom
"Our policy is, when in doubt, do the right thing." -- Roy L Ash

Scott Meyers

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
On 29 Jun 2000 16:46:22 -0400, Thant Tessman wrote:
> I use deques for holding digits in my arbitrary-precision arithmetic
> package.

Thank you for an interesting example of what certainly sounds like a very
legitimate use of deque.

Scott

Markus Redeker

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
Written by the moderator:

> Finally, "deque<bool>" may be the easiest portable way of avoiding
> the standard-enforced (mis)behavior of vector<bool>.

Which (mis)behaviour?


Markus

Wil Evers

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
In article <MPG.13c481907...@news.supernews.com>,
Scott Meyers wrote:

> In particular, a programmer who wanted to know what one would
> naturally use a deque for -- why the standard raises the
> comparatively obscure deque up to the heights of a vector,

> list, and string. It was a good question, and I didn't have

> a good answer for it.

> This evening I turned to my handsome AW C++ library for deque
> motivation.

[snip]

> I didn't come up with much. Lippman and Josuttis mention a queue,

> but they fail to explain why one would prefer a std::deque to a
> std::queue to model a queue.

std::queue is just a wrapper around some other container type that
has front(), pop_front(), back() and push_back(). By providing a
restricted interface, std::queue enforces that the wrapped
container, usually a std::deque or std::list, is strictly used as
a FIFO.

[snip]

> Why *is* deque so fundamental a data structure that it made it
> into the standard library? And what *are* some typical
> applications for which a deque is uniquely suited?

std::deque is the only container type in the STL that has (1)
amortized constant time insertion and removal at both its front and
its back and (2) constant time random access to its elements. In
contrast,a std::vector only has efficient insertions and removals at
its back, and a std::list doesn't have efficient random access (and,
of course, a plain C array doesn't have insertions and removals at
all).

To come up with a hopefully convincing example, consider the
implementation of an input buffer for a lexical analyzer that
requires some arbitrary amount of backtracking. For efficiency,
you only want to read each source character once, and to avoid
memory problems, you don't want to keep all of the source in
memory.

The section of the source that is being scanned for tokens is kept
in a deque. When more input is required, it is appended to the end
of the queue; when a token has been processed, the corresponding
characters are removed from the front of the queue; during the actual
scanning, random access is used to quickly retrieve the character at
the 'cursor' position.

Having said that, I must admit that I don't actually a use std::deque
for this purpose. The problem is that the requirements std::deque
places on its iterators and references dictate an implementation that
uses at least two levels of indirection to get from the container
object to its elements. For simple things like characters, this
overhead can be significant. This is also the reason why a deque's
capacity cannot be set in advance.

I have always wondered why this is so; about the only thing I can
imagine is that it leads to a slightly more efficient implementation
of insertions and removals in the middle of the container. So I
guess the bottom line is I have no idea what std::deque is for
either.

- Wil

--
Wil Evers, DOOSYS IT Consultants, Maarssen, Holland
[Wil underscore Evers at doosys dot com]

Ivan Vecerina

unread,
Jun 30, 2000, 3:00:00 AM6/30/00
to
Markus Redeker said:
>Written by the moderator:
>> Finally, "deque<bool>" may be the easiest portable way of avoiding
>> the standard-enforced (mis)behavior of vector<bool>.
>
>Which (mis)behaviour?

vector<bool> is optimized to store 1 bit per item.

Its misbehavior that makes it unique among all vectors is that
you cannot write, for example:

vector<bool> v;
....
bool& p = vector[i];

as individual bits of each char are used to store data.

--
"Si on avait pu naitre avant l'Homme !" - Cioran
----------------------------------------------------
Ivan Vecerina, Dr. med. , Senior Software Engineer
http://www.surgnav.com - http://www.cedara.com

sanderde

unread,
Jul 1, 2000, 3:00:00 AM7/1/00
to
It was seem any FIFO would do nicely. The undo command on a word
processor, history for your command shell, history on your
browser...

David Sanders
sand...@bellsouth.net

On 29 Jun 2000; Scott Meyers wrote:

>Date: 29 Jun 2000 12:00:45 -0400
>From: Scott Meyers <sme...@aristeia.com>
>Newsgroups: comp.lang.c++.moderated
>Subject: What the deque?!


>
>So there I was, minding my own business, when I noticed a glazy stare aimed
>in my direction. This was bad news, because my business of late is
>explaing the STL to professional C++ programmers, and a glazy stare is a

>sign of a confused programmer. In particular, a programmer who wanted to


>know what one would naturally use a deque for -- why the standard raises
>the comparatively obscure deque up to the heights of a vector, list, and
>string. It was a good question, and I didn't have a good answer for it.
>
>This evening I turned to my handsome AW C++ library for deque motivation.

>(One of the few perqs of being an author is that you can generally get some
>free books out of it.) I consulted -- and pardon my name dropping, but I
>just can't resist -- Budd, Knuth, Musser & Saini, Josuttis, Koenig & Moo,
>Breymann, Sutter, Lippman, and Stroustrup in seach of a realistic general

>problem for which a deque is a natural solution. I didn't come up with


>much. Lippman and Josuttis mention a queue, but they fail to explain why

>one would prefer a std::deque to a std::queue to model a queue. Austern
>provides the obligatory description of deque's behavior and technical
>strengths and weaknesses, but he then goes out of his way to discourage use
>of a deque vis a vis a vector. (In conjunction with his recent C++ Report
>column, this puts Matt on record as discouraging the use of deque, set,
>multiset, map, and multimap :-}) Stroustrup manages to come up with two
>examples of how a deque might be used ("to model a section of a railroad or
>to represent a deck of cards in a game"), but having only .3333 of his 1019
>pages to devote to the topic, he provides no details. Personally, I'm
>skeptical that there are a lot of programmers modeling railroads out there,
>and because most card games have a fixed number of cards, it's not clear
>why a vector or (gasp!) array used as a circular buffer wouldn't be as
>suitable a choice as a deque.
>

>So I'm still stumped and the programmer's eyes are still probably glazed.

>Can you help me out? Why *is* deque so fundamental a data structure that


>it made it into the standard library? And what *are* some typical
>applications for which a deque is uniquely suited?
>

>I can describe deque's behavior and technical strengths/weaknesses until
>the ruminants come home, but that's not what I need. What I need are some
>plausible examples of problems where a deque's behavior, strengths, and
>weaknesses are a better match for the problems than are the other standard
>containers or container adapters.
>

>Thanks,
>
>Scott
>
> {The following may not have the kinds of examples you want, but:
> For contrasting points of view, see GotW #53 and #54 (and
> incidentally #46), and my columns "Using the Standard Library,
> Part 1: Vectors and Deques" (C++ Report, Jul/Aug 99) and the
> "... Part 2"'s final 'In the Mail' subhead (C++ Report, Oct 99).

> Finally, "deque<bool>" may be the easiest portable way of avoiding
> the standard-enforced (mis)behavior of vector<bool>.
>

> This material didn't make it into XC++, but will be in MXC++. I do
> note that some of my conclusions are controversial. -mod/hps}
>

> [ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
> [ about comp.lang.c++.moderated. First time posters: do this! ]
>
>
>


{Either this message was flagged by the system as overquoted,
or you have quoted headers, .sigs, or other items which should
have been trimmed. Please trim quotes, and please don't quote
banners and .sigs. --mod}

bran...@cix.compulink.co.uk

unread,
Jul 1, 2000, 3:00:00 AM7/1/00
to
In article <d9ya3nn...@han.cs.umn.edu>, hari...@cs.umn.edu (Raja R
Harinath) wrote:
> Maybe it was included in STL to demonstrate a resizeable data
> structure with O(1) push_back cost, not amortized O(1) like a vector.

Surely deques push_back cost has the same complexity as vectors?
Std::deque is normally implemented as an array of arrays. Sometimes
adding an item will cause the first array to be resized and its contents
copied, which will be O(N) in the number of items.

Admittedly the constant multiplier may be smaller because (a) the size
of that array may be smaller than the size a vector would have (by a
fixed factor); and (b) the things copied are pointers which may be
smaller then the container's items. I suspect one of these will always be
true in real implementations. However, my local implementation has a
variable called DEQUESIZE which is effectively the number of items stored
in the second level arrays. If we set DEQUESIZE to 1, (a) will be false
and if we store chars, (b) will be false. Yet this would still be
conforming implementation (as I understand it).

The performance of deque::push_front() is amortised O(1) whereas a vector
version would be O(N). However, as Scott Meyers says, a simple resizable
ring-buffer could deliver the same. I think a ring-buffer would be a
conforming implementation for deque.

I believe vector could trivially be implemented by deque (at least until
the "contiguous storage" clause is added).

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Pete Becker

unread,
Jul 1, 2000, 3:00:00 AM7/1/00
to
bran...@cix.compulink.co.uk wrote:
>
> In article <d9ya3nn...@han.cs.umn.edu>, hari...@cs.umn.edu (Raja R
> Harinath) wrote:
> > Maybe it was included in STL to demonstrate a resizeable data
> > structure with O(1) push_back cost, not amortized O(1) like a vector.
>
> Surely deques push_back cost has the same complexity as vectors?
> Std::deque is normally implemented as an array of arrays. Sometimes
> adding an item will cause the first array to be resized and its contents
> copied, which will be O(N) in the number of items.
>

When the "first" array is full you add another array in front of it. No
copying needed.

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Contibuting Editor, C/C++ Users Journal (http://www.cuj.com)

Howard Hinnant

unread,
Jul 1, 2000, 3:00:00 AM7/1/00
to
In article <8jjjrl$kgl$1...@plutonium.compulink.co.uk>,
bran...@cix.compulink.co.uk wrote:

| I think a ring-buffer would be a
| conforming implementation for deque.

Nope. A ring buffer may invalidate outstanding references to elements
during a push_front or push_back (when its capacity is exceeded). A deque
can't do this (reference 23.2.1.3 / 1). That's not to say that a ring
buffer wouldn't be a valuable addition to one's toolbox. I certainly
think it is. The Metrowerks std::lib has one (currently named
std::__cdeque).

| I believe vector could trivially be implemented by deque (at least until
| the "contiguous storage" clause is added).

But the overhead in both cpu and memory would likely be unacceptable. I
consider deque to be the most complicated container in the std::lib.

-Howard

S.K.Mody

unread,
Jul 2, 2000, 3:00:00 AM7/2/00
to
Ivan Vecerina <iv...@mail.com> wrote in message
news:8jiugc$76f$1...@news1.urbanet.ch...

> Markus Redeker said:
> >Written by the moderator:
> >> Finally, "deque<bool>" may be the easiest portable way of avoiding
> >> the standard-enforced (mis)behavior of vector<bool>.
> >
> >Which (mis)behaviour?
>
> vector<bool> is optimized to store 1 bit per item.
>
> Its misbehavior that makes it unique among all vectors is that
> you cannot write, for example:
>
> vector<bool> v;
> ....
> bool& p = vector[i];
>
> as individual bits of each char are used to store data.
>
> --

I think the optimization of vector<bool> is provided via
a specialized allocator rather than through the vector class
itself. In fact strictly speaking you _have_ to use a special
allocator in order to be standards conforming.

You can get the regular unoptimized behaviour by
using vector< bool, std::allocator<bool> >. Of course
there is still the question of why the optimized allocator
was chosen as the default.

Regards,
S. K. Mody.

bran...@cix.compulink.co.uk

unread,
Jul 2, 2000, 3:00:00 AM7/2/00
to
In article <395E3114...@acm.org>, peteb...@acm.org (Pete Becker)
wrote:

> > Surely deques push_back cost has the same complexity as vectors?
> > Std::deque is normally implemented as an array of arrays. Sometimes
> > adding an item will cause the first array to be resized and its
> > contents copied, which will be O(N) in the number of items.
> >
>
> When the "first" array is full you add another array in front of it. No
> copying needed.

By "first" I mean the base-level array, not a second-level array. When
you add another (second-level) array, the base-level array may need to be
resized. That may involve copying its elements (which will be pointers to
the second-level arrays), which will be O(N).

(We were talking about push_back, not push_front, if that helps. Thus we
would be adding a new array after the last rather than in front of the
first.)

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Howard Hinnant

unread,
Jul 2, 2000, 3:00:00 AM7/2/00
to
In article <xJH75.10999$HD6.2...@iad-read.news.verio.net>, "S.K.Mody"
<Mod...@hotmail.com> wrote:

| I think the optimization of vector<bool> is provided via
| a specialized allocator rather than through the vector class
| itself. In fact strictly speaking you _have_ to use a special
| allocator in order to be standards conforming.

Section 23.2.5 / 1 states:

To optimize space allocation, a specialization of vector for bool elements
is provided:

template <class Allocator>
class vector<bool, Allocator>
{
public:
...


}

| You can get the regular unoptimized behaviour by
| using vector< bool, std::allocator<bool> >. Of course
| there is still the question of why the optimized allocator
| was chosen as the default.

Not in a conforming lib.

-Howard

bran...@cix.compulink.co.uk

unread,
Jul 2, 2000, 3:00:00 AM7/2/00
to
In article <hinnant-0107...@ith3-4a0.twcny.rr.com>,
hinnant@anti-spam_metrowerks.com (Howard Hinnant) wrote:
> | I think a ring-buffer would be a conforming implementation for
> | deque.
>
> Nope. A ring buffer may invalidate outstanding references to elements
> during a push_front or push_back (when its capacity is exceeded). A
> deque can't do this (reference 23.2.1.3 / 1).

OK. The STD evidently has a very specific data structure in mind for
deque, just as it has for map.

So Scott Meyers question is not, "What is a deque for?" so much as, "Why
implement deque with an array-of-arrays instead of with a ring-buffer
or a linked list?" And the answer depends on these rather subtle issues
of invalidation, performance trade-offs, and contiguity, which most of
the time we can ignore if we only care about basic LIFO-type behaviour.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Pete Becker

unread,
Jul 2, 2000, 3:00:00 AM7/2/00
to
bran...@cix.compulink.co.uk wrote:
>
> In article <395E3114...@acm.org>, peteb...@acm.org (Pete Becker)
> wrote:
> > > Surely deques push_back cost has the same complexity as vectors?
> > > Std::deque is normally implemented as an array of arrays. Sometimes
> > > adding an item will cause the first array to be resized and its
> > > contents copied, which will be O(N) in the number of items.
> > >
> >
> > When the "first" array is full you add another array in front of it. No
> > copying needed.
>
> By "first" I mean the base-level array, not a second-level array.

I see now what you meant. Sorry about confusing things.

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Contibuting Editor, C/C++ Users Journal (http://www.cuj.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Matt Osborn

unread,
Jul 2, 2000, 3:00:00 AM7/2/00
to
Scott Meyers wrote:
>
> So there I was, minding my own business, when I noticed a glazy stare
aimed
> in my direction. This was bad news, because my business of late is
> explaing the STL to professional C++ programmers, and a glazy stare is a
> sign of a confused programmer. In particular, a programmer who wanted to
> know what one would naturally use a deque for -- why the standard raises
> the comparatively obscure deque up to the heights of a vector, list, and
> string. It was a good question, and I didn't have a good answer for it.

<snip>

I use them to 'unwind' recursion.
For instance, directory searches. Push base directory onto the deque,
enter a loop that works the front of the deque searching for files
within the directory. Any subdirectories found are pushed onto the
deque. After current directory is empty, pop the front and continue at
the top of the loop.

Herb Sutter

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
Someone wrote:
>> > > Surely deques push_back cost has the same complexity as vectors?

No. In particular, deque::push_back() and its kin never need to copy a
T, ever; deque can therefore support "constant time insert and erase
operations at the beginning or the end." In constrast, vector can only
support "(amortized) constant time insert and erase operations at the
end."

>> > > Std::deque is normally implemented as an array of arrays. Sometimes

Yes, but more accurately, deque is an array of pointers to arrays. It's
not an array of directly-held arrays that are therefore contiguous in
memory... there'd be no point to that, since that already exists in the
standard library (it's spelled vector).

>> > > adding an item will cause the first array to be resized and its
>> > > contents copied, which will be O(N) in the number of items.

To which Pete replied, and later re-replied:


>> > When the "first" array is full you add another array in front of it. No
>> > copying needed.
>>
>> By "first" I mean the base-level array, not a second-level array.
>
>I see now what you meant. Sorry about confusing things.

Maybe I'm dense, but I still think Pete's first answer was right (even
though it answered a different question). No T objects already in the
deque ever need to be copied on deque::push_front() or
deque::push_back(). If your base-level array (what I would call the
"index") runs out of room, you just allocate a bigger one and copy over
the contents of the old one, which contents are just plain old T*
pointers to the second-level arrays (what I would call "pages").

The complexity requirements in the standard talk about T operations, and
even this grow case just copies a bunch of array pointers -- that's O(1)
T operations. Sure, if you include non-T operations, there will be N/M =
O(N) bald pointer copies... but what the OP claimed was that deque's
push_back() has the same complexity as vector's, which isn't true.

Herb

Herb Sutter

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
hinnant@anti-spam_metrowerks.com (Howard Hinnant) writes:
>In article <xJH75.10999$HD6.2...@iad-read.news.verio.net>, "S.K.Mody"
><Mod...@hotmail.com> wrote:
>| You can get the regular unoptimized behaviour by
>| using vector< bool, std::allocator<bool> >.
>
>Not in a conforming lib.

Yep, vector<bool, anything> gets you that standard (mis)behavior.

Back to deque: One use of deque that I recommend is that it's by far the
simplest portable way to avoid the vector<bool> specialization,
requiring you only to write "deque" instead of "vector" at the
declaration -- pretty easy.

The next simplest solution is probably to use vector<char> with casts,
which requires casts all over the place and is therefore ugly,
intrusive, and dangerous. (Yes, vector<int> would avoid some of the
casts, but typically requires four times as much space.)

The only other way I know to avoid vector<bool> and still use a
reasonably similar standard container is: use an array of bool. If on
reading that you immediately said "yuck" and recoiled, thinking that
we're supposed to use real container templates these days and not
builtin arrays, give yourself 10 points. Fortunately the standard
containers are very robust and wholesome... vector<bool> is the only
exception, and at the end of the day it hopefully won't affect too many
programmers.

Scott Meyers

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
On 2 Jul 2000 17:46:58 -0400, bran...@cix.compulink.co.uk wrote:
> So Scott Meyers question is not, "What is a deque for?" so much as, "Why
> implement deque with an array-of-arrays instead of with a ring-buffer
> or a linked list?"

That's a reasonable question, but my question was what I said it was: for
what kinds of problems is a deque a natural fit? Thanks to the postings to
this newsgroup as well as some private mail I received, I believe I can now
say, "A deque naturally models a number of variants of the queue data
structure (not to be confused with the specific queue interface in the
standard library), especially when classic queue usage is accompanied by a
need to have constant-time random access to queue contents and/or a need to
also add/remove elements at the 'wrong' end of the queue."

Thanks to everybody for continuing my education.

Scott

Ken Bloom

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
"Matt Osborn" <msos...@attglobal.net> wrote in message
news:395FFEDA...@attglobal.net...

> Scott Meyers wrote:
> >
> > So there I was, minding my own business, when I noticed a glazy stare
> aimed
> > in my direction. This was bad news, because my business of late is
> > explaing the STL to professional C++ programmers, and a glazy stare is
a
> > sign of a confused programmer. In particular, a programmer who wanted
to
> > know what one would naturally use a deque for -- why the standard
raises
> > the comparatively obscure deque up to the heights of a vector, list,
and
> > string. It was a good question, and I didn't have a good answer for
it.
>
> <snip>
>
> I use them to 'unwind' recursion.
> For instance, directory searches. Push base directory onto the deque,
> enter a loop that works the front of the deque searching for files
> within the directory. Any subdirectories found are pushed onto the
> deque. After current directory is empty, pop the front and continue at
> the top of the loop.

But as you describe it, all of this could be done with a queue.
A deque is meant for purposes matching either one of the following
patterns:
(F/L)IFO or other similar pattern,
or the use of a stack or queue where indexing into the stack/queue is also
required (as in a game of Klondike I had to write for my computer science
class)

Tim Sharrock

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
On 30 Jun 2000 14:30:32 -0400, "Andrei Alexandrescu"
<andre...@hotmail.com> wrote:

>Scott Meyers <sme...@aristeia.com> wrote in message
>news:MPG.13c481907...@news.supernews.com...

>> Can you help me out? Why *is* deque so fundamental a data structure
>that
>> it made it into the standard library? And what *are* some typical
>> applications for which a deque is uniquely suited?
>

>First of all, the obvious stuff: queue<deque> has lower space and time
>overhead compared to queue<list>. If all you store is a small amount
>of data, you waste a lot of space with list, in addition to using
>dynamic allocation for each node.

Unfortunately, I do not believe this is true. As is generally the case,
the standard makes no space guarantees, and deque typically allocates
blocks of elements. In the implementation I generally use (VC++6 SP3)
the deque implementation allocates at least 4096 bytes for each
non-empty deque. For short deques of a few tens of ints this is
significant overhead.

Tim

--
Tim Sharrock (t...@sharrock.org.uk)

Bill Wade

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to

<bran...@cix.compulink.co.uk> (Dave Harris?) wrote

> hinnant@anti-spam_metrowerks.com (Howard Hinnant) wrote:
> > | I think a ring-buffer would be a conforming implementation for
> > | deque.
> >
> > Nope. A ring buffer may invalidate outstanding references to elements
> > during a push_front or push_back (when its capacity is exceeded). A
> > deque can't do this (reference 23.2.1.3 / 1).
>
> OK. The STD evidently has a very specific data structure in mind for
> deque, just as it has for map.

Probably.

However a deque can be implemented with a ring buffer of pointers to T (or
pointers to arrays of T). Random access iterators become more complicated
(they need to know how far to go before the circle wraps-around), which may
be why you don't see this very often. I'd probably implement the base
buffer as a buffer of pair<T*, deque_info*> to do this.

Earlier in the thread Dave wrote:
> Surely deques push_back cost has the same complexity as vectors?

You can make deque push_xxx O(1) worst-case (rather than amortized). The
trick is to keep two low-level buffers (the buffers of pointers) of size M
and 2*M. The larger buffer is allocated when the smaller buffer contains
M/2 elements. Whenever a new pointer is added to the smaller buffer, two
pointers are copied to the larger buffer. By the time the smaller buffer is
full, all of its pointers have been copied to the larger buffer (which is
now half full, and the process can be repeated). This analysis was for
circular buffers. Some of the constants have to be changed for non-circular
buffers.

We assume allocation is O(1) and copying a fixed number of pointers at a
time is
certainly O(1), So every push is worst-case O(1).

I haven't run timings on this, but I suspect the complexity associated with
maintaining two base-level buffers would make this implementation
unattractive unless you seriously needed the strong guarantee on worst-case
performance and were working with very large deques.

Note that you can use the same type of technique to make a hash table change
its capacity (rehash) in worst-case O(1) time.

This is a special case of over-eager evaluation. Meyers mentions over-eager
evaluation in Item 18 of MEC++ His examples deal with statistical
operations, find the average of a vector in O(1) time.

Pete Becker

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
Herb Sutter wrote:
>
> >> > > Surely deques push_back cost has the same complexity as vectors?
>
> No. In particular, deque::push_back() and its kin never need to copy a
> T, ever; deque can therefore support "constant time insert and erase
> operations at the beginning or the end." In constrast, vector can only
> support "(amortized) constant time insert and erase operations at the
> end."

No, this doesn't follow. See below.

> The complexity requirements in the standard talk about T operations, and
> even this grow case just copies a bunch of array pointers -- that's O(1)
> T operations. Sure, if you include non-T operations, there will be N/M =
> O(N) bald pointer copies... but what the OP claimed was that deque's
> push_back() has the same complexity as vector's, which isn't true.
>

As David Harris pointed out, the key structure in a deque is the master
array, which holds pointers to sub-arrays that each hold up to M
objects. Think of that master array as a vector of pointers. The deque
adds a new pointer to that vector once for every M insert operations.
But it's still a vector, with amortized constant time complexity. The
advantages of a deque over vector come from inserting into the deque's
internal vector 1/M times as often, and perhaps from copying fewer bytes
when reallocating (when a pointer is smaller than a T object). Both of
these affect the multiplier, but not the time complexity itself. That
is, a deque will be faster than a vector, but asymptotically both are
amortized constant time, not constant time.

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Contibuting Editor, C/C++ Users Journal (http://www.cuj.com)

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Jens Kilian

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
iv...@mail.com (Ivan Vecerina) writes:
> vector<bool> is optimized to store 1 bit per item.
>
> Its misbehavior that makes it unique among all vectors is that
> you cannot write, for example:
>
> vector<bool> v;
> ....
> bool& p = vector[i];
>
> as individual bits of each char are used to store data.

But you can do

vector<bool>::reference p = v[i];

just as you can with *any* vector<T>. I see this as a minor inconvenience.
--
mailto:j...@acm.org phone:+49-7031-464-7698 (HP TELNET 778-7698)
http://www.bawue.de/~jjk/ fax:+49-7031-464-7351
PGP: 06 04 1C 35 7B DC 1F 26 As the air to a bird, or the sea to a fish,
0x555DA8B5 BB A2 F0 66 77 75 E1 08 so is contempt to the contemptible. [Blake]

mi...@hotmail.com

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
Scott Meyers wrote:
>
> I can describe deque's behavior and technical strengths/weaknesses until
> the ruminants come home, but that's not what I need. What I need are some
> plausible examples of problems where a deque's behavior, strengths, and
> weaknesses are a better match for the problems than are the other standard
> containers or container adapters.
>
> Thanks,
>
> Scott
>
>

Here is a real world example where a deque is the perfect fit. In this
example a deque is used to enable the fast computation of Online
Arithmetic Moving Averages.
In this application, the computation of moving averages is just a small
part of a complex stochastic processes forecasting tool.

Here is the story. Imagine a stochastic process which generates some
value of a given variable Y at discrete point in time. Each time the
process you observe generates a new value, your goal is to predict in
realtime what the next value generated by this system will be. For that,
you use a complex stochastic model which requires (among many other
things) as input the arithmetic moving averages of the last L values of
Y for several length L. To see, why a deque is require for fast
computation of these moving averages, let's see more precisely what an
arithmetic moving average is. The arithmetic moving average of length L
of the variable Y computed at time t is just the arithmetic average of
the last L values of the variable Y.

L-1
Mov(L) = SUM Y(t-i) / L
i=0

Since these moving averages of different length L must be computed at
each point in time, the following recurrence formula is used instead of
the previous one to save computation time:

Mov(L) = (L * Mov(L) + Y(t) - Y(t-L)) / L

Thus, at current time t, the new value of the moving average of length L
is computed by using its current value and the value of the variable Y
at time t and t-L.

Since the process we try to model generates new data at a relativeley
fast pace,
we cannot keep all the values of the variable Y in main memory. All we
can do is keep the last ML values, where ML is the maximum length of the
moving averages we need to compute.

To sump up, here is the requirements for fast computation of these
moving averages:
* we can only keep a window of the last ML values of Y in main memory,
and this window must be updated efficiently.
* we need fast indexing access in this window in order to get Y(t-L)
and update
the moving average.

The STL container which gives you the best compromise between fast
random access
and fast pop_front and push_back is the deque.
In my application, if I replace the deque by a vector (and using erase
of the first element instead of pop_front), the computation of the
moving averages is 10 times
slower! Of course using a list is not an option here because the length
of the past window we must keep in RAM can be substantial and fast
indexing is required.

----
Michel Minsoul
Universite de LIEGE (BELGIUM)
Michel.Minsoul AT ulg.ac.be

S.K.Mody

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to
Howard Hinnant <hinnant@anti-spam_metrowerks.com> wrote in message
news:hinnant-0207...@ith3-44d.twcny.rr.com...

> In article <xJH75.10999$HD6.2...@iad-read.news.verio.net>, "S.K.Mody"
> <Mod...@hotmail.com> wrote:
>
> | I think the optimization of vector<bool> is provided via
> | a specialized allocator rather than through the vector class
> | itself. In fact strictly speaking you _have_ to use a special
> | allocator in order to be standards conforming.
>
> Section 23.2.5 / 1 states:
>
> To optimize space allocation, a specialization of vector for bool
elements
> is provided:
>
> template <class Allocator>
> class vector<bool, Allocator>
> {
> public:
> ...
> }
>
> | You can get the regular unoptimized behaviour by
> | using vector< bool, std::allocator<bool> >. Of course
> | there is still the question of why the optimized allocator
> | was chosen as the default.
>
> Not in a conforming lib.
>

Yes, sorry I'm wrong. I thought I made a reasonable guess
as to the behaviour.

Regards,
S. K. Mody.

Matt Osborn

unread,
Jul 3, 2000, 3:00:00 AM7/3/00
to

I should have picked a better example. It is the additional capability
to order or modify the deque that prefers the deque to the queue.

bran...@cix.compulink.co.uk

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
In article <k030msgl9se73u9g7...@4ax.com>,
hsu...@peerdirect.com (Herb Sutter) wrote:
> The complexity requirements in the standard talk about T operations,
> and even this grow case just copies a bunch of array pointers -- that's
> O(1) T operations. Sure, if you include non-T operations, there will
> be N/M = O(N) bald pointer copies... but what the OP claimed was that
> deque's push_back() has the same complexity as vector's, which isn't
> true.

I'm the original poster. What I said was true if we use the usual
definition of algorithmic complexity. Evidently the standard has an
unusual definition.

To ignore non-T operations is strange, because copying a pointer may be
slower than copying a T (eg if T is a plain char the pointer may take 4
times as long). If that's what the C++ standard does, misunderstandings
are inevitable.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Joerg Barfurth

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
Herb Sutter <hsu...@peerdirect.com> wrote:

> The complexity requirements in the standard talk about T operations, and
> even this grow case just copies a bunch of array pointers -- that's O(1)
> T operations. Sure, if you include non-T operations, there will be N/M =
> O(N) bald pointer copies... but what the OP claimed was that deque's
> push_back() has the same complexity as vector's, which isn't true.

Complexity is about asymptotic behavior of execution times or operations
counts. In particular the values of any proportionality constants, as
well as any lower order terms or ignored here. Restricting this to only
take a subset of all operations into account is not useful.

BTW: By your argument std::list would allow O(1) random access, as this
requires traversing O(n) node links, but no T operations. Therefore
std::list should provide operator [](size_type).

And finally there would be no issue of list::splice vs. list::size as
both of those would be O(1) in 'most any implementation.

But if you 'include non-T operations' the complexity of deque's
push_back is only amortized constant (O(1)) time, due to occasional
reallocations that are O(n) - which is the same as O(n/m) for fixed m.
This is exactly the same complexity as specified for vector's pushback.

Of course absolute numbers vary depending on use: A non-reallocating
push_back might be faster on avegrage in vector, while reallocation in
deque may be a lot faster than in vector.

--
Jörg Barfurth joerg.b...@attglobal.net
--------------- using std::disclaimer; ------------------
Software Engineer joerg.b...@germany.sun.com
Star Office GmbH http://www.sun.com/staroffice

Gregory Bond

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
sme...@aristeia.com (Scott Meyers) writes:

> So I'm still stumped and the programmer's eyes are still probably glazed.

> Can you help me out? Why *is* deque so fundamental a data structure that
> it made it into the standard library? And what *are* some typical
> applications for which a deque is uniquely suited?

Hmm, anywhere a vector<> might be used, but can't afford to have
references to existing elements invalidated by a push_back()? (see
23.2.1.3 [in the CD2 draft]). I would expect this would come in handy
in complex container situations (e.g. a collection that is accessed by
index in a vector<> but also associatively via a map<>.)

Interesting note: References are not invalidated by push_back(), but
iterators are!!

Herb Sutter

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
Jens Kilian <Jens_...@agilent.com> writes:

>iv...@mail.com (Ivan Vecerina) writes:
>> vector<bool> is optimized to store 1 bit per item.
>>
>> Its misbehavior that makes it unique among all vectors is that
>> you cannot write, for example:
>>
>> vector<bool> v;
>> ....
>> bool& p = vector[i];
>>
>> as individual bits of each char are used to store data.
>
>But you can do
>
> vector<bool>::reference p = v[i];
>
>just as you can with *any* vector<T>. I see this as a minor inconvenience.

Except that the standard guarantees that the first code will work for
standard containers, which means vector<bool> is not a conforming
container. Boom.

Herb

Herb Sutter

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
Pete Becker <peteb...@acm.org> writes:

>Herb Sutter wrote:
>> deque::push_back() and its kin never need to copy a
>> T, ever; deque can therefore support "constant time insert and erase
>> operations at the beginning or the end." In constrast, vector can only
>> support "(amortized) constant time insert and erase operations at the
>> end."
>
>No, this doesn't follow. See below.
>
>> The complexity requirements in the standard talk about T operations, and
>> even this grow case just copies a bunch of array pointers -- that's O(1)
>> T operations. Sure, if you include non-T operations, there will be N/M =
>> O(N) bald pointer copies... but what the OP claimed was that deque's
>> push_back() has the same complexity as vector's, which isn't true.
>
>As David Harris pointed out, the key structure in a deque is the master
>array, which holds pointers to sub-arrays that each hold up to M
>objects. Think of that master array as a vector of pointers. The deque
>adds a new pointer to that vector once for every M insert operations.

Right. And that involves no T operations.

>But it's still a vector, with amortized constant time complexity.

Right, which doesn't matter, because that's the complexity in terms of
T* operations, not in terms of T operations.

>The
>advantages of a deque over vector come from inserting into the deque's
>internal vector 1/M times as often, and perhaps from copying fewer bytes
>when reallocating (when a pointer is smaller than a T object). Both of
>these affect the multiplier, but not the time complexity itself. That
>is, a deque will be faster than a vector, but asymptotically both are
>amortized constant time, not constant time.

Everything you said is true except for the last sentence, because that
master array just holds T*'s, and manipulating those doesn't count
toward the complexity we're talking about. 23.1/2 states:

2 All of the complexity requirements in this clause are stated solely in
terms of the number of operations on the contained objects. [Example:
the copy constructor of type vector <vector<int> > has linear complex-
ity, even though the complexity of copying each contained vector<int>
is itself linear. ]

In this case, the complexity of a deque<T> is stated solely in terms of
the number of T operations -- not T* or other operations, even if they
happen. It doesn't matter if deque<T> happens to use a vector<T*> under
the covers, which is amortized constant in T*, because that doesn't
contribute toward the complexity of deque<T> which is constant in T (not
T* or anything else).

In deque<T>'s push_front() and push_back(), there is never a need to
copy any of the already-contained T objects -- indeed, the only T
operation those functions ever need to perform is the obvious single
copy of the newly added object, and so these functions are always
constant time _in terms of T operations_.

Deque offers constant time push_back(). Vector offers "amortized"
constant time for push_back() (and incidentally to get that requires
exponential growth). The two do not have the same complexity.

Pete Becker

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
<bran...@cix.compulink.co.uk> (Dave Harris?) wrote

>
>You can make deque push_xxx O(1) worst-case (rather than amortized). The
>trick is to keep two low-level buffers (the buffers of pointers) of size M
>and 2*M. The larger buffer is allocated when the smaller buffer contains
>M/2 elements. Whenever a new pointer is added to the smaller buffer, two
>pointers are copied to the larger buffer. By the time the smaller buffer is
>full, all of its pointers have been copied to the larger buffer (which is
>now half full, and the process can be repeated). This analysis was for
>circular buffers. Some of the constants have to be changed for non-circular
>buffers.

Interesting technique. It replaces a single pointer assignment in the usual
implementation with three assignments, and triples the memory usage for the
base array. In exchange for this, it eliminates the copy that the usual
implementation does from time to time.

Those overheads aren't too bad, since this is actually a small part of the
overall push_back operation. What scares me about it is the edge cases.
Getting the array manipulation right and fast is hard enough with a single
array. The thought of having to get it right and fast for two arrays give
me a headache.

>I haven't run timings on this, but I suspect the complexity associated with
>maintaining two base-level buffers would make this implementation
>unattractive unless you seriously needed the strong guarantee on worst-case
>performance and were working with very large deques.

I agree. It looks to me like the overhead of this approach is higher, on
average, than the usual implementation. I think the usual approach is good
enough, and I'll propose relaxing the requirment in the standard to permit it.


-- Pete

Dinkumware, Ltd.
"Genuine Software"

Pete Becker

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
Herb Sutter wrote:
>Pete Becker <peteb...@acm.org> writes:
>>Herb Sutter wrote:
>>> deque::push_back() and its kin never need to copy a
>>> T, ever; deque can therefore support "constant time insert and erase
>>> operations at the beginning or the end." In constrast, vector can only
>>> support "(amortized) constant time insert and erase operations at the
>>> end."
>>
>>No, this doesn't follow. See below.
>>
>>> The complexity requirements in the standard talk about T operations, and
>>> even this grow case just copies a bunch of array pointers -- that's O(1)
>>> T operations. Sure, if you include non-T operations, there will be N/M =
>>> O(N) bald pointer copies... but what the OP claimed was that deque's
>>> push_back() has the same complexity as vector's, which isn't true.
>>
>>As David Harris pointed out, the key structure in a deque is the master
>>array, which holds pointers to sub-arrays that each hold up to M
>>objects. Think of that master array as a vector of pointers. The deque
>>adds a new pointer to that vector once for every M insert operations.
>
>Right. And that involves no T operations.

The time complexity requirements apply to deque<T>::push_back(T) regardless
of how it is implemented.

>
>>But it's still a vector, with amortized constant time complexity.
>
>Right, which doesn't matter, because that's the complexity in terms of
>T* operations, not in terms of T operations.

No, because it affects the observable time complexity of push_back(T).

>
>>The
>>advantages of a deque over vector come from inserting into the deque's
>>internal vector 1/M times as often, and perhaps from copying fewer bytes

>>when reallocating (when a pointer is smaller than a T object). Both of
>>these affect the multiplier, but not the time complexity itself. That
>>is, a deque will be faster than a vector, but asymptotically both are
>>amortized constant time, not constant time.
>
>Everything you said is true except for the last sentence, because that
>master array just holds T*'s, and manipulating those doesn't count
>toward the complexity we're talking about. 23.1/2 states:
>
>2 All of the complexity requirements in this clause are stated solely in
> terms of the number of operations on the contained objects. [Example:
> the copy constructor of type vector <vector<int> > has linear complex-
> ity, even though the complexity of copying each contained vector<int>
> is itself linear. ]

You're reading this inside out. It is not a license to implement algorithms
badly if they deal internally with pointers instead of objects. What it
says is that the complexity requirements are based on the assumption that
operations on the contained objects, such as copying, are constant time. If
that assumption is not satisfied then the container need not satisfy the
complexity requirements.

>
>In this case, the complexity of a deque<T> is stated solely in terms of
>the number of T operations -- not T* or other operations, even if they
>happen.

There is no such thing as the "complexity of a deque<T>." It is only
meaningful to talk about the complexity of various operations on a deque<T>.

> It doesn't matter if deque<T> happens to use a vector<T*> under
>the covers, which is amortized constant in T*, because that doesn't
>contribute toward the complexity of deque<T> which is constant in T (not
>T* or anything else).

If the implemenation of deque<T>::push_back does not satisfy the complexity
requirements that the standard imposes, it does not matter whether that
arises from a bad choice of an algorithm that deals with pointers or from a
bad choice of an algorithm that deals with objects. It doesn't satisfy the
complexity requirements.

>
>In deque<T>'s push_front() and push_back(), there is never a need to
>copy any of the already-contained T objects -- indeed, the only T
>operation those functions ever need to perform is the obvious single
>copy of the newly added object, and so these functions are always
>constant time _in terms of T operations_.

Defining time complexity in this way would make it utterly useless, since
it would impose no requirements on user-visible operations. As I pointed
out earlier, what the standard allows is violation of time complexity
requirements when the contained objects do not satisfy the assumptions made
by the standard. It does not allow violation of the time complexity
requirements through the ruse of dealing with pointers instead of objects.

>
>Deque offers constant time push_back().

David Harris has demonstrated that, in the usual implentation, it does not.
Rather, push_back is amortized constant time. As you add elements to the
container, most additions will take constant time, but once in a while
there will be a copy operation whose time is proportional to the number of
elements in the container.

Carlos Moreno

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
> > The complexity requirements in the standard talk about T operations,
> > and even this grow case just copies a bunch of array pointers -- that's
> > O(1) T operations. Sure, if you include non-T operations, there will
> > be N/M = O(N) bald pointer copies... but what the OP claimed was that
> > deque's push_back() has the same complexity as vector's, which isn't
> > true.
>
> I'm the original poster.

Will you look at this... Another one that believes he is Scott
Meyers...!!

;-)

Carlos
--

Christopher Eltschka

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
Jens Kilian wrote:
>
> iv...@mail.com (Ivan Vecerina) writes:
> > vector<bool> is optimized to store 1 bit per item.
> >
> > Its misbehavior that makes it unique among all vectors is that
> > you cannot write, for example:
> >
> > vector<bool> v;
> > ....
> > bool& p = vector[i];
> >
> > as individual bits of each char are used to store data.
>
> But you can do
>
> vector<bool>::reference p = v[i];
>
> just as you can with *any* vector<T>. I see this as a minor
inconvenience.

What about this:

template<class T> bool read_from_config(std::string name, T& value);

void foo()
{
std::vector<bool> v(10);
for (int i=0; i<v.size(); ++i)
{
std::ostringstream s;
s << "item" << i;
if (!read_from_config(s.str(), v[i])) // doesn't compile!
std::cerr << "Reading " << s.str() << "failed!" << std::endl;
}
// ...

Chris Newton

unread,
Jul 4, 2000, 3:00:00 AM7/4/00
to
Pete Becker <peteb...@acm.org> wrote [abridged]...

> Herb Sutter wrote:
> >In deque<T>'s push_front() and push_back(), there
> >is never a need to copy any of the already-contained
> >T objects -- indeed, the only T operation those
> >functions ever need to perform is the obvious single
> >copy of the newly added object, and so these
> >functions are always constant time _in terms of
> >T operations_.
>
> Defining time complexity in this way would make it
> utterly useless, since it would impose no requirements
> on user-visible operations.

It seems to me that both Herb and Pete are correct in their own frames
of reference, but that those frames are not the same. There are two
relevant types of complexity here:
- complexity in terms of number of operations on contained objects;
- complexity in terms of time taken.
In this area, these two concepts can give different answers.

23.1/2 of the standard reads, "All of the complexity requirements in


this clause are stated solely in terms of the number of operations on

the contained objects." That would seem to support Herb's point of view
here. I respectfully disagree with Pete that this is "utterly useless",
since knowing how many times a method of your object is called clearly
can be important. If copying your object is expensive in terms of time,
then the two types of complexity above are going to coincide anyway for
practical purposes. If not, maybe there are some side effects of copying
that you want to know about anyway.

On the other hand, I think Pete has a valid point, in that if the
complexity requirements do not force an efficient implementation, their
value is limited. Perhaps the standard's failure to clarify this is a
weakness that could be addressed in the next revision, if only as a
paragraph clarifying the spirit of the existing requirements. I will
leave the gurus here to debate whether more than this is required, since
if copying your objects is cheap in time terms, it is at least
questionable whether the time complexity is relevant to real world code
anyway.

Regards,
Chris

Pete Becker

unread,
Jul 5, 2000, 3:00:00 AM7/5/00
to

The requirements for push_back (and for most other algorithms) impose two
complexity requirements: one for the number of operations on T and one for
time complexity. For example, for list::push_back the standard says:

Complexity: Insertion of a single element into
a list takes constant time and exactly one call
to the copy constructor of T. Insertion of
multiple elements into a list is linear in the
number of elements inserted, and the number of
calls to the copy constructor of T is exactly
equal to the number of elements inserted.

Given this clear example, the statement in 23.1/1, that "[a]ll of the


complexity requirements in this clause are stated solely in terms of the

number of operations on the contained objects," is simply wrong. Satisfying
requirements on the number of operations on T is only part of satisfying
the overall complexity requirements.

The general rule for push_back comes from the introduction to Table 68:

The operations in Table 68 are provided only for
the containers for which they take constant time:

and Table 68 describes push_front and push_back. So the requirement is
"constant time," with its usual meaning, not some artifical construct that
tries to turn "constant time" into a requirement on the number of
operations on T. (Consider an implementation of deque that used a
singly-linked list: push_back would require O(N) operations to find the
insertion point, but would only do one copy of T to insert it at that
point. That isn't "constant time.")

>
> On the other hand, I think Pete has a valid point, in that if the
> complexity requirements do not force an efficient implementation, their
> value is limited. Perhaps the standard's failure to clarify this is a
> weakness that could be addressed in the next revision, if only as a
> paragraph clarifying the spirit of the existing requirements.

The problem is in the paragraph "clarifying the spirit of the existing
requirements." The statement that it makes about how the requirements are
stated is wrong.

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Contibuting Editor, C/C++ Users Journal (http://www.cuj.com)


-- Pete

Dinkumware, Ltd.
"Genuine Software"

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

bran...@cix.compulink.co.uk

unread,
Jul 5, 2000, 3:00:00 AM7/5/00
to
In article <2000070412...@chmls05.mediaone.net>,
peteb...@acm.org (Pete Becker) wrote:
> <bran...@cix.compulink.co.uk> (Dave Harris?) wrote
> >
> >You can make deque push_xxx O(1) worst-case (rather than amortized).

"Brangdon" is Dave Harris, but I didn't write those words. They were
written by "Bill Wade" <bill...@stoner.com> according to my newsfeed.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

A. G. McDowell

unread,
Jul 5, 2000, 3:00:00 AM7/5/00
to
In article <bkq2msgqd93c5bcka...@4ax.com>, Herb Sutter
<hsu...@peerdirect.com> writes

>
>Deque offers constant time push_back(). Vector offers "amortized"
>constant time for push_back() (and incidentally to get that requires
>exponential growth). The two do not have the same complexity.
>
I'd just like to point out - because this is one of the nicest little
tricks that I have seen - that "exponential growth" here does not mean
that a disproportionate amount of store is consumed. The sort of
strategy used here is to double the amount of store allocated each time
you find that the previous allocation was not enough. The real beauty of
this trick is of course two-fold:

1/On average, each unit stored is copied only some constant number of
times: if you need to copy at the very end, then the last half of the
characters added are copied once, the quarter before that twice, and so
on, so you get
1/2 + 1/4*2 + 1/8*3 + 1/4*4 +... which approaches some finite limit

2/Even though the store used is "exponentially growing", you can never
allocate more than twice the amount of store that is required - so your
storage overhead is at most a constant factor of two: pretty good
compared, for instance, to a linked list of small objects, where you
have to allocate link fields. In fact you would still have only a
constant factor overhead if you charged for all the store used and then
freed again - this follows from the fact that you have copied only a
constant number of times.
--
A. G. McDowell

Ken Bloom

unread,
Jul 5, 2000, 3:00:00 AM7/5/00
to
"Gregory Bond" <g...@itga.com.au> wrote in message
news:86ya3i3...@hellcat.itga.com.au...

> sme...@aristeia.com (Scott Meyers) writes:
>
> > So I'm still stumped and the programmer's eyes are still probably
glazed.
> > Can you help me out? Why *is* deque so fundamental a data structure
that
> > it made it into the standard library? And what *are* some typical
> > applications for which a deque is uniquely suited?
>
> Hmm, anywhere a vector<> might be used, but can't afford to have
> references to existing elements invalidated by a push_back()? (see
> 23.2.1.3 [in the CD2 draft]). I would expect this would come in handy
> in complex container situations (e.g. a collection that is accessed by
> index in a vector<> but also associatively via a map<>.)
>
> Interesting note: References are not invalidated by push_back(), but
> iterators are!!

Well, if you're talking about the CD2 draft, it is known to be incomplet,
incorect and iNconSiStenT. I hope they fixed this in the final standard.

Ron Hunsinger

unread,
Jul 6, 2000, 3:00:00 AM7/6/00
to
In article <bkq2msgqd93c5bcka...@4ax.com>, Herb Sutter
<hsu...@peerdirect.com> wrote:

> Everything you said is true except for the last sentence, because that
> master array just holds T*'s, and manipulating those doesn't count
> toward the complexity we're talking about. 23.1/2 states:
>
> 2 All of the complexity requirements in this clause are stated solely in
> terms of the number of operations on the contained objects. [Example:
> the copy constructor of type vector <vector<int> > has linear complex-
> ity, even though the complexity of copying each contained vector<int>
> is itself linear. ]

I think you're misreading that. It doesn't mean that bookkeeping operations
aren't counted. What they meant to say (and admittedly said badly, although
the example makes the intent clear) is that each operation on a contained
object is counted as if it were constant time even if it really isn't.

In the case of a vector<T>, you can compute the cost of copying a vector AS
IF you had a constant upper bound on the time needed to copy a T, even if
(as in the case where T = vector<int>) there is not actually any such upper
bound. You can do that, because 23.1/2 says that each operation on a T
(including copying it) is a single operation.

As an accountant will tell you, bookkeeping always counts.

-Ron Hunsinger

Christopher Eltschka

unread,
Jul 6, 2000, 3:00:00 AM7/6/00
to
Ken Bloom wrote:
>
> "Gregory Bond" <g...@itga.com.au> wrote in message
> news:86ya3i3...@hellcat.itga.com.au...
> > sme...@aristeia.com (Scott Meyers) writes:
> >
> > > So I'm still stumped and the programmer's eyes are still probably
> glazed.
> > > Can you help me out? Why *is* deque so fundamental a data structure
> that
> > > it made it into the standard library? And what *are* some typical
> > > applications for which a deque is uniquely suited?
> >
> > Hmm, anywhere a vector<> might be used, but can't afford to have
> > references to existing elements invalidated by a push_back()? (see
> > 23.2.1.3 [in the CD2 draft]). I would expect this would come in handy
> > in complex container situations (e.g. a collection that is accessed by
> > index in a vector<> but also associatively via a map<>.)
> >
> > Interesting note: References are not invalidated by push_back(), but
> > iterators are!!
>
> Well, if you're talking about the CD2 draft, it is known to be incomplet,
> incorect and iNconSiStenT. I hope they fixed this in the final standard.

If it ain't broken, don't fix it! ;-)

Iterators have to access the first-level array (otherwise
they couldn't find out where the next object is). The
first-level array may be reallocated. Thus iterators
will be invalidated just as with vector.
References only refer to the objects themselves, which are
in the second-level arrays which are never reallocated.
Therefore references don't get invalidated by push_back.

Ron Hunsinger

unread,
Jul 6, 2000, 3:00:00 AM7/6/00
to
In article <AqrKFBAybkY5Ew$a...@mcdowella.demon.co.uk>, "A. G. McDowell"
<mcdo...@mcdowella.demon.co.uk> wrote:

> 2/Even though the store used is "exponentially growing", you can never
> allocate more than twice the amount of store that is required - so your
> storage overhead is at most a constant factor of two: pretty good
> compared, for instance, to a linked list of small objects, where you
> have to allocate link fields. In fact you would still have only a
> constant factor overhead if you charged for all the store used and then
> freed again - this follows from the fact that you have copied only a
> constant number of times.

In fact, the storage overhead compares even more favorably than that.

Suppose you are comparing the maximum storage required to insert N elements
into two vector-like containers, one of which reallocates by a fixed number
of elements, and the other by a fixed ratio. For instance, adding M
elements in the first case, or doubling the size in the second case.

In the worst case, the last element added caused a reallocation. Under the
first algorithm, the old vector contained had a capacity for N-1 elements,
and the new container contains N-1+M. The total space needed at that
instant is 2N + M - 1. Assuming N is large, the actual value for M isn't
relevant. The maximum memory requirement is 2N and change.

Under the same circumstances, the second algorithm copies from a vector of
N-1 elements to one of 2N-2 elements. Total memory requirement at that
instant is 3N-3 elements, only about 50% more than the fixed-growth method.

The best case is when N elements just fit, and it's the N+1st element that
would cause reallocation. Then, at the moment of the last reallocation, the
fixed growth method would be copying from N-M to N, needing space for a
total of 2N-M elements. (Call it 2N, assuming N much larger than M.) The
doubling algorithm would have grown from N/2 to N elements, for a total
cost of 3N/2. In this case, despite the apparent waste, the doubling method
actually winds up using 25% *less* memory than the more conservative
method.

In the average case, the fixed-growth method will have needed space for 2N
elements to grow to a size of N, while the doubling algorithm will have
needed only 9N/4. The "exponentially growing" store compares much more
favorably, in terms of total memory required, than one might expect.

-Ron Hunsinger

Pete Becker

unread,
Jul 6, 2000, 3:00:00 AM7/6/00
to
Ron Hunsinger wrote:
>
> In article <bkq2msgqd93c5bcka...@4ax.com>, Herb Sutter
> <hsu...@peerdirect.com> wrote:
>
> > Everything you said is true except for the last sentence, because that
> > master array just holds T*'s, and manipulating those doesn't count
> > toward the complexity we're talking about. 23.1/2 states:
> >
> > 2 All of the complexity requirements in this clause are stated solely in
> > terms of the number of operations on the contained objects. [Example:
> > the copy constructor of type vector <vector<int> > has linear complex-
> > ity, even though the complexity of copying each contained vector<int>
> > is itself linear. ]
>
> I think you're misreading that. It doesn't mean that bookkeeping operations
> aren't counted. What they meant to say (and admittedly said badly, although
> the example makes the intent clear) is that each operation on a contained
> object is counted as if it were constant time even if it really isn't.
>
> In the case of a vector<T>, you can compute the cost of copying a vector AS
> IF you had a constant upper bound on the time needed to copy a T, even if
> (as in the case where T = vector<int>) there is not actually any such upper
> bound. You can do that, because 23.1/2 says that each operation on a T
> (including copying it) is a single operation.
>

Actually, it says a great deal less than this. <g> Those words are a
holdover from the HP STL documentation that was submitted to the standards
committee. That documentation only discussed the number of T operations
that were made. We added discussions of time complexity (as in the
description of list::push_back that I quoted earlier), and it looks like
nobody noticed this front matter that simply says the wrong thing.


-- Pete

Dinkumware, Ltd.
"Genuine Software"

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Ian McCulloch

unread,
Jul 8, 2000, 3:00:00 AM7/8/00
to
Herb Sutter wrote:
>
> >advantages of a deque over vector come from inserting into the deque's
> >internal vector 1/M times as often, and perhaps from copying fewer bytes
> >when reallocating (when a pointer is smaller than a T object). Both of
> >these affect the multiplier, but not the time complexity itself. That
> >is, a deque will be faster than a vector, but asymptotically both are
> >amortized constant time, not constant time.
>
> Everything you said is true except for the last sentence, because that
> master array just holds T*'s, and manipulating those doesn't count
> toward the complexity we're talking about. 23.1/2 states:
>
> 2 All of the complexity requirements in this clause are stated solely in
> terms of the number of operations on the contained objects. [Example:
> the copy constructor of type vector <vector<int> > has linear complex-
> ity, even though the complexity of copying each contained vector<int>
> is itself linear. ]
>

It looks to me like the intent of that paragraph is to say that,
for the purposes of our function, a T operation occurs in constant time.
That isn't what it says though, and the difference is critical. What if
I wrote a version of std::copy that looked like this:

template <class Iter1, Iter2>
void copy(Iter1 start1, Iter1 finish1, Iter2 start2)
{
int n = std::distance(start1, finish1);
while (start1 != finish1)
{
*start2++ = *start1++;
for (int i = 0; i < n; ++i)
{
waste_some_time();
}
}
}

If the *start2++ = *start1++ operation is taken to be constant time (in
fact,
at worst O(n) would suffice), then the time complexity of this algorithm
is
clearly O(n^2). Taking the literal
interpretation of the quoted paragraph of the standard, it is O(n) and
thus
would be a conforming implementation (aside from it not working on input
iterators). That seems incorrect to me - no matter how
fast the T operations are the above algorithm simply cannot execute in
O(n).
Similar situation for deque insertion, even if it only has one T
operation, the
algorithm itself is only amortized constant time (by the conventional
meaning
of time complexity). So, it seems that the standard uses a different
notion
of time complexity than everyone else. A defect?

Cheers,
Ian McCulloch

Herb Sutter

unread,
Jul 8, 2000, 3:00:00 AM7/8/00
to
Pete Becker <peteb...@acm.org> writes:
>Actually, it says a great deal less than this. <g> Those words are a
>holdover from the HP STL documentation that was submitted to the standards
>committee. That documentation only discussed the number of T operations
>that were made. We added discussions of time complexity (as in the
>description of list::push_back that I quoted earlier), and it looks like
>nobody noticed this front matter that simply says the wrong thing.

Thanks for helping to clear up something I found difficult to grok about
Clause 23. I think we ought to fix this, if only to avoid having other
folks fall down the same hole I did.

Herb

John Potter

unread,
Jul 24, 2000, 3:00:00 AM7/24/00
to
On 8 Jul 2000 07:06:57 -0400, Herb Sutter <hsu...@peerdirect.com>
wrote:

> Pete Becker <peteb...@acm.org> writes:
> >Actually, it says a great deal less than this. <g> Those words are a
> >holdover from the HP STL documentation that was submitted to the standards
> >committee. That documentation only discussed the number of T operations
> >that were made. We added discussions of time complexity (as in the
> >description of list::push_back that I quoted earlier), and it looks like
> >nobody noticed this front matter that simply says the wrong thing.
>
> Thanks for helping to clear up something I found difficult to grok about
> Clause 23. I think we ought to fix this, if only to avoid having other
> folks fall down the same hole I did.

I think that you are opening a very large can of worms. The standard
is very sloppy about time complexities. For example note that for
vector<int> v, v.insert(v.end(), istream_iterator<int>(cin),
istream_iterator<int>()) takes zero time. Must be interesting to
implement.:) The introduction to table 68 states that the operations
take constant time. It is obvious that the constant time is amortized
since the table includes vector push_back. There is also the general
question of amortized over what? It is obvious that a sequence of
alternating ++ -- for set iterator will not be amortized constant time,
but a complete traversal using ++ or -- will be.

A vector with growth function new_capacity = capacity + 1000 does not
conform but will out perform new_capacity = capacity * 1000001 / 1000000
+ 1 which does conform.

Since the standard assumes that copy construction and allocation are
constant time, there is no way to measure times and show other than
a gross violation of the requirements. Page faults and multiprocessing
also contribute. Complexities in terms of operations can be counted
accurately. In the above set iterator example, the operation is a
pointer chase and associated compares.

Do you really want to take the time to fix all of those things and
the others not mentioned?

John

0 new messages