Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

deque<> vs. list<> for a queue

49 views
Skip to first unread message

Bonita Montero

unread,
Nov 17, 2023, 5:06:11 AM11/17/23
to
I know that a deque is superior to a list if you want to have a queue
because it does chunk-allocations, thereby allogating the storage for
multiple nodes at once. But I was curious about what's the actual per-
formance-impact. And I was interested in how many allocations were
saved through a deque. So I overloaded new and delete, redirected the
allocation to malloc and free and counted the number of allocations.

#include <iostream>
#include <deque>
#include <string>
#include <sstream>
#include <chrono>
#include <optional>
#include <list>

using namespace std;
using namespace chrono;

uint64_t allocs = 0;

void *operator new( size_t n )
{
++allocs;
return malloc( n );
}

void operator delete( void *p ) noexcept
{
free( p );
}

int main()
{
auto bench = []( auto queue )
{
for( size_t i = 0; i != 10'000; ++i )
queue.emplace_back( (ostringstream() << "no short string
optimization: " << i).str() );
auto start = high_resolution_clock::now();
constexpr size_t ROUNDS = 100'000'000;
uint64_t allocsBefore = ::allocs;
for( size_t r = ROUNDS; r--; )
{
string str( move( queue.front() ) );
queue.pop_front();
queue.emplace_back( move( str ) );
}
double ns = (double)duration_cast<nanoseconds>(
high_resolution_clock::now() - start ).count() / ROUNDS;
uint64_t allocs = ::allocs - allocsBefore;
cout << ns << ": " << allocs << endl;
};
bench( deque<string>() );
bench( list<string>() );
}

With MSVC I get the following results on a AMD 7950X Zen4 CPU:
4.01774: 6384
23.5851: 100000000
So the deque is about eight times faster and has 15.600 times less
allocations. So the chunk size must be really large with MSVC's
deque.
With g++ 12.3.0 the results are the following:
3.13869: 6250000
11.8536: 100000000
So the deque is about 3.5 times faster and the chunks seem rather
small with 16 elements per chunk. Nevertheless libstdc++ seems to
have the faster memory allocator so that allocating so much more
chunks doesn't result in less performance. The better allocation
performance shows also with the list-performance.
So the decision is for a deque if you can afford the memory blend.

Bonita Montero

unread,
Nov 17, 2023, 5:29:34 AM11/17/23
to
I guess that MSVC's deque-implementation attaches an just emptied front
chunk to the back's capactity. Maybe I'll find a way to prove that.

Pavel

unread,
Nov 17, 2023, 9:43:06 PM11/17/23
to
This particular test will most likely be bested by boost::circular_buffer.

Bonita Montero

unread,
Nov 17, 2023, 11:19:11 PM11/17/23
to
Am 18.11.2023 um 03:42 schrieb Pavel:

> This particular test will most likely be bested by boost::circular_buffer.

I've got about 18 cycles per rotate from the back to the front unter
linux. A circular buffer wouldn't be much more efficient and has the
drawback that it is expensive to extend.
0 new messages