#include<iostream>
#include<new>
int main()
{
int i,n;
int *p;
std :: cout << "How many numbers would you like to type ";
std :: cin >> i;
p= new(std::nothrow) int[i];
if(p==0)
std :: cout<<"Memory Could Not Be Allocated ";
else
for(n=0;n<i;n++)
{
std :: cout <<"Enter Number ";
std :: cin >>p[n];
}
std :: cout<< "You have Entered ";
for(n=0; n<i;n++)
{
if(p[n] < p[i-1])
{
std :: cout <<p[n]
<<", ";
}
else if(p[n] == p[i-1])
{
std :: cout <<p[n]
<<".";
}
}
delete [] p;
std :: cout <<std ::endl;
return 0;
}
> is it the right way to write the Code......
> Here is mine code for using new and delete where i use get the number
> from the user and print them. in a order. can i do it better ?
Well, if the scope of your question is whether or not you're using new and
delete correctly, you are. However, if "better" should be interpreted in a
wide, generic, manner, a better approach is to use std::list, which
eliminates the need to use new and delete, altogether.
Also, there are other unrelated problems with your code, namely that when
you are trying to print out the entered values, your punctuation is going to
come out right only if you entered your numbers in a strictly numerically
increasing order. If you typed them in an unsorted order, your punctuation
is going to come out broken.
Hi Sam,
I guess this is better implemented in std::vector rather than list,
since the number of elements itz gonna hold is kinda a priori
( program gets it via cin ). If number of elements are not known and
random accessing is not needed, then as you correctly pointed out,
list is the better choice. I am seeing few access violation bugs in
the program given. Using STL would solve those issues.
Thanks & Regards,
Karthik.
Next is exceptions, yes there are no exceptions in your simple example
but are you sure there aren't in a more realistic scenario? So you
will have a memory leak if something between new and delete will throw
an exception. Solving this scenario by putting the code into a try
block might be a solution but when your catch code looks like this
catch(...){ delete [] p; throw }
you should think about RAII instead and encapsulate the array or
atleast the pointer in an object which deletes the array in its
deconstructor.
The best solution to solve all theser problems is to simply use
std::vector<int> instead of int[]. The vectore takes care of proper
memory handling, and if you have to use old C-code which expects
pointers to arrays and the size of the array like this one:
void oldCFunc (int * data, size_t size);
then you can call it with a vector too
std::vector<int> myVector(20);
oldCFunc (&myVector[0], myVector.size());
> increasing order. If you typed them in an unsorted order, your punctuation
> is going to come out broken.
Thanks Sam for the Reply
Yes its broken when user give unsorted order of numbers. how i can do
better with it. i already done it with vector.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
int main( void )
{
typedef std::vector<int> container_type;
container_type values;
container_type::value_type value;
std::cout << "Enter numbers, end with 'q': ";
while (std::cin >> value)
{
values.push_back(value);
}
if (! values.empty())
{
std::cout << "\nYou entered the " << values.size() << " values: ";
const container_type::iterator last (--values.end ());
// copy all but the last value from the container
// to the output stream, use ", " as delimiter.
std::copy ( values.begin()
, last
, std::ostream_iterator<container_type::value_type>
(std::cout, ", ") );
// copy the last value to the output stream
std::cout << *last << "." << std::endl;
}
else
std::cout << "\nYou entered no valid data." << std::endl;
return 0;
}
- Scooser
Thnaks Scooser its Grt...
What features of list, as opposed to any of the other standard
containers, do you see as being particularly relevant to the OP's
problem?
--
Richard Herring
> > What features of list, as opposed to any of the other standard
> > containers, do you see as being particularly relevant to the OP's
> > problem?
>
> The ability to collect a list of numbers, of variable size, then iterate
> over them for the purpose of printing them back.
I'll repeat part of what Richard said for emphasis: "as opposed to
any of the other standard containers"
Do you believe list is the only standard container that can collect a
list of numbers of variable size then iterate over them to print them
out?
If you believe that, I'd suggest you spend a bit of time studying the
other standard containers -- starting with vector and deque.
--
Later,
Jerry.
std::vector can do all of that.
Not random insertion, deletion and splicing, then?
--
Richard Herring
> In message <cone.1253828350....@commodore.email-scan.com>,
> Sam <s...@email-scan.com> writes
>>Richard Herring writes:
>>
>>> In message
>>><cone.1253761562...@commodore.email-scan.com>, Sam
>>><s...@email-scan.com> writes
>>>>arunix writes:
>>>>
>>>>> is it the right way to write the Code......
>>>>> Here is mine code for using new and delete where i use get the number
>>>>> from the user and print them. in a order. can i do it better ?
>>>>
>>>>Well, if the scope of your question is whether or not you're using
>>>>new and delete correctly, you are. However, if "better" should be
>>>>interpreted in a wide, generic, manner, a better approach is to use
>>>>std::list,
>>> What features of list, as opposed to any of the other standard
>>>containers, do you see as being particularly relevant to the OP's
>>>problem?
>>
>>The ability to collect a list of numbers, of variable size, then
>>iterate over them for the purpose of printing them back.
>
> std::vector can do all of that.
Except that you either need to know the number of elements in advance, or
end up doing a lot of needless reallocations.
> Not random insertion, deletion and splicing, then?
Insertion and deletion from a middle of a std::list carries less complexity
than it is for a std::vector, but thank you for playing anyway.
> In article <cone.1253828350....@commodore.email-
> scan.com>, s...@email-scan.com says...
>>
>> Richard Herring writes:
>
> [ ... ]
>
>> > What features of list, as opposed to any of the other standard
>> > containers, do you see as being particularly relevant to the OP's
>> > problem?
>>
>> The ability to collect a list of numbers, of variable size, then iterate
>> over them for the purpose of printing them back.
>
> I'll repeat part of what Richard said for emphasis: "as opposed to
> any of the other standard containers"
Repeating the same question will only yield the same answer: "the ability to
collect a list of numbers, of variable size, then iterate over them for the
purpose of printing them back". If you do not understand what that means,
look it up.
> Do you believe list is the only standard container that can collect a
> list of numbers of variable size then iterate over them to print them
> out?
Do you believe that grandstanding in this manner is really a successful
strategy for improving your self-image?
> If you believe that, I'd suggest you spend a bit of time studying the
> other standard containers -- starting with vector and deque.
It's always heartwarming to get helpful suggestions, from total strangers
who don't know me from Adam, to do what I've been doing for the last fifteen
years.
I think the point was that std::list is a poor choice as default: in practice
small allocations are inefficient by default, and the functionality of std::list
is not very well suited to what one ordinarily needs.
Cheers,
- Alf
> >>> In message
> >>><cone.1253761562.904448.4771....@commodore.email-scan.com>, Sam
> >>><s...@email-scan.com> writes
> >>>>arunix writes:
> >>The ability to collect a list of numbers, of variable size,
> >>then iterate over them for the purpose of printing them
> >>back.
> > std::vector can do all of that.
> Except that you either need to know the number of elements in
> advance, or end up doing a lot of needless reallocations.
You do a lot less allocations using vector than you will using
list. This is typical of the sort of thing vector is best for.
--
James Kanze
A lot? How much is "a lot"?
If you insert, let's say, one million elements into a vector with
push_back() (and without any reserve()), exactly how many reallocations
will be performed? I think the answer is about 20.
Maybe you consider 20 reallocations for 1 million elements "a lot",
but consider how many allocations will your std::list perform when you
push 1 million elements to it.
>> Not random insertion, deletion and splicing, then?
>
> Insertion and deletion from a middle of a std::list carries less
> complexity than it is for a std::vector, but thank you for playing anyway.
That assumes you already have an iterator to the element you want to
remove (or where the insertion should be done). If you don't, then there
will be little difference between std::list and std::vector.
Suppose you have a std::list with 1 million elements, and a
std::vector with 1 million elements, and you want to remove the element
in the middle. Which one is faster?
And exactly how is that different from any other STL container? Why is
std::list the best solution in this case?
> Sam wrote:
>>> std::vector can do all of that.
>>
>> Except that you either need to know the number of elements in advance,
>> or end up doing a lot of needless reallocations.
Indeed. push_back is O(1). The number of element copy is only 2 per
element, once for the first insertion, and on average, once per
element for the reallocation.
> A lot? How much is "a lot"?
>
> If you insert, let's say, one million elements into a vector with
> push_back() (and without any reserve()), exactly how many reallocations
> will be performed? I think the answer is about 20.
>
> Maybe you consider 20 reallocations for 1 million elements "a lot",
> but consider how many allocations will your std::list perform when you
> push 1 million elements to it.
What matters is that each element is not copied O(N) times, but O(1)
times. Actually, when you insert N elements, there's 2N copies
(instead of N copies if you knew in advance the number of elements).
See http://groups.google.fr/group/comp.programming/msg/3da8ae4c35834138?hl=en
(beware of the typo, inserting is O(1), not O(N)).
--
__Pascal Bourguignon__
We also have another problem here - what is the cost of copying one
int? What is the cost of allocating each int separately?
How much did you save from copying N ints over 2N ints?
Bo Persson
You probably missed the beginning of this thread. The OP was learning the
basics; and was trying to write a simple program that read and stored a list
of numbers, then printed them back.
The OP was trying to learn the basics of new/delete, thusly the OP was
allocating an array, for this purpose, from the heap. Since doing that
requires knowing the number of elements in advance, the OP was reading the
number of elements first, allocating the array from the heap, populating the
array, then iterating over it, and printing back its contents. The OP also
asked for general points and tips.
I pointed out that functionality similar results can be obtained by using
std::list. Additionally, that won't require getting beforehand the number of
elements in the array, in advance. Given that OP did not require random
access, I believe that std::list would be a better choice.
But, sometimes, given the number of eager beavers around here who are just
itching to demonstrate their expert knowledge of STL, the original question
can easily get lost amongst all the noise.
Hi.
For the OP's purpose, std::vector does not require obtaining the size of the
list in advance (see the push_back() member routine), and has generally better
performance: push_back is of amortized O(1) complexity. ;-)
I guess that's why people are sort of "upset" about your suggestion of std::list.
Not that it's terribly wrong or that std::vector is so much better or that it
matters at all in this case, and if one starts being pedantic about things then
a case can be made for std::deque as the probably best all-round potato.
But std::list is not generally a good *default* choice: it's a very special
purpose structure.
Cheers & hth.,
- Alf
> * Sam:
>> Juha Nieminen writes:
>>
>>> Sam wrote:
>>>>> I'll repeat part of what Richard said for emphasis: "as opposed to any
>>>>> of the other standard containers"
>>>>
>>>> Repeating the same question will only yield the same answer: "the
>>>> ability to collect a list of numbers, of variable size, then iterate
>>>> over them for the purpose of printing them back". If you do not
>>>> understand what that means, look it up.
>>>
>>> And exactly how is that different from any other STL container? Why is
>>> std::list the best solution in this case?
>>
>> Because, in the OP's poster case, his basic demo program did not require
>> random access, and std::list won't require obtaining the size of the
>> list in advance, in order to get the best performance.
>
> Hi.
>
> For the OP's purpose, std::vector does not require obtaining the size of the
> list in advance (see the push_back() member routine), and has generally better
> performance: push_back is of amortized O(1) complexity. ;-)
I am curious how true it is. So, I decided to measure the performance of
std::vector vs std::list in OP's original context: collect and store a
relatively small list of numbers. How efficient is std::list, versus
std::vector, in creating a small list/vector.
Test program:
int main()
{
size_t j;
for (j=0; j<100000; j++)
{
std::list<int> n;
size_t i;
for (i=0; i<10; i++)
n.push_back(i);
}
return (0);
}
Compiled with gcc 4.4.1. Default compilation options. Three consecutive
runs:
[mrsam@lc2440 tmp]$ time ./t
real 0m0.219s
user 0m0.209s
sys 0m0.002s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.214s
user 0m0.208s
sys 0m0.001s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.216s
user 0m0.210s
sys 0m0.002s
Replace std::list with std::vector. The results:
[mrsam@lc2440 tmp]$ time ./t
real 0m0.305s
user 0m0.296s
sys 0m0.001s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.304s
user 0m0.297s
sys 0m0.000s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.303s
user 0m0.297s
sys 0m0.001s
Seems fairly consistent -- std::vector is about 30% slower.
Now, using -O2, the results for std::list:
[mrsam@lc2440 tmp]$ time ./t
real 0m0.093s
user 0m0.086s
sys 0m0.001s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.093s
user 0m0.086s
sys 0m0.000s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.092s
user 0m0.086s
sys 0m0.001s
and for std::vector:
[mrsam@lc2440 tmp]$ time ./t
real 0m0.111s
user 0m0.105s
sys 0m0.001s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.110s
user 0m0.105s
sys 0m0.001s
[mrsam@lc2440 tmp]$ time ./t
real 0m0.110s
user 0m0.104s
sys 0m0.001s
The performance gap has narrowed, somewhat, but std::list still has the
edge.
> I guess that's why people are sort of "upset" about your suggestion of std::list.
Yeah, finding out that they've been barking up the wrong tree, all this
time, can be quite shocking :-).
> But std::list is not generally a good *default* choice: it's a very special
> purpose structure.
In OP's use case, it does seem to be a better one.
No.
Can you figure out what's wrong with your measurement?
Well, OK, that may explain your apparent bafflement at the comments you're drawing.
The key to understanding this is to combine three pieces of information:
* The C++03 standard requires a contiguous buffer for std::vector.
* push_back has guaranteed amortized O(1) complexity.
* Allocation is generally way more costly (time) than copying a handful
of bytes.
Regarding the last point it seems that you have a fast allocator. This
influences the relative performance of std::list versus std::vector.
The first two points mean that that when the buffer space is exceeded a
std::vector /must/ increase the buffer size by a factor, not by a constant
amount. The most common, as far as I know, is to just double the buffer size.
This involves 1 allocation plus copying O(n) bytes, where n is the current size.
When a std::vector that uses buffer doubling is created by repeated push_back
operations you therefore have two main contributions to the time: roughly C*n
time for copying (where C is some constant), and roughly A*(1+log2(n)) time for
allocations (where A is some other constant, and log2(x) is the number such that
2^(log2(x)) = x).
In contrast, for the std::list each push_back generally involves one allocation,
so you have roughly C*n+A*n (with possibly not quite the same values for C and
A, but whatever), anyway, linear in n.
If you draw the graph of log2(x) you'll see that from x=1 on it starts going
upward at a good incline, but flattens out rapidly. For example, log2(4) = 2,
but then you have to go up to log2(8) to get 3, and next all the way up to
log2(16) to get 4. Further on, log2(128) = 7, but then you have to go 128 units
more to increase the result to 8. So on, it gets really flat very fast.
And so at the beginning around n=10, which was you chose, there is a practical
possibility that std::list can outperform a std::vector, because (if there is no
small buffer optimization in the std::vector implementation) you have five
allocations for the vector and ten for the list, and with a fast allocator the
data copying of the vector may make up for that.
But if you go up to n=15 you still have just five allocations for the vector,
but now 15 for the list! At this point std::vector will probably outperform
std::list.
At n=16 you should expect a possible apparent drop in std::vector efficiency,
because here (with a buffer-doubling vector) you incur an additional allocation.
But then it's a free ride all the way up to and including n=32, where for the
list you have 32 allocations, and for std::vector you have just 6. Given that
the assumption in the third point above holds, you should expect std::list to be
decidedly inferior efficiency-wise in this range of n. Which is still *small*.
In short, the main problem with your test was that you tested only a single n
which was not just small but very small (namely 10). Instead your test should
have checked various values of n within some reasonable small range. And better
yet, systematically checking the general behavior of std::list versus
std::vector as n increases.
Rather the contrary -- there's quite a lot wrong with it. First of
all, what you originally claimed was supposed to be done was not just
insert some data into the collection, but iterate over the collection
to print out the items. Your test doesn't iterate over the items to
print them out, OR do anything else with them.
Second, since you've done nothing with the contents of the
collection, it's entirely possible (in fact rather likely, if you're
using a good compiler) that most of what you've done is being
optimized away completely. I can't say exactly how much the
particular compiler you're using happens to do in that direction, but
in the end it doesn't matter -- a meaningful benchmark has to
_ensure_ that it's measuring what it's supposed to, and yours does
nothing of the sort.
Third, you're creating an object (either the vector or list) inside
outer loop, so there's a good chance that the creation time is
affecting your results.
Fourth, you're measuring the time for the program as a whole instead
of even attempting to measure it for the part you care about.
Fifth, the total time taken by either program is small enough that
there's little certainty that the times you're getting mean much of
anything. To mean much, you need to increase the total time taken by
(for example) increasing the number of iterations in your outer loop.
Those aren't really the only problems, but they give some idea of how
far off the mark your test really is.
Now, there is one minor problem: if we actually print out the
contents, that printing will almost inevitably dominate the overall
time -- to the point that what we care about will probably be lost in
the noise. At the same time, we do want to ensure that the contents
of the collection is actually _used_, so the work being done can't be
optimized away. One way to meet both of these requirements is to
summarize the contents of the collection (e.g. add up all the values)
and only print out the sum.
One possible outcome of dealing with those glaring problems would be
code like this:
#include <vector>
#include <list>
#include <time.h>
#include <iostream>
#include <numeric>
int main()
{
std::vector<int> n;
clock_t start = clock();
for (size_t j=0; j<900000; j++)
{
n.clear();
for (size_t i=0; i<10; i++)
n.push_back(i);
}
int total = std::accumulate(n.begin(), n.end(), 0);
clock_t total_time = clock()-start;
std::cout << total << "\n";
std::cout << total_time << "\n";
return (0);
}
Three runs of this gives:
D:\C\source>time_lst
45
46
D:\C\source>time_lst
45
46
D:\C\source>time_lst
45
46
Switching to list gives:
D:\C\source>time_lst
45
1435
D:\C\source>time_lst
45
1435
D:\C\source>time_lst
45
1419
There's still a bit of jitter in the timing, but it's of little
relevance -- either way, vector is quite a bit faster. Of course the
exact ratio also depends on the compiler and library. Trying with the
compilers I have handy, list does a bit better than shown above with
some of them -- but even at best, it's still about 15 times slower
than vector.
--
Later,
Jerry.
> * Sam:
>> Alf P. Steinbach writes:
>>
>>> No.
>>>
>>> Can you figure out what's wrong with your measurement?
>>
>> Yes: nothing.
>
> Well, OK, that may explain your apparent bafflement at the comments you're drawing.
Yes. No amount of grandiose pondering on various theoretical aspects of
various algorithms has any impact on the practical conclusion that, for OP's
case, std::list is the best solution.
>
> The key to understanding this is to combine three pieces of information:
>
> * The C++03 standard requires a contiguous buffer for std::vector.
>
> * push_back has guaranteed amortized O(1) complexity.
While std::list::push_back()'s complexity is guaranteed constant, with no
additional qualifiers. Period.
> * Allocation is generally way more costly (time) than copying a handful
> of bytes.
>
> Regarding the last point it seems that you have a fast allocator. This
> influences the relative performance of std::list versus std::vector.
Indeed. I'm only using the most popular compiler, and the most popular C
library, on non-MSFT platforms. Gee, I really worked so hard to assemble
such a marginal environment in order to gather those metrics.
> In short, the main problem with your test was that you tested only a single n
> which was not just small but very small (namely 10). Instead your test should
Yeah -- just like OP's test program, where I suggested that he'd look at
std::list, instead. I must've missed the part where he was trying to
allocate memory for a million ints, and meticulously typing all of them in
from the keyboard.
And your analysis, which was aimed to defend one's bias towards std::vector,
conveniently neglected to consider all the other factors that work to the
detriment of std::vector:
1) Objects with expensive copy constructors
2) Objects of large size
Both will work to make std::vector's overall performance worse than
std::list's, as the number of elements in the container increases.
Your argument for std::vector was resting on theoretical aspects of
std::vector and std::list's complexity outside of the boundaries of the
original OP's use case, and had no practical elements. However, once you
went beyond the original OP's requirements, and confined to theoretical
factors only, your argument was artificially limited to only those
theoretical factors that work in std::vector's favor, and they seemingly
ommited all the factors that work in std::list's favor.
Which, of course, make yours' a flawed argument.
> In article <cone.1253929234....@commodore.email-
> scan.com>, s...@email-scan.com says...
>>
>> Alf P. Steinbach writes:
>>
>> > No.
>> >
>> > Can you figure out what's wrong with your measurement?
>>
>> Yes: nothing.
>
> Rather the contrary -- there's quite a lot wrong with it. First of
> all, what you originally claimed was supposed to be done was not just
> insert some data into the collection, but iterate over the collection
> to print out the items. Your test doesn't iterate over the items to
> print them out, OR do anything else with them.
Sure, Einstein. Since the overhead of dumping three million ints to
std::cout dwarfs the relative differences between the containers' methods,
that would make the entire comparison meaningless, nullifying everyone's
arguments, including yours.
I'm sorry that I disappointed you, by failing to come up with the
meaningless results that you were so much looking for.
Not.
> Second, since you've done nothing with the contents of the
> collection, it's entirely possible (in fact rather likely, if you're
> using a good compiler) that most of what you've done is being
> optimized away completely.
Yea, it's so easy to dispense an endless stream of possibly this, and
possibly that, sitting in one's recliner. Coming up with actual data, to
back up one's theoretical musings, is always the difficult part.
> I can't say exactly how much the
> particular compiler you're using happens to do in that direction, but
Yes. You have no idea. But that doesn't stop you from making affirmative
conclusions, of course.
> Third, you're creating an object (either the vector or list) inside
> outer loop, so there's a good chance that the creation time is
> affecting your results.
Yes. We all know how expensive allocation and initialization of 8 or 12
bytes on the stack is.
Haaaa haaaa haaaa!
> Fourth, you're measuring the time for the program as a whole instead
> of even attempting to measure it for the part you care about.
You do realize, of course, that startup time is a fixed constant, so by
subtracting the fixed constant startup time from the total run time, the
relative performance differences would be even more pronounced. Especially
in the -O2 case. I can't wait to read your argument how the std::list
version's startup time is so much faster than the std::vector version's is.
Here, here's another shovel for you. You can dig your hole faster with both
hands.
> Fifth, the total time taken by either program is small enough that
> there's little certainty that the times you're getting mean much of
> anything.
Right. Instead of three samples, you require at least three million, before
you finally become convinced.
I'll get back to you.
> To mean much, you need to increase the total time taken by
> (for example) increasing the number of iterations in your outer loop.
Sure thing, Einstein. After increasing the number of iterations by a factor
of ten, the resulting run times for std::list are:
[mrsam@lc2440 tmp]$ time ./t
real 0m2.195s
user 0m2.099s
sys 0m0.000s
[mrsam@lc2440 tmp]$ time ./t
real 0m2.177s
user 0m2.080s
sys 0m0.004s
[mrsam@lc2440 tmp]$ time ./t
real 0m2.245s
user 0m2.080s
sys 0m0.008s
And for std::vector:
[mrsam@lc2440 tmp]$ time ./t
real 0m3.134s
user 0m2.995s
sys 0m0.002s
[mrsam@lc2440 tmp]$ time ./t
real 0m3.113s
user 0m2.989s
sys 0m0.004s
[mrsam@lc2440 tmp]$ time ./t
real 0m3.356s
user 0m2.983s
sys 0m0.010s
Gee, that makes std::vector look so much better!
> Now, there is one minor problem: if we actually print out the
> contents, that printing will almost inevitably dominate the overall
Right. What I said, above.
> time -- to the point that what we care about will probably be lost in
> the noise.
… and which makes all arguments of std::vector's superioty moot.
> n.clear();
Bzzzt. You just got caught cheating.
Thank you for playing. You can go home, now.
OK, I put it in first-year-student terms but that wasn't enough.
I'm sorry but I don't think it can be simplified more than that. At least not by me.
>> The key to understanding this is to combine three pieces of information:
>>
>> * The C++03 standard requires a contiguous buffer for std::vector.
>>
>> * push_back has guaranteed amortized O(1) complexity.
>
> While std::list::push_back()'s complexity is guaranteed constant, with
> no additional qualifiers. Period.
>
>> * Allocation is generally way more costly (time) than copying a
>> handful
>> of bytes.
>>
>> Regarding the last point it seems that you have a fast allocator. This
>> influences the relative performance of std::list versus std::vector.
>
> Indeed. I'm only using the most popular compiler, and the most popular C
> library, on non-MSFT platforms. Gee, I really worked so hard to assemble
> such a marginal environment in order to gather those metrics.
>
>> In short, the main problem with your test was that you tested only a
>> single n which was not just small but very small (namely 10). Instead
>> your test should
>
> Yeah -- just like OP's test program, where I suggested that he'd look at
> std::list, instead. I must've missed the part where he was trying to
> allocate memory for a million ints, and meticulously typing all of them
> in from the keyboard.
Try with more than ten elements.
Talking about millions is, sorry to be blunt, idiocy.
> And your analysis, which was aimed to defend one's bias towards
> std::vector, conveniently neglected to consider all the other factors
> that work to the detriment of std::vector:
>
> 1) Objects with expensive copy constructors
>
> 2) Objects of large size
>
> Both will work to make std::vector's overall performance worse than
> std::list's, as the number of elements in the container increases.
Sorry, that's incorrect.
> Your argument for std::vector was resting on theoretical aspects of
> std::vector and std::list's complexity outside of the boundaries of the
> original OP's use case,
Sorry, that's incorrect.
> and had no practical elements. However, once you
> went beyond the original OP's requirements, and confined to theoretical
> factors only, your argument was artificially limited to only those
> theoretical factors that work in std::vector's favor, and they seemingly
> ommited all the factors that work in std::list's favor.
I'm sorry, but an analysis of complexity isn't a set of arguments for and
against. It isn't an emotional thing feeling your way to a decision about which
arguments are strongest or most convincing. It's simple *math*, in this case at
the secondary school level, and you can check that math by simple programs.
However, since you failed to understand it, and since I don't know how to
simplify it even more, I think that's that.
Cheers & hth.,
- ALf
Exactly. And the normal solution for that is std::vector.
> The OP was trying to learn the basics of new/delete, thusly
> the OP was allocating an array, for this purpose, from the
> heap. Since doing that requires knowing the number of elements
> in advance, the OP was reading the number of elements first,
> allocating the array from the heap, populating the array, then
> iterating over it, and printing back its contents. The OP also
> asked for general points and tips.
Clearly. And the correct response to this is vector, unless
there are particular needs which suggest otherwise.
> I pointed out that functionality similar results can be
> obtained by using std::list.
The most similar functionality to a C style array allocated
dynamicly is std::vector. In general, std::vector is the
container one uses by default. There are, in practice, very,
very few cases where std::list is appropriate.
> Additionally, that won't require getting beforehand the number
> of elements in the array, in advance. Given that OP did not
> require random access, I believe that std::list would be a
> better choice.
Better than dynamically allocating a C style array, yes. But
nowhere near as good as using std::vector.
--
James Kanze
I wouldn't use the word "disappointing". But the I'm-righter-than-you-
are-and-proud-of-it attitude we can see here isn't going to help
anybody. It certainly doesn't improve your popularity.
I'm not really sure what you did. After some testing with G++ 4.3.1
WITHOUT fancy custom allocators or any kind of cheating (like using
vector<>::reserve, vector<>::clear) I came to the conclusion that if
you don't need special list operations that are known to be fast
(splicing, insert, ... etc) a vector is both faster and consumes less
memory. No surprize. Ok, I didn't test the memory consumption part but
I think it should be obvious why this is -- even counting the
temporary memory usage a vector needs when it grows. Each node
consumes at least one pointer in addition (the XOR trick makes an
iterator slightly larger though) and there's the per heap-element
allocation overhead. For testing the speed I created & destroyed an
int-container 5000 times and each time filled it (push_back) with
10.000 elements and later iterated over the elements to compute the
sum. Here are my results:
----------8<---------- list ----------8<----------
sg@home:~/ccpp/heap$ time ./a.out l
real 0m3.332s
user 0m3.228s
sys 0m0.004s
sg@home:~/ccpp/heap$ time ./a.out l
real 0m2.976s
user 0m2.956s
sys 0m0.000s
sg@home:~/ccpp/heap$ time ./a.out l
real 0m3.985s
user 0m3.860s
sys 0m0.004s
----------8<---------- vector ----------8<----------
sg@home:~/ccpp/heap$ time ./a.out v
real 0m0.261s
user 0m0.260s
sys 0m0.000s
sg@home:~/ccpp/heap$ time ./a.out v
real 0m0.306s
user 0m0.280s
sys 0m0.004s
sg@home:~/ccpp/heap$ time ./a.out v
real 0m0.276s
user 0m0.276s
sys 0m0.000s
----------8<--------------------8<----------
I think it's a fair comparison. The use case is similar to the one of
the OP. Again: No tricks, For example, I created and destroyed the
containers 5000 times so that std::vector could not reuse its
allocated memory.
> > n.clear();
> Bzzzt. You just got caught cheating.
As you see, even without clear, reserve (which by the way is
appropriate in the OP's use case) the rather naive program using
vectors outperforms the one which uses std::list by a factor of about
10 in excecution speed.
I have trouble explaining the discrepancy between your results and
mine -- especially since you seem to be using G++ as well. I'm not
saying you're trolling. But it would explain it at least. Maybe you
have a better explanation. I'm all ears.
> Thank you for playing. You can go home, now.
Gee, you really know how to talk to people!
Cheers,
SG
Well, Sam has a point -- although he seems not to be aware of a
vector's complexity analysis w.r.t. copies. Assuming that a vector
doubles its capacity while growing it requires always about twice as
many copy constructions as a list -- regardless of the size, however,
which is where Sam's mistaken.
So, there *are* situations where the performance of a growing
std::vector will be worse than the performance of std::list. This is
the case when creating a copy is more expensive than allocating
memory.
@Sam: Note that this does not depend on the container's size but only
on the complexity of a copy construction. Since the OP wants to store
simple scalars and copying a scalar is much cheaper than allocating
memory std::vector still looks like the best container for the job.
I'm still interested in how you got your results.
Cheers,
SG
> Sam <s...@email-scan.com> wrote:
>>
>> I'm sorry that I disappointed you, by failing to come up with the
>> meaningless results that you were so much looking for.
>
> I wouldn't use the word "disappointing". But the I'm-righter-than-you-
> are-and-proud-of-it attitude we can see here isn't going to help
> anybody. It certainly doesn't improve your popularity.
When I actually decide to enter some kind of a popularity contest, I'll let
you know.
> (splicing, insert, ... etc) a vector is both faster and consumes less
> memory.
Yes. All the additional array elements that are held in reserve, for future
vector growth, really uses something called "hypothetical memory", instead
of the real one. I didn't know that!
> allocation overhead. For testing the speed I created & destroyed an
> int-container 5000 times and each time filled it (push_back) with
Poor OP. He must've spent a lot of time meticulously typing in all those
5000 ints that his test program prompted for.
>
> I think it's a fair comparison. The use case is similar to the one of
> the OP.
Really? OP's test program had him type in 5000 ints? Gosh, that's sure whole
a lot of typing.
> I have trouble explaining the discrepancy between your results and
> mine -- especially since you seem to be using G++ as well. I'm not
> saying you're trolling. But it would explain it at least. Maybe you
> have a better explanation. I'm all ears.
My explanation is very simple: I typed in the exact code, that I showed in
the thread. Well, I typed it in first, then I copy-pasted it into the
original message. Then I ran it. I timed the results. It's that simple.
Perhaps the explanation is that there seems to be some meme going that the
OP was entering billions and billions of ints, by hand, when prompted by his
test program. He did not, and I can't explain why everyone still thinks he
does, and that's the test case that should be used to evaluate list's and
vector's relative performance.
>
> Try with more than ten elements.
>
> Talking about millions is, sorry to be blunt, idiocy.
I agree. Now, go complain to everyone who insists on profiling million-sized
lists and vectors. It ain't me.
>> And your analysis, which was aimed to defend one's bias towards
>> std::vector, conveniently neglected to consider all the other factors
>> that work to the detriment of std::vector:
>>
>> 1) Objects with expensive copy constructors
>>
>> 2) Objects of large size
>>
>> Both will work to make std::vector's overall performance worse than
>> std::list's, as the number of elements in the container increases.
>
> Sorry, that's incorrect.
Sorry -- I forgot. When a vector has to be resized in order to accomodate
growth, all existing objects get copied into the new vector by magic,
without having to use copy constructors, and destructors. Gotta love those
magical constructors in TR1. And all the memory that needs to be allocated,
in reserve, for large objects, well, that actually comes from hypothetical
memory, so nobody needs to worry about it, either.
Indeed. Ignore the metrics. Pay no attention to the man behind the curtain.
>> I pointed out that functionality similar results can be
>> obtained by using std::list.
>
> The most similar functionality to a C style array allocated
> dynamicly is std::vector.
I missed the part where the OP stated an array-like structure to be a
requirement. He simply asked for an alternative to heap allocation. Your
vector requirement is a straw argument.
> In general, std::vector is the
> container one uses by default.
Only, I suppose, by those who are -- for some reason -- frighted to death by
other containers, and stay away from them for no rhyme or reason.
> There are, in practice, very,
> very few cases where std::list is appropriate.
Yup. When I want a container to hold a small number of large objects, with
expensive copy constructors, and random access is not required, a vector is
a perfect fit. Everyone has plenty of RAM and CPU these days. No worries.
Its threads like these that given me a warm, fuzzy feeling, that I'll never
have to worry about making a living, as long as there are still plenty of
cowboys around, whose messes will always need to be cleaned up. For a hefty
fee, of course.
Now, you know. Knowing is half the battle.
Seriously, with T=int I still expect it to be more efficient even with
the reserved memory and in the worst case scenario where a vector
needs to grow. I have no reason to believe otherwise. What do you
think is the overhead for allocating a list node?
You're right in that the containers' performance depends on T.
Summarization:
In case it's only about appending elements to a container and
iterating through it both kinds of container can be used. There is a
certain sizeof(T) value above which std::list comsumes less memory and
below that std::vector consumes less memory. There's also a certain
copy construction cost (time) above which std::list will be faster and
below that std::vector will be faster -- all assuming the exact size
is not known beforehand.
> > allocation overhead. For testing the speed I created & destroyed an
> > int-container 5000 times and each time filled it (push_back) with
>
> Poor OP. He must've spent a lot of time meticulously typing in all those
> 5000 ints that his test program prompted for.
What does the size have to do with anything? The excecution time and
memory usage scales linearly with the number of elements on both
containers!
As for the user not wanting to type in 5000 ints: Why do you even
bring up your memory cost argument?!
Cheers,
SG
Do you know what I find hilarious? First you argue how much more
efficient std::list is compared to std::vector, and then you argue that
efficiency is irrelevant because the original poster just wanted to
input a dozen of values by hand to his program.
So what is it? Efficiency is a reason to choose std::list in this
case, or it isn't? Try to decide already.
If efficiency is irrelevant, then you have to consider ease of usage
as the second most important feature. Which one is, in your opinion,
easier to use, std::list or std::vector? Suppose, that you want to, for
example, iterate through the elements in the container:
for(std::list<int>::iterator iter = values.begin();
iter != values.end(); ++iter)
...
vs.
for(size_t i = 0; i < values.size(); ++i)
...
Take your pick.
Besides, if reallocations are such a huge worry, then std::deque is in
this case way more efficient than std::list (because it performs
considerably less memory allocations and consumes considerably less memory).
std::list is definitely not the only alternative to std::vector.
In the small size regime where Sam tested the size matters for the contribution
from allocations. That contribution is linear for list and logarithmic for vector.
However, Sam only tested n = 10, and there were no other allocations in the
program, which might make the allocation operations unnaturally fast.
If he'd only try n=11, 12, 13...
Cheers,
- Alf
> Sam <s...@email-scan.com> wrote:
>>
>> Poor OP. He must've spent a lot of time meticulously typing in all those
>> 5000 ints that his test program prompted for.
>
> What does the size have to do with anything?
Because that was the size of his domain set. His test program did not prompt
for thousand ints, but, obviously, just a couple. As such, for his test
case, and requirements, std::list is a better choice.
> As for the user not wanting to type in 5000 ints: Why do you even
> bring up your memory cost argument?!
Why do I bring up some facts that may be quite relevant? Just consider it a
weakness of mine.
All these self-appointed experts around here are desperately trying to have
it both ways, in order to save face. As with any other issue of this kind,
it can be approached from a practical, or a theoretical viewpoint.
Practically, in the OP's case, std::list is a win. This has been shown.
However, this unexplained phobia some have of std::list leads them to
construct a theoretical argument, that, in theory, std::vector gives better
performance -- complete with theoretical analysis.
But yet, the theoretical analysis fails to include all theoretical factors,
such as cost and expense of additional copy constructors and destructors,
that are inherent in the std::vector implementation. Or that independent
pointers or iterators to members of std::vector cannot be held
independently, since, unlike with std::list, they may be invalidated by any
insert operation. No, we can't include those pesky inconveniences, in our
glowing analysis of std::vector. It's quite easy to conclude that something
is a panacea, when one willfully excludes all factors to the contrary.
I think that it all stems from nothing more than simple unfamiliarity and
inexperience, with proper usage of std::list. Fear of the unknown, and
nothing more. Arrays are an easy concept to wrap one's brain around, but
linked lists are slightly more complicated, and require more care, for
proper usage. These std::lists are too complicated, folks, that's why
they're bad. Bad, bad, bad.
> If efficiency is irrelevant, then you have to consider ease of usage
> as the second most important feature. Which one is, in your opinion,
> easier to use, std::list or std::vector? Suppose, that you want to, for
> example, iterate through the elements in the container:
>
> for(std::list<int>::iterator iter = values.begin();
> iter != values.end(); ++iter)
> ...
>
> vs.
>
> for(size_t i = 0; i < values.size(); ++i)
> ...
>
> Take your pick.
Yes. We all should take lessons on efficiencies from someone who fails to
realize that std::list::end() is an invariant, and that evaluating it on
every iteration always produces the same result.
What are you talking about? So what if it's an invariant? What does
that have anything to do with this? And exactly what does the end()
function "evaluate"? And even if calling end() was a couple of clock
cycles slower than doing something else, exactly why does it matter
here? The whole point was that efficiency is *irrelevant* when there's
just a dozen of elements in the list.
Maybe you could answer by showing how you would write that first loop
above.
Then, oh great guru of C++ programming, please explain to us mere
mortals the exact reasons why std::list "is a win", because I can't see it.
std::vector is easier to use than std::list because the former can be
accessed using a simple indexing syntax (namely operator[]) while the
latter cannot (you need to use iterators, which tend to produce more
verbose code than calling operator[]). That reason alone makes
std::vector more suitable for this situation. Less code to write, and
the code is also simpler.
You argue that std::list is more efficient. You also, rather
contradictorily, argue that in this case efficiency is irrelevant. Of
course in situations where efficiency *is* relevant (namely, when the
amount of elements is large), std::vector is clearly more efficient than
std::list for this type of task (it's faster and consumes less memory).
Linked lists are only useful in relatively rare situations (where you
need things like splicing or removing elements from the middle, assuming
you already have an iterator pointing to the element to be removed).
It's, naturally, good to know how to use std::list and when it's the
best solution. However, *in this case* it's not the best solution
(mainly because it's more verbose to use).
And why do you propose std::list as the best solution here and not,
for example, std::deque? Can you explain why std::list is better?
std::deque has both the easier syntax advantage (ie. operator[]) and
doesn't perform reallocations (which seems something you have so afraid of.
> This has been shown.
Where exactly?
> But yet, the theoretical analysis fails to include all theoretical
> factors, such as cost and expense of additional copy constructors and
> destructors, that are inherent in the std::vector implementation.
Please explain why std::deque wouldn't be a good alternative. Why
precisely std::list?
> Or
> that independent pointers or iterators to members of std::vector cannot
> be held independently, since, unlike with std::list, they may be
> invalidated by any insert operation. No, we can't include those pesky
> inconveniences, in our glowing analysis of std::vector. It's quite easy
> to conclude that something is a panacea, when one willfully excludes all
> factors to the contrary.
You are the one who seems to be excluding alternatives.
> I think that it all stems from nothing more than simple unfamiliarity
> and inexperience, with proper usage of std::list. Fear of the unknown,
> and nothing more.
You seem to have a pathological fear for hidden reallocation,
especially in situations where it doesn't matter (like this one).
> Arrays are an easy concept to wrap one's brain around,
> but linked lists are slightly more complicated, and require more care,
> for proper usage. These std::lists are too complicated, folks, that's
> why they're bad. Bad, bad, bad.
No, the reason is that std::list is not the most convenient solution
in this case. It's approximately the most complicated one.
> Sam wrote:
>> Juha Nieminen writes:
>>
>>> If efficiency is irrelevant, then you have to consider ease of usage
>>> as the second most important feature. Which one is, in your opinion,
>>> easier to use, std::list or std::vector? Suppose, that you want to, for
>>> example, iterate through the elements in the container:
>>>
>>> for(std::list<int>::iterator iter = values.begin();
>>> iter != values.end(); ++iter)
>>> ...
>>>
>>> vs.
>>>
>>> for(size_t i = 0; i < values.size(); ++i)
>>> ...
>>>
>>> Take your pick.
>>
>> Yes. We all should take lessons on efficiencies from someone who fails
>> to realize that std::list::end() is an invariant, and that evaluating it
>> on every iteration always produces the same result.
>
> What are you talking about? So what if it's an invariant?
Yeah -- what does the fact that when you're trying to measure the
performance of a given piece of code you also end up measuring the same
bit of irrelevant code, getting executed repeatedly, have to do with getting
accurate results?
> What does
> that have anything to do with this?
Right. When one wants to evaluate the performance of a specific algorithm,
the fact that you're also including multiple executions of an irrelevant
method means nothing, I repeat, NOTHING!
> And exactly what does the end()
> function "evaluate"?
That's an easy one: every iteration of the loop.
> And even if calling end() was a couple of clock
> cycles slower than doing something else, exactly why does it matter
> here?
Feh. Accuracy -- what an outmoded concept.
> The whole point was that efficiency is *irrelevant* when there's
> just a dozen of elements in the list.
I don't know whose point is that, but it's not mine's.
> Maybe you could answer by showing how you would write that first loop
> above.
for(std::list<int>::iterator b = values.begin(), e=values.end(); b != e; ++b)
And, C++0x also offers a more compact syntax that's even more accomodating
to compile-time optimizations.
> Sam wrote:
>> All these self-appointed experts around here are desperately trying to
>> have it both ways, in order to save face. As with any other issue of
>> this kind, it can be approached from a practical, or a theoretical
>> viewpoint. Practically, in the OP's case, std::list is a win.
>
> Then, oh great guru of C++ programming, please explain to us mere
> mortals the exact reasons why std::list "is a win", because I can't see it.
Elsewhere in the thread, you will find concrete data showing faster results
in std::list's case.
> std::vector is easier to use than std::list because the former can be
> accessed using a simple indexing syntax (namely operator[]) while the
Splendid. However, when random access is not required, this offers no value
added functionality.
> latter cannot (you need to use iterators, which tend to produce more
> verbose code than calling operator[]). That reason alone makes
> std::vector more suitable for this situation. Less code to write, and
> the code is also simpler.
Damn right! Less code=good, even if less code=worse performance. You'll find
plenty of compact, tight code on www.ioccc.org. We should all strive to
produce applications that look so compact, and tight.
> You argue that std::list is more efficient. You also, rather
> contradictorily, argue that in this case efficiency is irrelevant. Of
I made no such argument. Efficiency of containers with thousands of objects
is irrelevant to how efficient the given container is with a much smaller
number of objects, when this is the boundary of your domain set. This is not
the same thing as stating that efficiency is irrelevant, period.
It's just like saying that efficiency of a passenger bus doesn't matter to
me, when all I need is a personal vehicle. According to your illogical train
of thought, just because I don't care how efficient someone else's bus is,
efficiency is not important to me. That, of course, is sheer absurdity.
>> I think that it all stems from nothing more than simple unfamiliarity
>> and inexperience, with proper usage of std::list. Fear of the unknown,
>> and nothing more.
>
> You seem to have a pathological fear for hidden reallocation,
> especially in situations where it doesn't matter (like this one).
Yes. I doesn't matter if, any addition to the list might wind up, at any
given point in time, triggering a reallocation of a bunch of objects. Even
though I have pointers to some of them, I'm pretty sure that after all of
them are reallocated, all pointers will be updated, by magic.
[ ... ]
> > Rather the contrary -- there's quite a lot wrong with it. First
> > of all, what you originally claimed was supposed to be done was
> > not just insert some data into the collection, but iterate over
> > the collection to print out the items. Your test doesn't iterate
> > over the items to print them out, OR do anything else with them.
>
> Sure, Einstein. Since the overhead of dumping three million ints to
> std::cout dwarfs the relative differences between the containers' methods,
> that would make the entire comparison meaningless, nullifying everyone's
> arguments, including yours.
Reread what I said, and note (in particular) the emphasis on "OR do
anything else with them."
Next time, read what's there before you make yourself look like such
a complete idiot!
> > Second, since you've done nothing with the contents of the
> > collection, it's entirely possible (in fact rather likely, if you're
> > using a good compiler) that most of what you've done is being
> > optimized away completely.
>
> Yea, it's so easy to dispense an endless stream of possibly this, and
> possibly that, sitting in one's recliner. Coming up with actual data, to
> back up one's theoretical musings, is always the difficult part.
Yup -- you certainly seem to find it difficult anyway. So far, all
you've managed to show is that you're an idiot -- well, okay, that's
getting back to speculation -- you might be an asshole or a liar
instead!
> > I can't say exactly how much the
> > particular compiler you're using happens to do in that direction, but
>
> Yes. You have no idea. But that doesn't stop you from making affirmative
> conclusions, of course.
If it's impossible to tell for certain what was measured in a
benchmark, then yes, that supports an affirmative conclusion that it
doesn't produce meaningful results.
> > Third, you're creating an object (either the vector or list) inside
> > outer loop, so there's a good chance that the creation time is
> > affecting your results.
>
> Yes. We all know how expensive allocation and initialization of 8 or 12
> bytes on the stack is.
The whole point of the benchmark is to find out how expensive the
operations involved are. It appears that you're starting with an
assumption about what the result should be, codifying that into your
"benchmark" and then expecting everybody to consider it a huge
revelation when those assumptions show up in the results.
> Haaaa haaaa haaaa!
>
> > Fourth, you're measuring the time for the program as a whole instead
> > of even attempting to measure it for the part you care about.
>
> You do realize, of course, that startup time is a fixed constant, so by
> subtracting the fixed constant startup time from the total run time, the
> relative performance differences would be even more pronounced.
Quite the contrary, I know perfectly well that startup time is NOT a
fixed constant. Startup time depends most obviously on code size, but
also (to some degree) on how the code is arranged in the file.
It's rarely as important anymore, but at one time I found it quite
useful to optimize startup times. At times I've improved startup
times by a factor of about three without changing the functionality
of the program at all (though I'll openly admit that's an extreme
case).
> Especially
> in the -O2 case. I can't wait to read your argument how the std::list
> version's startup time is so much faster than the std::vector version's is.
Yet again, you've got things backwards. I don't have to know that
there is a difference for the benchmark to be meaningless. Exactly
the opposite is true: I have to know that there's NOT a difference
for the benchmark to be meaningful. If there's any uncertainty, the
benchmark is meaningless.
> > Fifth, the total time taken by either program is small enough
> > that there's little certainty that the times you're getting mean
> > much of anything.
>
> Right. Instead of three samples, you require at least three million, before
> you finally become convinced.
Your "measurements" would remain meaningless no matter how often they
were repeated.
> I'll get back to you.
Yes, when you get a clue, do that.
> > To mean much, you need to increase the total time taken by
> > (for example) increasing the number of iterations in your outer loop.
>
> Sure thing, Einstein. After increasing the number of iterations by a factor
> of ten, the resulting run times for std::list are:
As the previous comments made obvious to anybody with an IQ higher
than a worm's, you need to do a LOT more than just increase the
number of iterations. Get back to me when you get a clue and at least
make some attempt at fixing the other problems with your "benchmark."
[ ... ]
> Gee, that makes std::vector look so much better!
What it did was make you look even worse -- difficult though it was
to believe that such a thing was even possible!
> > Now, there is one minor problem: if we actually print out the
> > contents, that printing will almost inevitably dominate the overall
>
> Right. What I said, above.
>
> > time -- to the point that what we care about will probably be lost in
> > the noise.
>
> ? and which makes all arguments of std::vector's superioty moot.
If the OP simply wanted to write this one program, you'd be right.
Then again, it seems pretty obvious that a program that simply prints
out what it just asked you to type in is rarely useful in itself (and
even if you really wanted to do that, you could do it pretty easily
without any container at all).
This gets back to that "to anybody with an IQ higher than a worm's"
situation, where it's pretty clear that this program is not really an
end in itself. Instead, it's pretty obvious that (apparently unlike
you) the OP wants to learn at least a little but about how to
program.
That being the case, the difference between vector and list is both
meaningful and relevant.
> > n.clear();
>
> Bzzzt. You just got caught cheating.
Yes and no. It's true that there is undoubtedly _some_ difference in
speed between emptying an existing vector and creating a new one.
On the other hand, consider the alternatives: if we define the
collection inside the loop, we've also restricted its scope to the
loop, so we have to move the accumulate inside the loop as well:
for (size_t j=0; j<900000; j++)
{
std::list<int> n;
for (size_t i=0; i<10; i++)
n.push_back(i);
total = std::accumulate(n.begin(), n.end(), 0);
}
The accumulate was added simply to assure that the contents of the
collection were used, so the manipulation of the collection wouldn't
be optimized away completely. In this code, however, it's become a
substantial part of what we're timing instead. Arguably, this still
makes some sense though -- it's pretty pointless to create a bunch of
collections we never use, so using the collection every time is at
least somewhat reasonable. It doesn't make a lot of difference though
-- running this a few times, vector still wins (albeit by a rather
smaller margin).
Another possibility would be something like this:
std::vector<int> *n = NULL;
for (size_t j=0;j<iterations; j++) {
delete n;
n = new vector<int>;
for (size_t i=0; i<10; i++)
n->push_back(i);
}
int total = std::accumulate(n->begin(), n->end(), 0);
In some ways this is great -- we're creating a new collection each
time, and still only accumulating once to prevent the loop from being
optimized away. Unfortunately, with this we'd also be timing a new
and a delete at every iteration, AND all the references to the
collection are via a pointer as well. This adds a lot of overhead to
what we originally cared about.
A quick test shows that with either of these, list is still slower
than vector though.
--
Later,
Jerry.
> In article <cone.1253941921....@commodore.email-
> scan.com>, s...@email-scan.com says...
>>
>> Jerry Coffin writes:
>
> [ ... ]
>
>> > Rather the contrary -- there's quite a lot wrong with it. First
>> > of all, what you originally claimed was supposed to be done was
>> > not just insert some data into the collection, but iterate over
>> > the collection to print out the items. Your test doesn't iterate
>> > over the items to print them out, OR do anything else with them.
>>
>> Sure, Einstein. Since the overhead of dumping three million ints to
>> std::cout dwarfs the relative differences between the containers' methods,
>> that would make the entire comparison meaningless, nullifying everyone's
>> arguments, including yours.
>
> Reread what I said, and note (in particular) the emphasis on "OR do
> anything else with them."
Reread the entire thread, especially the part concerning the original issue
in this thread, namely which is the more efficient container for storing a
list of ints. The overhead of dumping three million ints to std::cout is
exactly the same irrespective of where those three million ints are
retrieved from. So, including the time to do that, when all one is trying to
do is measure the efficiency of the container itself, is quite absurd.
> Next time, read what's there before you make yourself look like such
> a complete idiot!
I never feel the need to accuse someone of being "a complete idiot". The
reason for that is quite simple. When someone's truly "a complete idiot", as
you put it, it is never necessary to explicitly say so. That's because it
should be self evident. But when it's not, that's the only time you'd
feel obligated to make such an assertion, when you have nothing to back up
your side of the argument.
>> Yea, it's so easy to dispense an endless stream of possibly this, and
>> possibly that, sitting in one's recliner. Coming up with actual data, to
>> back up one's theoretical musings, is always the difficult part.
>
> Yup -- you certainly seem to find it difficult anyway. So far, all
I can assure you it was no trouble at all. It took me only five minutes to
compile the simple test program, run and time it, and disprove the silly
claims of the std::vector fan club.
>> > I can't say exactly how much the
>> > particular compiler you're using happens to do in that direction, but
>>
>> Yes. You have no idea. But that doesn't stop you from making affirmative
>> conclusions, of course.
>
> If it's impossible to tell for certain what was measured in a
> benchmark, then yes, that supports an affirmative conclusion that it
> doesn't produce meaningful results.
Thank you for agreeing with me that including the time it takes to dump
everything in the container to std::cout would be utterly meaningless.
I knew that it was only a matter of time before you were forced to admit
that I was right.
>> Yes. We all know how expensive allocation and initialization of 8 or 12
>> bytes on the stack is.
>
> The whole point of the benchmark is to find out how expensive the
> operations involved are.
Right. See my basic test results.
> It appears that you're starting with an
> assumption about what the result should be, codifying that into your
No, just the sole assumption that what should be measured is exactly what
OP's original test case was: store a small list of ints in a container.
It's only when those results burst the bubble of the std::vector fan club,
is when the demands, to measure something completely different, start to
show up.
>> You do realize, of course, that startup time is a fixed constant, so by
>> subtracting the fixed constant startup time from the total run time, the
>> relative performance differences would be even more pronounced.
>
> Quite the contrary, I know perfectly well that startup time is NOT a
> fixed constant.
Absolutely. When your main() allocates an std::list on the stack, instead of
a std::vector, than populates it, it now takes a hundred times longer to
bootstrap the process, before main() gets invoked.
You always learn something new every day, around here.
>> Especially
>> in the -O2 case. I can't wait to read your argument how the std::list
>> version's startup time is so much faster than the std::vector version's is.
>
> Yet again, you've got things backwards. I don't have to know that
> there is a difference for the benchmark to be meaningless. Exactly
Yes, you don't have to know anything, in order to make your arguments. That
is very obvious.
>> Sure thing, Einstein. After increasing the number of iterations by a factor
>> of ten, the resulting run times for std::list are:
>
> As the previous comments made obvious to anybody with an IQ higher
> than a worm's, you need to do a LOT more than just increase the
> number of iterations.
Obviously you're quite upset to see how much those results ended up bursting
your bubble. Do not despair, Grasshopper. You will eventually achieve the
enlightenment you now seek.
> That being the case, the difference between vector and list is both
> meaningful and relevant.
>
>> > n.clear();
>>
>> Bzzzt. You just got caught cheating.
>
> Yes and no. It's true that there is undoubtedly _some_ difference in
> speed between emptying an existing vector and creating a new one.
But, by the same measuring stick you demand others abide by, if there's
/any/ doubt as to the validity of the benchmark, the results are null and
void. You stated this position yourself. This has to be a world record for
backpedaling.
*SPNAK*
> Another possibility would be something like this:
>
> std::vector<int> *n = NULL;
>
> for (size_t j=0;j<iterations; j++) {
> delete n;
> n = new vector<int>;
Oh dear. Bzzt again.
I believe I already gave you two shovels. Here's a third one. You can now
use both hands, and one foot simultaneously, to dig your hole.
2-
In the OP's use case, Sam took it mean a small number of ints, ~10.
Sam did correctly show that inserting ~10 ints into a std::list is
faster than std::vector for whatever platform, compiler, etc., he was
using.
However, the other people in this thread implicitly assumed, rightly
so, that creating a inserting ~10 ints into a container, many times,
is not a valid real use case. With this premise, it follows that the
number of times you insert ~10 ints into a container is a small number
of times, and thus efficiency is not a concern. However, Sam argued
that std::list is faster in this use case, and that's why you should
use it. Thus the apparent contradiction: Sam is arguing to use
something because it's faster when speed is of no concern.
Moreover, even if such a contrived use case did exist where you insert
~10 ints into a container, many times, then std::vector is still
better for the nontrivial case, as I shall show now.
If a real program did insert 10 ints into a container, many times, it
would do so so that the lifetimes of the containers would overlap, or
it would do so serially in a loop.
In the loop case, it would be far more efficient to hoist the
std::vector outside of the loop and reuse it. The only memory
allocations would be on the first iteration, and it would be \much\
faster than std::list.
In the overlapping lifetime case, then you have many lists living in
the program at the same time. If you used std::list over std::vector
in a nontrivial case (aka lots of containers), then you would see
worse performance because of more cache misses due to less locality of
std::list and more memory required of std::list. The extra memory
consumption plus the lack of locality would kill performance.
Your program basically tests creating a container in a tight loop, so
the containers' lifetimes do not overlap. Thus your testing is
worthless. Either the program will create lots of containers of int
with overlapping lifetime, in which case std::vector is a clear
winner, or it'll create a bunch of containers of int serially in a
loop, in which case hoisting the std::vector out of the loop is again
the clear winner.
Sorry, meant to add that the moral of this story is that micro-
benchmarks are often misleading as they don't often replicate caching
issues that a real program will face. Ex: it's often better to compile
a whole program for size optimizations than speed optimizations, even
though micro-benchmarks will show otherwise.
> Moreover, even if such a contrived use case did exist where you insert
> ~10 ints into a container, many times, then std::vector is still
> better for the nontrivial case, as I shall show now.
Oh, goody! You want to prove that black=white? Go for it!
> If a real program did insert 10 ints into a container, many times, it
> would do so so that the lifetimes of the containers would overlap, or
> it would do so serially in a loop.
And that's where you go off the road. You've gone off beyond the domain of
the original situation. That's the only way you contort and twist the
original situation in order to fit your preconceived notions.
> In the loop case, it would be far more efficient to hoist the
> std::vector outside of the loop and reuse it. The only memory
> allocations would be on the first iteration, and it would be \much\
> faster than std::list.
And if my grandma had wheels, she'd be a wagon.
> In the overlapping lifetime case, then you have many lists living in
> the program at the same time. If you used std::list over std::vector
> in a nontrivial case (aka lots of containers), then you would see
> worse performance because of more cache misses due to less locality of
> std::list and more memory required of std::list. The extra memory
> consumption plus the lack of locality would kill performance.
Oh goody! We now have to expand the original test scenario to include exotic
things like cache misses, and such.
And speaking of "extra memory consumption", I guess that all the additional
memory std::vector needs to allocate on account of future growth -- I
suppose that all gets stored in IAM (imaginary access memory), and doesn't
count. You are only concerned about "extra memory consumption" in
std::list's case, but in std::vector's case it's perfectly fine, because it
doesn't count.
> Your program basically tests creating a container in a tight loop, so
> the containers' lifetimes do not overlap. Thus your testing is
> worthless.
My test programs run instantiate and destruct instances of std::vector and
std::list, repeatedly, in a tight loop, in order to expand any difference in
the performance of those containers, within the scope of the original
domain, with a minimum of unrelated noise.
What would truly be "worthless", as you say, would be to try to measure the
relative performance of only one instantiation/destruction cycle; any
attempt to do so will have to deal with proportionally large amount of
noise. Your assertion that only one instantiation/destruction cycle should
be measures is, of course, absurd.
That's ok. I'm sure that on some bright day, in the future, you'll figure it
out.
My argument was: Given that (by your very words) efficiency is
irrelevant here because the program has to only handle a few values
entered by hand by the user, then the second most important issue to
consider is ease of usage. Then I proceeded to demonstrate that using
std::vector results in much simpler code than using std::list.
Your counter-argument is: In my example code there might be a few
clock cycles wasted in calling the end() method of std::list.
Not only did you refuse to answer my question (ie. which one of the
two is simpler and easier to write), but you dismissed the whole
question with a ridiculous and irrelevant counter-argument.
>> What does
>> that have anything to do with this?
>
> Right. When one wants to evaluate the performance of a specific
> algorithm
I did not talk about performance. I talked about ease of usage. Your
argument is irrelevant and ridiculous.
>> And exactly what does the end()
>> function "evaluate"?
>
> That's an easy one: every iteration of the loop.
The end() function evaluates every iteration of the loop? What does
that even mean? How can the end() function evaluate "every iteration of
the loop"? I don't think you even know what you are talking about.
>> And even if calling end() was a couple of clock
>> cycles slower than doing something else, exactly why does it matter
>> here?
>
> Feh. Accuracy -- what an outmoded concept.
Right, don't answer my question. I repeat: Given that there is only a
dozen or so values in the list, entered by hand by the user (your very
words), why does it matter whether calling the end() function may
perhaps be a few clock cycles slower than doing something else?
>> The whole point was that efficiency is *irrelevant* when there's
>> just a dozen of elements in the list.
>
> I don't know whose point is that, but it's not mine's.
It was your point because you argued earlier in the thread that it
makes no sense to measure std::list vs. std::vector performance with a
million elements because the user only enters a dozen or so elements by
hand. With a dozen elements the execution performance of the code is
completely irrelevant.
>> Maybe you could answer by showing how you would write that first loop
>> above.
>
> for(std::list<int>::iterator b = values.begin(), e=values.end(); b != e;
> ++b)
And that, in your opinion, is better than this:
With how many elements?
Sure, std::list may be faster when we have 10 elements. However, it
quickly becomes slower when the amount of elements grows even slightly.
And I'm not talking about a million elements. In my system the cutoff
point seems to be about 30 elements: When there are at least that many
elements, std::vector becomes faster than std::list. Of course the
larger the amount of elements, the faster std::vector will be compared
to std::list.
So if the original poster wants to handle, for example, 32 elements,
using std::list will be a complete loss in all possible ways. Not only
will it be slower, but the code will be more complicated.
(Not that it really matters with 32 elements anyways. However, it does
matter if the amount of elements grows much larger, let's say for
example 1000 elements. In that case std::vector beats std::list by a
factor of 10.)
And no, I did the test fair and square. I did not "cheat" by using
reserve() or reusing a std::vector instance on the outer scope: I
instantiated a std::vector at each loop and destroyed it at the end, the
same as with the std::list.
For comparison, I also used std::deque. Here's the program:
//--------------------------------------------------------------------------
#include <list>
#include <vector>
#include <ctime>
#include <iostream>
int main()
{
const int iterations = 1000000;
const int elements = 26;
int sum = 0;
std::clock_t iClock = std::clock();
for(int iteration = 0; iteration < iterations; ++iteration)
{
std::list<int> l;
for(int i = 0; i < elements; ++i) l.push_back(i);
for(std::list<int>::iterator b = l.begin(), e = l.end();
b != e; ++b)
sum += *b;
}
std::cout << double(std::clock() - iClock) / CLOCKS_PER_SEC
<< std::endl;
iClock = std::clock();
for(int iteration = 0; iteration < iterations; ++iteration)
{
std::vector<int> v;
for(int i = 0; i < elements; ++i) v.push_back(i);
const size_t amount = v.size();
for(size_t i = 0; i < amount; ++i)
sum += v[i];
}
std::cout << double(std::clock() - iClock) / CLOCKS_PER_SEC
<< std::endl;
iClock = std::clock();
for(int iteration = 0; iteration < iterations; ++iteration)
{
std::deque<int> v;
for(int i = 0; i < elements; ++i) v.push_back(i);
const size_t amount = v.size();
for(size_t i = 0; i < amount; ++i)
sum += v[i];
}
std::cout << double(std::clock() - iClock) / CLOCKS_PER_SEC
<< std::endl;
return sum;
}
//--------------------------------------------------------------------------
(I calculate and return the 'sum' from main to make sure the compiler
doesn't optimize anything away.)
I compiled it with "-O3 -march=native". Here are some results:
10 elements:
std::list: 0.82 seconds.
std::vector: 1.4 seconds.
std::deque: 0.34 seconds.
26 elements:
std::list: 2.2 seconds.
std::vector: 2.12 seconds.
std::deque: 0.42 seconds.
100 elements:
std::list: 8.25 seconds.
std::vector: 3.14 seconds.
std::deque: 0.78 seconds.
1000 elements:
std::list: 79.8 seconds.
std::vector: 7.18 seconds.
std::deque: 8.46 seconds.
Not surprisingly std::deque beats the other two hands down when the
amount of elements is smallish. (When the amount of elements grows
significantly, std::vector catches up with its lower constant factors.)
Most importantly, std::deque beats your proposed solution in this
particular case (ie. about 10 elements) in all possible counts.
>> std::vector is easier to use than std::list because the former can be
>> accessed using a simple indexing syntax (namely operator[]) while the
>
> Splendid. However, when random access is not required, this offers no
> value added functionality.
However, when splicing is not required, std::list offers no value added.
>> latter cannot (you need to use iterators, which tend to produce more
>> verbose code than calling operator[]). That reason alone makes
>> std::vector more suitable for this situation. Less code to write, and
>> the code is also simpler.
>
> Damn right! Less code=good, even if less code=worse performance.
Now you'll have to explain why less code = better performance in the
case of std::deque, which you seem to blissfully ignore as an
alternative. (Note how you can use std::deque in the exact same way as
you can use std::vector, so there's no penalty with regard to code
clarity and simplicity.)
> You'll
> find plenty of compact, tight code on www.ioccc.org. We should all
> strive to produce applications that look so compact, and tight.
Why would I use the verbose std::list when std::deque beats it hands
down and is easier to use?
> My argument was: Given that (by your very words) efficiency is
> irrelevant here because the program has to only handle a few values
To adopt a saying from another context: a few CPU cycles here, a few CPU
cycles there, a few bytes here, and a few bytes there, pretty soon you'll be
talking a real pig of an app.
> entered by hand by the user, then the second most important issue to
> consider is ease of usage. Then I proceeded to demonstrate that using
> std::vector results in much simpler code than using std::list.
You are referring to indexing a vector to access its elements. That only
matters when random access is a requirement. It is not always required, and
not in this sample case.
Furthermore, once again, you're trying to cobble together an argument by
blurring the distinction between practical and theoretical factors.
Practically, in OP's original case, a std::list was the most practical
approach.
Unable to form a practical argument, you have to retreat to making a
theoretical one, such that using a vector is "easier".
Well, if you adopt a theoretical argument, you'll just have to take into
account all theoretical aspects of a std::vector, instead of cherry-picking
just the ones you like. You're going to have to consider all of
std::vector's baggage, such as excessive memory usage, excessive copying,
and iterator invalidation in analogous situations where a std::list would
not invalidate any of the existing iterators.
But of course, you're always quick to close your eyes, and shut your ears,
when reality comes calling.
There are many situations which are quite the opposite, of course. In the
real world, one does not always add or remove elements from one end of the
container. Of course, insertion or deletion from a middle of a vector is
O(n), while with a std::list the comparable operation carries constant
complexity.
But, in singing the praises of std::vector, its fan club always fails to
consider these things.
This, of course, is the theoretical world, and this is a valid theoretical
point. What usually happens, at this point, is that the vector fan club
decides that they don't want to talk theory, at this point, and revert back
to some practical argument based on OP's original scenario. Each time
they're backed into a corner, they jump into some other one.
> Not only did you refuse to answer my question (ie. which one of the
> two is simpler and easier to write), but you dismissed the whole
> question with a ridiculous and irrelevant counter-argument.
I already answered it: the iterator one is simpler and easier to write, for
me. It is the most natural fit with many STL iterator-based classes and
algorithms. Furthermore, when using the C++0x syntax, iterating over a
vector or a list is the same:
for (auto b = myvec.begin(), e=myvec.end(); b != e; ++b)
or even:
for (auto &elem : myvec)
This gives one an option of coding for an opaque container, which may be
implemented as either a std::list or a std::vector, without changing the
iterating code. This is possible only with an iterator-based approach. And
that's why I believe that it's easier. Unfortunately, junior apprentices
tend to prefer array indexing, until they are sufficiently comfortable with
the more powerful iterators.
>>> What does
>>> that have anything to do with this?
>>
>> Right. When one wants to evaluate the performance of a specific
>> algorithm
>
> I did not talk about performance. I talked about ease of usage. Your
> argument is irrelevant and ridiculous.
You talk about either performance or ease of usage, depending on which one
is easier to shoehorn into your argument.
>>> And exactly what does the end()
>>> function "evaluate"?
>>
>> That's an easy one: every iteration of the loop.
>
> The end() function evaluates every iteration of the loop? What does
> that even mean? How can the end() function evaluate "every iteration of
> the loop"? I don't think you even know what you are talking about.
You know perfectly well what I'm talking about. You're just posturing, due
to a lack of any better argument.
>>> And even if calling end() was a couple of clock
>>> cycles slower than doing something else, exactly why does it matter
>>> here?
>>
>> Feh. Accuracy -- what an outmoded concept.
>
> Right, don't answer my question. I repeat: Given that there is only a
> dozen or so values in the list, entered by hand by the user (your very
> words), why does it matter whether calling the end() function may
> perhaps be a few clock cycles slower than doing something else?
Because it does matter. A small domain is not a valid excuse for a sloppy
implementation, in my book. You should try striving to produce the best,
most efficient code, notwithstanding the situation, every time. It's a good
habit to have, and brings benefits in the long run.
>>> Maybe you could answer by showing how you would write that first loop
>>> above.
>>
>> for(std::list<int>::iterator b = values.begin(), e=values.end(); b != e;
>> ++b)
>
> And that, in your opinion, is better than this:
>
> for(size_t i = 0; i < values.size(); ++i)
Yes. Because, in the real world, a previously defined typedef would likely
be used, the above is for descriptive purposes only. In practice, actual,
well-maintained code usually reads:
for (container_t::iterator b = values.begin(), e=values.end(); b != e; ++b)
And, this is "better" because even without c++0x, this is unlikely require a
rewrite if the underlying container type is changed. Unlike with your
version, which is tightly bound to a std::vector implementation for the
underlying container.
There is a reason for the extensive iterator support in STL. As the saying
goes: penny-wise, pound-foolish. In your world, "better" is just a synonym
"fewer characters". That is a foolish definition.
I think I'm not denying any overhead.
1. In the OP's use case he already knows the size in advance
which makes memory usage very efficient. The overhead is
constant: O(1)
2. In case the size is not known in advance the memory overhead
for std::vector assuming doubling the for a vector of size
pow(2,k) with k being an integer is at most
pow(2,k+1) * sizeof(T) + O(1)
(That's including the peak memory usage during growth)
Without this _temporary_ usage peak you'll have an overhead
of at most
pow(2,k) * sizeof(T) + O(1)
And this is just the worst case. At average it should be
about half of it:
pow(2,k-1) * sizeof(T) + O(1)
3. The memory overhead for the list of size n is
n * PNA + O(1)
where PNA is the per node overhead, more specifically the
amount of memory used by a node minus sizeof(T). Any
guesses on PNA for typical implementations?
Sam, what's your experience w.r.t. "PNA"? Can you suggest a way to
measure/check it? I don't know what the granularity of typical memory
allocates is. Due to alignment constraints I'm guessing that on a
typical system with a good allocator the amount of memory that is
reserved when allocating N bytes is not less than ((N+7+2*sizeof
(void*))/8)*8, so, 8 byte alignment plus the size two pointers (one
that is used by the allocator and one for the XORed next/prev
pointers). At average this is what? 12 to 20 bytes of overhead? 12 for
sizeof(void*)==4 and 20 for sizeof(void*)==8? Is this realistic or too
optimistic?
Assuming the above "optimistic" numbers are correct and further
assuming a 32bit system with sizeof(void*)=sizeof(int)=4 we have an
average overhead of about 50% for vector<int> and an overhead of 300%
for list<int>.
Ovbiously there is an implementation-dependent threshold S so that if
sizeof(T)<=S std::vector consumes not more memory than std::list in
case we don't know the size in advance. If we do know the size in
advance, std::vector has constant memory overhead.
> That's ok. I'm sure that on some bright day, in the future, you'll figure it
> out.
Please watch your signal-to-noise ratio.
Cheers,
SG
> Sam wrote:
>> Juha Nieminen writes:
>>
>>> Sam wrote:
>>>> All these self-appointed experts around here are desperately trying to
>>>> have it both ways, in order to save face. As with any other issue of
>>>> this kind, it can be approached from a practical, or a theoretical
>>>> viewpoint. Practically, in the OP's case, std::list is a win.
>>>
>>> Then, oh great guru of C++ programming, please explain to us mere
>>> mortals the exact reasons why std::list "is a win", because I can't
>>> see it.
>>
>> Elsewhere in the thread, you will find concrete data showing faster
>> results in std::list's case.
>
> With how many elements?
The same number in OP's test case.
> Sure, std::list may be faster when we have 10 elements. However, it
> quickly becomes slower when the amount of elements grows even slightly.
You changing a practical point into a theoretical one is noted. Now, include
the fact that vector iterates are invalidated after any insert, and
std::vector's worse performance for inserts and deletes in the middle of the
container, compared to std::list; include that in your theoretical analysis,
in order for it to be actually valid, and then we can have an intelligent
theoretical discussion.
But, you refuse to do this, because it shatters your bubble.
>>> std::vector is easier to use than std::list because the former can be
>>> accessed using a simple indexing syntax (namely operator[]) while the
>>
>> Splendid. However, when random access is not required, this offers no
>> value added functionality.
>
> However, when splicing is not required, std::list offers no value added.
Indeed. After adding an element to a vector, the feature that all iterators
or pointers to existing members of the vectors remain valid is not a "value
added" in a real world. No sirree.
Keep cherry-picking your arguments, and ignoring ones that burst your
bubble.
>
>>> latter cannot (you need to use iterators, which tend to produce more
>>> verbose code than calling operator[]). That reason alone makes
>>> std::vector more suitable for this situation. Less code to write, and
>>> the code is also simpler.
>>
>> Damn right! Less code=good, even if less code=worse performance.
>
> Now you'll have to explain why less code = better performance in the
> case of std::deque, which you seem to blissfully ignore as an
> alternative.
You are desperately trying to change the topic. You are defending the merits
of a std::vector, not std::deque, in the practical case. It looks like you
now begrudgingly abandoning your original argument, and are giving up, but
you want to save face arguing that a std::deque would be even better than
std::list, in the original case. That may or may not be, however even were
that to be the case, it has no relation to the merits of std::vector.
> Why would I use the verbose std::list when std::deque beats it hands
> down and is easier to use?
Because you went down with std::vector's flaming ship. :-)
The original post had no test cases. It only had a program. The
original poster did not specify how many numbers would be entered.
>> Sure, std::list may be faster when we have 10 elements. However, it
>> quickly becomes slower when the amount of elements grows even slightly.
>
> You changing a practical point into a theoretical one is noted.
Exactly how is it theoretical that std::vector becomes quickly faster
than std::list when the amount of elements grows even a bit (you don't
even need to get to 50 elements when std::vector already beats std::list
in speed). That's not theory, that's practice. You can go and test it
yourself.
> Now,
> include the fact that vector iterates are invalidated after any insert,
Your own arguments defeat you: The original poster didn't use any
iterators.
You talk about what the original poster strictly wanted in his post
when it's more convenient to you, but quickly switch to metafeatures
when it suits you better. Then you accuse me of "theoretical stuff" when
I argue that std::vector is more efficient.
> and std::vector's worse performance for inserts and deletes in the
> middle of the container, compared to std::list;
The original poster wasn't inserting anything in the middle.
> include that in your
> theoretical analysis, in order for it to be actually valid, and then we
> can have an intelligent theoretical discussion.
I'll do that when you include in your analysis the additional memory
requirements of std::list, the increased amount of allocations, the fact
that elements in a list are not contiguous in memory (which means that
they cannot be passed to something expecting a C-style array) and the
O(n) access of an element (compared to the O(1) access in std::vector).
Why are you so eager to defend std::list on the grounds that inserting
in the middle is O(1), but blissfully ignore the fact that accessing a
random element is O(n)? And don't answer with the "the OP didn't require
random access" crap. The OP didn't require insertion in the middle either.
It's you who is cherry-picking features which are most convenient to
your idiotic argument.
> But, you refuse to do this, because it shatters your bubble.
And you refuse to take into account the additional memory usage and
linear-time random access, because "it shatters your bubble".
>>>> std::vector is easier to use than std::list because the former can be
>>>> accessed using a simple indexing syntax (namely operator[]) while the
>>>
>>> Splendid. However, when random access is not required, this offers no
>>> value added functionality.
>>
>> However, when splicing is not required, std::list offers no value
>> added.
>
> Indeed. After adding an element to a vector, the feature that all
> iterators or pointers to existing members of the vectors remain valid is
> not a "value added" in a real world. No sirree.
Why is that a "value added" but O(1) access is not?
> Keep cherry-picking your arguments, and ignoring ones that burst your
> bubble.
Look who's talking. Please explain why constant-time random access is
not an added value, while inserting in the middle in constant time is.
Of course you won't answer to this question with anything coherent.
>> Now you'll have to explain why less code = better performance in the
>> case of std::deque, which you seem to blissfully ignore as an
>> alternative.
>
> You are desperately trying to change the topic. You are defending the
> merits of a std::vector, not std::deque, in the practical case.
I'm not changing the topic. Your argument was that std::list is the
best choice. I disagree with that argument. std::list is *not* the best
choice in this case, nor in most cases.
std::deque is a practical demonstration of why std::list is not the
best choice.
> It looks
> like you now begrudgingly abandoning your original argument, and are
> giving up, but you want to save face arguing that a std::deque would be
> even better than std::list, in the original case. That may or may not
> be, however even were that to be the case, it has no relation to the
> merits of std::vector.
It looks like you are trying to evade answering my questions. When I
showed you that std::list is *not* the most efficient solution, you are
now trying to evade admitting your mistake.
>> Why would I use the verbose std::list when std::deque beats it hands
>> down and is easier to use?
>
> Because you went down with std::vector's flaming ship. :-)
I don't even understand what that means.
Sorry, that should be ((N+7+sizeof(void*))/8)*8 (for 8 byte alignment
and a pointer as part of the allocator's data structures)
Sam, you keep talking about people ignoring facts. Maybe your
communication skills are part of the "problem". Just a thought I
wanted to share.
Cheers,
SG
> > However, when splicing is not required, std::list offers no value added.
>
> Indeed. After adding an element to a vector, the feature that all iterators
> or pointers to existing members of the vectors remain valid is not a "value
> added" in a real world. No sirree.
The context is adding elements to a container and iterating through
it. Nothing more, nothing less. This is what the OP tries to do. This
is where you can use std::vector as well as std::list. You're applying
different standards here. You're pointing out advantages of std::list
that are of no significance in this context. And everything someone
does the same for std::vector you're pointing to the OP's
requirements.
Cheers,
SG
Hi all,
I'm replying to the original post to recall it.
Take in account not what the above code _does_, but what it is _meant
to do_:
- collect n numbers from cin.
- format the collected numbers by inserting ", " between all of them
and appending "." at the end.
Two reasonable issues to take in account:
- ease of use
- performance
Since we are collecting data from cin and since it's not reasonable to
blame the computer for the slow typing of its user, I modified the
test case to collect data from an hypothetical super-typist - in other
words, from an already filled out stream.
Here is my benchmark/self-profiler or whatever you want to call it. It
might be screwed up, feel free to fix it in case and to teach me
something new about correct measuring - I'm not kidding.
Please scroll down till the end (click "read more" just in case...)
because I've something more to add once I'll have shown my program's
output.
-------
#include <iostream>
#include <sstream>
#include <list>
#include <vector>
#include <ctime>
using namespace std;
int oss_wins;
int list_wins;
int vector_wins;
int oss_losses;
int list_losses;
int vector_losses;
void test(const string& s) {
clock_t osstream_total = 0;
clock_t list_total = 0;
clock_t vector_total = 0;
static clock_t osstream_start = 0;
static clock_t list_start = 0;
static clock_t vector_start = 0;
static stringstream in;
static stringstream out;
static const int max_loops = 100000;
for (int loop = 0; loop < max_loops; ++loop) {
/* oss */ {
in.clear();
in.str("");
in << s;
out.str("");
osstream_start = clock();
ostringstream osstream;
int v = 0;
int n = 0;
in >> n;
--n;
for (int i = 0; i < n; ++i) {
in >> v;
osstream << v << ", ";
}
in >> v;
osstream << v << ".";
out << osstream.str() << endl;
osstream_total += clock() - osstream_start;
}
/* list */ {
in.clear();
in.str("");
in << s;
out.str("");
list_start = clock();
list<int> ilist;
int v = 0;
int n = 0;
in >> n;
for (int i = 0; i < n; ++i) {
in >> v;
ilist.push_back(v);
}
list<int>::iterator iter = ilist.begin();
/* the following is _NOT_ cheating, we know
upfront the size, we can use it as a fair help to list
in order to correctly format the last element */
for (int i = 0, e = ilist.size()-1; i < e; ++i, ++iter) {
out << *iter << ", ";
}
out << ilist.back() << "." << endl;
list_total += clock() - list_start;
}
/* vector */ {
in.clear();
in.str("");
in << s;
out.str("");
vector_start = clock();
vector<int> ivector;
int v = 0;
int n = 0;
in >> n;
/* the following is _NOT_ cheating, we know
upfront the required size, we can use it */
ivector.reserve(n);
for (int i = 0; i < n; ++i) {
in >> v;
ivector.push_back(v);
}
for (int i = 0, e = ivector.size()-1; i < e; ++i) {
out << ivector[i] << ", ";
}
out << ivector.back() << "." << endl;
vector_total += clock() - vector_start;
}
}
cout << out.str().substr(0, 10) << " [...]" << endl;
double oss_avg = double(osstream_total) / max_loops /
CLOCKS_PER_SEC * 1000;
double list_avg = double(list_total) / max_loops / CLOCKS_PER_SEC
* 1000;
double vector_avg = double(vector_total) / max_loops /
CLOCKS_PER_SEC * 1000;
if(oss_avg < list_avg && oss_avg < vector_avg) {
cout << "osstream wins" << endl;
++oss_wins;
} else if(list_avg < oss_avg && list_avg < vector_avg) {
cout << "list wins" << endl;
++list_wins;
} else if(vector_avg < list_avg && vector_avg < oss_avg) {
cout << "vector wins" << endl;
++vector_wins;
} else {
cout << "no absolute winner" << endl;
}
if(oss_avg > list_avg && oss_avg > vector_avg) {
cout << "osstream last classified" << endl;
++oss_losses;
} else if(list_avg > oss_avg && list_avg > vector_avg) {
cout << "list last classified" << endl;
++list_losses;
} else if(vector_avg > list_avg && vector_avg > oss_avg) {
cout << "vector last classified" << endl;
++vector_losses;
} else {
cout << "no absolute last classified" << endl;
}
cout << "osstream: " << oss_avg << "ms" << endl;
cout << " list: " << list_avg << "ms" << endl;
cout << " vector: " << vector_avg << "ms" << endl << endl;
}
int main() {
stringstream ss_small, ss_large;
ss_small << 10 << " ";
for(int i = 0; i < 10; ++i) {
ss_small << i << " ";
}
ss_large << 40 << " ";
for(int i = 0; i < 40; ++i) {
ss_large << i << " ";
}
for (int i = 0; i < 5; ++i) {
cout << "Run: " << i << endl;
cout << "10 elements:" << endl;
test(ss_small.str());
cout << "40 elements:" << endl;
test(ss_large.str());
}
cout << "--- totals --- " << endl;
cout << "osstream: " << oss_wins << " win(s), " << oss_losses << "
last position(s)" << endl;
cout << " list: " << list_wins << " win(s), " << list_losses <<
" last position(s)" << endl;
cout << " vector: " << vector_wins << " win(s), " <<
vector_losses << " last position(s)" << endl;
return 0;
}
/*
Run: 0
10 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.01391ms
list: 0.01502ms
vector: 0.01402ms
40 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.04616ms
list: 0.06029ms
vector: 0.04696ms
Run: 1
10 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.01323ms
list: 0.01583ms
vector: 0.0123ms
40 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.04637ms
list: 0.06021ms
vector: 0.04663ms
Run: 2
10 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.01392ms
list: 0.01455ms
vector: 0.013ms
40 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.04295ms
list: 0.0583ms
vector: 0.04947ms
Run: 3
10 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.015ms
list: 0.01503ms
vector: 0.01162ms
40 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.04757ms
list: 0.0588ms
vector: 0.04486ms
Run: 4
10 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.01283ms
list: 0.01622ms
vector: 0.01312ms
40 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.04626ms
list: 0.05759ms
vector: 0.04618ms
--- totals ---
osstream: 5 win(s), 0 last position(s)
list: 0 win(s), 10 last position(s)
vector: 5 win(s), 0 last position(s)
Process returned 0 (0x0) execution time : 104.950 s
*/
-------
Compiled without any kind of optimization with GCC 3.4.5.
If it were for performance issues, I would choose either vector or
ostringstream, but here the performance goes to hell because a human
being is meant to insert the data.
Hence my preference falls decisively onto ostringstream, because I
need to loop just once.
As somebody else already mentioned, for this very case we don't really
need a "so-called" container, we can plainly use a stream. But
honestly, in this case a stream is just a char container - with some
value added ;-)
I think I'll have no need to defend my preference.
Have good time,
Francesco
--
Francesco S. Carta, hobbyist
http://fscode.altervista.org
std::list uses more memory and more allocations than std::vector does.
Thus using it is counter-productive if you don't require the few
advantages it has. Your suggestion of using std::list is a bad one. It
will result in inefficient memory-hog programs.
>> entered by hand by the user, then the second most important issue to
>> consider is ease of usage. Then I proceeded to demonstrate that using
>> std::vector results in much simpler code than using std::list.
>
> You are referring to indexing a vector to access its elements. That only
> matters when random access is a requirement. It is not always required,
> and not in this sample case.
And then you try to argue how std::list is so much better because you
can insert in the middle in constant time, as if that had been a
requirement in this case.
> Furthermore, once again, you're trying to cobble together an argument by
> blurring the distinction between practical and theoretical factors.
> Practically, in OP's original case, a std::list was the most practical
> approach.
I find it hilarious how you keep repeating that, yet show zero proof.
In the original poster's case the most practical approach is the
*simplest* approach, and std::vector is way simpler to use than std::list.
You didn't even disagree that iterating through the elements is
simpler with std::vector than it is with std::list, yet you refuse to
acknowledge that std::vector is the most practical approach exactly
because it's simpler to use.
> Unable to form a practical argument, you have to retreat to making a
> theoretical one, such that using a vector is "easier".
Unable to form a practical argument, you have to retreat to making a
theoretical one, such as insertion in the middle being faster.
Which feature is more practical in this case: Ease of usage, or an
unused feature? The easier syntax with std::vector is, in fact,
something which would actually be used, unlike your ridiculous argument
that inserting in the middle is faster.
> Well, if you adopt a theoretical argument, you'll just have to take into
> account all theoretical aspects of a std::vector, instead of
> cherry-picking just the ones you like.
Good thing that you don't have to do the same with std::list.
> You're going to have to consider
> all of std::vector's baggage, such as excessive memory usage, excessive
> copying, and iterator invalidation in analogous situations where a
> std::list would not invalidate any of the existing iterators.
You are going to have to consider all the std::list's baggage, such as
excessive memory usage, slow allocation of a high amount of elements and
slow random access.
Practical test: Allocate 100 integers with std::vector and std::list,
and measure which one consumes more memory and is slower.
Also consider that often just *traversing* the elements will often be
slower with std::list than with std::vector because of cache issues
(traversing a list often causes more frequent cache misses than
traversing a vector, especially if the list elements happen to be at
random locations in memory, something which does not happen with vectors
nor deques).
> But of course, you're always quick to close your eyes, and shut your
> ears, when reality comes calling.
You are the one who is refusing to run his own test using more than 10
elements. Talk about deliberately closing your eyes from reality.
> But, in singing the praises of std::vector, its fan club always fails to
> consider these things.
And you fail to consider the drawbacks of std::list.
> This, of course, is the theoretical world, and this is a valid
> theoretical point.
Allocating 100 elements is a purely theoretical thing to do in your
world, it seems. I suppose allocating 1000 elements is completely beyond
the realm of reality then (not to talk about allocating 1 million elements).
> What usually happens, at this point, is that the
> vector fan club decides that they don't want to talk theory, at this
> point, and revert back to some practical argument based on OP's original
> scenario. Each time they're backed into a corner, they jump into some
> other one.
I find it hilarious how it seems like you are talking about yourself.
You are accusing others of your own flaws.
>> Not only did you refuse to answer my question (ie. which one of the
>> two is simpler and easier to write), but you dismissed the whole
>> question with a ridiculous and irrelevant counter-argument.
>
> I already answered it: the iterator one is simpler and easier to write,
> for me.
Oh, so you are a liar too, it seems. You don't really think that the
loop using std::list iterators is simpler than the loop using an
integral counter, but of course you can't retract from your argument
now, so if doing so requires lying, that's a small price to pay.
> It is the most natural fit with many STL iterator-based classes
> and algorithms. Furthermore, when using the C++0x syntax, iterating over
> a vector or a list is the same:
And you don't consider having to rely on a future not-yet-published
standard to be resorting to theoretical arguments?
Theory (in this type of conversation) means something which is not
practical, or something which cannot be put into practice. Resorting to
a yet-to-be-published standard is highly theoretical, rather obviously,
because you can't yet use the feature in practice.
> This gives one an option of coding for an opaque container, which may be
> implemented as either a std::list or a std::vector, without changing the
> iterating code.
Good thing you don't have to resort to "theoretical stuff", but you
can keep with practical things relevant to the original problem.
>> I did not talk about performance. I talked about ease of usage. Your
>> argument is irrelevant and ridiculous.
>
> You talk about either performance or ease of usage, depending on which
> one is easier to shoehorn into your argument.
Ease of usage is and has always been better for std::vector. I don't
have to choose that argument depending on the situation. That argument
is true regardless.
*In addition* to that, std::vector is more efficient when the amount
of elements grows even slightly larger. Your std::list loses in both
counts in that case.
>>>> And exactly what does the end()
>>>> function "evaluate"?
>>>
>>> That's an easy one: every iteration of the loop.
>>
>> The end() function evaluates every iteration of the loop? What does
>> that even mean? How can the end() function evaluate "every iteration of
>> the loop"? I don't think you even know what you are talking about.
>
> You know perfectly well what I'm talking about. You're just posturing,
> due to a lack of any better argument.
Your argument was that when measuring performance, end() should not be
called in the conditional because it evaluates something. I asked you
what is it that it evaluates. You answered with some weird "every
iteration of the loop" which doesn't make any sense.
My point was that end() is not a heavy function. It doesn't evaluate
anything at all. ("Evaluate" would imply that it performs some algorithm
or some calculations, which is not the case.) All it does is to return
an iterator in constant time, which probably takes a couple of clock
cycles. Its speed is largely irrelevant.
>>>> And even if calling end() was a couple of clock
>>>> cycles slower than doing something else, exactly why does it matter
>>>> here?
>>>
>>> Feh. Accuracy -- what an outmoded concept.
>>
>> Right, don't answer my question. I repeat: Given that there is only a
>> dozen or so values in the list, entered by hand by the user (your very
>> words), why does it matter whether calling the end() function may
>> perhaps be a few clock cycles slower than doing something else?
>
> Because it does matter.
That's not an answer to my question.
"Why does it matter?" "Because it does matter." That's not an answer.
It's just a childish "Why? Because!"
> A small domain is not a valid excuse for a
> sloppy implementation, in my book.
You want to optimize calling end() outside the loop. Well, how about
we optimize the speed testing program and put the std::vector instance
outside all the loops, so that at each iteration the same instance gets
reused over and over? See what the speed measurements tell then.
Or would that be "cheating"? Well, my answer to that would be your own
words: "A small domain is not a valid excuse for a sloppy
implementation, in my book."
If we are talking about micro-optimization, then we should use
micro-optimization with *all* testcases, not just with your pet container.
> You should try striving to produce
> the best, most efficient code, notwithstanding the situation, every
> time. It's a good habit to have, and brings benefits in the long run.
Which is why you should use std::vector rather than std::list, unless
the few advantages of std::list overweight its disadvantages compared to
std::vector. Also std::deque should be considered as an alternative (and
as I have shown, it *is* the best choice in this particular case,
beating your pet std::list hands down.)
Of course, this general statement is false. Please use proper
qualifications.
As I tried to point out this is only true _at average_ if at least one
of the following two conditions are met: (a) you know the size in
advance or (b) sizeof(T)/2 is smaller than the per-list-node memory
overhead -- assuming 50% extra capacity at average.
Cheers,
SG
[ ... ]
> In the OP's use case, Sam took it mean a small number of ints, ~10.
> Sam did correctly show that inserting ~10 ints into a std::list is
> faster than std::vector for whatever platform, compiler, etc., he was
> using.
Well no, he didn't really even do that. The shortcomings in his
testing mean that all he did was raise the possibility that it
_could_ be the case. But despite having a multitude of problems
pointed out very explicitly and clearly, he has not posted results
from a test that makes any attempt at taking those problems into
account. What we're left with is a claim but no data that really
supports it.
Just for fun, I'm going to post an even more comprehensive test
program. This one uses container sizes from 10 to 100 (in steps of
10), but weights the number of iterations toward the smaller
containers -- the number of iterations for any given size is
'iterations/size'. This emphasizes smaller sizes, but takes larger
sizes into account as well. This one does the test for list, vector
and deque, and uses all three strategies I previously outlined (clear
(), delete/new, accumulate inside the loop) for dealing with creating
an empty container for each iteration of the outer loop.
#include <vector>
#include <list>
#include <time.h>
#include <iostream>
#include <numeric>
#include <deque>
#include <string>
const size_t iterations = 3000000;
template <class T>
struct test1 {
clock_t operator()(size_t size) {
T coll;
clock_t start = clock();
for (int j=0; j<iterations/size; j++) {
coll.clear();
for (int i=0; i<size; i++)
coll.push_back(i);
}
clock_t total_time = clock()-start;
int total = std::accumulate(coll.begin(), coll.end(), 0);
return total_time;
}
};
template <class T>
struct test2 {
clock_t operator()(size_t size) {
T *coll = NULL;
clock_t start = clock();
for (size_t j=0; j<iterations/size; j++) {
delete coll;
coll = new T;
for (size_t i=0; i<size; i++)
coll->push_back(i);
}
clock_t total_time = clock() - start;
int total=std::accumulate(coll->begin(), coll->end(), 0);
delete coll;
return total_time;
}
};
template <class T>
struct test3 {
clock_t operator()(size_t size) {
clock_t start = clock();
for (size_t j=0; j<iterations/size; j++) {
T coll;
for (size_t i=0; i<size; i++)
coll.push_back(i);
int total = std::accumulate(coll.begin(), coll.end(), 0);
}
return clock()-start;
}
};
void show_time(std::string const &caption, clock_t t) {
std::cout << caption << "\t" << t << std::endl;
}
int main() {
clock_t times[3] = {0};
char const *labels[3] = {"list:", "vector:", "deque:"};
const size_t start = 10, stop=100, step=10;
for (size_t i=start; i<stop; i+=step) {
times[0] += test1<std::list<int> >()(i);
times[1] += test1<std::vector<int> >()(i);
times[2] += test1<std::deque<int> >()(i);
}
std::cout << "Times using clear():\n";
for (int i=0; i<3; i++)
show_time(labels[i], times[i]);
std::fill_n(times, 3, clock_t(0));
for (size_t i=start; i<stop; i+=step) {
times[0] += test2<std::list<int> >()(i);
times[1] += test2<std::vector<int> >()(i);
times[2] += test2<std::deque<int> >()(i);
}
std::cout << "Times using delete/new:\n";
for (int i=0; i<3; i++)
show_time(labels[i], times[i]);
std::fill_n(times, 3, clock_t(0));
for (size_t i=start; i<stop; i+=step) {
times[0] += test3<std::list<int> >()(i);
times[1] += test3<std::vector<int> >()(i);
times[2] += test3<std::deque<int> >()(i);
}
std::cout << "Times using accumulate inside loop:\n";
for (int i=0; i<3; i++)
show_time(labels[i], times[i]);
return (0);
}
Here are results from VC++ 9:
Times using clear():
list: 4104
vector: 123
deque: 1420
Times using delete/new:
list: 4572
vector: 1981
deque: 1730
Times using accumulate inside loop:
list: 4462
vector: 1935
deque: 1637
Just for fun, I also compiled it with Microsoft's 64-bit compiler,
and got these results:
Times using clear():
list: 2183
vector: 126
deque: 889
Times using delete/new:
list: 2371
vector: 1405
deque: 1060
Times using accumulate inside loop:
list: 2355
vector: 1280
deque: 1076
In both cases, the following compiler flags were used:
/O2b2 /GL /EHsc
/O2b2 turns on full optimization with automatic inlining
/GL turns on "whole-program" optimization
/EHsc enable exception handling
Conclusions:
1) vector gets the single fastest result (by a wide margin).
2) deque is the most dependably fast (wins the most often).
3) list does a lot better with the 64-bit compiler.
3a) but it's still consistently the slowest, by ~1.7:1 at best.
--
Later,
Jerry.
What I really meant was that "std::list uses more memory *per element*
than std::vector". It is true that if you don't know the amount of
elements in advance and simply keep pushing back new elements, in some
cases the total memory usage may surpass that of the equivalent
std::list. (Another disadvantage of std::vector which might present
itself sometimes, is that it can cause bad memory fragmentation, causing
the overall memory usage of the program to be considerably higher than
what the program is actually using.)
If the memory usage caused by the extra capacity reserved inside the
vector becomes a problem, you could always transfer all the contents
into a new vector which has the proper size. (Doesn't help the memory
fragmentation problem, though.)
Or you could use std::deque for optimal memory usage (in terms of
memory allocation and memory fragmentation) and speed in a situation
like this.
std::list might in some cases be a viable alternative if you use some
specialized fast memory allocator and the additional space required per
element is not a problem.
Not to be mean, but that's a completely unfair and unrealistic test.
Turn on basic optimizations and please post the numbers again.
Generally, most STL implementations rely heavily on inlining, which
you did not enable, which will greatly skew the result compared to a
real program, aka one compiled with optimizations.
One last possible condition: (c) If there is a point in which you are
"done" inserting elements, and if the code can identify that point,
then you can shrink the vector's capacity to its size, ala copy-swap.
Also, wouldn't the overhead of a vector be 50% extra capacity at \worst
\, not "at average"? Well, I guess during an operation you might have
about 100% memory overhead (x length for the original internal array,
and 2x length for the new array), which will immediately go down to
50% (aka 2x length of new array) as soon as push_back returns.
[ ... ]
> And speaking of "extra memory consumption", I guess that all the
> additional memory std::vector needs to allocate on account of
> future growth -- I suppose that all gets stored in IAM (imaginary
> access memory), and doesn't count. You are only concerned about
> "extra memory consumption" in std::list's case, but in
> std::vector's case it's perfectly fine, because it doesn't count.
Change that from "Imaginary Access Memroy" to "virtual memory", and
you've pretty much nailed it.
Most modern OSes (definitely including Windows and Linux) provide and
use virtual memory. This means when you allocate memory, most of what
you're allocating is really just address space and (possibly) some
room in the system paging file. Physical memory to "back" that
virtual memory will only be allocated when/if you read from or write
to a particular page of memory.
Memory is normally allocated in 4K (x86, POWER/PowerPC) or 8K (SPARC,
Alpha) pages. The maximum waste for a vector<int> on an x86
processor, for example, would be 4K-sizeof(int). If you have a vector
of a million objects and your vector has a growth factor of 2 (though
see my previous post -- it's usually smaller), when it resizes it to
hold 2 million objects, it's still only going to allocate one page
more physical memory.
If you want to get technical, it's also going to allocate some space
for page tables, but it still comes down to one fact: the allocation
is still in the range of kilobytes, not megabytes.
--
Later,
Jerry.
[ ... ]
> As I tried to point out this is only true _at average_ if at least one
> of the following two conditions are met: (a) you know the size in
> advance or (b) sizeof(T)/2 is smaller than the per-list-node memory
> overhead -- assuming 50% extra capacity at average.
In reality, vector will almost _never_ use 50% extra memory. First of
all, several years ago Andrew Koenig posted this:
http://groups.google.com/group/comp.lang.c++.moderated/msg/ba558b4924
758e2e
Andrew is sufficiently well known in the C++ community that I'd
expect essentially all current libraries make use of what he showed
(the growth factor should be less than ~1.6). Certainly the library I
normally use (Dinkumware) uses a smaller factor (1.5, to be exact).
It gets even better than that though. A few years ago, Kai Uwe Bux
did an analysis applying Benford's law to sizes of vectors.
http://groups.google.com/group/comp.lang.c++/msg/fdd89c5606365865
This shows that with a growth factor of 1.6, you expect to see only
about 20% of the space wasted, not the ~30% a more naive analysis
would suggest.
Even with a growth factor of two, we expect to see about 72% usage.
With a growth factor of 1.5, expect about 82% usage (<18% waste).
For list, in a typical case where an int and a pointer are the same
size, a list<int> will have an overhead of roughly 3:1 (i.e. the
actual data in each node that we care about will be only one third of
each node). On a 64-bit implementation, it gets even worse -- in this
case, an int often remains 32 bits, but a pointer is extended out to
64 bits, so a list<int> has 5:1 overhead!
--
Later,
Jerry.
You're right to call me on that: I just asked for one or more fixes
and I didn't expect any hugs & caresses ;-)
Honestly, I believed that the test was more fair _without_
optimizations, thank you for correcting this misconception of mine.
I have no flag called "basic optimizations", maybe you meant -O1?
Well, in any case, I compiled it with -O3, here are the results:
-------
Run: 0
10 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.01222ms
list: 0.0128ms
vector: 0.01133ms
40 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.04073ms
list: 0.05241ms
vector: 0.04014ms
Run: 1
10 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.01223ms
list: 0.01321ms
vector: 0.00982ms
40 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.04135ms
list: 0.05207ms
vector: 0.04195ms
Run: 2
10 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.01191ms
list: 0.01352ms
vector: 0.01093ms
40 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.04448ms
list: 0.05206ms
vector: 0.03977ms
Run: 3
10 elements:
0, 1, 2, 3 [...]
vector wins
osstream last classified
osstream: 0.01272ms
list: 0.01251ms
vector: 0.01153ms
40 elements:
0, 1, 2, 3 [...]
osstream wins
list last classified
osstream: 0.04205ms
list: 0.04985ms
vector: 0.04328ms
Run: 4
10 elements:
0, 1, 2, 3 [...]
vector wins
osstream last classified
osstream: 0.01303ms
list: 0.01301ms
vector: 0.01021ms
40 elements:
0, 1, 2, 3 [...]
vector wins
list last classified
osstream: 0.04179ms
list: 0.05255ms
vector: 0.04076ms
--- totals ---
osstream: 2 win(s), 2 last position(s)
list: 0 win(s), 8 last position(s)
vector: 8 win(s), 0 last position(s)
Process returned 0 (0x0) execution time : 92.142 s
-------
Uh... vector is the winner... I wasn't unfair with it, keeping
optimizations out.
OK, I still prefer ostringstream for this specific case, since it is
by far the easier to use - as I said, I have to loop only once.
Thanks a lot for your reply and for your correction.
Cheers,
> Uh... vector is the winner... I wasn't unfair with it, keeping
> optimizations out.
Ooops.. I meant to say: "I _was_ unfair with it", sorry.
That's just the size of the node. You also have to take into account
that each node is allocated separately and the memory allocator will
invariably require some ancillary data in each allocated block.
For example, in a 32-bit linux system allocating a struct of 12 bytes
(one int and two pointers) consumes 16 bytes of memory (coincidentally
having the least amount of overhead in any allocated memory block, so it
happens to be the optimal case).
So in a completely optimal situation in a std::vector (ie. the
capacity of the vector matches its size) each int will take only 4
bytes, while even in the most optimal case for a std::list (like in the
case of an int element) each element will take 16 bytes of memory.
[snip]
> --- totals ---
> osstream: 2 win(s), 2 last position(s)
> list: 0 win(s), 8 last position(s)
> vector: 8 win(s), 0 last position(s)
>
> Process returned 0 (0x0) execution time : 92.142 s
> -------
Compiling with just -O doesn't change much:
-------
[snip]
Run: 2
10 elements:
0, 1, 2, 3 [...]
list wins
osstream last classified
osstream: 0.01361ms
list: 0.01324ms
vector: 0.01352ms
[snip]
--- totals ---
osstream: 2 win(s), 2 last position(s)
list: 1 win(s), 8 last position(s)
vector: 7 win(s), 0 last position(s)
-------
List wins once in the 10 elements contest, the other results are more
or less the same.
I did? I don't think so. I'm merely being paranoid anal trying to
cover all of my bases, to make sure you cannot contort another use
case which I do not cover.
> > In the loop case, it would be far more efficient to hoist the
> > std::vector outside of the loop and reuse it. The only memory
> > allocations would be on the first iteration, and it would be \much\
> > faster than std::list.
>
> And if my grandma had wheels, she'd be a wagon.
I'm sorry. Is that meant to be a technical argument in favor of
std::list?
> > In the overlapping lifetime case, then you have many lists living in
> > the program at the same time. If you used std::list over std::vector
> > in a nontrivial case (aka lots of containers), then you would see
> > worse performance because of more cache misses due to less locality of
> > std::list and more memory required of std::list. The extra memory
> > consumption plus the lack of locality would kill performance.
>
> Oh goody! We now have to expand the original test scenario to include exotic
> things like cache misses, and such.
Cache misses are not exotic. Unless you're doing work on small
embedded systems, your machine will have many different levels of
caching, processor caches, caches for main memory, caches on the hard
drive, software caching aka virtual memory, etc. This is not exotic.
This is quite par for the course. Caching is greatly responsible for
the huge speeds we have today. I assure you, if you took out all
caching from modern machines, you would get at least 1000 times
slower, probably more. Ignore caching effects at your own risk, and at
the risk to your end users.
> And speaking of "extra memory consumption", I guess that all the additional
> memory std::vector needs to allocate on account of future growth -- I
> suppose that all gets stored in IAM (imaginary access memory), and doesn't
> count. You are only concerned about "extra memory consumption" in
> std::list's case, but in std::vector's case it's perfectly fine, because it
> doesn't count.
As described else-thread by SG, vector does not always take up less
memory. However, I think in the average use case, you can make it take
up less memory, either because: you know the size beforehand, or there
is a definite spot at which inserting will end, after which one could
reduce its capacity to its size ala copy-swap.
Also, you entirely ignored the second aspect of caching, which is even
more critical, locality. A vector will have much higher locality,
meaning a much higher cache hit rate, meaning much faster programs for
large programs when caching plays a significant role.
> > Your program basically tests creating a container in a tight loop, so
> > the containers' lifetimes do not overlap. Thus your testing is
> > worthless.
>
> My test programs run instantiate and destruct instances of std::vector and
> std::list, repeatedly, in a tight loop, in order to expand any difference in
> the performance of those containers, within the scope of the original
> domain, with a minimum of unrelated noise.
I attempted to show that there are two kinds of use cases, allocating
a single container in a loop, serially, so that the containers'
lifetimes will not overlap, or creating many contains so that their
lifetimes will overlap. I proceeded to show that vector is better in
both cases. That is, either you allocate a small number of containers
so their lifetimes overlap, in which case the number of insertions
must be small so performance doesn't matter, or the number of
insertions must be large in which case vector beats list.
Your particular use case is contrived. It is a micro-benchmark. Any
real code for performance would immediately hoist the container out of
the loop and reuse it, at which point vector would outperform list in
any test, with the list declared in or out of the loop.
I'm sure you will pick and choose arguments to respond to, so I'm not
sure why I'm replying. For your and my benefit though, here's a
recap:
1- For allocating a container in a loop serially, just hoist the
container out of the loop. Vector wins.
2- For allocating containers so their lifetimes overlap:
2a- and for a small number of insertions, insertion performance of the
container does not matter, so the container type does not matter.
However, if the number of accesses is high, then vector wins as it has
much faster accesses.
2b- and for large number of insertions to a relatively smaller number
of containers, vector wins on insertion performance, it wins on access
performance due to locality, and it probably wins on memory
consumption because you can probably reserve up front or copy-swap at
the end.
2c- and for a large number of insertions to a equally large number of
containers, aka ~10 inserts to each container, vector wins on access
performance due to locality, and it probably wins on insertion
performance again due to caching effects. If you can make the vector's
capacity match its size, either by reserve up front or copy-swap at
the end, then this further bolsters my claim. Without that, this claim
is less likely to be true, but probably still is true due to the
locality effects.
std::list is rarely better than std::vector for performance. Even if
the element is expensive to copy, but you're only doing removals or
insertions at the end, std::vector<T *> will outperform std::list.
std::list is only better for performance when you need slicing or lots
of operations on arbitrary points in the list.
[ ... ]
> What I really meant was that "std::list uses more memory *per
> element* than std::vector". It is true that if you don't know the
> amount of elements in advance and simply keep pushing back new
> elements, in some cases the total memory usage may surpass that of
> the equivalent std::list.
In theory this is true.
In fact, it's only likely to arise when the items being stored are
relatively large. In the case being discussed here, the "item" is an
int. In a typical implementation, ints and pointers are the same
size. Each node contains one stored item plus two pointers (std:list
is a doubly-linked list) so twice as much memory is wasted as used.
To use triple the memory with a vector, you need to have a growth
factor of 2 -- but it was shown long ago that the growth factor
should be smaller than that, and most libraries have used a smaller
factor for a long time (e.g. Dinkumware's been using a growth factor
of 1.5 since the 16-bit days).
As I also noted in a previous post, a virtual memory system will
normally limit the actual memory usage to no more than one page
beyond what's been written. Even if you're concerned about a system
that's extremely short of physical memory, the worst case is during a
growth operation, during which the absolute worst case would be
needing four pages of physical memory (which only arises if a single
item crosses a page boundary).
It should also be noted that most free store managers won't even try
to allocate an object as small as a single int. Most round all
allocations up to the next larger power of 2. In a typical case, the
minimum allocation is 16 or 32 bytes. This often adds even MORE
overhead to lists of small items.
> (Another disadvantage of std::vector which might present
> itself sometimes, is that it can cause bad memory fragmentation, causing
> the overall memory usage of the program to be considerably higher than
> what the program is actually using.)
This is why you use a smaller growth factor -- it helps ensure that
discarded memory can be re-used.
[ ... ]
> std::list might in some cases be a viable alternative if you use
> some specialized fast memory allocator and the additional space
> required per element is not a problem.
There _are_ cases where std::list makes sense -- but they're pretty
rare. The standard containers are value oriented, and assume that
(for example) copying is cheap (e.g. virtually everything is passed
by value, not reference).
Most of the cases that favor std::list over vector, will work even
better with something else that passes by reference instead. Of
course, that's assuming a rather naive compiler -- with enough care,
copy elision, return value optimization, etc., can eliminate most of
the problems. Then again, the same optimizations can frequently
eliminate most of the problems with copying in a vector as well...
--
Later,
Jerry.
[ ... ]
> std::list is rarely better than std::vector for performance. Even
> if the element is expensive to copy, but you're only doing removals
> or insertions at the end, std::vector<T *> will outperform
> std::list.
Or a vector of some sort of reference-counted pointer to T.
> std::list is only better for performance when you need slicing or lots
> of operations on arbitrary points in the list.
Operations on arbitrary points in the list only help under limited
circumstances. For example, if I have a collection of 10 items, and
want to insert a new item in the fifth spot, for list I do an O(N)
walk through the list to get to the fifth spot, then an O(K)
insertion. With vector, getting to the fifth spot is O(K), and the
insertion is O(N).
As such, it only works out in favor of list IF you've already stored
the point where the insertion needs to take place. In theory the two
come out even if you're doing something like an insertion sort, where
you have to scan through the items to find the right spot for the
insertion (or deletion).
In reality, walking through items with a vector usually has lower
constants, so it'll usually be faster even in this case. Worse, if
you have items in a defined order, you can usually find the right
spot with a binary search, and get O(lg N) to find the right spot.
You're stuck with walking the items IFF you need to do something like
'insert X after Y', so there's a required order, but it's not sorted.
--
Later,
Jerry.
Yes. You are correct. It depends. I should have phrased it as
"std::list is only better if [X], though not necessarily better." If
you already have an iterator to the position you need to insert after,
you don't have to walk there. Sometimes this could be the case.
Depends on the algorithm. std::list's usefulness depends very much on
the specific requirements.
> Sam wrote:
>>>> Elsewhere in the thread, you will find concrete data showing faster
>>>> results in std::list's case.
>>>
>>> With how many elements?
>>
>> The same number in OP's test case.
>
> The original post had no test cases. It only had a program. The
> original poster did not specify how many numbers would be entered.
Oh, I see now -- the OP was entering tens of thousands of numbers on
std::cin. That explains all the hubbub.
>>> Sure, std::list may be faster when we have 10 elements. However, it
>>> quickly becomes slower when the amount of elements grows even slightly.
>>
>> You changing a practical point into a theoretical one is noted.
>
> Exactly how is it theoretical that std::vector becomes quickly faster
> than std::list when the amount of elements grows even a bit (you don't
It becomes theoretical as soon as you leave the boundaries of the original
scenario.
>> Now,
>> include the fact that vector iterates are invalidated after any insert,
>
> Your own arguments defeat you: The original poster didn't use any
> iterators.
But once you venture out in the theoretical world, all theoretical
consequences of using std::vector become in play.
> when it suits you better. Then you accuse me of "theoretical stuff" when
> I argue that std::vector is more efficient.
Only when you ignore half of the consequences of using std::vector.
>> include that in your
>> theoretical analysis, in order for it to be actually valid, and then we
>> can have an intelligent theoretical discussion.
>
> I'll do that when you include in your analysis the additional memory
> requirements of std::list, the increased amount of allocations, the fact
… the additional overhead of reserving memory for unused elements of
std::vector, the impact from all the additional invocations of copy
constructors and destructors, as the container grows in size.
> that elements in a list are not contiguous in memory (which means that
> they cannot be passed to something expecting a C-style array) and the
> O(n) access of an element (compared to the O(1) access in std::vector).
And what is the complexity of removing an element from the middle of the
list, versus the middle of a vector?
> Why are you so eager to defend std::list on the grounds that inserting
> in the middle is O(1), but blissfully ignore the fact that accessing a
> random element is O(n)?
Because you wouldn't use a std::list when random access is required.
Really, it's that simple.
> And don't answer with the "the OP didn't require
> random access" crap. The OP didn't require insertion in the middle either.
Once more, you are cherry-picking from either the practical or the
theoretical side of the argument, as you see fit.
>> But, you refuse to do this, because it shatters your bubble.
>
> And you refuse to take into account the additional memory usage and
> linear-time random access, because "it shatters your bubble".
Are you talking about yourself, and your blissful ignorance of the
additional memory requirements of std::vector?
>>
>> Indeed. After adding an element to a vector, the feature that all
>> iterators or pointers to existing members of the vectors remain valid is
>> not a "value added" in a real world. No sirree.
>
> Why is that a "value added" but O(1) access is not?
Because you forgot that it's the vector fan club who began adamantly
claiming that std::vector solves world hunger, brings world peace, and is a
cure for the common cold.
As you are dimly beginning to realize, what really matters is the actual,
individual requirements of each individual case. Now, you've stumbled the
closest you've been to the real answer so far, let's see if you've got what
it takes to bring it home.
But as long as anyone makes the absurd, blanket statement that using a
vector is always the right solution, I'll just maintain the opposite
position, using their own arguments against them, until they are forced to
backtrack.
>>> Now you'll have to explain why less code = better performance in the
>>> case of std::deque, which you seem to blissfully ignore as an
>>> alternative.
>>
>> You are desperately trying to change the topic. You are defending the
>> merits of a std::vector, not std::deque, in the practical case.
>
> I'm not changing the topic.
Would you mind looking back through the thread and see who was the first one
to bring up std::deque?
I'll wait.
> Your argument was that std::list is the
> best choice. I disagree with that argument. std::list is *not* the best
> choice in this case, nor in most cases.
>
> std::deque is a practical demonstration of why std::list is not the
> best choice.
Sorry to confuse you with facts, but std::deque suffers the same limitations
as std::vector, in terms of the complexity of insertion or deletion in the
middle of the container, as well as the same memory usage issue.
>> It looks
>> like you now begrudgingly abandoning your original argument, and are
>> giving up, but you want to save face arguing that a std::deque would be
>> even better than std::list, in the original case. That may or may not
>> be, however even were that to be the case, it has no relation to the
>> merits of std::vector.
>
> It looks like you are trying to evade answering my questions. When I
Pop quiz: who was the first one who began bringing up other, unrelated
containers, in the discussion concerning the merits of vector vs list?
So, the OP's case involves a small number of integers?
Thus no matter what kind of container, the OP will not see any
measurable difference for his program, correct?
However, if he were to try to reuse his program, such as by piping the
output of one program to another, then std::vector would outperform
right?
Thus he should just use std::vector, right?
> Sam wrote:
>> Juha Nieminen writes:
>>
>>> My argument was: Given that (by your very words) efficiency is
>>> irrelevant here because the program has to only handle a few values
>>
>> To adopt a saying from another context: a few CPU cycles here, a few CPU
>> cycles there, a few bytes here, and a few bytes there, pretty soon
>> you'll be talking a real pig of an app.
>
> std::list uses more memory and more allocations than std::vector does.
Very funny. You're a real comedian, you should be writing for SNL.
You've got your blinders on, and can't see all the additional memory
std::vector allocates, in order to reserve room for the growth of a vector.
There are plenty of situations where std::vector uses more memory than
std::list would, with the same number of elements.
Your blanket, absolute statement is evidence that you do not fully
understand the subject matter at hand.
>> You are referring to indexing a vector to access its elements. That only
>> matters when random access is a requirement. It is not always required,
>> and not in this sample case.
>
> And then you try to argue how std::list is so much better because you
> can insert in the middle in constant time, as if that had been a
> requirement in this case.
I am merely pointing out situations where std::list is a better choice, in
response to illogical claims about the superiority of std::vector in all
possible situations.
You cannot dispute that, so you have to limit your response to pointing out
other instances where std::vector works better than std::list.
> You didn't even disagree that iterating through the elements is
> simpler with std::vector than it is with std::list, yet you refuse to
I most certainly disagreed. You're in error.
>> You're going to have to consider
>> all of std::vector's baggage, such as excessive memory usage, excessive
>> copying, and iterator invalidation in analogous situations where a
>> std::list would not invalidate any of the existing iterators.
>
> You are going to have to consider all the std::list's baggage, such as
> excessive memory usage, slow allocation of a high amount of elements and
> slow random access.
You have me confused with someone else. I am not the one who began the
argument with absolute claims of the kind that I quoted up top.
>> But, in singing the praises of std::vector, its fan club always fails to
>> consider these things.
>
> And you fail to consider the drawbacks of std::list.
Wrong, as usual. Unlike other participants of this thread, I never made an
absolute statement that one container always has better results in all
possible situations. I merely pointed out that within the confines of OP's
test case, std::list comes out on top. If the OP was trying to do something
else, such as reading the input data from a file, where the potential number
of elements might be quite large; or if the OP wanted to sort the elements,
for example, that would've called for a different approach.
>>> Not only did you refuse to answer my question (ie. which one of the
>>> two is simpler and easier to write), but you dismissed the whole
>>> question with a ridiculous and irrelevant counter-argument.
>>
>> I already answered it: the iterator one is simpler and easier to write,
>> for me.
>
> Oh, so you are a liar too, it seems. You don't really think that the
It's always very convenient to have someone who can read my thoughts, and
explain them to everyone else. It's a nice perk to have.
It's almost as good as trying to argue regarding what I really think with
someone who believes they know that better than me.
> loop using std::list iterators is simpler than the loop using an
> integral counter, but of course you can't retract from your argument
You could not iterate over a std::list with an index counter even if you
wanted to. You probably meant std::vector, here.
I just wanted to confirm that I understood you correctly, before I point you
to a public CVS repository with my code that iterates over a vector using
iterators, for you to read, and weep.
>> It is the most natural fit with many STL iterator-based classes
>> and algorithms. Furthermore, when using the C++0x syntax, iterating over
>> a vector or a list is the same:
>
> And you don't consider having to rely on a future not-yet-published
> standard to be resorting to theoretical arguments?
Of course it's theoretical, which is fine since you, yourself, started a
theoretical discussion.
> Ease of usage is and has always been better for std::vector. I don't
Again, an absolute claim that betrays your lack of experience in the subject
matter at hand.
> have to choose that argument depending on the situation. That argument
> is true regardless.
No, it's not true. Absolute, unqualified statements of this kind, are rarely
true.
> *In addition* to that, std::vector is more efficient when the amount
> of elements grows even slightly larger.
Only if you discard from your consideration the additional penalty and
overhead that std::vector already incurred, up to this point.
But, of course, I don't expect you to understand that.
> My point was that end() is not a heavy function. It doesn't evaluate
> anything at all. ("Evaluate" would imply that it performs some algorithm
> or some calculations, which is not the case.)
Ok -- computing the ending iterator for a container is no longer an
"algorithm". It's a magical process. Hurrah!
> All it does is to return
> an iterator in constant time, which probably takes a couple of clock
> cycles. Its speed is largely irrelevant.
Well, it also probably takes "a couple of clock cycles" for std::list too,
but, suddenly, it's a big deal to you.
>> You should try striving to produce
>> the best, most efficient code, notwithstanding the situation, every
>> time. It's a good habit to have, and brings benefits in the long run.
>
> Which is why you should use std::vector rather than std::list, unless
> the few advantages of std::list overweight its disadvantages compared to
> std::vector.
There are just as many advantages to std::list as there are to std::vector,
but junior apprentices rarely understand that, until they get more
experienced.
>> > In the loop case, it would be far more efficient to hoist the
>> > std::vector outside of the loop and reuse it. The only memory
>> > allocations would be on the first iteration, and it would be \much\
>> > faster than std::list.
>>
>> And if my grandma had wheels, she'd be a wagon.
>
> I'm sorry. Is that meant to be a technical argument in favor of
> std::list?
It's a technical counterpoint to a theoretical, pie in the sky, claim.
HTH.
>> Oh goody! We now have to expand the original test scenario to include exotic
>> things like cache misses, and such.
>
> Cache misses are not exotic.
In the context of the original discussion, it certainly is. All it was is a
simple test program that created a small container. Trying to justify the
merits of one particular implementation by going off a tangent and talking
about things like cache misses is rather exotic, in my view.
> Also, you entirely ignored the second aspect of caching, which is even
> more critical, locality. A vector will have much higher locality,
It's merely more pie in the sky. Locality also depends, of course, on what
gets done with the contents of the container.
> Your particular use case is contrived. It is a micro-benchmark. Any
It is a valid benchmark that clearly illustrates the differences in the end
result between using one, versus another, container in the original
situation.
> real code for performance would immediately hoist the container out of
> the loop and reuse it, at which point vector would outperform list in
> any test, with the list declared in or out of the loop.
And, again, if my grandma had wheels, she'd be a wagon.
> I'm sure you will pick and choose arguments to respond to, so I'm not
Quite funny -- since all you're doing here is picking and choosing various
theoretical and practical arguments; mixing them together in an
incomprehensible mish mash involving cache misses and compiler optimiations.
> std::list is rarely better than std::vector for performance. Even if
It is better in the original use case. Furthermore "rarely better" is an
absolute enough statement that can be disproven merely by enumerating
several situations where it is false, which would then invalidate it, and
there are quite a few practical use cases where std::list would fare much
better than std::vector.
In case you missed it Sam, I'd like to know your reply to this simple
argument: If it's a small number of items, performance doesn't matter,
but if it's a large number of elements, then std::vector will
outperform std::list. So, for a small number of elements the choice
does not matter, so he should use vector to make his code reusable and
extensible, because as we all know code is frequently reused, such as
copy-paste, a library, in his piping output from one program to his,
etc.
This was, and is, the source of contention for one issue, that you
claim performance is a reason to use list, but that's only true for a
very small number of inserts, your interpretation of his use case.
However, in this use case, the performance bonus of std::list over
std::vector is not measurable. Thus your argument is "Use std::list
because it performs faster in an entirely unmeasurable way in your use
case." (Note that I said "unmeasurable in his use case". If you change
it to a micro-benchmark like you did, then yes you can measure it. You
cannot measure the difference for his program for a small input set.
You can however measure the difference for a larger input set, in
which case std::vector clearly wins.)
Then there's the teaching aspect as well. (You disagree, but) vector
is generally better than std::list for performance, so it is good
advice to tell him to use vector over list by default.
[...]
> (Another disadvantage of std::vector which might present
> itself sometimes, is that it can cause bad memory fragmentation, causing
> the overall memory usage of the program to be considerably higher than
> what the program is actually using.)
Or vice versa. std::list typically allocates a lot of little
blocks (although custom allocators can avoid this); std::vector
will end up with a single, large block. A single, large block
will generally cause less fragmentation than a lot of little
blocks. But of course, it depends on the actual use. If you
insert everything in the list in one go, with no other uses of
dynamic memory during the insertions, it's possible that all of
the little blocks end up contiguous. But it's not the sort of
thing I'd count on.
Note that similar considerations hold between std::map and
maintaining a sorted std::vector< std::pair >. In one recent
application, we switched from std::map to the vector solution
because of problems with memory fragmentation. In general, node
based containers like std::map or std::list increase memory
fragmentation, and in most cases, total memory use and execution
times. But that's only very, very generally; each application
will be particular, and you really should use the logically most
appropriate container for the job until you've got some actual
measurements to go on. Thus, std::map for a map, and
std::vector for a vector, std::list if you make a lot of
insertions and deletions in the middle, etc.
--
James Kanze
So what would you expect to be the execution-time cost of "evaluating"
an inline function which looks something like one of these (examples
from three implementations of the standard library I happen to have to
hand)?
iterator end() { return (iterator(_Myhead, this)); }
iterator end() { return this->_M_node._M_data; }
iterator end () { return __node; }
>and that evaluating it on every iteration always produces the same result.
Don't you think the compiler might make this deduction too?
--
Richard Herring
To you, may be not. But in latency-sensitive environments, it most certainly
does.
But just keep thinking that it doesn't matter, don't let me discourage you
otherwise. I consider stuff like that to be excellent job security.
> but if it's a large number of elements, then std::vector will
> outperform std::list.
Only in some situations involving a large number of elements. In others, a
std::vector cannot be used for other reasons, so it's not even an option.
> So, for a small number of elements the choice
> does not matter, so he should use vector to make his code reusable and
> extensible, because as we all know code is frequently reused, such as
If one places code reusability at a premium, one should be using iterators,
rather than any std::vector-specific methods, or overloads. As I showed,
doing so allows one to substitute the underlying implementation, without
changing the code.
A completely different issue.
> This was, and is, the source of contention for one issue, that you
> claim performance is a reason to use list, but that's only true for a
> very small number of inserts, your interpretation of his use case.
Not just "interpretation". I showed the actual numbers.
> However, in this use case, the performance bonus of std::list over
> std::vector is not measurable.
Sorry, I measured it. Just because some may don't think twice about pissing
away a small amount of resources, that's not a valid reason to do so in my
book.
> You can however measure the difference for a larger input set, in
> which case std::vector clearly wins.)
No, not "clearly". There are a number of situations where the
contortions one must go through, in order to use a vector as the underlying
container, more than negate any inherent vector-specific optimizations.
Sorry, facts disagree.
> In message <cone.1253978073...@commodore.email-scan.com>,
> Sam <s...@email-scan.com> writes
>>Juha Nieminen writes:
>>
>>> If efficiency is irrelevant, then you have to consider ease of usage
>>> as the second most important feature. Which one is, in your opinion,
>>> easier to use, std::list or std::vector? Suppose, that you want to, for
>>> example, iterate through the elements in the container:
>>> for(std::list<int>::iterator iter = values.begin();
>>> iter != values.end(); ++iter)
>>> ...
>>> vs.
>>> for(size_t i = 0; i < values.size(); ++i)
>>> ...
>>> Take your pick.
>>
>>Yes. We all should take lessons on efficiencies from someone who fails
>>to realize that std::list::end() is an invariant,
>
> So what would you expect to be the execution-time cost of "evaluating"
> an inline function which looks something like one of these (examples
> from three implementations of the standard library I happen to have to
> hand)?
>
> iterator end() { return (iterator(_Myhead, this)); }
>
> iterator end() { return this->_M_node._M_data; }
>
> iterator end () { return __node; }
The cost is that in many situations the compiler will not be able to prove
to itself that the ending iterator value remains constant for the duration
of the iteration, so it has to emit the code to load the ending iterator
value on each iteration of the loop.
Explicitly evaluating end() once, at the beginning of the loop, is usually
sufficient for the compiler to emit corresponding code that loads the value
just once, into a register, avoiding the need to load it every time.
>>and that evaluating it on every iteration always produces the same result.
>
> Don't you think the compiler might make this deduction too?
It's surprisingly hard, and can only be done in very limited situations.
If there's even one non-inlined function call in the body of the loop, there
are only a very limited amount of cases where the compiler will know that
the function call cannot possibly modify the container using some other
pointer or reference, elsewhere, and it is safe to optimize out the ending
iterator evaluation on every iteration. Otherwise, each function call can
potentially modify the container, and since each iteration compares against
the value of end() that exists at that time, it cannot be optimized out.
If latency really is so important, you might just possibly want to
consider whether locality of reference has any impact on cache
performance.
--
Richard Herring
...for an appropriate choice of "many"...
>the compiler will not be able to prove to itself that the ending
>iterator value remains constant for the duration of the iteration, so
>it has to emit the code to load the ending iterator value on each
>iteration of the loop.
>
>Explicitly evaluating end() once, at the beginning of the loop, is
>usually sufficient for the compiler to emit corresponding code that
>loads the value just once, into a register, avoiding the need to load
>it every time.
>
>>>and that evaluating it on every iteration always produces the same result.
>> Don't you think the compiler might make this deduction too?
>
>It's surprisingly hard, and can only be done in very limited situations.
>
>If there's even one non-inlined function call in the body of the loop,
>there are only a very limited amount of cases where the compiler will
>know that the function call cannot possibly modify the container using
>some other pointer or reference, elsewhere, and it is safe to optimize
>out the ending iterator evaluation on every iteration. Otherwise, each
>function call can potentially modify the container, and since each
>iteration compares against the value of end() that exists at that time,
>it cannot be optimized out.
So the saving is something like one load, or similar, per iteration,
_if_ the loop body is complex enough to contain a non-inline function
call. What's that as a fraction of the overall execution time?
--
Richard Herring
Unlike you, I actually *measure* things, rather than relying to
theoretical handwaving.
For example, I measured the memory usage of a program which does this:
std::vector<int> v;
for(int i = 0; i < 1000000; ++i) v.push_back(i);
The amount of memory taken by the program was 4.77 MB. Then I changed
the std::vector to std::list and ran the test again. The memory taken by
the program was 16 MB.
>>> You are referring to indexing a vector to access its elements. That only
>>> matters when random access is a requirement. It is not always required,
>>> and not in this sample case.
>>
>> And then you try to argue how std::list is so much better because you
>> can insert in the middle in constant time, as if that had been a
>> requirement in this case.
>
> I am merely pointing out situations where std::list is a better choice,
> in response to illogical claims about the superiority of std::vector in
> all possible situations.
Exactly who claimed that std::vector is superior in all possible
situations? The only thing that I, or anyone else I have seen, have
claimed is that std::vector is superior in *this* particular situation.
>> You didn't even disagree that iterating through the elements is
>> simpler with std::vector than it is with std::list, yet you refuse to
>
> I most certainly disagreed. You're in error.
Well, as I already mentioned in another post, I don't believe you
really think that traversing a vector is not simpler than traversing a
list, but you are only claiming it for the sake of argument and because
you can't retreat from your position anymore.
Using iterators might be more *generic*, but that's not the same thing
as being *simpler*. Those are two radically different things. Your only
argument pro using iterators to traverse the container is that it's more
generic.
In this particular case the original poster will benefit from
simplicity more than from genericity.
>>> You're going to have to consider
>>> all of std::vector's baggage, such as excessive memory usage, excessive
>>> copying, and iterator invalidation in analogous situations where a
>>> std::list would not invalidate any of the existing iterators.
>>
>> You are going to have to consider all the std::list's baggage, such as
>> excessive memory usage, slow allocation of a high amount of elements and
>> slow random access.
>
> You have me confused with someone else. I am not the one who began the
> argument with absolute claims of the kind that I quoted up top.
You started the argument that std::list is some kind of pure win in
this situation. I disagree. Your arguments on why std::list is better
are flawed.
When measuring all possible factors in this particular situation,
std::deque comes out as a clear winner in all counts (including speed,
which was your original argument, and std::deque wins by a huge margin).
However, you didn't even consider std::deque as an alternative. My guess
is that you don't even understand how std::deque works, which is why it
didn't occur to you to suggest using it.
Clearly std::list is *not* the best solution in this particular situation.
(Of course the point is moot anyways. If the task is to ask the user
some numbers and then do something with them, it doesn't really matter
which container you use.)
>>> But, in singing the praises of std::vector, its fan club always fails to
>>> consider these things.
>>
>> And you fail to consider the drawbacks of std::list.
>
> Wrong, as usual. Unlike other participants of this thread, I never made
> an absolute statement that one container always has better results in
> all possible situations.
And who might these "other participants" be? I don't remember seeing
anybody claiming that std::vector (or any other container) is best in
*all possible situations*. The only claim has been that std::vector is
better in *this particular situation*. Nobody has said anything about
all possible situations (on the contrary, other hypothetical situations
where std::vector might not be the best have been addresses in this
thread more than once, by other people than just yourself).
> I merely pointed out that within the confines
> of OP's test case, std::list comes out on top.
Which is clearly a wrong statement. std::deque beats your std::list
hands down in all possible categories ("within the confines of the OP's
test case", and far beyond).
Now, be honest and confess the true reason why you didn't suggest
using std::deque, and instead suggested using std::list. (My guess: The
possibility didn't even cross your mind.)
> If the OP was trying to
> do something else, such as reading the input data from a file, where the
> potential number of elements might be quite large; or if the OP wanted
> to sort the elements, for example, that would've called for a different
> approach.
Just out of curiosity: Why wouldn't std::list be feasible for sorting
the elements? std::list does have a O(n log n) sorting function. (And if
the elements were very large and heavy to copy, then it could even beat
std::sort applied to a std::vector.)
And you refuse to acknowledge that you don't need tens of thousands of
numbers for std::vector to become faster than std::list. You just need
something like 30.
Besides, if extreme speed is of essence, std::deque would be the right
choice, not std::list (as I demonstrated), thus your solution is wrong.
Moreover, if the input is expected to have something like 10 or 20
numbers at most, std::vector would beat both if you use a proper
reserve() call. With just some "reserve(20)" I don't think std::vector
would ever become slower than std::list. (std::deque might beat it by a
small margin after more than 20 elements are pushed back.)
>>>> Sure, std::list may be faster when we have 10 elements. However, it
>>>> quickly becomes slower when the amount of elements grows even slightly.
>>>
>>> You changing a practical point into a theoretical one is noted.
>>
>> Exactly how is it theoretical that std::vector becomes quickly faster
>> than std::list when the amount of elements grows even a bit (you don't
>
> It becomes theoretical as soon as you leave the boundaries of the
> original scenario.
Your requirement for extreme speed in the original scenario is
completely theoretical. What good do a few clock cycles do if the user
is entering numbers by hand?
> … the additional overhead of reserving memory for unused elements of
> std::vector, the impact from all the additional invocations of copy
> constructors and destructors, as the container grows in size.
Yes, completely relevant in the original scenario.
> And what is the complexity of removing an element from the middle of the
> list, versus the middle of a vector?
O(n) for both. If you want to prove me wrong, write a function which
takes a std::list as parameter and removes the element at the exact
middle faster than O(n).
And if you don't specify that the *order* of the elements must be
maintained (as you never did), then removing the middle element from a
vector becomes O(1) (by assigning the last element to it and then
shrinking the vector by one). Removing it from a std::list is still O(n).
>> Why are you so eager to defend std::list on the grounds that inserting
>> in the middle is O(1), but blissfully ignore the fact that accessing a
>> random element is O(n)?
>
> Because you wouldn't use a std::list when random access is required.
So you are implying that the original poster required insertion in the
middle of the container?
>>> But, you refuse to do this, because it shatters your bubble.
>>
>> And you refuse to take into account the additional memory usage and
>> linear-time random access, because "it shatters your bubble".
>
> Are you talking about yourself, and your blissful ignorance of the
> additional memory requirements of std::vector?
Unlike you, I have actually *measured* in practice the memory usage of
std::vector vs. std::list.
>>> Indeed. After adding an element to a vector, the feature that all
>>> iterators or pointers to existing members of the vectors remain valid is
>>> not a "value added" in a real world. No sirree.
>>
>> Why is that a "value added" but O(1) access is not?
>
> Because you forgot that it's the vector fan club who began adamantly
> claiming that std::vector solves world hunger, brings world peace, and
> is a cure for the common cold.
No, it's you who is claiming that some people here are doing that,
even though they aren't. All they did was say that std::vector is the
best choice in *this particular case*.
(Even if we start nitpicking about efficiency, like you did, then
std::list would *not* be the correct answer, as std::deque clearly beats
it. Thus your argument is incorrect even in that case.)
> But as long as anyone makes the absurd, blanket statement that using a
> vector is always the right solution
Could you please quote me from the post where I said that using a
vector is *always* the right solution? Or anyone else, for that matter.
You seem to have some kind of obsession about that subject.
>>>> Now you'll have to explain why less code = better performance in the
>>>> case of std::deque, which you seem to blissfully ignore as an
>>>> alternative.
>>>
>>> You are desperately trying to change the topic. You are defending the
>>> merits of a std::vector, not std::deque, in the practical case.
>>
>> I'm not changing the topic.
>
> Would you mind looking back through the thread and see who was the first
> one to bring up std::deque?
>
> I'll wait.
The topic was "what is the best data container for this particular
situation". Exactly how is suggesting std::deque "changing the subject"?
Hey, I could nitpick about *you* changing the topic. Who was the first
one to bring up inserting in the middle of a container? I'll wait.
>> Your argument was that std::list is the
>> best choice. I disagree with that argument. std::list is *not* the best
>> choice in this case, nor in most cases.
>>
>> std::deque is a practical demonstration of why std::list is not the
>> best choice.
>
> Sorry to confuse you with facts, but std::deque suffers the same
> limitations as std::vector, in terms of the complexity of insertion or
> deletion in the middle of the container, as well as the same memory
> usage issue.
See? Look who is changing the topic. Your original argument was that
std::list is the fastest (if you suffer from dementia, please refer back
to your own posts). Clearly that's not true.
So now that you have been proven wrong, you change the subject by
trying coming up with some argument about insertion in the middle. If
that's not a change in topic, I don't know what is.
Besides, as I said, inserting and deleting from the middle of a list
is O(n). If you want to prove me wrong, give the function which takes a
std::list and removes the mid element faster than O(n).
And std::list would be the worst possible choice in that case.
> No, not "clearly". There are a number of situations where the
> contortions one must go through, in order to use a vector as the
> underlying container, more than negate any inherent vector-specific
> optimizations.
And what might those be? And how are they relevant to the original post?
[ ... ]
> But, by the same measuring stick you demand others abide by, if
> there's /any/ doubt as to the validity of the benchmark, the
> results are null and void. You stated this position yourself. This
> has to be a world record for backpedaling.
Nonsense! You're trying to conflate two completely separate issues.
Issue number 1: you must know _what_ you're measuring. Without this,
your measurement is devoid of meaning.
Issue number 2: you'd _like_ what you measure to be as similar to a
real workload as possible. OTOH, it's virtually never possible to
measure a real workload without at least some modification -- so you
get as close as you reasonably can, and (by strong preference) have
at least some idea of how the two differ.
The first is necessity. The second is preference.
--
Later,
Jerry.
You are the one picking and choosing data to fit your arguments, often
in an inconsistent matter. If it's all about a user entering numbers
at a terminal, there is no latency to speak of. So, try to be
consistent: either we're discussing the OP's use case exactly, as you
sometimes claim, or we're discussing general practice, as you just
claimed.
If the program is not being used at an interactive terminal, such as
piping the output of one program to it, and the number of ints is ~30
or more, then std::vector will noticeably outperform.
I don't know of any real use cases which require std::list over
std::vector because of this "magical latency". std::list only
outperforms std::vector by a unmeasurable amount when the number of
ints is roughly < 30. No performance matters when you're doing 30
ints, not even transferring over a shitty modem. However, as soon as
the use case morphs ever so slightly, like 30 ints or more, then
std::vector wins, as repeatedly mentioned.
> > but if it's a large number of elements, then std::vector will
> > outperform std::list.
>
> Only in some situations involving a large number of elements. In others, a
> std::vector cannot be used for other reasons, so it's not even an option.
Again, stay consistent. Either we're addressing the user's original
use case on performance grounds, or we're talking about general
practice. Frankly, your arguments are inane. "Oh, it doesn't fit this
use case which I'm keeping in my bag, so we should just use
std::list". std::vector is better for the OP's use case. Give me
another use case, and we can talk about whether std::list or
std::vector is better.
> > So, for a small number of elements the choice
> > does not matter, so he should use vector to make his code reusable and
> > extensible, because as we all know code is frequently reused, such as
>
> If one places code reusability at a premium, one should be using iterators,
> rather than any std::vector-specific methods, or overloads. As I showed,
> doing so allows one to substitute the underlying implementation, without
> changing the code.
>
> A completely different issue.
It is inane to say "Oh, but your algorithm must be as general as
possible". There is little to nothing to be gained by templatizing the
algorithm in this case. I cannot think of a situation where I would
take his original algorithm and replace vector with list, thus there
is no value in templatizing it, but there is a cost associated with
templatizing it, so I will not.
Code reusability is not the same thing as ease of refactoring, though
related. Sometimes, one writes these things called "libraries" or
"stand alone program utilities" (aka the unix way of a bunch of stand
alone programs like cat, sed, grep, etc., and everything is a file).
His program might be one such thing. One could reuse his program
without ever having to recompile it, and if it's using list over
vector, it will be much slower for anything but the trivial small
usecases, and it will never be measurably faster.
>
> > This was, and is, the source of contention for one issue, that you
> > claim performance is a reason to use list, but that's only true for a
> > very small number of inserts, your interpretation of his use case.
>
> Not just "interpretation". I showed the actual numbers.
Please consider actually reading what others post. I said "your
interpretation of his use case is a small number of inserts". Measured
numbers of execution time of programs does not come into it. Your
reply is a non sequitur.
> > However, in this use case, the performance bonus of std::list over
> > std::vector is not measurable.
>
> Sorry, I measured it. Just because some may don't think twice about pissing
> away a small amount of resources, that's not a valid reason to do so in my
> book.
Thank you for snipping the pre-emptive reply I gave. I shall reproduce
it here.
Joshua Maurice wrote:
> This was, and is, the source of contention for one issue, that you
> claim performance is a reason to use list, but that's only true for a
> very small number of inserts, your interpretation of his use case.
> However, in this use case, the performance bonus of std::list over
> std::vector is not measurable. Thus your argument is "Use std::list
> because it performs faster in an entirely unmeasurable way in your use
> case." (Note that I said "unmeasurable in his use case". If you change
> it to a micro-benchmark like you did, then yes you can measure it. You
> cannot measure the difference for his program for a small input set.
> You can however measure the difference for a larger input set, in
> which case std::vector clearly wins.)
You did not measure the difference in his use case. For his use case,
the difference in speed between std::list and std::vector is entirely
unmeasurable for ~10 elements. The various other costs of process
spawning, program start up and death, waiting for the user to type
things, etc., will entirely dwarf the time spent in list vs vector.
You cannot accurately measure it. However, for larger inputs, the
difference is measurable, and std::vector wins. You really need to go
review Big O analysis of algorithms, and the justifications behind it.
Do not optimize to the small case, as the small case's performance
does not matter. Instead, optimize to the asymptotic case, e.g.: the
case with input roughly > 30. This is commonly accepted standard
programming procedure.
Sam Wrote:
> Joshua Maurice wrote:
> > You can however measure the difference for a larger input set, in
> > which case std::vector clearly wins.)
>
> No, not "clearly". There are a number of situations where the
> contortions one must go through, in order to use a vector as the underlying
> container, more than negate any inherent vector-specific optimizations.
>
> Sorry, facts disagree.
You sir are an idiot or a troll. I shall wash my hands of this thread,
and of you forever.
Good day.
>> No, not "clearly". There are a number of situations where the
>> contortions one must go through, in order to use a vector as the
>> underlying container, more than negate any inherent vector-specific
>> optimizations.
>
> And what might those be?
Where existing iterators to container members must remain valid after an
insert (or a delete) from the container.
Duh.
> And how are they relevant to the original post?
That's going to be your homework assignment for today.
> On Sep 28, 4:05 am, Sam <s...@email-scan.com> wrote:
>>
>> > In case you missed it Sam, I'd like to know your reply to this simple
>> > argument: If it's a small number of items, performance doesn't matter,
>>
>> To you, may be not. But in latency-sensitive environments, it most certainly
>> does.
>>
>> But just keep thinking that it doesn't matter, don't let me discourage you
>> otherwise. I consider stuff like that to be excellent job security.
>
> You are the one picking and choosing data to fit your arguments, often
> in an inconsistent matter. If it's all about a user entering numbers
> at a terminal, there is no latency to speak of.
You asked for my "reply to this simple" theoretical argument, and you got
one. Suddenly, you want to switch from this theoretical situation to a
practical one, involving terminal entry. This is the only way you can
continue to flounder -- by cherry picking bits from all over place, and
discarding all other, inconvenient facts.
> So, try to be
> consistent: either we're discussing the OP's use case exactly, as you
> sometimes claim, or we're discussing general practice, as you just
> claimed.
Hillarous -- you can't even make up your mind, yourself!
>
> If the program is not being used at an interactive terminal, such as
> piping the output of one program to it, and the number of ints is ~30
> or more, then std::vector will noticeably outperform.
Furthermore, as I already stated previously in the thread, just because the
penalty of one's favorite security blanket is negligible, it does not
mandate its use.
Your position is so untenable, that you keep going back and revisiting
arguments that have been answered.
> I don't know of any real use cases which require std::list over
> std::vector because of this "magical latency".
Well, just because you don't, it doesn't mean that they do not exist.
> std::list only
> outperforms std::vector by a unmeasurable amount when the number of
This "unmeasurable amount" is very much measurable. And, many people work
very hard, and get paid a lot of money, to squeeze every last bit of latency
they can, out of their realtime apps. The extra 20 or so milliseconds of
latency, that I showed, often means being either in the black, or in the
red, at the end of the day.
>> > but if it's a large number of elements, then std::vector will
>> > outperform std::list.
>>
>> Only in some situations involving a large number of elements. In others, a
>> std::vector cannot be used for other reasons, so it's not even an option.
>
> Again, stay consistent. Either we're addressing the user's original
> use case on performance grounds, or we're talking about general
> practice.
Here's a pop quiz. Who's arguing the merits of the containers in question by
arguing what happens when you put a lot more objects in the container than
in OP's case?
Ain't me.
> Frankly, your arguments are inane. "Oh, it doesn't fit this
> use case which I'm keeping in my bag, so we should just use
> std::list".
> std::vector is better for the OP's use case. Give me
Facts disagree. I posted hard data that proves otherwise. Numbers don't lie.
In OP's case, std::list is faster. And, it's not my fault that the hard data
pierces the bubble you've wrapped yourself in.
>> > So, for a small number of elements the choice
>> > does not matter, so he should use vector to make his code reusable and
>> > extensible, because as we all know code is frequently reused, such as
>>
>> If one places code reusability at a premium, one should be using iterators,
>> rather than any std::vector-specific methods, or overloads. As I showed,
>> doing so allows one to substitute the underlying implementation, without
>> changing the code.
>>
>> A completely different issue.
>
> It is inane to say "Oh, but your algorithm must be as general as
> possible".
Newflash: a well-designed algorithm implementation that will not require any
changes in the event that the underlying container implementation changes,
is "inane".
Just another new thing one learns every day, around here.
> There is little to nothing to be gained by templatizing the
> algorithm in this case. I cannot think of a situation where I would
> take his original algorithm and replace vector with list, thus there
> is no value in templatizing it, but there is a cost associated with
> templatizing it, so I will not.
Just because you can't think of one, it doesn't prevent it from ever
happening, and from having to deal with the consequences. I think you just
don't have enough real world experience to understand the value of deploying
an implementation that will require a lot less testing, when something like
this happens.
… Whoa. It turns out that I'm quite familiar with your so-called "unified
enterprise data integration platform" product, and its overall quality. I
must say, that this is quite an eye opener, and puts your arguments, and
your overall position, in a new light.
>> > This was, and is, the source of contention for one issue, that you
>> > claim performance is a reason to use list, but that's only true for a
>> > very small number of inserts, your interpretation of his use case.
>>
>> Not just "interpretation". I showed the actual numbers.
>
> Please consider actually reading what others post. I said "your
> interpretation of his use case is a small number of inserts". Measured
> numbers of execution time of programs does not come into it.
I see. Measuring the execution time of programs with alternative
implementations will not, I repeat, will NOT, tell anyone anything about
which implemention is the more efficient one.
Brilliant.
>> cannot measure the difference for his program for a small input set.
>> You can however measure the difference for a larger input set, in
>> which case std::vector clearly wins.)
>
> You did not measure the difference in his use case. For his use case,
> the difference in speed between std::list and std::vector is entirely
> unmeasurable for ~10 elements.
I measured it. Repeatedly. And easily. The fact that you don't seem to place
much value in anything that does not involve some "unified enterprise data
integration platform" does not invalidate my results.
> The various other costs of process
> spawning, program start up and death, waiting for the user to type
> things, etc., will entirely dwarf the time spent in list vs vector.
> You cannot accurately measure it.
It may be a shocking revelation to some, but fork() and exec() will take the
same amount time for a simple program that uses either std::list or a
std::vector. Therefore, subtracting the fixed startup overhead from the
sampled execution times will only serve to increase the proportional
performance penalty of std::vector compared to std::list. Ask an eighth
grader for help, if you can't figure out how this works.
> However, for larger inputs, the
> difference is measurable,
Really? Would you be kind and refresh my memory with who you claim always
fails to make up his mind whether "Either we're addressing the user's
original use case on performance grounds, or we're talking about general
practice."
Physician, heal thyself.
>> Sorry, facts disagree.
>
> You sir are an idiot or a troll. I shall wash my hands of this thread,
> and of you forever.
Although I can assure you that I share a reciprocal opinions of you, I will
not call you an idiot or a troll. I'll let the facts present themselves, and
paint an equivalent picture, as the inescapable conclusion.
> In message <cone.1254136305....@commodore.email-scan.com>,
> Sam <s...@email-scan.com> writes
>
>>If there's even one non-inlined function call in the body of the loop,
>>there are only a very limited amount of cases where the compiler will
>>know that the function call cannot possibly modify the container using
>>some other pointer or reference, elsewhere, and it is safe to optimize
>>out the ending iterator evaluation on every iteration. Otherwise, each
>>function call can potentially modify the container, and since each
>>iteration compares against the value of end() that exists at that time,
>>it cannot be optimized out.
>
> So the saving is something like one load, or similar, per iteration,
> _if_ the loop body is complex enough to contain a non-inline function
> call. What's that as a fraction of the overall execution time?
Whatever it is, I wouldn't consider it something to be ignored. Always
producing the best possible code is a good habit to have. As I mentioned
elsewhere, in certain situations, a difference of a few milliseconds in
latency of production code means whether you come out in the black, or in
the red, at the end of the day.
But, I suppose, if you're not doing anything particularly important, and you
don't care, you shouldn't. If you forget the whole thing, but this suddenly
becomes important enough for someone to care about it, it won't be you.
> Sam wrote:
>>> std::list uses more memory and more allocations than std::vector does.
>>
>> Very funny. You're a real comedian, you should be writing for SNL.
>>
>> You've got your blinders on, and can't see all the additional memory
>> std::vector allocates, in order to reserve room for the growth of a
>> vector. There are plenty of situations where std::vector uses more
>> memory than std::list would, with the same number of elements.
>>
>> Your blanket, absolute statement is evidence that you do not fully
>> understand the subject matter at hand.
>
> Unlike you, I actually *measure* things, rather than relying to
> theoretical handwaving.
>
> For example, I measured the memory usage of a program which does this:
Ok, so you've agreed to measure a theoretical use case.
> std::vector<int> v;
> for(int i = 0; i < 1000000; ++i) v.push_back(i);
>
> The amount of memory taken by the program was 4.77 MB. Then I changed
> the std::vector to std::list and ran the test again. The memory taken by
> the program was 16 MB.
Well, I decided to measure an equally valid theoretical use case, only
slightly different from yours. Even more list elements, and an even larger
objects, not just a puny int.
std::list<foo> v;
for(int i = 0; i < 1100000; ++i) v.push_back(foo());
ps -l showed:
0 S 500 3660 3587 11 80 0 - 45904 hrtime pts/0 00:00:00 t
45904 pages=179 megabytes of RAM.
After replacing the std::list with a std::vector:
0 S 500 3648 3587 58 80 0 - 68476 hrtime pts/0 00:00:01 t
68476 pages=267 megabytes of RAM.
This is just as valid measurement as yours. Sorry to have to confuse you
with facts.
Your homework assignment is to figure out how big the foo class is. It's not
very big at all.
>> I am merely pointing out situations where std::list is a better choice,
>> in response to illogical claims about the superiority of std::vector in
>> all possible situations.
>
> Exactly who claimed that std::vector is superior in all possible
> situations? The only thing that I, or anyone else I have seen, have
> claimed is that std::vector is superior in *this* particular situation.
This particular situation: .
Hint: you need a magnifying class to see it.
> Using iterators might be more *generic*, but that's not the same thing
> as being *simpler*.
Damn right. If the underlying container changes, it's easier to rewrite a
bunch of code, then just to have it Just Work™.
> When measuring all possible factors in this particular situation,
> std::deque comes out as a clear winner in all counts (including speed,
> which was your original argument, and std::deque wins by a huge margin).
No, it does not. deque shares many negative factors with vector.
> Clearly std::list is *not* the best solution in this particular situation.
If by "this" you mean your test case, where your element count is just below
the reserve capacity of a vector that uses logarithmic expansion, then it
is.
If by *this* you mean OP's original situation, it is not the best solution.
Nor it would be when your real world case doesn't happen to have the same
stars aligned so nicely, when your container's size will jump all over that
place.
>> If the OP was trying to
>> do something else, such as reading the input data from a file, where the
>> potential number of elements might be quite large; or if the OP wanted
>> to sort the elements, for example, that would've called for a different
>> approach.
>
> Just out of curiosity: Why wouldn't std::list be feasible for sorting
> the elements? std::list does have a O(n log n) sorting function. (And if
> the elements were very large and heavy to copy, then it could even beat
> std::sort applied to a std::vector.)
"sort" not necessary = std::sort().
>> And what is the complexity of removing an element from the middle of the
>> list, versus the middle of a vector?
>
> O(n) for both.
Hahahahahaha.
std::list<obj> theList;
// …
std::list<obj>::iterator p;
// …
theList.erase(p); // Warning, O(n)!!!!!
Very funny.
> If you want to prove me wrong, write a function which
> takes a std::list as parameter and removes the element at the exact
> middle faster than O(n).
What you fail to grok, is that when you have a member of containe that you
want removed, you already have its iterator, you don't need to search for
it.
Thank you for playing.
> And if you don't specify that the *order* of the elements must be
> maintained (as you never did), then removing the middle element from a
> vector becomes O(1) (by assigning the last element to it and then
> shrinking the vector by one). Removing it from a std::list is still O(n).
You're just grasping at straws here, desperately trying to create a
contrived situation that fits your preconceived notions. This is something
that will often occur inside ivory halls of academia, where brownie points
are awarded for completing a purely theoretical excersize that bears no
relevance to anything that will occur in practice.
>> additional memory requirements of std::vector?
>
> Unlike you, I have actually *measured* in practice the memory usage of
> std::vector vs. std::list.
… only in a cherry-picked case.
>> But as long as anyone makes the absurd, blanket statement that using a
>> vector is always the right solution
>
> Could you please quote me from the post where I said that using a
> vector is *always* the right solution? Or anyone else, for that matter.
Well, how about the "Unlike you, I have actually *measured* in practice the
memory usage of std::vector vs. std::list" part, referring to your argument
that std::vector always wins?
>> I'll wait.
>
> The topic was "what is the best data container for this particular
> situation". Exactly how is suggesting std::deque "changing the subject"?
>
> Hey, I could nitpick about *you* changing the topic. Who was the first
> one to bring up inserting in the middle of a container? I'll wait.
Who was the first one to start yammering about huge lists and vectors, to
start off this entire thread?
I'll wait.
>>> std::deque is a practical demonstration of why std::list is not the
>>> best choice.
>>
>> Sorry to confuse you with facts, but std::deque suffers the same
>> limitations as std::vector, in terms of the complexity of insertion or
>> deletion in the middle of the container, as well as the same memory
>> usage issue.
>
> See? Look who is changing the topic.
The individual who first brought up dequeue. Hint: it wasn't me.
> Your original argument was that
> std::list is the fastest (if you suffer from dementia, please refer back
> to your own posts). Clearly that's not true.
Clearly it was true, /in OP's original use case/. Nothing you've said
disputes it.
"fastest" != "always fastest". Elementary, my dear Watson, elementary.
> Besides, as I said, inserting and deleting from the middle of a list
> is O(n).
No, it's not. See 23.2.2.3:
Complexity: Insertion of a single element into a list takes constant time
and exactly one call to the copy constructor of T.
…
Complexity: Erasing a single element is a constant time operation with a
single call to the destructor of T.
No qualification about inserting into/deleting from the beginning, the end,
or into the middle. It's always constant. Sorry to confuse you with facts.
> If you want to prove me wrong, give the function which takes a
> std::list and removes the mid element faster than O(n).
list.erase(q);
HTH.
Who said something about "having a member"?
> Thank you for playing.
*yawn*
You simply phrased your question badly, Sam. It's not obvious if an
iterator to the middle element is already known. I do think you're
smart enough to grok what was going on. Juha even explained how he
interpreted the question by saying "write a function which takes a
std::list as parameter and removes the element at the exact middle
faster than O(n)". But instead of saying something like
"Yes, O(n) if you don't know the iterator. But I was assuming
that it is known. In that case std::list makes a difference."
you try to make it look like you're the only sane person here.
-1 for weak communication skills.
Cheers,
SG
cf. "you need a magnifying glass to see it."
> Always producing the best possible code is a good habit to have.
Indeed, and "best" is about more than speed. If you subsequently
"substitute the underlying implementation, without changing the code" by
changing the container type, the hidden assumption that end() is
invariant may come back to bite you.
> As I mentioned elsewhere, in certain situations, a difference of a few
>milliseconds in latency of production code means whether you come out
>in the black, or in the red, at the end of the day.
In that case you should also be worrying about the relative cost of
_incrementing_ different kinds of iterator. How does an increment
compare with an indirection on your target architecture? ;-)
And, as I mentioned elsewhere, you should really be much more concerned
about maximizing locality of reference. If the next element of your
sequence isn't already in the cache, all those micro-savings on iterator
manipulation are pointless.
>But, I suppose, if you're not doing anything particularly important,
>and you don't care, you shouldn't.
'We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil.'
> If you forget the whole thing, but this suddenly becomes important
>enough for someone to care about it, it won't be you.
--
Richard Herring