Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

string_view problem

94 views
Skip to first unread message

Ralf Goertz

unread,
Jun 25, 2019, 4:05:19 AM6/25/19
to
Hi,

I wrote a small program to create all distinct circular arrangements of
n_i objects from category i. If you have, say two A's and two B's, there
are two distinctive ways to put them in a circular order:

AABB ABAB

The other four (linear) arrangements ABBA(1), BAAB(1), BABA(2), BBAA(1)
are only circular permutations of the two above (with the number in
parenthesis indicating of which). It is surprisingly difficult to come
up with a formula¹ for the number of such arrangements, at least in the
case where the greatest common divisor of all the n_i is greater than 1.
My idea for the program was to go over all linear arrangements (with the
first element fixed) and store those which were no circular permutations
of any of the previously stored arrangements. To test that I created a
string called tester which is basically the current linear permutation
twice. Let the test permutation be BAAB. Then tester would be BAABBAAB.
Now I can try to find all previously stored permutations within tester:

tester.find(1,i)

which is not string::npos if i=AABB. (I need to start only at position 1
because there can't be a match at 0. Likewise, I don't need the last
character of tester.) The complete program goes like this:


#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <string>
#include <string_view>

using namespace std;


int main(int argc, char *argv[]) {
string permuter;
char c='A';
for (auto i=1;i<argc;++i) {
auto n=stoi(argv[i]);
permuter.insert(permuter.end(),n,c++);
}
if (!permuter.size()) {
cout<<0<<endl;
return 0;
}
vector<string> result;
size_t bin=1;
do {
string const tester=permuter+permuter.substr(0,permuter.size()-1); //*
bool found(false);
for (auto &i:result) {
if (tester.find(i,1)!=string::npos) {
found=true;
break;
}
}
if (!found) {
result.push_back(permuter);
if (result.size()==bin) {
cout<<bin<<endl;
bin<<=1;
}
}
} while (next_permutation(permuter.begin()+1,permuter.end()));
if (result.size()<1000) copy(result.begin(),result.end(),ostream_iterator<string>(cout,"\n"));
cout<<result.size()<<endl;
}

In an attempt to optimize I tried (for the first time) to use
string_view instead of string on the line marked with the asterisk. I
tried it with the arguments "6 6" and was impressed by the fact that the
program now needed only 1.5 seconds instead of the 2.5 it needed with
string. But then I realised that the size of the result vector was 6709
and not the correct 2704 it was with string. I figured out that the
problem probably was the fact that the string_view was initialised from
a temporary. I tried to compile with "g++ -pedantic -Wall" but there was
no diagnostic message at all. Why? If that is not the problem then what
is?

As an aside: What find algorithm does basic_string::find() use? Does it
do something more sophisticated than the naïve way?

¹Clarke, L. E. and Singer, James: On circular permutations, The American
Mathematical Monthly (65), 1958, 609--610.

Ralf Goertz

unread,
Jun 25, 2019, 4:25:47 AM6/25/19
to
Am Tue, 25 Jun 2019 10:05:10 +0200
schrieb Ralf Goertz <m...@myprovider.invalid>:

>
> tried it with the arguments "6 6" and was impressed by the fact that

Oops, make that "9 9"

Alf P. Steinbach

unread,
Jun 25, 2019, 11:40:14 AM6/25/19
to
In Windows 10 with Visual C++ 2019 and g++ 8.2 following code
consistently produces garbage output:


#include <cppx-core/all.hpp> // <url:
https://github.com/alf-p-steinbach/cppx-core>

using namespace std;

auto main() -> int
{
string permuter = "Blah12345678901234567890"; // More than short
string buffer.
string_view const tester = permuter + permuter.substr( 0,
permuter.size() - 1 );
cout << tester << endl;
}


However, since it's UB it's not /guaranteed/ to produce garbage.

The `basic_string::find` is just a naïve direct find algorithm. More
advanced text search algorithms were added as freestanding functions in
C++17. See overlead (5) of `std::search` at <url:
https://en.cppreference.com/w/cpp/algorithm/search>. But this newfangled
stuff isn't necessarily available with any particular compiler yet.


Cheers!,

- Alf

James Kuyper

unread,
Jun 25, 2019, 9:59:00 PM6/25/19
to
On 6/25/19 4:05 AM, Ralf Goertz wrote:
...
> As an aside: What find algorithm does basic_string::find() use? Does it
> do something more sophisticated than the naïve way?

As it normally is, the standard is silent on the algorithm used,
defining only the results. There's many different overloads for find(),
and several variant of find, but they are all specified in a fashion
similar, in this regard, to the first one:

"Effects: Determines the lowest position xpos, if possible, such that
both of the following conditions
obtain:
— pos <= xpos and xpos + str.size() <= size();
— traits::eq(at(xpos+I), str.at(I)) for all elements I of the string
controlled by str." (21.4.7.2p1).

The algorithm used is chosen by the implementor, and therefore depends
upon which implementation you are using.

Ralf Goertz

unread,
Jun 26, 2019, 4:42:03 AM6/26/19
to
Am Tue, 25 Jun 2019 17:39:58 +0200
schrieb "Alf P. Steinbach" <alf.p.stein...@gmail.com>:

> In Windows 10 with Visual C++ 2019 and g++ 8.2 following code
> consistently produces garbage output:
>
>
> #include <cppx-core/all.hpp> // <url:
> https://github.com/alf-p-steinbach/cppx-core>
>
> using namespace std;
>
> auto main() -> int
> {
> string permuter = "Blah12345678901234567890"; // More than
> short string buffer.
> string_view const tester = permuter + permuter.substr( 0,
> permuter.size() - 1 );
> cout << tester << endl;
> }
>
>
> However, since it's UB it's not /guaranteed/ to produce garbage.

Okay, but shouldn't the compiler warn about that? I mean it's fairly
obvious (at least after having read how string_view is constructed).

> The `basic_string::find` is just a naïve direct find algorithm. More
> advanced text search algorithms were added as freestanding functions
> in C++17. See overlead (5) of `std::search` at <url:
> https://en.cppreference.com/w/cpp/algorithm/search>. But this
> newfangled stuff isn't necessarily available with any particular
> compiler yet.

Thanks, I changed to std::boyer_moore_searcher. That indeed sped up the
program considerably (after interchanging corpus and pattern). Now I
could also use vector<char> instead of string which also had a very
noticable effect. Together, the "9 9" case is 1 second compared to the
2.5 seconds before.

Juha Nieminen

unread,
Jun 26, 2019, 8:12:06 AM6/26/19
to
Ralf Goertz <m...@myprovider.invalid> wrote:
> string const tester=permuter+permuter.substr(0,permuter.size()-1); //*

If you are aiming for the utmost extreme speed and optimization, lines
like this should *immediately* ring alarm bells in your head.

std::string uses dynamic memory allocation, and lines like that are
doing them in droves. This is especially problematic when you are
just dealing with byte arrays of fixed sizes (with the size being
determined before the algorithm even begins), which usually means
you can get away with just one or a handful of allocations, and no
dynamic allocations of any kind during the execution of the algorithm
(other than, possibly, adding the result to some dynamic data
container, which usually isn't a problem).

Dynamic memory allocation is notoriously inefficient for many reasons,
and using pre-allocated non-size-changing buffers over and over is
often the best approach. Doing dynamic allocations in the inner loops
of your computationally heavy algorithms is going to kill the efficiency.
(There are situations where it simply can't be avoided, but if you can
avoid it, you should.)

Ralf Goertz

unread,
Jun 26, 2019, 8:42:48 AM6/26/19
to
Am Wed, 26 Jun 2019 12:11:57 -0000 (UTC)
schrieb Juha Nieminen <nos...@thanks.invalid>:

> Ralf Goertz <m...@myprovider.invalid> wrote:
> > string const
> > tester=permuter+permuter.substr(0,permuter.size()-1); //*
>
> Dynamic memory allocation is notoriously inefficient for many reasons,
> and using pre-allocated non-size-changing buffers over and over is
> often the best approach. Doing dynamic allocations in the inner loops
> of your computationally heavy algorithms is going to kill the
> efficiency. (There are situations where it simply can't be avoided,
> but if you can avoid it, you should.)

That's what I did now. (See also my answer to Alf.) The relevant changes
are the following:

size_t const size=permuter.size();
vector<char> tester(size*2-2);
do {
bool found(false);
boyer_moore_searcher b(permuter.begin(),permuter.end());
for (int i=0;i<result.size();++i) {
copy(result[i].begin()+1,result[i].end(), tester.begin());
copy(result[i].begin() ,result[i].end()-1,tester.begin()+size-1);
if (b(tester.begin(),tester.end()).first!=tester.end()) {

Ralf Goertz

unread,
Jun 27, 2019, 2:43:52 AM6/27/19
to
Am Wed, 26 Jun 2019 12:11:57 -0000 (UTC)
schrieb Juha Nieminen <nos...@thanks.invalid>:

> Ralf Goertz <m...@myprovider.invalid> wrote:
> > string const
> > tester=permuter+permuter.substr(0,permuter.size()-1); //*
>
> If you are aiming for the utmost extreme speed and optimization, lines
> like this should *immediately* ring alarm bells in your head.

I thought a little about that. While I can see your point, I remember
having heard (I am no professional programer, so I never properly learnt
C++) that every variable should be declared where it is first used.
Which somewhat conflicts with your statement:

> Doing dynamic allocations in the inner loops of your computationally
> heavy algorithms is going to kill the efficiency. (There are
> situations where it simply can't be avoided, but if you can avoid it,
> you should.)

I think I also heard that compilers usually generate code that optimises
away all but the first allocation. In my case the size of permuter
doesn't change within the loop so the memory allocated in the first run
could simply be reused in the subsequent runs, no?

Juha Nieminen

unread,
Jun 27, 2019, 3:57:32 AM6/27/19
to
Ralf Goertz <m...@myprovider.invalid> wrote:
> Am Wed, 26 Jun 2019 12:11:57 -0000 (UTC)
> schrieb Juha Nieminen <nos...@thanks.invalid>:
>
>> Ralf Goertz <m...@myprovider.invalid> wrote:
>> > string const
>> > tester=permuter+permuter.substr(0,permuter.size()-1); //*
>>
>> If you are aiming for the utmost extreme speed and optimization, lines
>> like this should *immediately* ring alarm bells in your head.
>
> I thought a little about that. While I can see your point, I remember
> having heard (I am no professional programer, so I never properly learnt
> C++) that every variable should be declared where it is first used.
> Which somewhat conflicts with your statement:

That's a principle relating to syntax, not relating to where objects
should be created. In old C variables needed to be declared at the
beginning of the function, even if they were only needed later in
the code. In other words, you had to write like:

void foo()
{
int i, j;

// lots of code here...

i = 5; j = 10;
// etc...
}

In C++, and in modern C, it's recommended for clarity to do it like:

void foo()
{
// lots of code here...

int i = 5, j = 10;
// etc...
}

Obviously the advise is not saying that if you have some buffer of fixed
size for calculations, and this buffer is used inside the innermost loop
of a heavy algorithm, you should create the buffer again and again inside
that inner loop.

> I think I also heard that compilers usually generate code that optimises
> away all but the first allocation. In my case the size of permuter
> doesn't change within the loop so the memory allocated in the first run
> could simply be reused in the subsequent runs, no?

You are not only creating the 'tester' string in the inner loop, but you
are also creating temporary strings. These strings get destroyed at the
end of the scope where they are created. The compiler isn't going
to optimize that away.

Ralf Goertz

unread,
Jun 27, 2019, 4:37:29 AM6/27/19
to
Am Thu, 27 Jun 2019 07:57:22 -0000 (UTC)
schrieb Juha Nieminen <nos...@thanks.invalid>:

> You are not only creating the 'tester' string in the inner loop, but
> you are also creating temporary strings. These strings get destroyed
> at the end of the scope where they are created. The compiler isn't
> going to optimize that away.

So the following would be okay and the allocation is done just once?
Also, will it be be initialised as the constructor says although this is
not needed at all?

size_t const size=permuter.size();
do {
bool found(false);
boyer_moore_searcher b(permuter.begin(),permuter.end());
for (int i=0;i<result.size();++i) {
string tester(size*2-2,' '); //or vector<char>
copy(result[i].begin()+1,result[i].end(),tester.begin());
copy(result[i].begin() ,result[i].end()-1,tester.begin()+size-1);
if (b(tester.begin(),tester.end()).first!=tester.end()) {
found=true;
break;
}
}

} while (next_permutation(permuter.begin()+1,permuter.end()));

James Kuyper

unread,
Jun 27, 2019, 7:27:12 AM6/27/19
to
On 6/27/19 3:57 AM, Juha Nieminen wrote:
> Ralf Goertz <m...@myprovider.invalid> wrote:
...
>> I thought a little about that. While I can see your point, I remember
>> having heard (I am no professional programer, so I never properly learnt
>> C++) that every variable should be declared where it is first used.

That's not always possible, if the scope for the variable needs to be
larger than the scope in which the first use occurs.
The key requirement is that it be declared no later than first use, and
preferably as late as possible, consistent with the way it will be used.

>> Which somewhat conflicts with your statement:
>
> That's a principle relating to syntax, not relating to where objects
> should be created. In old C variables needed to be declared at the
> beginning of the function, even if they were only needed later in
> the code. ...

That may have been true in the very earliest days of C, but it hasn't
been true since at least K&R C, which allowed you to declare objects at
the start of any compound statement, not just at the start of the
outermost compound statement of a function:

int func(in)
int in;
{
if(in)
{
int total;
...

C++ added the ability to intermingle declarations and statements, which
was later adopted by C99 as well. However, the freedom to place
declarations in positions other than the start of the function dates
back at least to K&R C.

...
> Obviously the advise is not saying that if you have some buffer of fixed
> size for calculations, and this buffer is used inside the innermost loop
> of a heavy algorithm, you should create the buffer again and again inside
> that inner loop.

Actually, I would not hesitate to define a buffer as a C style array
inside a loop, if there's no need for the contents of that buffer to
carry over from one pass of the loop to another. I would normally expect
the compiler to simply reuse the same piece of memory during each pass
through the loop, at no extra cost.

This is particularly true if there's a need to zero initialize the
buffer between passes, which can be achieved by simply providing a {0}
initializer for the buffer, rather than having to call memset().

In C++, I would hesitate to define a standard container in that
location, because I couldn't rely on the compiler to figure out how to
optimize away the constructor and destructor calls that nominally must
occur at the start and end of each pass through the loop.

Tim Rentsch

unread,
Jun 27, 2019, 8:49:48 AM6/27/19
to
Ralf Goertz <m...@myprovider.invalid> writes:

> I wrote a small program to create all distinct circular arrangements of
> n_i objects from category i. If you have, say two A's and two B's, there
> are two distinctive ways to put them in a circular order:
>
> AABB ABAB
>
> The other four (linear) arrangements ABBA(1), BAAB(1), BABA(2), BBAA(1)
> are only circular permutations of the two above (with the number in
> parenthesis indicating of which). It is surprisingly difficult to come
> up with a formula[1] for the number of such arrangements, at least in the
> case where the greatest common divisor of all the n_i is greater than 1.
> My idea for the program was to go over all linear arrangements (with the
> first element fixed) and store those which were no circular permutations
> of any of the previously stored arrangements. [...]

This algorithm does a lot more work than is needed. The program
can be much faster (I ran a 14 14 input in about 0.65 seconds).

Hint: it isn't necessary to store any previous results to check
if next_permutation() gives a string none of whose rotations have
been seen before.

Manfred

unread,
Jun 27, 2019, 10:15:55 AM6/27/19
to
On 6/27/2019 1:26 PM, James Kuyper wrote:
> On 6/27/19 3:57 AM, Juha Nieminen wrote:
>> Ralf Goertz <m...@myprovider.invalid> wrote:
> ...
>>> I thought a little about that. While I can see your point, I remember
>>> having heard (I am no professional programer, so I never properly learnt
>>> C++) that every variable should be declared where it is first used.
>
> That's not always possible, if the scope for the variable needs to be
> larger than the scope in which the first use occurs.
> The key requirement is that it be declared no later than first use, and
> preferably as late as possible, consistent with the way it will be used.
>
The "preferably" part is subject to debate, and I don't think it should
be stated as a general recommendation.
That is probably what you meant with "consistent with the way it will be
used", but it doesn't sound that clear.
Some experienced programmers consider the habit of intermingling
declarations and statements as sloppy practice, and this is not just a
matter of being old fashioned.

As an example, when I have to implement an algorithm of some sort, I
take care to identify the data elements that compose the state of the
logic involved.
Managing a component's state requires proper discipline, so I consider
such state data best declared at the top of the context where the
algorithm is defined (which may well be a nested scope), so as to make
clear that it belongs to a whole which is key to the functioning of the
following procedure.

Ralf Goertz

unread,
Jun 27, 2019, 10:52:41 AM6/27/19
to
Am Thu, 27 Jun 2019 05:49:37 -0700
schrieb Tim Rentsch <tr.1...@z991.linuxsc.com>:

> Ralf Goertz <m...@myprovider.invalid> writes:
>
> > I wrote a small program to create all distinct circular
> > arrangements of n_i objects from category i. If you have, say two
> > A's and two B's, there are two distinctive ways to put them in a
> > circular order:
> >
> > AABB ABAB
> >
> > The other four (linear) arrangements ABBA(1), BAAB(1), BABA(2),
> > BBAA(1) are only circular permutations of the two above (with the
> > number in parenthesis indicating of which). It is surprisingly
> > difficult to come up with a formula[1] for the number of such
> > arrangements, at least in the case where the greatest common
> > divisor of all the n_i is greater than 1. My idea for the program
> > was to go over all linear arrangements (with the first element
> > fixed) and store those which were no circular permutations of any
> > of the previously stored arrangements. [...]
>
> This algorithm does a lot more work than is needed. The program
> can be much faster (I ran a 14 14 input in about 0.65 seconds).

What number of permutations did you get?

> Hint: it isn't necessary to store any previous results to check
> if next_permutation() gives a string none of whose rotations have
> been seen before.

I am surprised to hear that. How else do I know that when I get to the
permutation ABBA in the example above, it is the same as AABB so that I
must discard it? The example also shows that the equivalence classes are
in general not of the same size which is the very reason the formula is
rather complicated. (The reference I gave corrected an erroneous
previous attempt by another author.) 15 years ago I thought a lot about
circular permutations and I still don't think I have an intuitive
understanding of them. At that time I lacked the ability (and the
computer power) to enumerate them. 14 14 wouldn't have sufficed anyway.
The goal was to prove that for certain inputs there where permutations
without neighbouring elements of the same category. The smallest
relevant example (which is [probably] the only such example where there
is no such permutation) is 3 1 1. I ended up proving the theorem for
infinitely many (but not all) inputs with a completely different
approach. But the idea of coming to grips with the problem by looking at
the actual permutations somehow stuck.

So I am not only surprised that there is a much simpler algorithm but
also very impressed that you found it with apparent ease. Regrettably,
your hint doesn't help me enough. So could you please elaborate?



James Kuyper

unread,
Jun 27, 2019, 12:00:03 PM6/27/19
to
On Thursday, June 27, 2019 at 10:15:55 AM UTC-4, Manfred wrote:
> On 6/27/2019 1:26 PM, James Kuyper wrote:
> > On 6/27/19 3:57 AM, Juha Nieminen wrote:
> >> Ralf Goertz <m...@myprovider.invalid> wrote:
> > ...
> >>> I thought a little about that. While I can see your point, I remember
> >>> having heard (I am no professional programer, so I never properly learnt
> >>> C++) that every variable should be declared where it is first used.
> >
> > That's not always possible, if the scope for the variable needs to be
> > larger than the scope in which the first use occurs.
> > The key requirement is that it be declared no later than first use, and
> > preferably as late as possible, consistent with the way it will be used.
> >
> The "preferably" part is subject to debate, and I don't think it should
> be stated as a general recommendation.

I do.

> That is probably what you meant with "consistent with the way it will be
> used",

I don't think so.
The main thing I meant by that comment is that each variable must be
declared in a location that gives it sufficient scope to be referred to
by name in all of the parts of the code that need to refer to it by
name. That requirement sometimes constrains how close a declaration can
be to the point of first use. That isn't the issue you're talking about.

> Some experienced programmers consider the habit of intermingling
> declarations and statements as sloppy practice, and this is not just a
> matter of being old fashioned.

I know that they do, and I disagree. I think that declaring an
identifier as close as possible to the point of first use makes it
easier to find and understand, and minimizes the opportunities for
conflict with other identifiers. Some people feel that a different
location, such as the top of the function, would make it easier to find,
but I haven't seen a reasonable explanation for that feeling.

Tim Rentsch

unread,
Jun 27, 2019, 12:57:34 PM6/27/19
to
Manfred <non...@add.invalid> writes:

> On 6/27/2019 1:26 PM, James Kuyper wrote:
>
>> On 6/27/19 3:57 AM, Juha Nieminen wrote:
>>
>>> Ralf Goertz <m...@myprovider.invalid> wrote:
>>
>> ...
>>
>>>> I thought a little about that. While I can see your point, I
>>>> remember having heard (I am no professional programer, so I never
>>>> properly learnt C++) that every variable should be declared where
>>>> it is first used.
>>
>> That's not always possible, if the scope for the variable needs to
>> be larger than the scope in which the first use occurs.

>> The key requirement is that it be declared no later than first use,
>> and preferably as late as possible, consistent with the way it will
>> be used.
>
> The "preferably" part is subject to debate, and I don't think it
> should be stated as a general recommendation. [...]
> Some experienced programmers consider the habit of intermingling
> declarations and statements as sloppy practice, and this is not
> just a matter of being old fashioned.
>
> As an example, when I have to implement an algorithm of some sort,
> I take care to identify the data elements that compose the state
> of the logic involved.
> Managing a component's state requires proper discipline, so I
> consider such state data best declared at the top of the context
> where the algorithm is defined (which may well be a nested scope),
> so as to make clear that it belongs to a whole which is key to the
> functioning of the following procedure.

I agree with your sentiments. The policy of declaring variables
"preferably as late as possible" is overly simplistic, and over
sold.

Tim Rentsch

unread,
Jun 27, 2019, 1:16:28 PM6/27/19
to
Ralf Goertz <m...@myprovider.invalid> writes:

> Am Thu, 27 Jun 2019 05:49:37 -0700
> schrieb Tim Rentsch <tr.1...@z991.linuxsc.com>:
>
>> Ralf Goertz <m...@myprovider.invalid> writes:
>>
>>> I wrote a small program to create all distinct circular
>>> arrangements of n_i objects from category i. If you have, say two
>>> A's and two B's, there are two distinctive ways to put them in a
>>> circular order:
>>>
>>> AABB ABAB
>>>
>>> The other four (linear) arrangements ABBA(1), BAAB(1), BABA(2),
>>> BBAA(1) are only circular permutations of the two above (with the
>>> number in parenthesis indicating of which). It is surprisingly
>>> difficult to come up with a formula[1] for the number of such
>>> arrangements, at least in the case where the greatest common
>>> divisor of all the n_i is greater than 1. My idea for the program
>>> was to go over all linear arrangements (with the first element
>>> fixed) and store those which were no circular permutations of any
>>> of the previously stored arrangements. [...]
>>
>> This algorithm does a lot more work than is needed. [...]
>> Hint: it isn't necessary to store any previous results to check
>> if next_permutation() gives a string none of whose rotations have
>> been seen before.
>
> I am surprised to hear that. How else do I know that when I get
> to the permutation ABBA in the example above, it is the same as
> AABB so that I must discard it? [...] Regrettably, your hint
> doesn't help me enough. So could you please elaborate?

Consider looking at ABBA. It has four rotations, which is to say
the original plus three non-zero rotations:

ABBA
BBAA
BAAB
AABB

The last of these comes lexicographically before ABBA. Therefore
we must have seen AABB earlier, and ABBA is a duplicate.

(The "earlier" in the above statement relies on next_permutation()
giving its results in lexicographic order, but if what we're doing
is counting them the order doesn't matter as long as we eventually
get every permutation - we will count the "smallest" rotations,
and skip the rest, in whatever order they might appear.)

Ralf Goertz

unread,
Jun 27, 2019, 2:10:39 PM6/27/19
to
Am Thu, 27 Jun 2019 10:16:17 -0700
schrieb Tim Rentsch <tr.1...@z991.linuxsc.com>:

> Consider looking at ABBA. It has four rotations, which is to say the
> original plus three non-zero rotations:
>
> ABBA
> BBAA
> BAAB
> AABB
>
> The last of these comes lexicographically before ABBA. Therefore
> we must have seen AABB earlier, and ABBA is a duplicate.
>
> (The "earlier" in the above statement relies on next_permutation()
> giving its results in lexicographic order, but if what we're doing
> is counting them the order doesn't matter as long as we eventually
> get every permutation - we will count the "smallest" rotations,
> and skip the rest, in whatever order they might appear.)

Great, thanks! I still need 2.7 seconds for 14 14, but that was not
doable before. Maybe you've got some insight as to why I still seem to
be much slower than you (assuming our hardware is comparable).

#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <string>

using namespace std;

template<typename T> bool has_smaller_permutation(T const &v) {
static T rotator(v.size(),' ');
for (auto i=1;i<v.size();++i) {
copy(v.begin(),v.begin()+i,copy(v.begin()+i,v.end(),rotator.begin()));
if (rotator<v) return true;
}
return false;
}

int main(int argc, char *argv[]) {
vector<char> permuter;
//string permuter;
char c='A';
for (auto i=1;i<argc;++i) {
auto n=stoi(argv[i]);
permuter.insert(permuter.end(),n,c++);
}
if (!permuter.size()) {
cout<<0<<endl;
return 0;
}
vector<string> result;
size_t sum=0;
do {
if (!has_smaller_permutation(permuter)) {
//copy(permuter.begin(),permuter.end(),ostream_iterator<char>(cout,"")); cout<<'\n';
++sum;
}
} while (next_permutation(permuter.begin()+1,permuter.end()));
cout<<sum<<endl;
}

Keith Thompson

unread,
Jun 27, 2019, 3:07:04 PM6/27/19
to
James Kuyper <james...@alumni.caltech.edu> writes:
> On 6/27/19 3:57 AM, Juha Nieminen wrote:
[...]
>> That's a principle relating to syntax, not relating to where objects
>> should be created. In old C variables needed to be declared at the
>> beginning of the function, even if they were only needed later in
>> the code. ...
>
> That may have been true in the very earliest days of C, but it hasn't
> been true since at least K&R C, which allowed you to declare objects at
> the start of any compound statement, not just at the start of the
> outermost compound statement of a function:
>
> int func(in)
> int in;
> {
> if(in)
> {
> int total;
> ...
>
> C++ added the ability to intermingle declarations and statements, which
> was later adopted by C99 as well. However, the freedom to place
> declarations in positions other than the start of the function dates
> back at least to K&R C.
[...]

Yes. Historical note: In the 1975 C Reference Manual, declarations
are allowed only at the top of a function definition. Compound
statements cannot contain declarations. K&R1, 1978, redefined
compound statements to allow zero or more declarations followed by
zero or more statements.

https://www.bell-labs.com/usr/dmr/www/cman.pdf

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

unread,
Jun 27, 2019, 8:17:33 PM6/27/19
to
On 6/27/19 3:06 PM, Keith Thompson wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
...
>> That may have been true in the very earliest days of C, but it hasn't
>> been true since at least K&R C, which allowed you to declare objects at
>> the start of any compound statement, not just at the start of the
>> outermost compound statement of a function:
>>
>> int func(in)
>> int in;
>> {
>> if(in)
>> {
>> int total;
>> ...
>>
>> C++ added the ability to intermingle declarations and statements, which
>> was later adopted by C99 as well. However, the freedom to place
>> declarations in positions other than the start of the function dates
>> back at least to K&R C.
> [...]
>
> Yes. Historical note: In the 1975 C Reference Manual, declarations
> are allowed only at the top of a function definition. Compound
> statements cannot contain declarations. K&R1, 1978, redefined
> compound statements to allow zero or more declarations followed by
> zero or more statements.
>
> https://www.bell-labs.com/usr/dmr/www/cman.pdf

In addition to allowing declarations in compound-statements, K&R C also
made a confusing step forward toward the modern syntax rules for a
function definitions.

Looking over cman.pdf, I see that it specified the following grammar
productions in section 10.1 "External function definitions". and
duplicated in Appendix 1 "Syntax Summary", section 4:

function-definition:
type-specifier opt function-declarator function-body

function-body:
type-decl-list function-statement

function-statement:
{ declaration-list opt statement-list }

In K&R C, under section 18 "Syntax summary", you can find exactly that
same grammar in sub-section 4. However, in the main text, in section
10.1 "External function definitions", the middle rule in that chain is
different:

function-body:
declaration-list compound-statement

Alf P. Steinbach

unread,
Jun 27, 2019, 9:31:34 PM6/27/19
to
I cooked up this code, you can check if it's faaaster or slooower:

(The $use_std macro invocation expands to `using std::string, ...;`, and
ditto for the $use_cppx invocation. The library just reduces verbosity.)


#include <cppx-core/all.hpp> // <url:
https://github.com/alf-p-steinbach/cppx-core>
using namespace std::literals;
$use_std( string, string_view, next_permutation, cout, endl );
$use_cppx( string_repeat::operator*, Range );

auto is_first_of_rotations( const string& s, string& rotations_buffer )
-> bool
{
const int n = s.length();
rotations_buffer = s;
rotations_buffer += s;
for( const int i : Range( 1, n - 1 ) ) {
if( string_view( &rotations_buffer[i], n ) < s ) {
return false;
}
}
return true;
}

auto main() -> int
{
const int k = 14;
string s = k*"A"s + k*"B"s;
auto rotations_buffer = string( 2*s.length(), '.' );
int count = 0;
do
{
if( is_first_of_rotations( s, rotations_buffer ) ) {
//cout << s << endl;
++count;
}
} while( next_permutation( $items_of( s ) ) );
cout << count << " circular permutations of " << k << "A+" << k <<
"B." << endl;
}


Cheers!,

- Alf

Juha Nieminen

unread,
Jun 28, 2019, 2:44:05 AM6/28/19
to
Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
> (The $use_std macro invocation expands to `using std::string, ...;`, and
> ditto for the $use_cppx invocation. The library just reduces verbosity.)

When you are posting here, why do you insist in using such non-standard code
that only makes it harder for somebody to test it? Or even understand it?

Are you, perhaps, naive enough to think that if you keep using your pet
library in your usenet posts, it will become popular?

This newsgroup is about *standard* C++. How about we keep all code
standard as well, and preferably not needlessly dependent on some
third-party libraries, especially when the point of the code has
absolutely nothing to do with them? Is that too much to ask?

Tim Rentsch

unread,
Jun 28, 2019, 2:56:11 AM6/28/19
to
Ralf Goertz <m...@myprovider.invalid> writes:

> Am Thu, 27 Jun 2019 10:16:17 -0700
> schrieb Tim Rentsch <tr.1...@z991.linuxsc.com>:
>
>> Consider looking at ABBA. It has four rotations, which is to say the
>> original plus three non-zero rotations:
>>
>> ABBA
>> BBAA
>> BAAB
>> AABB
>>
>> The last of these comes lexicographically before ABBA. Therefore
>> we must have seen AABB earlier, and ABBA is a duplicate.
>>
>> (The "earlier" in the above statement relies on next_permutation()
>> giving its results in lexicographic order, but if what we're doing
>> is counting them the order doesn't matter as long as we eventually
>> get every permutation - we will count the "smallest" rotations,
>> and skip the rest, in whatever order they might appear.)
>
> Great, thanks! I still need 2.7 seconds for 14 14, but that was not
> doable before. Maybe you've got some insight as to why I still seem to
> be much slower than you (assuming our hardware is comparable).

Briefly:

(1) My code had only one string, twice as long as the string
being permuted, with next_permutation() being done on the first
half, then copying the first half into the second half (using
memcpy()) to set up the rotations checks.

(2) Except for initializing and next_permutation(), my code did
everything with 'const char *', not strings or string_views.
Here are the functions that do the rotations check:

int
is_first( const char *s, unsigned k ){
unsigned i;
for( i = 1; i < k; i++ ){
if( is_before( s+i, s, k ) ) return 0;
}
return 1;
}

int
is_before( const char *a, const char *b, unsigned k ){
return
k == 0 || *a > *b ? 0 :
*a < *b ? 1 :
is_before( a+1, b+1, k-1 );
}

(3) Optimization level -O3 gave maybe a 5% improvement over -O2.

(4) What seemed to work best was "inlining" is_before() into the
body of is_first(), but not is_first() into its caller. I don't
know why, but since it seemed to help that's what I did.

Alf P. Steinbach

unread,
Jun 28, 2019, 3:55:07 AM6/28/19
to
You're asking that the relevant library code be duplicated in each
posting here.

That would be idiotic, if you think about it.

There's even a bunch of acronyms designed to help programmers avoid the
ungood practice of duplicating code.

One of them is "DRY": Don't Repeat Yourself.

I'm not going to add more code to postings when that common code is much
better referred to on GitHub.


Cheers!,

- Alf

Geoff

unread,
Jun 28, 2019, 4:14:29 PM6/28/19
to
On Fri, 28 Jun 2019 09:54:54 +0200, "Alf P. Steinbach"
<alf.p.stein...@gmail.com> wrote:

>On 28.06.2019 08:43, Juha Nieminen wrote:
>> Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
>>> (The $use_std macro invocation expands to `using std::string, ...;`, and
>>> ditto for the $use_cppx invocation. The library just reduces verbosity.)
>>
>> When you are posting here, why do you insist in using such non-standard code
>> that only makes it harder for somebody to test it? Or even understand it?
>>
>> Are you, perhaps, naive enough to think that if you keep using your pet
>> library in your usenet posts, it will become popular?
>>
>> This newsgroup is about *standard* C++. How about we keep all code
>> standard as well, and preferably not needlessly dependent on some
>> third-party libraries, especially when the point of the code has
>> absolutely nothing to do with them? Is that too much to ask?
>
>You're asking that the relevant library code be duplicated in each
>posting here.
>

He's not asking that at all. He's asking you to stop posting code
that's dependent on your non-standard library.

>That would be idiotic, if you think about it.
>

The OP's code was fairly standard C++, your code is not.
That's idiotic.

>There's even a bunch of acronyms designed to help programmers avoid the
>ungood practice of duplicating code.
>
>One of them is "DRY": Don't Repeat Yourself.
>
>I'm not going to add more code to postings when that common code is much
>better referred to on GitHub.
>

There's another one - GCIGC: Garbage Code Is Garbage Code.

The OP would be better off ignoring your posts.

Alf P. Steinbach

unread,
Jun 28, 2019, 7:01:29 PM6/28/19
to
I seem to remember killfiling you years ago.

Anyway, plink.

Mr Flibble

unread,
Jun 28, 2019, 7:16:11 PM6/28/19
to
So, Alf, how many people telling you that what you are doing is fucking
egregious will it take for you to stop posting fucking egregious code?

Please when posting snippets that are answers to questions:
1) DO NOT use your personal non-standard fucktarded library that flings
dollar sign shit all over the walls of the room;
2) KEEP your use of auto return type deduction to the cases that require it;
3) DO NOT write auto main() ... (see (2)) as it makes you look like a
trolling cunt fuckwomble who just wants to irritate everyone.

/Flibble

--
"Snakes didn't evolve, instead talking snakes with legs changed into
snakes." - Rick C. Hodgin

“You won’t burn in hell. But be nice anyway.” – Ricky Gervais

“I see Atheists are fighting and killing each other again, over who
doesn’t believe in any God the most. Oh, no..wait.. that never happens.” –
Ricky Gervais

"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

Melzzzzz

unread,
Jun 28, 2019, 7:18:06 PM6/28/19
to
As much as I don't like style I am always for free speach :P
>


--
press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

Alf P. Steinbach

unread,
Jun 28, 2019, 7:40:10 PM6/28/19
to
There's nothing non-standard about the library, other than purely
formally some macro names that you can trivially transform to formally
conforming.

You have been so informed numerous times.

You keep forgetting.


> that flings
> dollar sign shit all over the walls of the room;

Reiterating what you've been told several times before:

You can trivially replace each macro name's dollar sign with prefix
`CPPX_`, and just uppercase the rest, to avoid the /emotional/ affront
to you.

You can even do that in your head, no editing effort involved, since
examples posted in clc++ are seldom posted for their output but for
understanding things.

When you don't that means you're lazy, or trolling, or both.


> 2) KEEP your use of auto return type deduction to the cases that require
> it;
> 3) DO NOT write auto main() ... (see (2)) as it makes you look like a
> trolling cunt fuckwomble who just wants to irritate everyone.

Well, the swearing and strong reaction to modern C++ syntax says it all.


Cheers & hth.,

- Alf

Juha Nieminen

unread,
Jul 1, 2019, 3:26:39 AM7/1/19
to
Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
> On 28.06.2019 08:43, Juha Nieminen wrote:
>> Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
>>> (The $use_std macro invocation expands to `using std::string, ...;`, and
>>> ditto for the $use_cppx invocation. The library just reduces verbosity.)
>>
>> When you are posting here, why do you insist in using such non-standard code
>> that only makes it harder for somebody to test it? Or even understand it?
>>
>> Are you, perhaps, naive enough to think that if you keep using your pet
>> library in your usenet posts, it will become popular?
>>
>> This newsgroup is about *standard* C++. How about we keep all code
>> standard as well, and preferably not needlessly dependent on some
>> third-party libraries, especially when the point of the code has
>> absolutely nothing to do with them? Is that too much to ask?
>
> You're asking that the relevant library code be duplicated in each
> posting here.
>
> That would be idiotic, if you think about it.

This newsgroup is about standard C++. Unless the thread is about a
particular third-party library, keep all third-party libraries out
of the code, unless absolutely necessary.

Your pet personal library is *far* for necessary. It's completely
superfluous and only makes it harder for anybody to do anything
with any code you post, for no good reason.

Even if you have to repeat a few lines of code in more than one post,
so what? Who cares? Answers can share code, if they are relevant to
the topic at hand.

For some reason you are enamored with this pet personal library of
yours, which, frankly, feels lie what a very newbie programmer would
do. I also get the feeling that you, indeed, have a very naive notion
that if you keep using it in this newsgroup, it might get adopted and
popular. I highly doubt that will ever happen, so just stop it. It's
only annoying.
0 new messages