Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Shorthand ~~ pointers

52 views
Skip to first unread message

Rick C. Hodgin

unread,
Dec 20, 2018, 10:18:37 AM12/20/18
to
Note: This is a follow-up to a prior thread entitled "Shorthand ~> pointers"

On 12/19/2018 9:31 PM, Rick C. Hodgin wrote:
> I've had the idea to introduce the ~> symbol for shorthand pointers.
>
> These are pointers which access uniquely named child member objects
> without having to traverse the child member hierarchy to get there:
>
> // Define a structure hierarchy example
> struct SGrandChild
> {
> s32 gcint;
> };
>
> struct SChild
> {
> SGrandChild gc;
> };
>
> struct SExample
> {
> SChild* child;
> };
>
> // Use in code:
> SExample* e = generate_new_e();
> e->child->gc.gcInt = 5; // Traditional access
> e~>gcInt = 5; // Shorthand access
>
> It would only work for uniquely named members, but could also be
> used from a relative point where, from that point down the rest
> of the hierarchy it is then unique.

Based on some follow-up suggestions, I'm leaning toward this syntax:

e~~gcInt = 5;

In that way, the ~~ indicates there are members between that are
not included (but are assumed to be a distinct chain), and more
clearly differentiate between the ~ and - symbols in certain fonts
and monitors, as well as conveying the general concept of the "this
is the only gcInt of e's progeny being referenced here."

The fact that it has two symbols would be a notably clearer delin-
eator. Whatever font is used here in Thunderbird makes it very
clear. I like seeing that degree of clarity in a font.

--
Rick C. Hodgin

Ben Bacarisse

unread,
Dec 20, 2018, 12:00:29 PM12/20/18
to
"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> Note: This is a follow-up to a prior thread entitled "Shorthand ~>
> pointers"

It's also what is called multi-posted. Cross posting is almost always
better in these cases. The replies you've got on comp.lang.c apply here
too, but by multi-posting people have to keep replying twice or leave it
to others who are unaware that a particular point has been made.

--
Ben.

Rick C. Hodgin

unread,
Dec 20, 2018, 12:03:03 PM12/20/18
to
I updated it here to include both groups in my reply.

--
Rick C. Hodgin

Rick C. Hodgin

unread,
Dec 20, 2018, 12:03:54 PM12/20/18
to
On Thursday, December 20, 2018 at 11:44:34 AM UTC-5, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> <snip>
>> Based on some follow-up suggestions, I'm leaning toward this syntax:
>>
>> e~~gcInt = 5;
>
> Without further changes to the standard, this will break existing code
> because ~~ can already occur in C, and must be processed as two tokens.
> Obviously you could change the lexing and/or parsing rules or you could
> just accept that this is a "breaking" change. Very little code will
> have ~~ in it.

Where is the e~~gcInt legal today? I believe ~~ would require an
operator before it to be a legal syntax, such as x = ~~y; Or to
use it in something like x = a + ~~b;, etc.

I do not see where the syntax I propose would break any existing
legal syntax in C or C++ code.

--
Rick C. Hodgin

Ben Bacarisse

unread,
Dec 20, 2018, 12:17:38 PM12/20/18
to
"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Thursday, December 20, 2018 at 11:44:34 AM UTC-5, Ben Bacarisse wrote:
>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>> <snip>
>>> Based on some follow-up suggestions, I'm leaning toward this syntax:
>>>
>>> e~~gcInt = 5;
>>
>> Without further changes to the standard, this will break existing code
>> because ~~ can already occur in C, and must be processed as two tokens.
>> Obviously you could change the lexing and/or parsing rules or you could
>> just accept that this is a "breaking" change. Very little code will
>> have ~~ in it.
>
> Where is the e~~gcInt legal today? I believe ~~ would require an
> operator before it to be a legal syntax, such as x = ~~y; Or to
> use it in something like x = a + ~~b;, etc.
>
> I do not see where the syntax I propose would break any existing
> legal syntax in C or C++ code.

I thought you were writing a C compiler? C's rules call for punctuation
symbols to be made from the longest possible sequence. I.e. if ~~
becomes a punctuator, ~~ will taken to be one token not two. Of course,
as I said, you can change /more/ things to make this all work. You
could invent a new unary operator ~~x with the meaning of ~(~x), or you
could make the lex and/or parse context sensitive.

C++ might be prepared to be more forgiving since it already has a
special case for >>. But then C++'s rules for accessing member names is
likely to mean that the committee would not touch this with a barge
pole.

--
Ben.

Rick C. Hodgin

unread,
Dec 20, 2018, 12:32:16 PM12/20/18
to
On 12/20/2018 12:17 PM, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>
>> On Thursday, December 20, 2018 at 11:44:34 AM UTC-5, Ben Bacarisse wrote:
>>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>>> <snip>
>>>> Based on some follow-up suggestions, I'm leaning toward this syntax:
>>>>
>>>> e~~gcInt = 5;
>>>
>>> Without further changes to the standard, this will break existing code
>>> because ~~ can already occur in C, and must be processed as two tokens.
>>> Obviously you could change the lexing and/or parsing rules or you could
>>> just accept that this is a "breaking" change. Very little code will
>>> have ~~ in it.
>>
>> Where is the e~~gcInt legal today? I believe ~~ would require an
>> operator before it to be a legal syntax, such as x = ~~y; Or to
>> use it in something like x = a + ~~b;, etc.
>>
>> I do not see where the syntax I propose would break any existing
>> legal syntax in C or C++ code.
>
> I thought you were writing a C compiler? C's rules call for punctuation
> symbols to be made from the longest possible sequence. I.e. if ~~
> becomes a punctuator, ~~ will taken to be one token not two. Of course,
> as I said, you can change /more/ things to make this all work. You
> could invent a new unary operator ~~x with the meaning of ~(~x), or you
> could make the lex and/or parse context sensitive.

Again, I ask you, where would e~~gcInt be legal today?

> C++ might be prepared to be more forgiving since it already has a
> special case for >>. But then C++'s rules for accessing member names is
> likely to mean that the committee would not touch this with a barge
> pole.

--
Rick C. Hodgin

james...@alumni.caltech.edu

unread,
Dec 20, 2018, 12:39:34 PM12/20/18
to
On Thursday, December 20, 2018 at 12:32:16 PM UTC-5, Rick C. Hodgin wrote:
> On 12/20/2018 12:17 PM, Ben Bacarisse wrote:
...
> > I thought you were writing a C compiler? C's rules call for punctuation
> > symbols to be made from the longest possible sequence. I.e. if ~~
> > becomes a punctuator, ~~ will taken to be one token not two. Of course,
> > as I said, you can change /more/ things to make this all work. You
> > could invent a new unary operator ~~x with the meaning of ~(~x), or you
> > could make the lex and/or parse context sensitive.
>
> Again, I ask you, where would e~~gcInt be legal today?

It wouldn't - but e=~~gcInt would be, and if ~~ were added as a valid
token, then because of the maximal munch rule, e=~~gcInt would
necessarily parse as using that token, rather than as using two
consecutive ~ tokens which would constitute a syntax error.
I seem to recall that you disdain the maximal munch rule; but you'd have
a very hard time convincing either the C or the C++ committees to accept
dropping it.

Rick C. Hodgin

unread,
Dec 20, 2018, 12:44:39 PM12/20/18
to
~~ would be a separate token from ~(~x) as it is today. It would be
like + and ++.

In addition, because it's used in a different syntax, it should not be
any kind of issue. The C++ committee already accepted >> and << which
are in far greater use than ~~ (at least in my observation over the
years).

--
Rick C. Hodgin

Ben Bacarisse

unread,
Dec 20, 2018, 1:17:29 PM12/20/18
to
Yes, but /why/ are you asking? We both know the answer to /that/
question (I hope), but you seemed unaware that adding a new punctuator
that can already occur in other contexts would require other changes to
the language. If you were also aware of that, then a simple, "yes I
know" would have done.

--
Ben.

Rick C. Hodgin

unread,
Dec 20, 2018, 1:38:37 PM12/20/18
to
On 12/20/2018 12:57 PM, James Kuyper wrote:
> On Thursday, December 20, 2018 at 12:44:37 PM UTC-5, Rick C. Hodgin wrote:
>> ~~ would be a separate token from ~(~x) as it is today. It would be
>> like + and ++.
>
> What does "as it is today" refer to? "~~" is NOT a separate token today. It parses as two consecutive "~" tokens.

I am not aware of that operator as two consecutive ~~ characters. I
did a search for it for C or C++ and didn't find it. Only the single
~ character.

What does ~~ do today?

And if the syntax it uses today is different than the syntax that I
propose, such that there is no legal use of ~~ in the syntax I pro-
pose today, then why would it matter? It would be parsed as two dif-
ferent uses of the same operator, like << and << for one use to bit
shift, one use to direct / pipe as with cout.

--
Rick C. Hodgin

Rick C. Hodgin

unread,
Dec 20, 2018, 1:47:40 PM12/20/18
to
I don't see it listed:

https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B

And when I do a test using:

unsigned int i, j;
j = 5;
i = ~~j;

It appears to be doing:

i = ~(~j);

Which is the same as:

i = j;

--
Rick C. Hodgin

Bart

unread,
Dec 20, 2018, 2:00:02 PM12/20/18
to
!! is often used, but !! is two ! tokens one after the other.

You proposal would either mean processing ~~ as one new brand-new token,
or introducing a new binary operator that consists of two ~ tokens.

In the first case, that means dismissing the current use of ~~x. Even
though it is very rare (and not useful), it effectively means you are
removing it from the language, or need special rules to still allow it
(eg. requiring white space or parentheses).

In the second case, it means you can write "a~ ~b" as well as "a~~b",
which you probably don't want.

(BTW here's another case of ambiguity:

a ~~ b ~~ c

assuming successive ~~ ops are allowed.)

Rick C. Hodgin

unread,
Dec 20, 2018, 2:04:07 PM12/20/18
to
On 12/20/2018 1:59 PM, Bart wrote:
> On 20/12/2018 18:39, Rick C. Hodgin wrote:
>> On 12/20/2018 12:57 PM, James Kuyper wrote:
>>> On Thursday, December 20, 2018 at 12:44:37 PM UTC-5, Rick C. Hodgin wrote:
>>>> ~~ would be a separate token from ~(~x) as it is today.  It would be
>>>> like + and ++.
>>>
>>> What does "as it is today" refer to? "~~" is NOT a separate token today.
>>> It parses as two consecutive "~" tokens.
>>
>> I am not aware of that operator as two consecutive ~~ characters.  I
>> did a search for it for C or C++ and didn't find it.  Only the single
>> ~ character.
>>
>> What does ~~ do today?
>>
>> And if the syntax it uses today is different than the syntax that I
>> propose, such that there is no legal use of ~~ in the syntax I pro-
>> pose today, then why would it matter?  It would be parsed as two dif-
>> ferent uses of the same operator, like << and << for one use to bit
>> shift, one use to direct / pipe as with cout.
>>
>
> !! is often used, but !! is two ! tokens one after the other.

I've seen that before and I've seen it's been explained here, but I
still don't know what it does. I've seen it in Linux source code.

> You proposal would either mean processing ~~ as one new brand-new token, or
> introducing a new binary operator that consists of two ~ tokens.

I still don't see where ~~ is an operator. I'd like to see a project
that uses that functionality. I find Javascript references to it
online, but none for C or C++.

Show me where it's used in SQLite, or Blender or some open source
software of significance.

> In the first case, that means dismissing the current use of ~~x. Even though
> it is very rare (and not useful), it effectively means you are removing it
> from the language, or need special rules to still allow it (eg. requiring
> white space or parentheses).
>
> In the second case, it means you can write "a~   ~b" as well as "a~~b", which
> you probably don't want.

The syntax I propose would be exactly:

[ptr name][~~][member name]

No whitespaces in there, and it would be distinct from other valid
syntaxes that may use ~~ as an operator in C or C++.

> (BTW here's another case of ambiguity:
>
>     a ~~ b ~~ c
>
> assuming successive ~~ ops are allowed.)

No ambiguity if there is a->i->j->b->k->l->c... it would take a~~b as
the first run, bypassing i and j, and then b~~c as the second run,
bypassing k and l.

--
Rick C. Hodgin

Bart

unread,
Dec 20, 2018, 2:04:51 PM12/20/18
to
You might see it used like this:

#define M(x) ~x
~M(5);

The result after preprocessing is ~~5. Although it is a no-op, it would
be expected to work as a no-op, using two ~ tokens with no intervening
space.

Rick C. Hodgin

unread,
Dec 20, 2018, 2:07:39 PM12/20/18
to
On 12/20/2018 2:04 PM, Bart wrote:
> You might see it used like this:
>
>     #define M(x) ~x
>     ~M(5);
>
> The result after preprocessing is ~~5. Although it is a no-op, it would be
> expected to work as a no-op, using two ~ tokens with no intervening space.

It would still be a separate and distinct syntax. It would preprocess
out as:

~M(5);
~~5;
5;

It would not be the same syntax as:

ptr~~member;

Even though it might use the same ~~ symbol, it's used differently, like
the way << and >> are used in C++ differently than when they're used as
i = j << 2; A completely different meaning from cout << 2;

--
Rick C. Hodgin

Rick C. Hodgin

unread,
Dec 20, 2018, 2:09:27 PM12/20/18
to
On 12/20/2018 1:48 PM, Rick C. Hodgin wrote:
> On 12/20/2018 1:39 PM, Rick C. Hodgin wrote:
>> On 12/20/2018 12:57 PM, James Kuyper wrote:
>>> On Thursday, December 20, 2018 at 12:44:37 PM UTC-5, Rick C. Hodgin wrote:
>>>> ~~ would be a separate token from ~(~x) as it is today.  It would be
>>>> like + and ++.
>>>
>>> What does "as it is today" refer to? "~~" is NOT a separate token today.
>>> It parses as two consecutive "~" tokens.
>>
>> I am not aware of that operator as two consecutive ~~ characters.  I
>> did a search for it for C or C++ and didn't find it.  Only the single
>> ~ character.
>>
>> What does ~~ do today?
>>
>> And if the syntax it uses today is different than the syntax that I
>> propose, such that there is no legal use of ~~ in the syntax I pro-
>> pose today, then why would it matter?  It would be parsed as two dif-
>> ferent uses of the same operator, like << and << for one use to bit
>> shift, one use to direct / pipe as with cout.
>
> I don't see it listed:
>
> https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B

I don't find any references to ~~ here either:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

I searched for "tilde" and found one reference, and ~ and found 14
references, but none for ~~.

I'd like to know what this operator does so I can add support for
it in CAlive.

--
Rick C. Hodgin

Bart

unread,
Dec 20, 2018, 2:34:14 PM12/20/18
to
On 20/12/2018 19:08, Rick C. Hodgin wrote:
> On 12/20/2018 2:04 PM, Bart wrote:
>> You might see it used like this:
>>
>>      #define M(x) ~x
>>      ~M(5);
>>
>> The result after preprocessing is ~~5. Although it is a no-op, it
>> would be expected to work as a no-op, using two ~ tokens with no
>> intervening space.
>
> It would still be a separate and distinct syntax.  It would preprocess
> out as:
>
>     ~M(5);
>     ~~5;
>     5;

No, it would preprocess as ~~5 not 5. Try ~~"A"; or ~~3.4; and you will
see that ~~ does make it out of the preprocessor. (Just use -E option on
most compilers.)

And at this point, a normal C parser expects to see two successive ~ tokens.

I think my example is legitimate, but it is hard to search for. It's not
hard however to imagine a constant macro K defined as an expression that
starts with ~, and for ~K to appear in code. Or even for another
constant L defined as ~K, and for ~L (requiring ~~~) to be used somewhere.

>
> It would not be the same syntax as:
>
>     ptr~~member;
>
> Even though it might use the same ~~ symbol, it's used differently, like
> the way << and >> are used in C++ differently than when they're used as
> i = j << 2;  A completely different meaning from cout << 2;

But "<<" is always the same single token. It's not like it's a "<" token
followed by another "<" in one context, and a single "<<" in another.

This is not a big deal, but in C, you /will/ be messing about with how
"~" is normally treated.

james...@alumni.caltech.edu

unread,
Dec 20, 2018, 2:35:43 PM12/20/18
to
Ah! So you are familiar with ~~ - you just didn't realize it. Adding ~~
as a token of it's own would prevent ~~ from being parsed as two
separate ~ operators.

Rick C. Hodgin

unread,
Dec 20, 2018, 2:37:58 PM12/20/18
to
On 12/20/2018 2:29 PM, James Kuyper wrote:
> On Thursday, December 20, 2018 at 1:38:36 PM UTC-5, Rick C. Hodgin wrote:
>> What does ~~ do today?
> "~~x" does the same thing that "~(~x)" does, an expression you've
> already used, so I assume that you're familiar with it.

That's what I thought.

>> And if the syntax it uses today is different than the syntax that I
>> propose, such that there is no legal use of ~~ in the syntax I pro-
>> pose today, then why would it matter? It would be parsed as two dif-
>> ferent uses of the same operator, like << and << for one use to bit
>> shift, one use to direct / pipe as with cout.
>
> No, the maximal munch rule precludes such a parse - if there's a "~~"
> token, "~~" must be parsed as that token, it cannot be parsed as two
> consecutive "~" tokens. Maximal munch is quite fundamental to C, and
> lots of existing code would no longer parse as legal C if you made any
> significant change to that rule. Identifying tokens is completed in
> translation phase 7; parsing the sequences of tokens doesn't start until
> translation phase 8 - the approach you imply would require merging
> together translation phases 7 and 8, with far-reaching consequences for
> just about everything.
>
> Note: there's a perfectly simple way around this problem that doesn't
> require any massive re-write of the fundamental principles governing the
> parsing of C or C++, and a clue to that way can be found in the
> preceding paragraph - but I'll let you figure it out, if you care (which
> I doubt).

I don't see how the maximal munch rule would have any issue whatsoever
here because absorbing ~~ in one stroke would still require there be
an operator based on its surrounding syntax. If you tried to do "~~;"
it would fail, for example. It's looking for some value there to oper-
ate on after identifying the operator.

With "ptr~~member" it would be able to recgonize ~~ as a single thing,
and then identify how it's used there by examining the left and right
to determine what's going on.

It doesn't seem to be any conflict whatsoever. It seems like there's
a huge mountain being made out of a non-existent mole hill.

--
Rick C. Hodgin

Rick C. Hodgin

unread,
Dec 20, 2018, 2:47:45 PM12/20/18
to
On 12/20/2018 2:34 PM, Bart wrote:
> On 20/12/2018 19:08, Rick C. Hodgin wrote:
>> On 12/20/2018 2:04 PM, Bart wrote:
>>> You might see it used like this:
>>>
>>>      #define M(x) ~x
>>>      ~M(5);
>>>
>>> The result after preprocessing is ~~5. Although it is a no-op, it would be
>>> expected to work as a no-op, using two ~ tokens with no intervening space.
>>
>> It would still be a separate and distinct syntax.  It would preprocess
>> out as:
>>
>>      ~M(5);
>>      ~~5;
>>      5;
>
> No, it would preprocess as ~~5 not 5.

I think in the case of the constant, it would go ahead and perform the
operation on it at compile-time. But, I'll grant you the premise. :-)

> Try ~~"A"; or ~~3.4; and you will see
> that ~~ does make it out of the preprocessor. (Just use -E option on most
> compilers.)
>
> And at this point, a normal C parser expects to see two successive ~ tokens.
>
> I think my example is legitimate, but it is hard to search for. It's not hard
> however to imagine a constant macro K defined as an expression that starts
> with ~, and for ~K to appear in code. Or even for another constant L defined
> as ~K, and for ~L (requiring ~~~) to be used somewhere.
>
>> It would not be the same syntax as:
>>
>>      ptr~~member;
>>
>> Even though it might use the same ~~ symbol, it's used differently, like
>> the way << and >> are used in C++ differently than when they're used as
>> i = j << 2;  A completely different meaning from cout << 2;
>
> But "<<" is always the same single token. It's not like it's a "<" token
> followed by another "<" in one context, and a single "<<" in another.
>
> This is not a big deal, but in C, you /will/ be messing about with how "~" is
> normally treated.

Of course. It's an add-on. The parser would now need to look for
the context because more than one use of ~~ exists.

FWIW, ~~ in CAlive today is a syntax error. :-) It may not be that
way tomorrow ... if I ever figure out exactly what ~~ is supposed to
do, and why it's allowed to exist that way.

--
Rick C. Hodgin

Bart

unread,
Dec 20, 2018, 3:10:40 PM12/20/18
to
On 20/12/2018 19:48, Rick C. Hodgin wrote:
> On 12/20/2018 2:34 PM, Bart wrote:
>> On 20/12/2018 19:08, Rick C. Hodgin wrote:
>>> On 12/20/2018 2:04 PM, Bart wrote:
>>>> You might see it used like this:
>>>>
>>>>      #define M(x) ~x
>>>>      ~M(5);
>>>>
>>>> The result after preprocessing is ~~5. Although it is a no-op, it
>>>> would be expected to work as a no-op, using two ~ tokens with no
>>>> intervening space.
>>>
>>> It would still be a separate and distinct syntax.  It would preprocess
>>> out as:
>>>
>>>      ~M(5);
>>>      ~~5;
>>>      5;
>>
>> No, it would preprocess as ~~5 not 5.
>
> I think in the case of the constant, it would go ahead and perform the
> operation on it at compile-time.  But, I'll grant you the premise. :-)

Yes, it would (I think it has to). But I'm talking about the output of
the preprocessor and tokeniser, before it gets to parsing the tokens.

Then it is expected that "~~" is two tokens. Although unlikely to be
seen like that in actual code, my example shows how that can occur via
macros.

So you can't simply remove ~~ from the language, or replace them with a
single incompatible token.


> Of course.  It's an add-on.  The parser would now need to look for
> the context because more than one use of ~~ exists.

It's more complicated than that. You want ~~ to mean ~ ~ in one context,
and ~~ (a separate token) in another. But you might not know the context
until after ~/~ or ~~ has already been processed.

If you want a syntax with fewer incompatibilities with C's lexical
grammar, try a..c or a.(c).

(The latter matches how road numbers are sometimes designated on road
signs in UK: a direction name marked (A5) means the route will
eventually use the A5, but not yet. A little like how you get from a to
c, but via b.)

Rick C. Hodgin

unread,
Dec 20, 2018, 3:30:07 PM12/20/18
to
On 12/20/2018 3:10 PM, Bart wrote:
> So you can't simply remove ~~ from the language, or replace them with a
> single incompatible token.

This is the part I'm not understanding at all.

>> Of course.  It's an add-on.  The parser would now need to look for
>> the context because more than one use of ~~ exists.
>
> It's more complicated than that. You want ~~ to mean ~ ~ in one context, and
> ~~ (a separate token) in another. But you might not know the context until
> after ~/~ or ~~ has already been processed.

Correct. That's not an issue for the way CAlive parses things. The
determination of what an operator is comes from context, and not just
tokens.

> If you want a syntax with fewer incompatibilities with C's lexical grammar,
> try a..c or a.(c).
>
> (The latter matches how road numbers are sometimes designated on road signs
> in UK: a direction name marked (A5) means the route will eventually use the
> A5, but not yet. A little like how you get from a to c, but via b.)

I really like the ~~ sequence. It even looks like a spring, like it's
reaching into their across a distance, which conveys the whole meaning.

For CAlive it's not an issue. For C and C++, whoever wants to implement
this can add whatever they'd like to add for the token.

--
Rick C. Hodgin

Ben Bacarisse

unread,
Dec 20, 2018, 3:32:52 PM12/20/18
to
Bart <b...@freeuk.com> writes:
On adding ~~ as a new token:

> You might see it used like this:
>
> #define M(x) ~x
> ~M(5);
>
> The result after preprocessing is ~~5. Although it is a no-op, it
> would be expected to work as a no-op, using two ~ tokens with no
> intervening space.

This example is going to confuse matters because the result must be
treated like ~ ~ 5 even if a ~~ token gets added to the language. (Try
it with -- or ++ if you want confirmation.)

Pre-processing happens (conceptually) at the token level, not the string
level. The token list <~><M><(><5><)><;> has <M><(><5><)> replaced by
<~><x> and then <~><5> to give <~><~><5><;>

--
Ben.

Mr Flibble

unread,
Dec 20, 2018, 3:45:49 PM12/20/18
to
neoGFX has a snake operator, ~~~~, as described here:

https://neogfx.io/wiki/index.php/neoGFX_Event_System

Rick's idea of trying to use ~ to skip dereferences in a pointer chain is
fucktarded tho; it is a rare use-case and goes against the grain of one of
C++ precepts namely encapsulation.

/Flibble

--
“You won’t burn in hell. But be nice anyway.” – Ricky Gervais

“I see Atheists are fighting and killing each other again, over who
doesn’t believe in any God the most. Oh, no..wait.. that never happens.” –
Ricky Gervais

"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

Robert Wessel

unread,
Dec 20, 2018, 3:55:32 PM12/20/18
to
OTOH, in (more recent version of) C++, ">>" is sometimes treated as
two ">" tokens. That deals with a glitch in template syntax where
several template blocks ending simultaneously required (somewhat)
unnatural whitespace between the closing ">" tokens. For example:

vector<pair<int, int> > v;

Pre-C++0x, if you omitted the space between the two ">", you'd get a
syntax error* when it got parsed as a ">>" token.


*Usually of the annoying ones about faulty block parallelism being
detected a considerable distance from the actual problem.

Keith Thompson

unread,
Dec 20, 2018, 4:22:14 PM12/20/18
to
Bart <b...@freeuk.com> writes:
[...]
> You might see it used like this:
>
> #define M(x) ~x
> ~M(5);
>
> The result after preprocessing is ~~5. Although it is a no-op, it would
> be expected to work as a no-op, using two ~ tokens with no intervening
> space.

The output of the preprocessor is logically a sequence of tokens, not
text. Many preprocessors have an option to show the output as text,
but that text is generally constructed so that it will parse as the
correct token sequence.

Macro expansion doesn't paste tokens together unless you tell it to
using the ## operator.

The result of ~M(5) in your example is the 3-token sequence ~, ~, 5 --
and it would be even if there were a ~~ token. A preprocessor that
shows its output as text would probably emit "~ ~5".

An example:

#define LT(x) <x
int a = 1<<3;
int b = 1<LT(3);

The last line expands to the equivalent of:
int b = 1< <3;
which is a syntax error.

There are plenty of problems with the proposed ~~ operator, but this
isn't one of them.

--
Keith Thompson (The_Other_Keith) k...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Bart

unread,
Dec 20, 2018, 4:59:00 PM12/20/18
to
On 20/12/2018 21:21, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
> [...]
>> You might see it used like this:
>>
>> #define M(x) ~x
>> ~M(5);
>>
>> The result after preprocessing is ~~5. Although it is a no-op, it would
>> be expected to work as a no-op, using two ~ tokens with no intervening
>> space.
>
> The output of the preprocessor is logically a sequence of tokens, not
> text. Many preprocessors have an option to show the output as text,
> but that text is generally constructed so that it will parse as the
> correct token sequence.
>
> Macro expansion doesn't paste tokens together unless you tell it to
> using the ## operator.

I'm not suggesting that. I was showing how sometimes two consecutive ~
operators can occur in C code, even if no one would ever write them
directly as ~~ or ~ ~.

Therefore support for a double ~ operation is still needed, even if any
compiler will turn it into a no-op.

David Brown

unread,
Dec 20, 2018, 5:03:15 PM12/20/18
to
On 20/12/2018 20:05, Rick C. Hodgin wrote:
> On 12/20/2018 1:59 PM, Bart wrote:
>> On 20/12/2018 18:39, Rick C. Hodgin wrote:
>>> On 12/20/2018 12:57 PM, James Kuyper wrote:
>>>> On Thursday, December 20, 2018 at 12:44:37 PM UTC-5, Rick C. Hodgin
>>>> wrote:
>>>>> ~~ would be a separate token from ~(~x) as it is today.  It would be
>>>>> like + and ++.
>>>>
>>>> What does "as it is today" refer to? "~~" is NOT a separate token
>>>> today. It parses as two consecutive "~" tokens.
>>>
>>> I am not aware of that operator as two consecutive ~~ characters.  I
>>> did a search for it for C or C++ and didn't find it.  Only the single
>>> ~ character.
>>>
>>> What does ~~ do today?
>>>
>>> And if the syntax it uses today is different than the syntax that I
>>> propose, such that there is no legal use of ~~ in the syntax I pro-
>>> pose today, then why would it matter?  It would be parsed as two dif-
>>> ferent uses of the same operator, like << and << for one use to bit
>>> shift, one use to direct / pipe as with cout.
>>>
>>
>> !! is often used, but !! is two ! tokens one after the other.
>
> I've seen that before and I've seen it's been explained here, but I
> still don't know what it does.  I've seen it in Linux source code.
>

!! is two ! tokens after each other. !x turns 0 into 1, and non-zero
into 0. So applying that twice, means !!x turns 0 into 0, and non-zero
into 1. So in usage, !!x means "normalise x as a 0/1 boolean value".
It is not often necessary in C99, where you have proper booleans, and
was more common in C90 code. I've used it a few times myself.

>> You proposal would either mean processing ~~ as one new brand-new
>> token, or introducing a new binary operator that consists of two ~
>> tokens.
>
> I still don't see where ~~ is an operator.  I'd like to see a project
> that uses that functionality.  I find Javascript references to it
> online, but none for C or C++.

~~ is not an operator - it is two operators, after each other.

The issue here is not that someone might intentionally use ~~ (though it
might turn up due to macros). Nor will you find any code that looks
like "a~~b", as it would be a syntax error (unless "a" is a macro of
some sort).

The issue is the way C defines the rules for parsing and lexical
analysis. If you want a set of characters to be treated as a single
token - such as "~~" - then it will /always/ be treated as that token.
That is fine for your new use, for "a~~b". And the old "~" operator
will usually work as before - "x = ~y".

However, if someone were to write "x = ~~y;", then in normal C this will
be treated as "x = ~(~y);" because there is no "~~" token. With your
extension, it would now be parsed as "x" "assignment operator"
"shorthand pointer operator" "y". That would be meaningless, but code
that used to be valid C is now invalid.

Does this actually matter? Probably not, in this case - you are
unlikely ever to come across C code like "x = ~~y;", even allowing for
odd macros. I don't think it should stop you using this operator. (I
don't like the operator suggestion for other reasons, but that is beside
the point here.)

C++ had to deal with this situation when trying to allow ">>" to be the
same as "> >" for closing two template declarations. It was
surprisingly difficult, and broke the way tokenising worked, but it
neatened the language for users.

If your CAlive language has a different kind of parsing and tokenising,
then only you can tell if it is a problem or not. In these groups, we
can only tell you how it relates to C and C++.

>
> Show me where it's used in SQLite, or Blender or some open source
> software of significance.
>

Almost certainly never, which is why I would say it is only a
theoretical issue and should not stop you. But you ought to understand
the issue so that you can make an informed decision.

Richard Damon

unread,
Dec 21, 2018, 6:39:00 AM12/21/18
to
But cout << 2 is NOT different than i = j << 2, both become effectively
calls to operator<<() (the latter case to the implicitly definded
fundamental version for built in types, just like all the other operator
functions). The fact that count << 2 doesn't act like a shift is just a
matter of library definition, and not a core language issue (just like
making list += element be an append operation).


Rick C. Hodgin

unread,
Dec 21, 2018, 8:08:29 AM12/21/18
to
On 12/21/2018 6:38 AM, Richard Damon wrote:
> On 12/20/18 2:08 PM, Rick C. Hodgin wrote:
>> Even though it might use the same ~~ symbol, it's used differently, like
>> the way << and >> are used in C++ differently than when they're used as
>> i = j << 2;  A completely different meaning from cout << 2;
>
> But cout << 2 is NOT different than i = j << 2, both become effectively
> calls to operator<<() (the latter case to the implicitly definded
> fundamental version for built in types, just like all the other operator
> functions).

I would be surprised to learn that cout << 2 and j << 2 call the same
operator() function. If they do, I view that as a fundamental flaw in
the language.

> The fact that count << 2 doesn't act like a shift is just a
> matter of library definition, and not a core language issue (just like
> making list += element be an append operation).

It's more or less the same in CAlive. CAlive identifies the template
usage of the token at compile-time. It calls the appropriate internal
handler at that time, which then emits whatever is required for the
operation.

Nonetheless, they remain different. The call for cout << 2 is different
than the call for j << 2 in all ways. The fact that the same symbol is
used is just a choice made by the C++ designers. CAlive does not support
cout or cin, by the way. I think they are inappropriate additions to the
base C language, features that should've been handled differently.

--
Rick C. Hodgin

David Brown

unread,
Dec 21, 2018, 8:31:25 AM12/21/18
to
On 21/12/18 14:09, Rick C. Hodgin wrote:
> On 12/21/2018 6:38 AM, Richard Damon wrote:
>> On 12/20/18 2:08 PM, Rick C. Hodgin wrote:
>>> Even though it might use the same ~~ symbol, it's used differently, like
>>> the way << and >> are used in C++ differently than when they're used as
>>> i = j << 2; A completely different meaning from cout << 2;
>>
>> But cout << 2 is NOT different than i = j << 2, both become effectively
>> calls to operator<<() (the latter case to the implicitly definded
>> fundamental version for built in types, just like all the other operator
>> functions).
>
> I would be surprised to learn that cout << 2 and j << 2 call the same
> operator() function. If they do, I view that as a fundamental flaw in
> the language.

They do not call the same function - they use the same operator name.
The function "operator<<" is overloaded - you have the same name, but
different functions depending on the type of the arguments.

This is a key feature in C++. It means that you can make, for example,
a matrix class - and then write "M1 + M2" just as you would write "x + y".

The only different between "cout << 2" and "j << 2" is that the
"operator<<" function acting on streams is defined in a library, while
the function operating on integers is built into the language.

This is a fundamental /strength/ of the C++ language.

(Whether or not the streams library is well designed, and whether or not
the choice of the << and >> operators is well conceived, is a different
matter. Opinions vary on that one.)

>
>> The fact that count << 2 doesn't act like a shift is just a
>> matter of library definition, and not a core language issue (just like
>> making list += element be an append operation).
>
> It's more or less the same in CAlive. CAlive identifies the template
> usage of the token at compile-time. It calls the appropriate internal
> handler at that time, which then emits whatever is required for the
> operation.
>
> Nonetheless, they remain different. The call for cout << 2 is different
> than the call for j << 2 in all ways.

The implementation of the functions is different. The syntax and
mechanism for calling them is the same (except that for some cases, such
as integers, the function is built into the language).

> The fact that the same symbol is
> used is just a choice made by the C++ designers. CAlive does not support
> cout or cin, by the way. I think they are inappropriate additions to the
> base C language, features that should've been handled differently.
>

"cout" and its use of << is not built into the C++ language, it is part
of the standard library.


Richard Damon

unread,
Dec 21, 2018, 9:13:48 PM12/21/18
to
(removing clc as this isn't relevant to C)
As David said, they are different overloads of the same base function,
the first something like:

ostream& operator()(ostream&, long long)
and the second something like
int operator()(int, int)
(now the operator functions on built in types may not actually exist as
actual functions, but largely act as if they did).

In the LANGUAGE, they mean exactly the same thing. The library has
defined different operations to those functions. When the use of
inserters (and extractors) were first being developed, a lot of people
didn't like it, and was often used as one of the examples of bad
operator overloading.

The key here is that the parse tree for cout << 2 is basically the same
as j << 2, so at the langauge doesn't need to make a distinction.

The problem with ~~ is that this is NOT true. Unless you TOTALLY change
how the language is parsed, the token sequence of ~~ will mean something
different than ~ ~, so this says that without some heroic wording,
trying to add a ~~ operator will change the meaning of some existing
valid and working code, which gives it a serious black mark as a
possible extension.

Because of this, I would suggest that ~> is a much better symbol, as no
valid code could have that combination. Also, I don't think the
confusion with -> is significant, as the operator is basically an
extension to it.

I can see there are some situations where such an operator could be
useful, but again without some careful wording, many cases where is
could be used get excluded since it follows pointers by recursive
structure (like trees or lists) which include a pointer to another
object of the same type, since any member name that could be reached
directly could also be reach indirectly through such a pointer. I
personally would often find this to make this operator unusable.

You also mentioned that you plan to remove the distinction between . and
->, so you access pointed values the same as member values, one thing
that this will interfere with is smart pointer classed. These classes
need to distinguish between the use of -> to access the members of the
object they point to, and the use of . to access the API of the smart
pointer.

Rick C. Hodgin

unread,
Dec 21, 2018, 11:15:44 PM12/21/18
to
On 12/21/2018 9:13 PM, Richard Damon wrote:
> As David said, they are different overloads of the same base function,
> the first something like:

I appreciate you conveying this information.

> ostream& operator()(ostream&, long long)
> and the second something like
> int operator()(int, int)
> (now the operator functions on built in types may not actually exist as
> actual functions, but largely act as if they did).

That's what I would expect it to be.

> The problem with ~~ is that this is NOT true. Unless you TOTALLY change
> how the language is parsed, the token sequence of ~~ will mean something
> different than ~ ~, so this says that without some heroic wording,
> trying to add a ~~ operator will change the meaning of some existing
> valid and working code, which gives it a serious black mark as a
> possible extension.

That's an issue for C and C++. Whoever wants to implement this fea-
ture to those languages can use whatever symbols they want to. As
others have indicated, the >> sequence can be parsed as > > when it's
used in a certain syntax.

> Because of this, I would suggest that ~> is a much better symbol, as no
> valid code could have that combination. Also, I don't think the
> confusion with -> is significant, as the operator is basically an
> extension to it.

Suggest that to those who want to implement it in C and C++. It will
be perfectly acceptable in CAlive, and without jumping through any
hoops.

> You also mentioned that you plan to remove the distinction between . and
> ->, so you access pointed values the same as member values, one thing
> that this will interfere with is smart pointer classed. These classes
> need to distinguish between the use of -> to access the members of the
> object they point to, and the use of . to access the API of the smart
> pointer.

I am unaware of any situation where the distinguishing factors of the
symbol used (being . or ->) make any difference because it is the un-
derlying type that is the thing. It's separate and distinguishing, such
that -> used on a member value, or . used on a pointer value, indicate
that each one is referring to a member. The compiler can resolve it,
and deal with it wholly. Related operators will be as you indicate
above, with the different function overloads.

--
Rick C. Hodgin

0 new messages