Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Creating a constexpr list

39 views
Skip to first unread message

rep_movsd

unread,
Jul 17, 2017, 9:44:01 AM7/17/17
to
I'm writing a library that does compile time parsing, creating a list.
The only way I am able to get it to work right now is by using a std::array of my Node structure as a "statically allocated pool" and using integer indexes as "pointers" into that array.

This works great, but I am forced to fix the size of the array.

Here is what I have:

#include <iostream>
using namespace std;

struct Symbol
{
constexpr Symbol(): pBeg(), pEnd() {}
constexpr Symbol(const char *pBeg, const char *pEnd): pBeg(pBeg), pEnd(pEnd) {}
constexpr int len() { return pEnd - pBeg; }

string getText() const { return string(pBeg, pEnd); }


const char *pBeg;
const char *pEnd;
};


extern void ParseError(const char *s) {cerr << "Parse Error:" << s << endl;}

constexpr bool isAlpha(char ch)
{
return (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z');
}

// Raises compiletime error if no more characters left to parse
constexpr const char *checkEOS(const char *pszText)
{
bool eos = !*pszText;
if(eos) ParseError("Unexpected end of stream");

return pszText;
}

// Takes a string literal and returns pointer to first non-whitespace character
constexpr const char *eatSpace(const char *pszText)
{
while(*pszText == ' ' || *pszText == '\n' || *pszText == '\r' || *pszText == '\t')
{
++pszText;
}
return pszText;
}

// Takes a string literal text and tries to consume [a-z]+
constexpr const Symbol eatAlpha(const char *pszText)
{
// Ensure not EOS
checkEOS(pszText);

Symbol sym;
sym.pBeg = pszText;
sym.pEnd = pszText;
while(isAlpha(*sym.pEnd)) sym.pEnd++;

// Ensure at least 1 character is consumed
bool empty_tag = sym.pBeg == sym.pEnd;
if(empty_tag) ParseError("Expecting an identifier");

return sym;
}

struct Node
{
Symbol tag; // pointer to tag name range
const Node &child;

constexpr Node(Symbol tag, const Node child):tag(tag), child(child){}

};

constexpr const Symbol NullSym;
constexpr const Node NullNode{NullSym, NullNode};


constexpr Node parse(const char* text)
{
if(text)
{
text = eatSpace(text);
if(isAlpha(*text))
{
Symbol symTag = eatAlpha(text);
return Node(symTag, parse(symTag.pEnd));
}
}
return NullNode;
}

void dumpNode(const Node &n, int indent = 0)
{
if(&n.child != &NullNode)
{
cerr << n.tag.getText() << endl;
dumpNode(n.child, indent + 1);
}
}


int main()
{
constexpr Node node = parse("attr battr cattr");
dumpNode(node);
}

When compiling:

$ g++ --std=c++14 -DSPT_DEBU main4.cpp
main4.cpp:72:48: error: 'Node{Symbol{0, 0}, child}' is not a constant expression
constexpr const Node NullNode{NullSym, NullNode};
^
main4.cpp: In function 'int main()':
main4.cpp:101:49: in constexpr expansion of 'parse(((const char*)"attr battr cattr"))'
main4.cpp:83:44: in constexpr expansion of 'parse(symTag.Symbol::pEnd)'
main4.cpp:83:44: in constexpr expansion of 'parse(symTag.Symbol::pEnd)'
main4.cpp:83:44: in constexpr expansion of 'parse(symTag.Symbol::pEnd)'
main4.cpp:101:49: error: constexpr call flows off the end of the function
constexpr Node node = parse("attr battr cattr");
^

Is what I am trying even possible?

I'm a bit hazy about what the lifetime of constexpr values are when you nest them through recursive calls

I'm pretty sure this should be possible. After all someone wrote a constexpr C compiler.

All I need is some sort of dynamic data structure like a list, that doesnt have to be fixed in size at compile time.

Thanks in advance.

Öö Tiib

unread,
Jul 17, 2017, 1:30:42 PM7/17/17
to
On Monday, 17 July 2017 16:44:01 UTC+3, rep_movsd wrote:
> I'm writing a library that does compile time parsing, creating a list.
> The only way I am able to get it to work right now is by using a std::array
> of my Node structure as a "statically allocated pool" and using
> integer indexes as "pointers" into that array.
>
> This works great, but I am forced to fix the size of the array.

Might be worthy exercise but further posted code does not quite reflect what
you wrote above.
You plan to pass a copy of NullNode to its own constructor? Creating such
constexpr temporaries and taking references to those won't work.

>
>
> constexpr Node parse(const char* text)
> {
> if(text)
> {
> text = eatSpace(text);
> if(isAlpha(*text))
> {
> Symbol symTag = eatAlpha(text);
> return Node(symTag, parse(symTag.pEnd));

Again same thing as on previous case but this time recursive return values
passed by value and then taken references. :D

> }
> }
> return NullNode;
> }
>
> void dumpNode(const Node &n, int indent = 0)
> {
> if(&n.child != &NullNode)
> {
> cerr << n.tag.getText() << endl;
> dumpNode(n.child, indent + 1);
> }
> }
>
>
> int main()
> {
> constexpr Node node = parse("attr battr cattr");
> dumpNode(node);
> }
>
> When compiling:
>
> $ g++ --std=c++14 -DSPT_DEBU main4.cpp
> main4.cpp:72:48: error: 'Node{Symbol{0, 0}, child}' is not a constant expression
> constexpr const Node NullNode{NullSym, NullNode};
> ^
> main4.cpp: In function 'int main()':
> main4.cpp:101:49: in constexpr expansion of 'parse(((const char*)"attr battr cattr"))'
> main4.cpp:83:44: in constexpr expansion of 'parse(symTag.Symbol::pEnd)'
> main4.cpp:83:44: in constexpr expansion of 'parse(symTag.Symbol::pEnd)'
> main4.cpp:83:44: in constexpr expansion of 'parse(symTag.Symbol::pEnd)'
> main4.cpp:101:49: error: constexpr call flows off the end of the function
> constexpr Node node = parse("attr battr cattr");
> ^
>
> Is what I am trying even possible?
>
> I'm a bit hazy about what the lifetime of constexpr values are when you
> nest them through recursive calls

Do not try to build your list into some sort of references taken to
by-value returns of functions or by-value passed function parameters.
That is not meant to work.

>
> I'm pretty sure this should be possible. After all someone wrote a constexpr
> C compiler.
>
> All I need is some sort of dynamic data structure like a list, that
> doesnt have to be fixed in size at compile time.
>
> Thanks in advance.

Sure it is possible but it is strange how you try it.

First thing get rid of pointer algebra. Length of string literal is
known to compiler so pass around the reference to it like that:

#include <iostream>

template<size_t N> constexpr
size_t len(char const (&literal)[N])
{
return N;
}

template<size_t N> constexpr
char nth_char(char const (&literal)[N], size_t n)
{
return literal[n];
}

int main()
{
constexpr char what_we_parse[] = "attr battr cattr";
constexpr size_t l = len(what_we_parse);
constexpr char c = nth_char(what_we_parse, 5);
std::cout << "Len: " << l << " 5th: " << c << std::endl;
}

Try it. It will make your code lot simpler.


rep_movsd

unread,
Jul 17, 2017, 2:00:38 PM7/17/17
to
I already have a parser working, which takes in a std::array of nodes of fixed size.
As the constexpr function calls itself recursively, it adds nodes to the std:array

The only problem with that method is that I need to arbitrarily pre-select a size for the array

In actuality I need to parse a LISP cons like data structure (which is a multiway tree), so the statically allocated thing is likely to be limiting.

Öö Tiib

unread,
Jul 17, 2017, 3:22:05 PM7/17/17
to
On Monday, 17 July 2017 21:00:38 UTC+3, rep_movsd wrote:
>
> I already have a parser working, which takes in a std::array of nodes of
> fixed size. As the constexpr function calls itself recursively, it adds
> nodes to the std:array
>
> The only problem with that method is that I need to arbitrarily pre-select
> a size for the array

Now you lost me. You have input data available so why you can not count
needed size of array instead of randomly selecting something?

>
> In actuality I need to parse a LISP cons like data structure (which is
> a multiway tree), so the statically allocated thing is likely to be
> limiting.

Huh? Every constexpr variable is *very* static and immutable already at
time of compiling the program. It is *defining* feature of constexpr data
and it is no way limiting it. :D

Chris Vine

unread,
Jul 18, 2017, 6:13:31 AM7/18/17
to
On Mon, 17 Jul 2017 11:00:20 -0700 (PDT)
rep_movsd <rep....@gmail.com> wrote:
[snip]
> I already have a parser working, which takes in a std::array of nodes
> of fixed size. As the constexpr function calls itself recursively, it
> adds nodes to the std:array
>
> The only problem with that method is that I need to arbitrarily
> pre-select a size for the array
>
> In actuality I need to parse a LISP cons like data structure (which
> is a multiway tree), so the statically allocated thing is likely to
> be limiting.

This won't work in C++. constexpr expressions are not like lisp
macros: you don't have the whole language available for constructing
at compile time the output form which is to be evaluated at run time,
as you do with a lisp. Everything in a constexpr expression is a
literal of some kind - a compile-time constant or a literal type, or a
constexpr function taking and returning literal types.

This means amongst other things that you cannot do anything in a
constexpr expression which allocates memory. More generally, you
cannot return or take as parameters non-literal types, including
std::string.

As a very trivial example, a lisp macro can call the language's time
functions to embed the time of compilation in the compiled binary. This
absolutely won't work in C++ (you would use the __DATE__ and __TIME__
macros instead).

Have a look at this:
http://en.cppreference.com/w/cpp/language/constexpr

Chris Vine

unread,
Jul 18, 2017, 6:35:51 AM7/18/17
to
On Tue, 18 Jul 2017 11:13:04 +0100
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> wrote:
> On Mon, 17 Jul 2017 11:00:20 -0700 (PDT)
> rep_movsd <rep....@gmail.com> wrote:
> [snip]
> > I already have a parser working, which takes in a std::array of
> > nodes of fixed size. As the constexpr function calls itself
> > recursively, it adds nodes to the std:array
> >
> > The only problem with that method is that I need to arbitrarily
> > pre-select a size for the array
> >
> > In actuality I need to parse a LISP cons like data structure (which
> > is a multiway tree), so the statically allocated thing is likely to
> > be limiting.
>
> This won't work in C++. constexpr expressions are not like lisp
> macros: you don't have the whole language available for constructing
> at compile time the output form which is to be evaluated at run time,
> as you do with a lisp. Everything in a constexpr expression is a
> literal of some kind - a compile-time constant or a literal type, or a
> constexpr function taking and returning literal types.

On reflection I suppose it is worth adding first that you couldn't
parse a cons structure (singly-linked list composed of pairs) in a lisp
macro either, unless the list structure is a literal known at compile
time, for obvious reasons; and secondly that you could write a parser
in C++ to operate on such a compile time structure as a code generator
and then compile that - which is more or less what a lisp macro does.

0 new messages