Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Does std::regex need to be so large?

191 views
Skip to first unread message

Juha Nieminen

unread,
Aug 21, 2020, 10:37:22 AM8/21/20
to
I have found std::regex to be surprisingly useful and handy
for many situations. Quite often I need to parse some data
from some input, and std::regex can be quite an easy way
to do that in many cases.

One thing I have noticed, however, is how astonishingly bit the
class is (or the code that's created when using the class). At
least with gcc on linux (haven't tried with clang).

Just using one std::regex and matching a string with it will
increase the size of the produced object file by about 250 kilobytes,
the size of the final executable by about 100 kilobytes (even after
stripping all debug info from it), and increase the compilation time
of that one source file by several seconds (eg. about 2.5 seconds
in this computer I'm using.)

It might not sound like much, but when the compilation time of that
source file *without* the std::regex is a tenth of a second,
the resulting object file is like 8 kB and the final executable
(in this case) is about 16 kB, that's quite a huge chunk.

Does std::regex really need to be that big? Is there really no way
for them to optimize it to be smaller and faster to compile?

Melzzzzz

unread,
Aug 21, 2020, 11:07:56 AM8/21/20
to
I guess that depends on how regex is implemented and that is nowhere
specified...

--
current job title: senior software engineer
skills: c++,c,rust,go,nim,haskell...

press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

Alf P. Steinbach

unread,
Aug 21, 2020, 11:30:27 AM8/21/20
to
On 21.08.2020 16:37, Juha Nieminen wrote:
> I have found std::regex to be surprisingly useful and handy
> for many situations.

Yes yes.

Unfortunately they're going to deprecate it, with the aim of replacing
it with something more Unicode-aware.

> [snip]


- Alf

jacobnavia

unread,
Aug 21, 2020, 1:21:09 PM8/21/20
to
GREAT!

Then, it won't be 250K but 500K! It will pull all Unicode librairies and
maybve a few other (icu, etc). toupper, tolower, etc aren't very simple
in chinese.

You can't stop progress.

Bonita Montero

unread,
Aug 21, 2020, 1:25:24 PM8/21/20
to
> Then, it won't be 250K but 500K! It will pull all Unicode librairies and
> maybve a few other (icu, etc). toupper, tolower, etc aren't very simple
> in chinese.

The code-size doesn't matter.

Paavo Helde

unread,
Aug 21, 2020, 1:50:41 PM8/21/20
to
21.08.2020 17:37 Juha Nieminen kirjutas:
> I have found std::regex to be surprisingly useful and handy
> for many situations.

Well, if it were a small trivial piece of software doing some trivial
thing only, it would not be so useful. And regex is far from trivial.


> Just using one std::regex and matching a string with it will
> increase the size of the produced object file by about 250 kilobytes,
> the size of the final executable by about 100 kilobytes (even after
> stripping all debug info from it), and increase the compilation time
> of that one source file by several seconds (eg. about 2.5 seconds
> in this computer I'm using.)

Things depend on the perspective. For example, we have got a third-party
library which basically does nothing except trivial data copy and which
takes ages to build and ca 30 MB in final executable. Everybody seems to
be happy with it, makes our products more enterprisy I guess. Regex is
also there in our product, but nobody has bothered to measure its size.

>
> It might not sound like much, but when the compilation time of that
> source file *without* the std::regex is a tenth of a second,
> the resulting object file is like 8 kB and the final executable
> (in this case) is about 16 kB, that's quite a huge chunk.
>
> Does std::regex really need to be that big? Is there really no way
> for them to optimize it to be smaller and faster to compile?

Google tells me there are partial regex libraries optimized for size
which take 3 kB, so I'm sure a full regex could also be done in much
less than 100 kB. However, I guess nobody cares, 100 kB is less than
0.001% of a typical desktop RAM.

C++ libraries tend to be optimized for runtime speed nowadays. If this
conflicts with size or compilation speed the latters will suffer.

Jorgen Grahn

unread,
Aug 21, 2020, 3:15:24 PM8/21/20
to
I don't have the answer, but I share your observations. I used it as a
convenience in one of my programs, but when the (small) binary doubled
in size and compilation time was more than doubled, I got angry and
wrote my own manual parser for whatever it was.

I thought maybe it's a quality of implementation thing which will
improve over time. I also briefly wondered if Boost.Regex would have
had the same effect. I know that POSIX regexec(3) doesn't.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Cholo Lennon

unread,
Aug 21, 2020, 4:01:00 PM8/21/20
to
I suffered the same problem with clang/llvm outputting WASM. I needed
the class to parse some stuff in a smart contract (running inside the
EOS blockchain), but my surprise was huge when the size of the contract
increased from 40 Kb to 300 Kb, just for using std::regex. In a
constrained RAM environment like a blockchain, that was a no no. I ended
up using methods from std::string to parse the data. The resulting code
was awful, but the size stayed around 40 Kb.

--
Cholo Lennon
Bs.As.
ARG

Christian Gollwitzer

unread,
Aug 21, 2020, 4:40:44 PM8/21/20
to
Am 21.08.20 um 16:37 schrieb Juha Nieminen:
> Just using one std::regex and matching a string with it will
> increase the size of the produced object file by about 250 kilobytes,
> the size of the final executable by about 100 kilobytes (even after
> stripping all debug info from it), and increase the compilation time
> of that one source file by several seconds (eg. about 2.5 seconds
> in this computer I'm using.)

There should be ways to make that much smaller. For example, the PCRE
library jit-compiles the regex pattern into machine code (see here:
https://zherczeg.github.io/sljit/ ). The only reason for JIT is that a
typical program gets the pattern from the user input. If, however, there
is a static pattern only, you could run this compiler and store the
output in the executable - I doubt very much that this will increase the
code size by 100kB unless you have extremely complicated patterns. There
might still be library functions to add, like isspace() etc. but you'd
need those also for handwritten code.

Christian

jacobnavia

unread,
Aug 21, 2020, 4:59:48 PM8/21/20
to
Le 21/08/2020 à 19:25, Bonita Montero a écrit :
> The code-size doesn't matter.
>

THAT'S IT IT!

What you do not understand is that it DOES matter. Code size slows down
the program because it fills the code cache with irrelevant stuff!

It takes time to load it from slow RAM, time to decode it and execute
it. And it does the same thing, essentially, as a C compiler does with a
few dozens Kbytes of a regular expression library (5636 bytes to be
accurate).

A factor of 50 for doing WHAT?


It is really doing 50 times more?


jacobnavia

unread,
Aug 21, 2020, 5:00:39 PM8/21/20
to
Le 21/08/2020 à 19:50, Paavo Helde a écrit :
> Well, if it were a small trivial piece of software doing some trivial
> thing only, it would not be so useful. And regex is far from trivial.

The C equiivalent is 5636 bytes. Is this code doing 50 times more?

jacobnavia

unread,
Aug 21, 2020, 5:01:43 PM8/21/20
to
Le 21/08/2020 à 22:00, Cholo Lennon a écrit :
>
> I suffered the same problem with clang/llvm outputting WASM. I needed
> the class to parse some stuff in a smart contract (running inside the
> EOS blockchain), but my surprise was huge when the size of the contract
> increased from 40 Kb to 300 Kb, just for using std::regex. In a
> constrained RAM environment like a blockchain, that was a no no. I ended
> up using methods from std::string to parse the data. The resulting code
> was awful, but the size stayed around 40 Kb.
>

The C library is 5636 bytes.

Cholo Lennon

unread,
Aug 21, 2020, 6:54:12 PM8/21/20
to
Well, due to the way an EOS smart contract is built/used, introducing
external dependencies to EOS SDK (CDT)/C++17 (restricted, not all
standard library features are available) is not the ideal (and also the
problem was solved using std::string and some library algorithms). But
yes, any regex library (especially the one from standard library) would
have been better.

Lynn McGuire

unread,
Aug 21, 2020, 7:08:11 PM8/21/20
to
Wow, that will be interesting. UTF-8 or UTF-16 ?

Thanks,
Lynn

bol...@nowhere.co.uk

unread,
Aug 22, 2020, 4:18:15 AM8/22/20
to
Your problem is your compiling to web assembly. Serious programs do not
run in a browser.

Bonita Montero

unread,
Aug 22, 2020, 4:42:25 AM8/22/20
to
> What you do not understand is that it DOES matter. Code size slows down
> the program because it fills the code cache with irrelevant stuff!

That's actually not relevant because i-caches are large enough to hold
the hot-spots.

Öö Tiib

unread,
Aug 22, 2020, 10:20:07 AM8/22/20
to
You mean PC computer games? Big part of tools that people use to
do their everyday work have front end in web browser.

bol...@nowhere.co.uk

unread,
Aug 22, 2020, 2:06:44 PM8/22/20
to
On Sat, 22 Aug 2020 07:19:56 -0700 (PDT)
=?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> wrote:
>On Saturday, 22 August 2020 11:18:15 UTC+3, bol...@nowhere.co.uk wrote:
>> On Fri, 21 Aug 2020 19:53:59 -0300
>> Cholo Lennon <cholo...@hotmail.com> wrote:
>> >On 8/21/20 6:01 PM, jacobnavia wrote:
>> >> Le 21/08/2020 =C3=A0 22:00, Cholo Lennon a =C3=A9crit=C2=A0:
>> >>>
>> >>> I suffered the same problem with clang/llvm outputting WASM. I needed=
>=20
>> >>> the class to parse some stuff in a smart contract (running inside the=
>=20
>> >>> EOS blockchain), but my surprise was huge when the size of the=20
>> >>> contract increased from 40 Kb to 300 Kb, just for using std::regex. I=
>n=20
>> >>> a constrained RAM environment like a blockchain, that was a no no. I=
>=20
>> >>> ended up using methods from std::string to parse the data. The=20
>> >>> resulting code was awful, but the size stayed around 40 Kb.
>> >>>
>> >>=20
>> >> The C library is 5636 bytes.
>> >
>> >Well, due to the way an EOS smart contract is built/used, introducing=20
>> >external dependencies to EOS SDK (CDT)/C++17 (restricted, not all=20
>> >standard library features are available) is not the ideal (and also the=
>=20
>> >problem was solved using std::string and some library algorithms). But=
>=20
>> >yes, any regex library (especially the one from standard library) would=
>=20
>> >have been better.
>>=20
>> Your problem is your compiling to web assembly. Serious programs do not
>> run in a browser.
>
>You mean PC computer games? Big part of tools that people use to

Yes, computer games that might have been state of the art 15 years ago
can now run in a browser. BFD.

>do their everyday work have front end in web browser.=20

Front end yes, but you're writing a blockchain processor it would seem and
running that in a browser is a fuckwitted idea.

jacobnavia

unread,
Aug 22, 2020, 3:43:52 PM8/22/20
to
Sure, but in bloated code the "hot" spots are luke warm, since they are
diluted iin all kind of rubbish the processor has to do. In general,
bulky code does thing that are irrelevant to the task at hand, what
makes it quite slow.

Bulky code is diluted. The algorithm is a few percentage points of all
the work the machine has to do.

To be clear, suppose that you write a function:

int add(int a,int b) { return a+b;}

Here, we have the overhead of establishing a function frame, and
destroying it, of passing the arguments and retrieving the result, what
is far bigger than the single instruction needed to actually do the
addition!

Of course, you will argue, that function will be inlined, but that is an
EXAMPLE of how things go wrong. In a convoluted regexp library most of
the time will be spent doing other things than matching

Cholo Lennon

unread,
Aug 22, 2020, 9:49:12 PM8/22/20
to
On 8/22/20 5:18 AM, bol...@nowhere.co.uk wrote:
> On Fri, 21 Aug 2020 19:53:59 -0300
> Cholo Lennon <cholo...@hotmail.com> wrote:
>> On 8/21/20 6:01 PM, jacobnavia wrote:
>>> Le 21/08/2020 à 22:00, Cholo Lennon a écrit :
>>>>
>>>> I suffered the same problem with clang/llvm outputting WASM. I needed
>>>> the class to parse some stuff in a smart contract (running inside the
>>>> EOS blockchain), but my surprise was huge when the size of the
>>>> contract increased from 40 Kb to 300 Kb, just for using std::regex. In
>>>> a constrained RAM environment like a blockchain, that was a no no. I
>>>> ended up using methods from std::string to parse the data. The
>>>> resulting code was awful, but the size stayed around 40 Kb.
>>>>
>>>
>>> The C library is 5636 bytes.
>>
>> Well, due to the way an EOS smart contract is built/used, introducing
>> external dependencies to EOS SDK (CDT)/C++17 (restricted, not all
>> standard library features are available) is not the ideal (and also the
>> problem was solved using std::string and some library algorithms). But
>> yes, any regex library (especially the one from standard library) would
>> have been better.
>
> Your problem is your compiling to web assembly. Serious programs do not
> run in a browser.
>

I am sorry, but I have to say that your bias and ignorance is huge... I
didn't say my program runs in a browser... this blockchain technology
(EOS, https://eos.io/ https://github.com/EOSIO) is built entirely in
C++17... the blockchain nodes are coded in native C++. Each node runs a
WASM virtual machine that executes smart contracts also coded in C++.
And, BTW, there are a lot of good applications running inside a browser,
even an operating system like Linux.

bol...@nowhere.co.uk

unread,
Aug 24, 2020, 4:36:24 AM8/24/20
to
So what exactly is this "node", hardware or software and why does it run web
assembly inside a VM? Thats just ridiculous. Blockchain calculations are
highly complex and need the fastest speed possible - ie a proper optimised
binary. This sounds like some mickey mouse toy setup.

>And, BTW, there are a lot of good applications running inside a browser,
>even an operating system like Linux.

At about 1/100 the speed of a native program.

Bonita Montero

unread,
Aug 24, 2020, 5:28:08 AM8/24/20
to
> Sure, but in bloated code the "hot" spots are luke warm, ...

Source ?

Chris M. Thomasson

unread,
Aug 24, 2020, 5:59:39 AM8/24/20
to
A tight personal parser can be a good thing. No Baggage.

Juha Nieminen

unread,
Aug 24, 2020, 6:47:16 AM8/24/20
to
bol...@nowhere.co.uk wrote:
>>And, BTW, there are a lot of good applications running inside a browser,
>>even an operating system like Linux.
>
> At about 1/100 the speed of a native program.

You seem to sill be living in the 90's. For quite a long time now most
browsers JIT-compile Javascript, making it approach native program
speeds. (It might not be exactly as fast as a highly-optimized
compiled C++ program, mainly due to the limitations of Javascript
itself, but it's not that far off either.)

In fact, Javascript can be *so* low-level bare-metal and
accurate-to-the-clock-cycle that the Spectre and Meltdown exploits,
which require an accurate clock cycle count between two instructions,
could be exploited in Javascript (which was the main reason why
those exploits were so extraordinarily and unusually serious, as you
could get exploited by simply visiting a web page, without the exploit
even having to rely on any sort of browser bug or security hole.)

And of course WebGL is javascript-based, and can run hardware-accelerated
3D games at almost native speed. Just make a search for such games and be
amazed at how fast they run inside the browser.

Your "at about 1/100 the speed of a native program" would need some serious
proof.

Juha Nieminen

unread,
Aug 24, 2020, 6:49:19 AM8/24/20
to
Chris M. Thomasson <chris.m.t...@gmail.com> wrote:
> A tight personal parser can be a good thing. No Baggage.

Writing a parser for even relatively simple input formats using C++98
(without any third-party libraries) can be quite laborious, time-consuming
and error-prone. Depending on the particular input format and how complex
the parsing needs to be, it can be even more laborious to make the parsing
even moderately efficient.

I know. I have experience.

bolta...@nuttyella.co.uk

unread,
Aug 24, 2020, 7:05:27 AM8/24/20
to
On Mon, 24 Aug 2020 10:47:02 +0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>bol...@nowhere.co.uk wrote:
>>>And, BTW, there are a lot of good applications running inside a browser,
>>>even an operating system like Linux.
>>
>> At about 1/100 the speed of a native program.
>
>You seem to sill be living in the 90's. For quite a long time now most
>browsers JIT-compile Javascript, making it approach native program
>speeds. (It might not be exactly as fast as a highly-optimized
>compiled C++ program, mainly due to the limitations of Javascript
>itself, but it's not that far off either.)

Next you'll be telling me node.js is a superb server side system because of
its speed and efficiency. Note I have worked somewhere that used it - it isn't.

>In fact, Javascript can be *so* low-level bare-metal and
>accurate-to-the-clock-cycle that the Spectre and Meltdown exploits,

Any flaw in javscript is simply a flaw in the API presented to it via the
C++ code the browser is written in.

>which require an accurate clock cycle count between two instructions,
>could be exploited in Javascript (which was the main reason why
>those exploits were so extraordinarily and unusually serious, as you
>could get exploited by simply visiting a web page, without the exploit
>even having to rely on any sort of browser bug or security hole.)

The fact that the javascript API allows this shows the unnecessary low
level activity shows the massive feature creep browsers have incurred.

>And of course WebGL is javascript-based, and can run hardware-accelerated
>3D games at almost native speed. Just make a search for such games and be
>amazed at how fast they run inside the browser.

I've played plenty of those games. Amazed isn't a word that occured to me
tbh when seeing some basic texture mapped 3D shapes being manipulated on a
3Ghz multi core machine and dedicated GPU.


Cholo Lennon

unread,
Aug 24, 2020, 11:36:01 AM8/24/20
to
On 8/24/20 5:36 AM, bol...@nowhere.co.uk wrote:
> On Sat, 22 Aug 2020 22:49:01 -0300
> Cholo Lennon <cholo...@hotmail.com> wrote:
>> On 8/22/20 5:18 AM, bol...@nowhere.co.uk wrote:
>>> Your problem is your compiling to web assembly. Serious programs do not
>>> run in a browser.
>>>
>>
>> I am sorry, but I have to say that your bias and ignorance is huge... I
>> didn't say my program runs in a browser... this blockchain technology
>> (EOS, https://eos.io/ https://github.com/EOSIO) is built entirely in
>> C++17... the blockchain nodes are coded in native C++. Each node runs a
>> WASM virtual machine that executes smart contracts also coded in C++.
>
> So what exactly is this "node", hardware or software and why does it run web
> assembly inside a VM? Thats just ridiculous. Blockchain calculations are
> highly complex and need the fastest speed possible - ie a proper optimised
> binary. This sounds like some mickey mouse toy setup.
>

Ridiculous? Micky mouse toy setup? C'mon, just read about it before
emitting an opinion, you look like a troll. EOS is the fastest
blockchain *in the world* (why? because of the protocol used, because of
C++ and because of its high performant WASM VM - 6/12x over other
implementations)

https://www.blocktivity.info/

EOS produces and validates blocks every 500 ms, in contrast, Ethereuem
does the same every 15 seconds, and Bitcoin every 10 minutes! (Yes, when
you pay something using bitcoins, you have to wait on average 10 minutes
for the transaction confirmation).
The VISA payment system manages almost 2000 transactions per seconds,
EOS is able to reach 5000 TPS now (it has an average of 3000). In
contrast Ethereum and Bitcoint (which are not prepared for real time
transactions) manage 15 TPS and 7 TPS respectively. Ethereum is
re-thinking its architecture to solve this problem and Bitcoin has a
"patch", the "Lightning network" (a kind of parallel network of
"unverified" transactions).

EOS implements the Delegated Proof of Stake protocol (*) where a group
of selected nodes (21) are the only ones allowed to produce blocks.
Those 21 are voted by users and other producers. All smart contracts (in
all blockchains) run inside a virtual machine. Why? Because smart
contracts are "restricted/constrained" code, you can't do whatever you
want in a smart contract. In the case of EOS, the blockchain controls de
CPU, memory and security of the smart contract. Blocks are produced at
native level by a producer node, but the transactions are validated at
WASM level by a smart contract written by users (there are also system
smart contracts but those are part of the blockchain core). EOS is like
a distributed operating system. Actually the user smart contracts are
known as "dApps" (distributed apps).

EOS is not a toy, it's a serious platform.

(*) Do not confuse with Proof of Work protocol (used by Ethereum and
Bitcoin among others) which requires a lot of CPU/energy (and, in order
to be productive, a specialized hardware) to solve a crypto problem in
order to validate a block)

bol...@nuttyella.co.uk

unread,
Aug 24, 2020, 12:12:30 PM8/24/20
to
On Mon, 24 Aug 2020 12:35:44 -0300
Cholo Lennon <cholo...@hotmail.com> wrote:
>On 8/24/20 5:36 AM, bol...@nowhere.co.uk wrote:
>> On Sat, 22 Aug 2020 22:49:01 -0300
>> Cholo Lennon <cholo...@hotmail.com> wrote:
>>> On 8/22/20 5:18 AM, bol...@nowhere.co.uk wrote:
>>>> Your problem is your compiling to web assembly. Serious programs do not
>>>> run in a browser.
>>>>
>>>
>>> I am sorry, but I have to say that your bias and ignorance is huge... I
>>> didn't say my program runs in a browser... this blockchain technology
>>> (EOS, https://eos.io/ https://github.com/EOSIO) is built entirely in
>>> C++17... the blockchain nodes are coded in native C++. Each node runs a
>>> WASM virtual machine that executes smart contracts also coded in C++.
>>
>> So what exactly is this "node", hardware or software and why does it run web
>> assembly inside a VM? Thats just ridiculous. Blockchain calculations are
>> highly complex and need the fastest speed possible - ie a proper optimised
>> binary. This sounds like some mickey mouse toy setup.
>>
>
>Ridiculous? Micky mouse toy setup? C'mon, just read about it before
>emitting an opinion, you look like a troll. EOS is the fastest
>blockchain *in the world* (why? because of the protocol used, because of
>C++ and because of its high performant WASM VM - 6/12x over other
>implementations)

TL;DR

Software that requires speed doesn't use web assembly. Its speed is no doubt
despite them using it, not because of it. They've obviously just written some
highly efficient algorithms.

Not that I care about bitcoin in the slightest, its just another form of
financial speculation for the gullible and a currency for criminals.


Cholo Lennon

unread,
Aug 24, 2020, 12:20:19 PM8/24/20
to
Troll detected

Chris M. Thomasson

unread,
Aug 24, 2020, 3:09:18 PM8/24/20
to

Vir Campestris

unread,
Aug 24, 2020, 4:54:08 PM8/24/20
to
Well the guys at Microsoft told me that they compile with optimisation
for size. Because "A page fault can ruin your day".

I've got 32Gb on my dev machine, and 16 on this thing. If the code is
bigger I'll have less free to use as disc cache. Given that our compiled
code base is 144G I'm not going to fit it all into RAM any time soon.

Andy

Richard

unread,
Aug 24, 2020, 11:46:13 PM8/24/20
to
[Please do not mail me a copy of your followup]

Cholo Lennon <cholo...@hotmail.com> spake the secret code
<rhp957$1jke$1...@gioia.aioe.org> thusly:

>I suffered the same problem with clang/llvm outputting WASM. I needed
>the class to parse some stuff in a smart contract (running inside the
>EOS blockchain), but my surprise was huge when the size of the contract
>increased from 40 Kb to 300 Kb, just for using std::regex. In a
>constrained RAM environment like a blockchain, that was a no no. I ended
>up using methods from std::string to parse the data. The resulting code
>was awful, but the size stayed around 40 Kb.

You may want to look at the compile-time regex library:
<https://github.com/hanickadot/compile-time-regular-expressions>
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Bonita Montero

unread,
Aug 25, 2020, 1:21:26 AM8/25/20
to
> Well the guys at Microsoft told me that they compile with optimisation
> for size. Because "A page fault can ruin your day".

We're talking about I-cache hit-rates and not about page-faults.

> I've got 32Gb on my dev machine, and 16 on this thing. If the code is
> bigger I'll have less free to use as disc cache. Given that our compiled
> code base is 144G I'm not going to fit it all into RAM any time soon.

144G code in memory - LOL. Even oracle isn't that large.

Christian Gollwitzer

unread,
Aug 25, 2020, 2:23:57 AM8/25/20
to
Am 25.08.20 um 05:46 schrieb Richard:
>
> You may want to look at the compile-time regex library:
> <https://github.com/hanickadot/compile-time-regular-expressions>

Wow, this looks really good. Compiling is slow, but the generated code
is very compact (the links to godbolt result in something like 100
instructions for one of teh demo functions)

Christian



Richard

unread,
Aug 25, 2020, 2:29:51 AM8/25/20
to
[Please do not mail me a copy of your followup]

Christian Gollwitzer <auri...@gmx.de> spake the secret code
<ri2apk$mip$1...@dont-email.me> thusly:
Yes, it's quite neat. Compile time regex requires a modern compiler as
it has a dependency on some compiler extensions that have been proposed for
standardization but not yet adopted. If that makes it infeasible to
use (e.g. in the embedded scenario where you may be reliant on an
externally supplied toolchain), then you might want to look at the
boost.spirit parsing library. Matching a regex is essentially a
parsing problem. With boost.spirit, you can create parsers that are
very efficient in terms of code space and runtime. Boost.spirit v2
has been around a long time and supports older compilers. V3 assumes
C++11 IIRC and requires more modern compilers for some features.

Juha Nieminen

unread,
Aug 25, 2020, 2:56:51 AM8/25/20
to
bolta...@nuttyella.co.uk wrote:
>>> At about 1/100 the speed of a native program.
>>
>>You seem to sill be living in the 90's. For quite a long time now most
>>browsers JIT-compile Javascript, making it approach native program
>>speeds. (It might not be exactly as fast as a highly-optimized
>>compiled C++ program, mainly due to the limitations of Javascript
>>itself, but it's not that far off either.)
>
> Next you'll be telling me node.js is a superb server side system because of
> its speed and efficiency. Note I have worked somewhere that used it - it isn't.

I notice that you didn't actually refute what I said.

>>In fact, Javascript can be *so* low-level bare-metal and
>>accurate-to-the-clock-cycle that the Spectre and Meltdown exploits,
>
> Any flaw in javscript is simply a flaw in the API presented to it via the
> C++ code the browser is written in.
>
>>which require an accurate clock cycle count between two instructions,
>>could be exploited in Javascript (which was the main reason why
>>those exploits were so extraordinarily and unusually serious, as you
>>could get exploited by simply visiting a web page, without the exploit
>>even having to rely on any sort of browser bug or security hole.)
>
> The fact that the javascript API allows this shows the unnecessary low
> level activity shows the massive feature creep browsers have incurred.

And you are still not refuting what I said.

So, is a program running in a browser "1/100 the speed of a native program"
or not? Or are you just going to dodge some more?

bol...@nuttyella.co.uk

unread,
Aug 25, 2020, 4:39:38 AM8/25/20
to
How tediously predictable and pathetic. Anyone who has an opinion that differs
from the hipster groupthink is a troll. Whatever.

bol...@nuttyella.co.uk

unread,
Aug 25, 2020, 4:42:14 AM8/25/20
to
On Mon, 24 Aug 2020 12:09:06 -0700
"Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
Very pretty. Here's an arcade game that did something similar in 1983 using
a 1.5Mhz 6809 CPU:

https://en.wikipedia.org/wiki/Star_Wars_(1983_video_game)

bol...@nuttyella.co.uk

unread,
Aug 25, 2020, 4:44:35 AM8/25/20
to
On Tue, 25 Aug 2020 06:56:41 +0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>bolta...@nuttyella.co.uk wrote:
>> The fact that the javascript API allows this shows the unnecessary low
>> level activity shows the massive feature creep browsers have incurred.
>
>And you are still not refuting what I said.
>
>So, is a program running in a browser "1/100 the speed of a native program"
>or not? Or are you just going to dodge some more?

It was exaggeration to make a point as you well know. But my experience of
node.js in a production enviroment compared to a C/C++ equivalent is that
the latter runs approx 4-5x the speed. I don't see why a browser would be
any different.

Öö Tiib

unread,
Aug 25, 2020, 12:30:01 PM8/25/20
to
Yes, the math hasn't changed but the performance. Few lines on 320 x 200
pixels and 16 colors screen 6 frames per second. You now get 100
times more lines on 6 x 6 times bigger screen that are 8 times more
colorful at 10 times faster rate. So 100 * 6 * 6 * 8 * 10 ... 288 000
times more processing needed. But as in everything there are GPUs that
WebGL of Chris can just run in browser window on one of screens attached
to my laptop and it does not affect productivity of anything in any
manner.

Cholo Lennon

unread,
Aug 25, 2020, 8:50:14 PM8/25/20
to
On 8/25/20 12:46 AM, Richard wrote:
> [Please do not mail me a copy of your followup]
>
> Cholo Lennon <cholo...@hotmail.com> spake the secret code
> <rhp957$1jke$1...@gioia.aioe.org> thusly:
>
>> I suffered the same problem with clang/llvm outputting WASM. I needed
>> the class to parse some stuff in a smart contract (running inside the
>> EOS blockchain), but my surprise was huge when the size of the contract
>> increased from 40 Kb to 300 Kb, just for using std::regex. In a
>> constrained RAM environment like a blockchain, that was a no no. I ended
>> up using methods from std::string to parse the data. The resulting code
>> was awful, but the size stayed around 40 Kb.
>
> You may want to look at the compile-time regex library:
> <https://github.com/hanickadot/compile-time-regular-expressions>
>

Nice library, thanks :-) I'll be testing it.

bol...@nuttyella.co.uk

unread,
Aug 26, 2020, 3:53:57 AM8/26/20
to
On Tue, 25 Aug 2020 09:29:49 -0700 (PDT)
=?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> wrote:
>On Tuesday, 25 August 2020 11:42:14 UTC+3, bol...@nuttyella.co.uk wrote:
>> On Mon, 24 Aug 2020 12:09:06 -0700
>> "Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>> >Love WebGL. Here is some of my work:
>> >
>>
>>http://funwithfractals.atspace.cc/ct_gfield_test/3d_user/ct_wormhole_exp.html
>>
>> Very pretty. Here's an arcade game that did something similar in 1983 using
>> a 1.5Mhz 6809 CPU:
>>
>> https://en.wikipedia.org/wiki/Star_Wars_(1983_video_game)
>
>Yes, the math hasn't changed but the performance. Few lines on 320 x 200
>pixels and 16 colors screen 6 frames per second. You now get 100

It was vector graphics, not raster. And I assume you mean 60 fps.

>times more lines on 6 x 6 times bigger screen that are 8 times more
>colorful at 10 times faster rate. So 100 * 6 * 6 * 8 * 10 ... 288 000
>times more processing needed. But as in everything there are GPUs that

Do you really think the graphics in his example were 288K times better?


Chris M. Thomasson

unread,
Aug 26, 2020, 4:52:23 PM8/26/20
to
;^) Try to get the following running at 60fps on a 1.5mhz 6809 CPU:

https://www.shadertoy.com/view/XtscDl

Ian Collins

unread,
Aug 26, 2020, 6:11:35 PM8/26/20
to
Nice!

--
Ian.

bol...@nuttyella.co.uk

unread,
Aug 27, 2020, 4:14:31 AM8/27/20
to
You're going to have to do better than that. I've seen similar stuff in the
90s assembly demo scene runing on 486s and less. Heres the 93 winner:

https://www.youtube.com/watch?v=rFv7mHTf0nA

How about somebody post something that is actually impressive for a 3Ghz,
4/8 core modern PC with a dedicated GPU that is written in wasm.

Chris M. Thomasson

unread,
Aug 27, 2020, 4:18:51 AM8/27/20
to
On 8/27/2020 1:14 AM, bol...@nuttyella.co.uk wrote:
> On Wed, 26 Aug 2020 13:52:12 -0700
> "Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>> On 8/26/2020 12:53 AM, bol...@nuttyella.co.uk wrote:
>>> Do you really think the graphics in his example were 288K times better?
>>>
>>>
>>
>> ;^) Try to get the following running at 60fps on a 1.5mhz 6809 CPU:
>>
>> https://www.shadertoy.com/view/XtscDl
>
> You're going to have to do better than that. I've seen similar stuff in the
> 90s assembly demo scene runing on 486s and less. Heres the 93 winner:

Well the demo scene are total pros! I bow down to their skills. They are
great!

Chris M. Thomasson

unread,
Aug 27, 2020, 4:20:57 AM8/27/20
to
On 8/27/2020 1:14 AM, bol...@nuttyella.co.uk wrote:
Fwiw, here is a favorite of mine:

https://youtu.be/Gm0NWMdcbmU

Chris M. Thomasson

unread,
Aug 27, 2020, 4:22:38 AM8/27/20
to
On 8/27/2020 1:14 AM, bol...@nuttyella.co.uk wrote:
Here is another fav:

https://youtu.be/T_U3Zdv8to8

bol...@nuttyella.co.uk

unread,
Aug 27, 2020, 4:38:37 AM8/27/20
to
>Here is another fav:
>
>https://youtu.be/T_U3Zdv8to8

To fit something like those in 64K (even though I assume it doesn't include
the dlls required) is just staggering. These guys really are programmming
geniuses.

The music from the 2nd one sounds familiar.

Chris M. Thomasson

unread,
Aug 27, 2020, 4:53:51 AM8/27/20
to
On 8/27/2020 1:38 AM, bol...@nuttyella.co.uk wrote:
> On Thu, 27 Aug 2020 01:20:49 -0700
> "Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>> Fwiw, here is a favorite of mine:
>>
>> https://youtu.be/Gm0NWMdcbmU
>
>> Here is another fav:
>>
>> https://youtu.be/T_U3Zdv8to8
>
> To fit something like those in 64K (even though I assume it doesn't include
> the dlls required) is just staggering. These guys really are programmming
> geniuses.

Indeed! Big time.

Chris M. Thomasson

unread,
Aug 27, 2020, 5:43:32 AM8/27/20
to
Fwiw, check this out, a little experiment:

https://youtu.be/P_lAP4IiYyE

bol...@nuttyella.co.uk

unread,
Aug 27, 2020, 6:24:10 AM8/27/20
to
On Thu, 27 Aug 2020 02:43:20 -0700
"Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>On 8/27/2020 1:53 AM, Chris M. Thomasson wrote:
>> On 8/27/2020 1:38 AM, bol...@nuttyella.co.uk wrote:
>>> On Thu, 27 Aug 2020 01:20:49 -0700
>>> "Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>>>> Fwiw, here is a favorite of mine:
>>>>
>>>> https://youtu.be/Gm0NWMdcbmU
>>>
>>>> Here is another fav:
>>>>
>>>> https://youtu.be/T_U3Zdv8to8
>>>
>>> To fit something like those in 64K (even though I assume it doesn't
>>> include
>>> the dlls required) is just staggering. These guys really are programmming
>>> geniuses.
>>
>> Indeed! Big time.
>
>Fwiw, check this out, a little experiment:
>
>https://youtu.be/P_lAP4IiYyE

Not really in the same league tbh :)

Juha Nieminen

unread,
Aug 27, 2020, 12:15:31 PM8/27/20
to
bol...@nuttyella.co.uk wrote:
> On Wed, 26 Aug 2020 13:52:12 -0700
> "Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>>On 8/26/2020 12:53 AM, bol...@nuttyella.co.uk wrote:
>>> Do you really think the graphics in his example were 288K times better?
>>>
>>>
>>
>>;^) Try to get the following running at 60fps on a 1.5mhz 6809 CPU:
>>
>>https://www.shadertoy.com/view/XtscDl
>
> You're going to have to do better than that. I've seen similar stuff in the
> 90s assembly demo scene runing on 486s and less. Heres the 93 winner:
>
> https://www.youtube.com/watch?v=rFv7mHTf0nA

Running at 320x200 resolution, 256 colors.

Let's see the same 486 run the same effects in 1920x1080 32-bit color,
60 Hz.

Of course you can run almost anything if you lower the resolution
and color depth enough, use untextured polygons, and so on.

Chris M. Thomasson

unread,
Aug 27, 2020, 2:07:25 PM8/27/20
to
Yeah. I can do better! :^)

Chris M. Thomasson

unread,
Aug 27, 2020, 2:39:00 PM8/27/20
to
Here is another one of my favorites:

https://youtu.be/hIItHzm9hzc

Vir Campestris

unread,
Aug 27, 2020, 5:05:30 PM8/27/20
to
On 25/08/2020 06:21, Bonita Montero wrote:
>> Well the guys at Microsoft told me that they compile with optimisation
>> for size. Because "A page fault can ruin your day".
>
> We're talking about I-cache hit-rates and not about page-faults.

Doesn't matter. Make the code smaller and it will fit in RAM and not be
paged to disk. Or fit in the L2 cache, and not in main store. Or the L1
cache.

Andy

Bonita Montero

unread,
Aug 27, 2020, 11:42:03 PM8/27/20
to
> Doesn't matter. Make the code smaller and it will fit in RAM and not be
> paged to disk. Or fit in the L2 cache, and not in main store. Or the L1
> cache.

You usually get I-cache hit-rates > 95% becaus the I-cache covers the
hot-spots, even with large code.

bol...@nuttyella.co.uk

unread,
Aug 28, 2020, 5:02:56 AM8/28/20
to
On Thu, 27 Aug 2020 16:15:12 +0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>bol...@nuttyella.co.uk wrote:
>> On Wed, 26 Aug 2020 13:52:12 -0700
>> "Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
>>>On 8/26/2020 12:53 AM, bol...@nuttyella.co.uk wrote:
>>>> Do you really think the graphics in his example were 288K times better?
>>>>
>>>>
>>>
>>>;^) Try to get the following running at 60fps on a 1.5mhz 6809 CPU:
>>>
>>>https://www.shadertoy.com/view/XtscDl
>>
>> You're going to have to do better than that. I've seen similar stuff in the
>> 90s assembly demo scene runing on 486s and less. Heres the 93 winner:
>>
>> https://www.youtube.com/watch?v=rFv7mHTf0nA
>
>Running at 320x200 resolution, 256 colors.
>
>Let's see the same 486 run the same effects in 1920x1080 32-bit color,
>60 Hz.

You think a 486 couldn't do 60hz? Anyway, so what? The point still stands
that similar graphics could be done 25 years ago with far less powerful
machines. No one has yet disproved my point that web assembly is just another
piss poor cycle sucking VM that sacrifices performance on the altar of
portability.

>Of course you can run almost anything if you lower the resolution
>and color depth enough, use untextured polygons, and so on.

His example wasn't of anything that involved polygons.

Juha Nieminen

unread,
Aug 28, 2020, 12:14:54 PM8/28/20
to
bol...@nuttyella.co.uk wrote:
>>Running at 320x200 resolution, 256 colors.
>>
>>Let's see the same 486 run the same effects in 1920x1080 32-bit color,
>>60 Hz.
>
> You think a 486 couldn't do 60hz?

Not at 1920x1080 32-bit color, no.

Maybe at 0.1 Hz. If even that.

> Anyway, so what? The point still stands
> that similar graphics

You have a strange definition of "similar".

> could be done 25 years ago with far less powerful
> machines.

Not at the same resolution, same color depth, same framerate.

Sure, if you lower the resolution enough, almost anything can run
almost anything. I can create a fully path-traced scene using global
illumination, dynamic shadows and volumetric lighting on a NES...
at 1x1 resolution 2 colors. Clearly the NES is as powerful at
graphics as modern PCs.

bol...@nuttyella.co.uk

unread,
Aug 29, 2020, 10:51:20 AM8/29/20
to
On Fri, 28 Aug 2020 16:14:39 +0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>bol...@nuttyella.co.uk wrote:
>>>Running at 320x200 resolution, 256 colors.
>>>
>>>Let's see the same 486 run the same effects in 1920x1080 32-bit color,
>>>60 Hz.
>>
>> You think a 486 couldn't do 60hz?
>
>Not at 1920x1080 32-bit color, no.
>
>Maybe at 0.1 Hz. If even that.

You'd be surprised what a 486 could do when programmed properly. I had
640x400, 16 bit full motion video running on mine back in the 90s with a
humble Matrox Millenium video card. And plenty of early setop TV boxes
actually ran a CPU core much less powerful, albeit with hardware decoding.

>> Anyway, so what? The point still stands
>> that similar graphics
>
>You have a strange definition of "similar".

Similar type of demo.

>Sure, if you lower the resolution enough, almost anything can run
>almost anything. I can create a fully path-traced scene using global
>illumination, dynamic shadows and volumetric lighting on a NES...
>at 1x1 resolution 2 colors. Clearly the NES is as powerful at
>graphics as modern PCs.

Now you're just being stupid. My point was that rather humble demo of how
supposedly good web assembly is was nothing of the sort.

Chris M. Thomasson

unread,
Sep 2, 2020, 11:44:29 PM9/2/20
to
Check this out:

https://youtu.be/BUbMJJyj7AU

https://youtu.be/P_lAP4IiYyE

My spider attractor...
0 new messages