Any Unicode Experts?

FR

unread,

Aug 6, 2021, 10:40:53 AM8/6/21

to

ASCII is dead as it should be. It goes back to the telegraph era
(i.e. Morse code). The first 32 bytes (except EOL, LF, and CR)
of ASCII are transmission control codes that are no longer relevant.

ASCII has been incorporated first into ISO-8859-1 and now into
UTF-8, which is (or should be) the global standard.

But what happened to the first 32 bytes of UTF-8? Obviously
LF and CR are still there, but have the former control codes
been replaced by other, and more meaningful, code points?

As far as I know, the first 32 bytes (except CR and LF) are
just dead space. Is this correct?

FR

unread,

Aug 6, 2021, 11:01:56 AM8/6/21

to

On Fri, 06 Aug 2021 14:51:25 +0000, Lew Pitcher wrote:

> On Fri, 06 Aug 2021 14:39:50 +0000, FR wrote:
>
> [opinions ellided]

>
>> But what happened to the first 32 bytes of UTF-8? Obviously
>> LF and CR are still there, but have the former control codes
>> been replaced by other, and more meaningful, code points?
>

> See https://www.unicode.org/charts/PDF/U0000.pdf
> Also https://www.unicode.org/charts/PDF/U0080.pdf

>
>> As far as I know, the first 32 bytes (except CR and LF) are
>> just dead space. Is this correct?
>

> Not in the least.
>

A comprehensive answer is here:

https://www.aivosto.com/articles/control-characters.html

To preserve the integrity of conversions and interchange,
the control codes remain.

However, the semantics of the control codes are reserved
to applications.

So the control codes are in a kind of limbo. They have
only an application-specific meaning but in the absence
of an application they are to be interpreted according
to ISO/IEC 6429:1992. (Unicode 9.0 p. 822)

Jeff-Relf.Me

unread,

Aug 6, 2021, 11:15:44 AM8/6/21

to

Fabian Russell:
> As far as I know, the first 32 [ code points ] (except CR and LF)

> are just dead space. Is this correct ?

No, US-ASCII is a subset of the set of Unicode code points.

Unicode code points are 20 bits long, not 32.

typedef wchar_t wchar ; typedef char32_t Uchar ;

// UTF-16 requires a "Surrogate Pair" ( 2 wchars )
// to encode one 20 bit UTF-32 (Uchar) code point.

Uchar _i32, i32 = U'⛏'; wchar pCh[2];

#define i32＞pCh( Ch, pCh ) hall( pCh[1] = 0, Ch >= 0x10000 ? *pCh = 0xd800 | Ch - 0x10000 >> 10, pCh[1] = 0xdc00 + ( Ch & 0x03FF ) : *pCh = Ch )

i32＞pCh( i32, pCh );

SurrogatePair = ( *pCh & 0xF800 ) == 0xD800 ;

_i32 = !SurrogatePair ? *pCh : 0x10000 + ( *pCh - 0xD800 << 10 ) + pCh[1] - 0xDC00 ;

R.Wieser

unread,

Aug 6, 2021, 11:36:53 AM8/6/21

to

Jeff,

>> As far as I know, the first 32 [ code points ] (except CR and LF)
>> are just dead space. Is this correct ?
>
> No, US-ASCII is a subset of the set of Unicode code points.

Try to understand the question before you answer it.

@Fabian Russell:

> As far as I know, the first 32 [ code points ] (except CR and LF)
> are just dead space. Is this correct ?

No. All of those have definite meanings. Just think of the BACKSPACE
(0x08), TAB (0x09), ESCape (0x1B) and BELL (0x07) characters. One you
might never have heard of is ctrl-Z (0x1A), which, for old-style textfiles,
means that the file ends (even when it may contain some more
characters/garbage).

For a full list you could take a peek here : http://asciiset.com/

Granted, quite a few of the control-characters there you will never use.
They are still there if you need them though.

Regards,
Rudy Wieser

P.s.
Did you also think of the DEL (0x7F) character ? Just as the BACKSPACE
character is used to delete the character left of the cursor, the DEL
character is ment to delete the character under the cursor.

FR

unread,

Aug 6, 2021, 11:49:51 AM8/6/21

to

On Fri, 06 Aug 2021 17:36:37 +0200, R.Wieser wrote:

>
>> As far as I know, the first 32 [ code points ] (except CR and LF)
>> are just dead space. Is this correct ?
>
> No. All of those have definite meanings.
>

According to the link that I cited, in Unicode the meanings,
outside of an application, are not always the same as the original:

"Unicode specifies semantics for the following control characters:

ASCII control characters:

HT and SP are considered whitespace.

LF, VT, FF and CR are considered whitespace, and also mandatory line breaks
in the line breaking algorithm.

FS, GS, RS and US are considered separators in the bi-directional algorithm."

Thus if I encounter an HT control code in some UTF-8 file
then I should process that code as whitespace.

But a specific application may choose to use the HT control
code for some other purpose.

R.Wieser

unread,

Aug 6, 2021, 12:48:54 PM8/6/21

to

FR,

> According to the link that I cited

..grumble ... Jeff-Relf changed the subject and with it made it look, in my
chronological list, as if it was a new thread. Only now I notice your "Any
Unicode Experts?" post.

Alas, although I do know a bit about ASCII, I never bothered to know much
about unicode.

> HT and SP are considered whitespace.

...

> Thus if I encounter an HT control code in some UTF-8
> file then I should process that code as whitespace.

That fully depends on what you mean with "processed".

For "scanning the string" purposes ? Yes, you can handle both in the same
way (just like you may do with most most all of the other control
characters, possibly even including CR and LF [1] ).

For *displaying* purposes ? Nope, both have a different effect.

[1] if you have access to a C{something} compiler you could take a peek at
how fscanf and printf deal with those control characters. IIRC fscanf for
a value will ignore CR and LF (which is not always what you want...)

Regards,
Rudy Wieser

FR

unread,

Aug 6, 2021, 1:58:22 PM8/6/21

to

On Fri, 06 Aug 2021 17:25:11 +0000, Charlie Gibbs wrote:

> On 2021-08-06, FR <f...@random.info> wrote:
>
>> On Fri, 06 Aug 2021 17:36:37 +0200, R.Wieser wrote:
>>
>

> To sum up, there is whitespace and there is whitespace. The
> differences between the various characters may be subtle, but
> that doesn't mean that you should treat them all the same.
> Each has its own special use.
>

Whitespace can be independent of format. When doing a search,
for example, a whitespace char is simply skipped over.

Formatting is also something that is independent of a particular
character set. Formatting is achieved by an application such
as a word processor, type setter, etc. that would use its own
internal codes and not that of the charset.

Although the old ASCII format codes are still present in
Unicode they, aside from EOL, should not be interpreted as such
when processing a UTF-8 text file, for example. In Unicode
their interpretation is ambiguous. Anyone composing with UTF-8
would never use them.

Unicode does have many unambiguous formatting characters such as
U+2028, line separator, and U+2029, paragraph separator, but, not
being an expert in Unicode, I wouldn't know how or where to use them.

Unicode is supposed to provide the symbols but not the formatting
instructions. Some languages might require special codes for a
proper construction, but this is not the same as formatting.

Rich

unread,

Aug 6, 2021, 2:10:51 PM8/6/21

to

The first 127 Unicode code points are identical to the ASCII code
points that predated Unicode. All of ASCII exists as the first 127
Unicode code points.

Eli the Bearded

unread,

Aug 6, 2021, 4:27:58 PM8/6/21

to

(Follow-up to comp.os.linux.advocacy ignored.)

In comp.os.linux.misc, FR <f...@random.info> wrote:
> ASCII is dead as it should be. It goes back to the telegraph era
> (i.e. Morse code). The first 32 bytes (except EOL, LF, and CR)
> of ASCII are transmission control codes that are no longer relevant.

...

> As far as I know, the first 32 bytes (except CR and LF) are
> just dead space. Is this correct?

No, absolutely not.

Null and horizontal tab get a lot of use still. Vertical tab and
form feed get a little. Arguably escape and backspace get used (eg
terminal control sequences). Whatever the status of the rest vis-a-vis
newly composed documents, their meanings remain unchanged for historical
documents.

Unicode, covering Linear A (U+10600 to U+1077F, in use from 1800 to 1450
BC) to Signwriting (U+1D800 to U+1DAAF, a generalized notation for
writing down sign languages) aims to include modern and historical
content. Just because you should not expect to generate new content
with start of heading (SOH) or end transmission block (ETB), doesn't
mean those characters are dead space.

Elijah
------
has a soft spot for ␗ U+0017

The Natural Philosopher

unread,

Aug 7, 2021, 4:37:02 AM8/7/21

to

On 06/08/2021 21:27, Eli the Bearded wrote:
> (Follow-up to comp.os.linux.advocacy ignored.)
>
> In comp.os.linux.misc, FR <f...@random.info> wrote:
>> ASCII is dead as it should be. It goes back to the telegraph era
>> (i.e. Morse code). The first 32 bytes (except EOL, LF, and CR)
>> of ASCII are transmission control codes that are no longer relevant.
> ...
>> As far as I know, the first 32 bytes (except CR and LF) are
>> just dead space. Is this correct?
>
> No, absolutely not.
>
> Null and horizontal tab get a lot of use still. Vertical tab and
> form feed get a little. Arguably escape and backspace get used (eg
> terminal control sequences). Whatever the status of the rest vis-a-vis
> newly composed documents, their meanings remain unchanged for historical
> documents.
>

Ctrl-Z and Ctrl-C and Ctrl-V have a lot of use too

> Unicode, covering Linear A (U+10600 to U+1077F, in use from 1800 to 1450
> BC) to Signwriting (U+1D800 to U+1DAAF, a generalized notation for
> writing down sign languages) aims to include modern and historical
> content. Just because you should not expect to generate new content
> with start of heading (SOH) or end transmission block (ETB), doesn't
> mean those characters are dead space.
>

Indeed not. They are handy as with all 'out of band' text characters as
control characters on communications streams that could conceivably be
useful again sometime in the future

> Elijah
> ------
> has a soft spot for ␗ U+0017
>

--
Future generations will wonder in bemused amazement that the early
twenty-first century’s developed world went into hysterical panic over a
globally average temperature increase of a few tenths of a degree, and,
on the basis of gross exaggerations of highly uncertain computer
projections combined into implausible chains of inference, proceeded to
contemplate a rollback of the industrial age.

Richard Lindzen

Eli the Bearded

unread,

Aug 8, 2021, 12:32:59 AM8/8/21

to

In comp.os.linux.misc, The Natural Philosopher <t...@invalid.invalid> wrote:
> On 06/08/2021 21:27, Eli the Bearded wrote:
> > Null and horizontal tab get a lot of use still. Vertical tab and
> > form feed get a little. Arguably escape and backspace get used (eg
> > terminal control sequences). Whatever the status of the rest vis-a-vis
> > newly composed documents, their meanings remain unchanged for historical
> > documents.
> Ctrl-Z and Ctrl-C and Ctrl-V have a lot of use too

I type those, and others like ctrl-W, ctrl-U, ctrl-F, ctrl-B, ctrl-D,
ctrl-R, etc, reasonably often, but I don't encounter them in files,
unlike the ones I named. (I have checked-in files at github with embedded
terminal control sequences.) So those _feel_ different. The stty
settings are just defaults, and can be changed. There's nothing special
about the characters. The ones I use in vi, not in stty, are are not
changable, but are based on mnemonics instead of ASCII meaning. I use
ctrl-F for "forward page" not for ACK.

The stty ones:
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z;
rprnt = ^R; werase = ^W; lnext = ^V; discard = ^O;

Of those that have settings, only discard do I not use. Stop, I use, but
usually by accident and then I have to start again. I'd like that one to
default to undef, but I log in to too many computers per day to bother
trying to manually set it. Reprint, escape next (lnext), and quit I use
fairly rarely, but when I need them there's no substitute.

Oh, wait, you're thinking of Windows short cuts. I don't think they are
meant to map to a character _ever_. Those are just key press events
caught by the OS / program, same as command keys in Macs.

Ctrl-C for Copy I think exists because Command-C for copy first existed.
Neither has anything to do with the ASCII control character ETX.

> Indeed not. They are handy as with all 'out of band' text characters as
> control characters on communications streams that could conceivably be
> useful again sometime in the future

I like tab separated values more than comma separated values, because
with rare exceptions[*], tabs in text can be converted to spaces without
loss of meaning, so I can have a separator that never needs quoting. But
using ASCII RS (ctrl-^) for it's intended purpose as a record separator
just feels obsolent and wrong to me, even if it is even less likely to
appear in the sort of things I put into TSV files.

[*] And the sorts of things that do need raw tabs, like Makefiles, I
don't put in TSV files.

Elijah
------
in two minutes of testing could not get ^O (discard) to do anything

rbowman

unread,

Aug 8, 2021, 12:50:13 AM8/8/21

to

On 08/07/2021 10:32 PM, Eli the Bearded wrote:
> Ctrl-C for Copy I think exists because Command-C for copy first existed.
> Neither has anything to do with the ASCII control character ETX.

I'm probably wrong but I think WordStar had an influence.

Rich

unread,

Aug 8, 2021, 1:27:04 PM8/8/21

to

Except with WordStar, cut and copy were two character sequences (^K^C
^K^X for copy and cut, and ^K^V for paste). The second character was
the same, but they were not single key presses.

I saw an explanation at some point that alluded to the choice of X for
cut, C for copy, and V for paste as trying to mimic the red-ink markup
that copy editors would make on paper manuscripts for changes they
wanted made. They would cross out (loosely an X) stuff to delete
(i.e., cut), and use a caret (loosely a V) to indicate where to insert
something. So cut becoming "X" and "paste/insert" becoming "V", at
least given that explanation, made some sense. Copy then becoming "C"
was simply mnemonic. Whether the author was simply "back constructing
a meaning" for the mappings I do not know.

Carlos E. R.

unread,

Aug 8, 2021, 2:28:28 PM8/8/21

to

It is possible. The "user interface" of the original Wordstar was
designed for touch typists.

I could not understand why when they created other Wordstar versions
(say, wordstar 2000?) they changed the key bindings.

The use of ^C, ^X, ^V alone (or equivalents using the insert/delete key)
came later, with the CUA design used in other software and in Windows.

CUA: <https://en.wikipedia.org/wiki/IBM_Common_User_Access>

--
Cheers,
Carlos E.R.

jak

unread,

Aug 8, 2021, 2:32:16 PM8/8/21

to

I don't think so. These are the old commands for wordstar:
https://sfwriter.com/wordstar-command-summary.pdf
In my memory the first Ctrl-X/Ctrl-C/Ctrl-V shortcuts I saw with the
first full-screen commands in ms-dos 3.1: explorer.exe and qbasic.exe.
Explorer was very similar to the Windows version but in semi-graphics
(video in text mode and windows drawn with ascii-extended fonts). These
programs finally allowed to use the mouse making the old shortcuts
difficult to use (Shift-Del/Ctrl-Ins/Shift-Ins) with the left hand
(to keep the right hand on the mouse), however the old shortcuts also
work today and often even with programs that suppress the new shortcuts.

cheers

rbowman

unread,

Aug 8, 2021, 8:01:01 PM8/8/21

to

Brief had those key bindings but by '85 it wasn't clear who was copying
who. That was the only programming editor I ever paid money for. Borland
bought it and buried it.

I'm not a power user of the VS editor but 35 years later it seems to
lack the features of Brief.

Jeff-Relf.Me

unread,

Aug 8, 2021, 9:11:41 PM8/8/21

to

Bowman:

> I'm not a power user of the VS editor but
> 35 years later it seems to lack the features of Brief.

My macros & extensions to Visual Studio 2019: http://Jeff-Relf.Me/Macros.HTM

In Visual Studio, I'm editing a small, recently opened text file.
AutoRecover is turned off.

For that, it's consuming 1.8 gigabytes of RAM, and growing rapidly;
occasionally, it drops back down to "just" 1.2 gigabytes.

Just now, it disappeared from the task manager altogether,
showing up again only after ReStarting the task manager.

Every time I debug my app, memory usage goes up,
briefly topping 2 gigabytes, _after_ exiting the app/debugger.
Hmmm... it's a memory leak, apparently.

I ReStarted Visual Studio, now it's consuming 186 megabytes,
no more memory leaking.

"while(1);" borks Visual Studio, apparently.

rbowman

unread,

Aug 9, 2021, 12:19:19 AM8/9/21

to

It would tend to do that. Even a more complex while statement that
doesn't have any natural points where it will block will try for 100% of
the cpu. At least now it only ties up one core.

jak

unread,

Aug 9, 2021, 1:29:00 AM8/9/21

to

In this way you make me dig up the past ... At that time I started
developing in dos and windows but I came from the unix and xenix world,
so the only editor I knew was the Vi, so I got a copy that ran under dos
. A powerful copy weighing 17KB. I stopped using it only many years
later when Multiedit (an amazing editor) came on the market.

rbowman

unread,

Aug 9, 2021, 9:56:25 AM8/9/21

to

Back when a 1 GB hard drive was huge, the choice between Vim and Emacs
was a no-brainer. iirc Vim was 1.6 MB, Emacs was 25 MB. It looks like
gVim has porked up to 2.7.

Jack Strangio

unread,

Aug 9, 2021, 9:58:48 AM8/9/21

to

Rich <ri...@example.invalid> writes:
> In comp.os.linux.misc rbowman <bow...@montana.com> wrote:
>
> Except with WordStar, cut and copy were two character sequences (^K^C
> ^K^X for copy and cut, and ^K^V for paste).

Not quite:

^K^C was indeed copy the previously marked block.

But ^K^V is to *move* the marked block of text to the cursor location.
*Paste* is generally more of a copy procedure, whether or not the block
of text has actually been previously cut.

And ^K^X is to exit theWordStar/'joe' program.

Being imprinted by WordStar nearly 40 years ago, I still use the
same keystrokes whether I am using true WorStar on CP/M, or using the
'jstar' alter-ego of the 'joe' text-editor in Linux.

Regards.

Jack

--
"My mother says I don't know what good, clean fun is.
She's right. I don't know what good it is."

- Laugh-In, 1968

jak

unread,

Aug 10, 2021, 4:33:36 AM8/10/21

to

Il 09/08/2021 15:56, rbowman ha scritto:
> Back when a 1 GB hard drive was huge, the choice between Vim and Emacs
> was a no-brainer. iirc Vim was 1.6 MB, Emacs was 25 MB. It looks like
> gVim has porked up to 2.7.

No. Further back. When a hard drive was huge if greater than 30MB.

The Natural Philosopher

unread,

Aug 10, 2021, 7:49:53 AM8/10/21

to

haha.
when the PC running a serial program to the PDP 11 had more RAM than the
PDP!

And some twat ran IIRC SED all over the source files and turned multiple
spaces into tabs to reduce the space...and ruined all my C string
cọnstants!!!

Happy dayz.

It was quicker to grab the files over the serial link, edit them with
wordstar on the PC, and upload them and then compile them.

just used 'vi' for quick hacks...

--
It is the folly of too many to mistake the echo of a London coffee-house
for the voice of the kingdom.

Jonathan Swift

jak

unread,

Aug 10, 2021, 9:26:34 AM8/10/21

to

Il 10/08/2021 13:49, The Natural Philosopher ha scritto:
> On 10/08/2021 09:33, jak wrote:
>> Il 09/08/2021 15:56, rbowman ha scritto:
>>> Back when a 1 GB hard drive was huge, the choice between Vim and
>>> Emacs was a no-brainer. iirc Vim was 1.6 MB, Emacs was 25 MB. It
>>> looks like gVim has porked up to 2.7.
>>
>> No. Further back. When a hard drive was huge if greater than 30MB.
>
> haha.
> when the PC running a serial program to the PDP 11 had more RAM than the
> PDP!
>
> And some twat ran IIRC SED all over the source files and turned multiple
> spaces into tabs to reduce the space...and ruined all my C string
> cọnstants!!!
>
> Happy dayz.
>
> It was quicker to grab the files over the serial link, edit them with
> wordstar on the PC, and upload them and then compile them.
>
> just used 'vi' for quick hacks...
>

It should be remembered more often than to modify the source codes, the
'Vi' can be a great ally when it is renamed 'Ed'.

vallor

unread,

Aug 10, 2021, 9:44:01 AM8/10/21

to

Thank gosh!

Sensible people on the usenets!

--
-v

jak

unread,

Aug 10, 2021, 12:02:35 PM8/10/21

to

sad C coders:

while(1);

better:

#define ever (;;)

for ever;

XDD

Jeff-Relf.Me

unread,

Aug 10, 2021, 12:28:43 PM8/10/21

to

You (jak) replied ( to me ):
> > "while(1);" borks Visual Studio.

>
> #define ever (;;)
> for ever;

while(malloc(666));

jak

unread,

Aug 10, 2021, 12:39:59 PM8/10/21

to

With your version, VS will probably just survive a few seconds. ;^)

rbowman

unread,

Aug 10, 2021, 11:12:36 PM8/10/21

to

What am I ever going to do with all this disk space? There is a
corollary of Moorse's Law that pertains to executable and data bloat.

The Natural Philosopher

unread,

Aug 11, 2021, 7:25:31 AM8/11/21

to

That has actually returned to haunt me. My NAS that has all the data is
a linux box with lots of spinning rust, but my user devices are laptops
Pis stupidPhones and a desktop pc all equipped with SSD's that are
almost entirely read only devices. And virtually empty.

But I cant get a 30GB drive.

--
There is something fascinating about science. One gets such wholesale
returns of conjecture out of such a trifling investment of fact.

Mark Twain

Charlie Gibbs

unread,

Aug 11, 2021, 12:52:51 PM8/11/21

to

s/Moorse's/Parkinson's/

--
/~\ Charlie Gibbs | They don't understand Microsoft
\ / <cgi...@kltpzyxm.invalid> | has stolen their car and parked
X I'm really at ac.dekanfrus | a taxi in their driveway.
/ \ if you read it the right way. | -- Mayayana

jak

unread,

Aug 11, 2021, 2:19:06 PM8/11/21

to

Il 11/08/2021 18:52, Charlie Gibbs ha scritto:
> s/Moorse's/Parkinson's/

Moore's

Charlie Gibbs

unread,

Aug 12, 2021, 12:38:10 PM8/12/21

to

True, but I figured I'd used up my nitpicking allowance.

jak

unread,

Aug 12, 2021, 2:32:16 PM8/12/21

to

Il 12/08/2021 18:37, Charlie Gibbs ha scritto:
> On 2021-08-11, jak <nos...@please.ty> wrote:
>
>> Il 11/08/2021 18:52, Charlie Gibbs ha scritto:
>>
>>> s/Moorse's/Parkinson's/
>>
>> Moore's
>
> True, but I figured I'd used up my nitpicking allowance.
>

Do not worry. The culprit is rbowman and his bad habit at the fishing.
XDD

rbowman

unread,

Aug 12, 2021, 10:40:47 PM8/12/21

to

Man, you don't want to know about the rest of my bad habits...

Stéphane CARPENTIER

unread,

Aug 13, 2021, 1:42:37 PM8/13/21

to

Le 10-08-2021, Andreas Kohlbach <a...@spamfence.net> a écrit :

> On Mon, 9 Aug 2021 07:56:15 -0600, rbowman wrote:
>>
>> Back when a 1 GB hard drive was huge, the choice between Vim and Emacs
>> was a no-brainer. iirc Vim was 1.6 MB, Emacs was 25 MB. It looks like
>> gVim has porked up to 2.7.
>

> The on my first PC 1995 had 1300 MB. In late 1997 I split it between
> Windows 95 (850 MB) and installed Linux (rest including swap space). No
> space for a GUI. May be I could had installed Xfce or something. But the
> CD, which came as bonus on a magazine, only had KDE, and I was a bloody
> n00b not even knowing what KDE is.

I don't remember kde being available so soon. At this time the most
beautiful and heavy WM I heard about was enlightenment. For the others
which didn't have a powerful computer, there was fvwm which as very
light.

--
Si vous avez du temps à perdre :
https://scarpet42.gitlab.io

Soviet_Mario

unread,

Aug 16, 2021, 10:06:46 AM8/16/21

to

Il 09/08/21 06:19, rbowman ha scritto:

It is not selfevident to me why an infinite loop should be
supposed to eat up RAM ...
CPU time maybe yes, but why high RAM consumption ?

Recursion of calls to procedures consumes RAM (even if on
many compilers that don't let grow limitlessly automatic
memory, but the stack has a maximum size and then the
overflow kills badly the process : the stack overlaps and
collide with read only data, with "code", with areas
belonging to other processes, seg-faulting, etc).

To produce similar leaks, I think (I'm just saying an
opinion, not a truth) some conditions might operate :

an infinite recursion of calls to procedures that have a
very limited (or negligible) stack signature (i.g. no
parameters passed, maybe not even a return value) and
internally performing some heavy "dynamic allocation, i.g.
by specialized constructors or else).

also routines that works on objects encapsulating arrays
that assume to pass By Value (not by pointer or reference),
when called in nested mode, will need to copy large amount
of data, in dynamic ram.

Or even some mis-written or generated "destructors" that
don't clean-up properly dynamically allocated objects (i.g.
bad string allocators in programming manipulating a lot of
rw strings). Or both.
A memory leak can have a lot of fathers and mothers. But,
imho, not a simple infinite loop in itself.

If i think sth like this

String A = "";
while (1)
A &= "x";

the memory usage would explode, sure, depending on the
placement of the chained string A ....

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)