Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How much memory without using malloc?

1,170 views
Skip to first unread message

Peabody

unread,
May 16, 2018, 2:06:34 PM5/16/18
to
I have extremely limited experience in C. In the few programs I've written,
I've simply declared variables and arrays without requesting any additional
memory. But I think the largest was a 2048-character array. Now I'm looking
at more like 500,000 characters. At what point do you have to use something
like malloc to get enough memory? There must be some limit as to what
Windows will let you use without a malloc.

If it matters, this is all Windows console, 32-bit, LCC compiler.



bartc

unread,
May 16, 2018, 2:33:23 PM5/16/18
to
Do you mean lccwin32?

I tried this program:

#include <stdio.h>

char memory[1500000000];

int main(void) {
printf("%d\n",sizeof(memory));
}

and it worked using 1.5GB of static memory without malloc. But 2.0GB (or
2.0GiB, whichever one is 2e9 bytes), crashed.

But the data must be declared outside of a function. Inside a function,
stack memory is very limited unless you declare data as static.

Of course, the machine needs to have that spare RAM available, and it's
the same whether you use malloc, or this static allocation.

--
bartc

Ben Bacarisse

unread,
May 16, 2018, 3:16:58 PM5/16/18
to
Peabody <waybackNO...@yahoo.com> writes:

> I have extremely limited experience in C. In the few programs I've written,
> I've simply declared variables and arrays without requesting any additional
> memory. But I think the largest was a 2048-character array. Now I'm looking
> at more like 500,000 characters. At what point do you have to use something
> like malloc to get enough memory? There must be some limit as to what
> Windows will let you use without a malloc.

Try it and see! That's the simplest way.

Using malloc is not hard. The only trouble is that you should check the
result, but for programs that can get all space they need at the start
that's simple.

You will usually find he the most limited space is stack space. That's
for declared objects with what C calls automatic storage duration --
basically things declared in a function (even main). But even there, I
doubt the 500,000 bytes will cause a problem on a modern machine.

--
Ben.

Steve Carroll

unread,
May 16, 2018, 3:39:27 PM5/16/18
to
Well... like I said, I know the flooder is Steven Petruzzellis, who is a self professed C developer but I don't know if it could be used to automate this. Pro tip: You won't go into a party, quaff all the sauce, screw all the herd animals, heist the nicknacks and toss your cookies in the bathroom without being laughed at. Can you cease seeking for my attention? Translation of Steven Petruzzellis speak: Steven Petruzzellis's "debugging proficiency" side tracked him from a component he needs that does exist on Pulseaudio. The only thing not working here is Steven Petruzzellis. After The Flying Spaghetti Monster's update I no longer can get work done on Linux. Like anyone can! Proof Sandman accuses everyone of being Snit http://sandman.net/files/snit_circus.png.

Can ASCII characters be concatenated into a string with a logical connection to the "real" world; that is the question posed recently by those astonished at the fire-hose of nonsense from assorted google-posters and Cheetos smelling losers.



--
This broke the Internet
http://www.5z8.info/php-start_GPS_tracking-user_j3m3lg_whitepower
Jonas Eklundh

Jorgen Grahn

unread,
May 16, 2018, 3:58:41 PM5/16/18
to
On Wed, 2018-05-16, Peabody wrote:
> I have extremely limited experience in C. In the few programs I've
> written, I've simply declared variables and arrays without
> requesting any additional memory. But I think the largest was a
> 2048-character array. Now I'm looking at more like 500,000
> characters. At what point do you have to use something like malloc
> to get enough memory?

It's not about the actual limit, but about whether there /is/ a known
limit. How do you know, in this case, that you need 500k, and never
ever 600k?

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Peabody

unread,
May 16, 2018, 5:10:50 PM5/16/18
to
bartc says...

>> If it matters, this is all Windows console, 32-bit, LCC
>> compiler.

> Do you mean lccwin32?

Yes.

>I tried this program:
>
> #include <stdio.h>
>
> char memory[1500000000];
>
> int main(void) {
> printf("%d\n",sizeof(memory));

> and it worked using 1.5GB of static memory without
> malloc. But 2.0GB (or 2.0GiB, whichever one is 2e9
> bytes), crashed.

Ok, well I was thinking a program would be limited to a lot
less than that. So it looks like I will be ok.

> But the data must be declared outside of a function.
> Inside a function, stack memory is very limited unless
> you declare data as static.

I think last time I declared it in Main. But as I said
before, it was only 2K. By the way, does it need to be
"unsigned char" or is "char" automatically unsigned. Seems
like it would be. Or does it depend on the compiler?

Thanks for your help.

Peabody

unread,
May 16, 2018, 5:16:29 PM5/16/18
to
Ben Bacarisse says...

> Using malloc is not hard. The only trouble is that you
> should check the result, but for programs that can get
> all space they need at the start that's simple.

So you say, but you're fluent in C. For the rest of
humanity, I suspect malloc suffers from the same issue that
plagues all normal people.

Of course I'm speaking of pointers.

bartc

unread,
May 16, 2018, 5:25:10 PM5/16/18
to
On 16/05/2018 22:10, Peabody wrote:

> By the way, does it need to be
> "unsigned char" or is "char" automatically unsigned. Seems
> like it would be. Or does it depend on the compiler?

On Windows, most compilers make 'char' signed (out of 7 compilers I
have, only one makes 'char' unsigned). Some may have a option to change
it (not lccwin32).

But it depends on what you want to do with it. If you specifically need
it unsigned or signed, then declare unsigned char or signed char. If it
doesn't matter, just use 'char'.

If it needs to be unsigned but 'unsigned char' is too long-winded, then
try this:

typedef unsigned char byte;

byte memory[500000000];

Just use 'byte' (or whatever short name you choose) in place of
'unsigned char).

--
bartc

Peabody

unread,
May 16, 2018, 5:34:56 PM5/16/18
to
Jorgen Grahn says...

> It's not about the actual limit, but about whether there
> /is/ a known limit. How do you know, in this case, that
> you need 500k, and never ever 600k?

In fact, I could need as much as 2048K. That's the maximum
firmware size of some models of the STM32F microcontrollers.

I'm trying to "help" on a project involving an electronics
device which uses an F303CC, and the problem is providing
firmware updates in encrypted form. As of now, the device
has a 12MB serial flash memory attached, and behaves as a
removable drive when it's plugged into Windows USB. So to
update, you just transfer the new file to the "drive", then
reboot into the custom bootloader and select that file to
flash.

But that file is now Intel hex, and I don't think it's a
good idea to encrypt a file with many repetitions of
"[CRLF]:10" at predictable intervals - seems like it might
make it easier to decrypt. So I was thinking of converting
the .hex file to raw .bin and then encrypting that. But
that leaves me looking for a format for the .bin file.
There would still need to be multiple segments, and each
would need a segment type, starting address, and length.
It's the length that's the problem because I'll have to go
back and fill that field only after reaching the end of it
in the .hex file. So that means keeping all of the binary
image in memory until it's finalized, and then encrypting
that.

Well, that probably doesn't make any sense. But it seems I
may also have an endianness issue with the length field.

Maybe encrypting a .hex file wouldn't be so bad after all.

Keith Thompson

unread,
May 16, 2018, 5:45:41 PM5/16/18
to
Peabody <waybackNO...@yahoo.com> writes:
[...]
> I think last time I declared it in Main.

"main", not "Main". C is case-sensitive. It's a good idea to get into
the habit of being precise.

> But as I said
> before, it was only 2K. By the way, does it need to be
> "unsigned char" or is "char" automatically unsigned. Seems
> like it would be. Or does it depend on the compiler?

It depends on the compiler.

"char", "signed char", and "unsigned char" are all distinct types.
Plain "char" has the same size and representation as either "signed
char" or "unsigned char"; the choice is implementation-defined.
Don't write code that assumes one or the other.

Use "char" when you want to store character data. Use "signed char"
when you want a very small integer type (not a common need). Use
"unsigned char" when you want to store raw data, usually in arrays.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson

unread,
May 16, 2018, 7:16:02 PM5/16/18
to
bartc <b...@freeuk.com> writes:
[...]
> If it needs to be unsigned but 'unsigned char' is too long-winded, then
> try this:
>
> typedef unsigned char byte;
>
> byte memory[500000000];
>
> Just use 'byte' (or whatever short name you choose) in place of
> 'unsigned char).

"unsigned char" is not too long-winded. Any C programmer will know
exactly what it means. Any C programmer seeing "byte" will guess
that it's *probably* unsigned char, but will have to track down
the declaration to be sure.

Keith Thompson

unread,
May 16, 2018, 7:17:41 PM5/16/18
to
Peabody <waybackNO...@yahoo.com> writes:
[...]
> But that file is now Intel hex, and I don't think it's a
> good idea to encrypt a file with many repetitions of
> "[CRLF]:10" at predictable intervals - seems like it might
> make it easier to decrypt.
[...]

Not really a topic for this newsgroup, but any decent modern
encryption algorithm shouldn't have this kind of problem.

bartc

unread,
May 16, 2018, 7:45:39 PM5/16/18
to
On 17/05/2018 00:15, Keith Thompson wrote:
> bartc <b...@freeuk.com> writes:
> [...]
>> If it needs to be unsigned but 'unsigned char' is too long-winded, then
>> try this:
>>
>> typedef unsigned char byte;
>>
>> byte memory[500000000];
>>
>> Just use 'byte' (or whatever short name you choose) in place of
>> 'unsigned char).
>
> "unsigned char" is not too long-winded.

That's your opinion. Mine is that:

unsigned char* Strfn(const unsigned char* s, const unsigned char* t)
{
unsigned char* r;
....
}

(with a few consts thrown as people like using them)
looks a lot more long-winded and a nuisance to type than:

char* Strfn(char* s, char* t) {
char* r;
....
}

I think many will agree with me otherwise stdint.h would have defined
unsigned_int32_t rather than uint32_t.

Any C programmer will know
> exactly what it means. Any C programmer seeing "byte" will guess
> that it's *probably* unsigned char, but will have to track down
> the declaration to be sure.

They will have to anyway when people use typedefs, and they tend to do
that a lot. Tiny C uses 300 of them. Sqlite about 2000. While windows.h
seems to consist of little else. Plus all the macros you also need to
look up to find out what they do.

Really, using a 'byte' typedef would be the least of anyone's problems
when trying to read other people's code.

But by all means continue writing 'const unsigned long long int' in your
own code if you think it is better.

--
bartc

Ben Bacarisse

unread,
May 16, 2018, 7:54:45 PM5/16/18
to
As a first approximation, you replace:

int main(void)
{
char array[1000];
/* code that uses array[0] to array [999] */
}

with

#include <stdlib.h>

int main(void)
{
char *array = malloc(1000 * sizeof *data);
if (array != NULL) {
/* code that uses array[0] to array [999] */
}
}

The magic "1000 * sizeof *data" is the way to ask for room for 100 of
whatever data points to (in this case characters). The reason this is a
good idea is that it works for any type. If you need and array of a
million ints you write

int *many = malloc(1000000 * sizeof *many);

instead.

I'm saying this only because learning new stuff is usually a good thing.
My original post suggested that you probably won't need to do this for a
0.5MB array, particularly if you declare the array outside of main (or
inside with the static keyword).

--
Ben.

Keith Thompson

unread,
May 16, 2018, 8:09:52 PM5/16/18
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
[...]
> #include <stdlib.h>
>
> int main(void)
> {
> char *array = malloc(1000 * sizeof *data);

You meant
char *array = malloc(1000 * sizeof *array);
or
char *data = malloc(1000 * sizeof *data);

(Naming a pointer object "array" could be confusing.)

[...]

Ian Collins

unread,
May 16, 2018, 9:27:37 PM5/16/18
to
On 17/05/18 11:45, bartc wrote:
> On 17/05/2018 00:15, Keith Thompson wrote:
>> bartc <b...@freeuk.com> writes:
>> [...]
>>> If it needs to be unsigned but 'unsigned char' is too long-winded, then
>>> try this:
>>>
>>> typedef unsigned char byte;
>>>
>>> byte memory[500000000];
>>>
>>> Just use 'byte' (or whatever short name you choose) in place of
>>> 'unsigned char).
>>
>> "unsigned char" is not too long-winded.
>
> That's your opinion. Mine is that:
>
> unsigned char* Strfn(const unsigned char* s, const unsigned char* t)

Which is why we use unit8_t.

--
Ian

Peabody

unread,
May 16, 2018, 11:51:55 PM5/16/18
to
Keith Thompson says...

>> But that file is now Intel hex, and I don't think it's
>> a good idea to encrypt a file with many repetitions of
>> "[CRLF]:10" at predictable intervals - seems like it
>> might make it easier to decrypt. [...]

> Not really a topic for this newsgroup, but any decent
> modern encryption algorithm shouldn't have this kind of
> problem.

I was thinking RC4, which is not exactly modern, but still
widely used despite its shortcomings. Well since hex file
parsing is already implemented in the custom bootloader,
I'll just stick with encryping the hex file. However, I
just got home from my local open source hardware meetup, and
the guys there say there's little point in bothering with
encryption. For chips like the STM32, cloners just decap
the chip and read out the current contents.


already...@yahoo.com

unread,
May 17, 2018, 4:55:17 AM5/17/18
to
On Thursday, May 17, 2018 at 12:45:41 AM UTC+3, Keith Thompson wrote:
>
> Use "signed char" when you want a very small integer type (not a common need).
>

IMHO, using "signed char" for small integers make your intentions less clear than int8_t. int_fast8_t probably expresses intentions even better, but it looks ugly.
I regard "signed char" as a legacy type which only purpose by now should be interfacing with old code and libraries.

Ben Bacarisse

unread,
May 17, 2018, 7:28:41 AM5/17/18
to
Keith Thompson <ks...@mib.org> writes:

> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
> [...]
>> #include <stdlib.h>
>>
>> int main(void)
>> {
>> char *array = malloc(1000 * sizeof *data);
>
> You meant
> char *array = malloc(1000 * sizeof *array);
> or
> char *data = malloc(1000 * sizeof *data);

Yes, thanks.

--
Ben.

Scott Lurndal

unread,
May 17, 2018, 9:11:25 AM5/17/18
to
Peabody <waybackNO...@yahoo.com> writes:
>Jorgen Grahn says...
>
> > It's not about the actual limit, but about whether there
> > /is/ a known limit. How do you know, in this case, that
> > you need 500k, and never ever 600k?
>
>In fact, I could need as much as 2048K. That's the maximum
>firmware size of some models of the STM32F microcontrollers.
>
>I'm trying to "help" on a project involving an electronics
>device which uses an F303CC, and the problem is providing
>firmware updates in encrypted form. As of now, the device
>has a 12MB serial flash memory attached, and behaves as a
>removable drive when it's plugged into Windows USB. So to
>update, you just transfer the new file to the "drive", then
>reboot into the custom bootloader and select that file to
>flash.
>
>But that file is now Intel hex, and I don't think it's a
>good idea to encrypt a file with many repetitions of
>"[CRLF]:10" at predictable intervals - seems like it might
>make it easier to decrypt.

If you use a good modern encryption algorithm, that
won't be a problem.

Mr. Man-wai Chang

unread,
May 17, 2018, 9:48:45 AM5/17/18
to
You need operating system API calls to really monitor memory usage...
it's not just really a C issue. C programs run inside an operating system.

--
@~@ Remain silent! Drink, Blink, Stretch! Live long and prosper!!
/ v \ Simplicity is Beauty!
/( _ )\ May the Force and farces be with you!
^ ^ (x86_64 Ubuntu 9.10) Linux 2.6.39.3
不借貸! 不詐騙! 不賭錢! 不援交! 不打交! 不打劫! 不自殺! 不求神! 請考慮綜援
(CSSA):
http://www.swd.gov.hk/tc/index/site_pubsvc/page_socsecu/sub_addressesa

Malcolm McLean

unread,
May 17, 2018, 11:14:19 AM5/17/18
to
The stack (where local variables and arrays go) is designed for holding
small amounts of data. Anything over about 1K you should consider whether
it is too large.
The alternatives are malloc() and free(), or declaring a large static pool
of memory. The advantage of malloc() is that it is scaleable to the
maximum amount of memory in your system, and you can free the memory
for reuse when you have done with it. The disadvantages are that it
can return null, and some implementations "soft crash" by paging out
large allocations to disk - usually this isn't a viable solution
as it then takes too long to execute.
A static buffer you are guaranteed to have, but you've got to specify
a size. It's not so friendly to other programs if you choose a very
large size to handle a rare exceptional case - whether that matters
or not just depends. It may be that the computer is primarily running
your program and isn't a general-purpose machine for web browsing and
clerical work. It then doesn't matter if other programs have constrained
resources. However it may be that your program is just one amongst
many, so excessive resource usage will make it unpopular.

If the static buffer exceeds some memory limit, the program won't load
at all. Again, that might be what you want, or it might be a disaster.

David Brown

unread,
May 17, 2018, 2:24:04 PM5/17/18
to
I agree, except when you need greater portability. "signed char" is
always the smallest signed integer type supported by the platform. It
is the same (baring seriously weird theoretical implementations) as
"int_least8_t". "int_fast8_t" may well be bigger than a "signed char" -
on some platforms it will be as big as 32 bits.

In many situations where you need a very small integer type (as Keith
says, these are not that common), then very often int8_t expresses your
needs accurately - typically you /do/ want to be sure it is 8-bit to fit
in with file formats, network structures, etc.

David Brown

unread,
May 17, 2018, 2:34:13 PM5/17/18
to
On 16/05/18 23:34, Peabody wrote:
> Jorgen Grahn says...
>
> > It's not about the actual limit, but about whether there
> > /is/ a known limit. How do you know, in this case, that
> > you need 500k, and never ever 600k?
>
> In fact, I could need as much as 2048K. That's the maximum
> firmware size of some models of the STM32F microcontrollers.
>

Just to be clear here, are you writing the C code for the
microcontroller or for the PC?

If you are writing for a microcontroller, you should /strongly/ prefer
static allocation (either at file scope, or at as a "static" within a
function) because it gives you a much clearer picture of the memory
requirements of your project. Your linker will catch errors, and your
map file will show detailed memory information.

(If you want to know more here, just ask - but there is no point in my
giving details if it is not helpful.)

> I'm trying to "help" on a project involving an electronics
> device which uses an F303CC, and the problem is providing
> firmware updates in encrypted form. As of now, the device
> has a 12MB serial flash memory attached, and behaves as a
> removable drive when it's plugged into Windows USB. So to
> update, you just transfer the new file to the "drive", then
> reboot into the custom bootloader and select that file to
> flash.
>
> But that file is now Intel hex, and I don't think it's a
> good idea to encrypt a file with many repetitions of
> "[CRLF]:10" at predictable intervals - seems like it might
> make it easier to decrypt. So I was thinking of converting
> the .hex file to raw .bin and then encrypting that. But
> that leaves me looking for a format for the .bin file.

Intel hex format files have much lower entropy than a raw binary file
would have. Yes, that /can/ make it easier to decrypt - but if you pick
a decent algorithm there will be no problem. As long as it is cheaper
and easier for the bad guys to go around to your house and threaten you
into telling them the encryption key, your encryption algorithm is
strong enough!

However, it is a lot more efficient to use a binary format - it will be
about a third of the size of the Intel Hex format.

Raw binary images for microcontroller firmware don't usually have much
of a format as such - they are just memory images. You have the
interrupt vectors and other device-specific features at certain fixed
addresses in the image - the rest is just the program code. But it can
be worth adding a few extra bits and pieces in a structure at a fixed
spot - typically I include the length of the image, a CRC check, a
marker or identifier for the program type or card type (so that your
update system doesn't put a program for a radio into your CD player, or
vice versa), and perhaps a copyright string and version information.

> There would still need to be multiple segments, and each
> would need a segment type, starting address, and length.

Nope. You generally don't need that, unless the program is split into
many parts in memory. In a microcontroller like an STM32F, all the
flash is contiguous. (You don't need any ram segments in the image.)

> It's the length that's the problem because I'll have to go
> back and fill that field only after reaching the end of it
> in the .hex file. So that means keeping all of the binary
> image in memory until it's finalized, and then encrypting
> that.
>
> Well, that probably doesn't make any sense. But it seems I
> may also have an endianness issue with the length field.

Go little-endian all the way.

David Brown

unread,
May 17, 2018, 2:36:54 PM5/17/18
to
On 17/05/18 05:51, Peabody wrote:
> Keith Thompson says...
>
> >> But that file is now Intel hex, and I don't think it's
> >> a good idea to encrypt a file with many repetitions of
> >> "[CRLF]:10" at predictable intervals - seems like it
> >> might make it easier to decrypt. [...]
>
> > Not really a topic for this newsgroup, but any decent
> > modern encryption algorithm shouldn't have this kind of
> > problem.
>
> I was thinking RC4, which is not exactly modern, but still
> widely used despite its shortcomings.

DES or 3DES is widely implemented on embedded systems, and very common.
It is sufficient for most uses.

> Well since hex file
> parsing is already implemented in the custom bootloader,
> I'll just stick with encryping the hex file. However, I
> just got home from my local open source hardware meetup, and
> the guys there say there's little point in bothering with
> encryption. For chips like the STM32, cloners just decap
> the chip and read out the current contents.
>

They are right. It is seldom worth the bother encrypting the code.
Give some thought as to who might want access to the binary image, and
what they might be able to do with it. If the answer is "not many" and
"not much", then don't bother complicating matters.

Peabody

unread,
May 17, 2018, 7:36:44 PM5/17/18
to
David Brown says...

> Just to be clear here, are you writing the C code for
> the microcontroller or for the PC?

I'm doing a demonstration program for the PC - both an
encryption program and a decryption program. Just to make
sure I understand the algorithm properly.

>> There would still need to be multiple segments, and
>> each would need a segment type, starting address, and
>> length.

> Nope. You generally don't need that, unless the program
> is split into many parts in memory. In a
> microcontroller like an STM32F, all the flash is
> contiguous.

Well, the processor in question has 256K of flash, and the
Intel Hex format only allows 64K using the normal "00"
record type lines (two bytes for the address). A sample hex
file I have for that chip has record type "04" lines every
64K, which establish the new upper word of the address for
up to the following 64K. In addition, I think it should be
possible to have the code not be contiguous. And there's
also a record type "05" line that I think is the address to
jump to when flashing is completed.

So for all these reasons, I think even the "raw" version of
the file should have record type, address, and length fields
at the beginning of each contiguous block of code. So the
structure of the raw file would look like the hex file
except the record type 00 data wouldn't be divided into
lines, and of course everything would be in binary instead
of ascii hex representation.

As I said before, the length field complicates things
because I won't know what to put in there until I've come to
the end of the block. Hence the need to keep it all, or at
least each block, in memory. But I guess in theory 64K is
the largest block I would really need if I write each block
to disk once it's encrypted.

I may go ahead and finish this, but since the custom
bootloader for this project can already parse Intel Hex
files, I think it's just easier to encrypt the hex file if
that's not insecure, although, as you suggest, the raw
version is more efficient, and, well, more elegant.

David Brown

unread,
May 18, 2018, 4:28:48 AM5/18/18
to
On 18/05/18 01:36, Peabody wrote:
> David Brown says...
>
> > Just to be clear here, are you writing the C code for
> > the microcontroller or for the PC?
>
> I'm doing a demonstration program for the PC - both an
> encryption program and a decryption program. Just to make
> sure I understand the algorithm properly.
>
> >> There would still need to be multiple segments, and
> >> each would need a segment type, starting address, and
> >> length.
>
> > Nope. You generally don't need that, unless the program
> > is split into many parts in memory. In a
> > microcontroller like an STM32F, all the flash is
> > contiguous.
>
> Well, the processor in question has 256K of flash, and the
> Intel Hex format only allows 64K using the normal "00"
> record type lines (two bytes for the address). A sample hex
> file I have for that chip has record type "04" lines every
> 64K, which establish the new upper word of the address for
> up to the following 64K. In addition, I think it should be
> possible to have the code not be contiguous. And there's
> also a record type "05" line that I think is the address to
> jump to when flashing is completed.

It is certainly common for hex files to have multiple sections. 64K
limits in the Intel Hex format is one reason, others include different
output sections from the linker (for vectors, code, read-only data,
etc.). But the flash memory on the device is contiguous and can be
treated as a single block. When converting the hex file to a binary
file, fill in the gaps with 0xff (for marginally faster programming) or
0x00 (it feels more natural if you are looking at the binary). The 05
record for program start address is usually completely ignored - the
microcontroller uses the reset vector address for program startup.

You might want to talk to the folk writing the code on the
microcontroller about this kind of thing. It can be covered quickly
person to person (preferably with a whiteboard at hand).

>
> So for all these reasons, I think even the "raw" version of
> the file should have record type, address, and length fields
> at the beginning of each contiguous block of code. So the
> structure of the raw file would look like the hex file
> except the record type 00 data wouldn't be divided into
> lines, and of course everything would be in binary instead
> of ascii hex representation.
>
> As I said before, the length field complicates things
> because I won't know what to put in there until I've come to
> the end of the block. Hence the need to keep it all, or at
> least each block, in memory. But I guess in theory 64K is
> the largest block I would really need if I write each block
> to disk once it's encrypted.

Put space for the program record (program type, length, maybe other
things as I mentioned in another post) at a fixed address near the
beginning of the binary - this must be included in the microcontroller
program side of the system. Your PC code reads in the hex file (or bin
file) output from the microcontroller development tools, then fills in
the blank record. Maybe also add a 32-bit CRC at the end of the image.

Alternatively, if you don't want to make changes to the microcontroller
side of the coding, then your program record is only ever sent to the
bootloader/updater on the device, and never programmed into the flash.

Do not encrypt your program record - that leads to insanity.
(Preferably, don't encrypt anything at all - you don't want that extra
complication when you are just learning this stuff.)


And don't write the PC program in C, unless are a very experienced C
programmer and know no other programming languages. C (on a PC) has no
problem with large lumps of data, statically or dynamically allocated.
But it is a hopeless language for something like parsing Intel Hex files
or other string manipulation. You'll spend far more time trying to
figure out your buffers, memory allocations, data structures, etc., than
actually solving the problem. Use a language that has direct support
for strings, lists, pattern matching, etc. It will be a fraction of the
development effort.

Usually you shouldn't even bother with hex file parsing - just use
objcopy, which will be part of the development tools for the
microcontroller. It will convert your file to a nice raw binary image.

Steve Carroll

unread,
May 18, 2018, 11:12:58 AM5/18/18
to
Just some names Steven Petruzzellis has used
"Evil" John *
"Evil" Snit *
Big Crotch on a Small Fish
Cornelius Munshower
CSMA Moderator
Edward Stanfield
Fretwiz *
Hitman Hero
Measles
Petruzzellis Kids
Sigmond
Slaveen
Smit
Steve C *
Steve Camoll *
Steve Carroll <noone@xxxxxxxxxxx> *
Steve Carroll <stevecarroll@xxxxxxxxxxx> *
Steve Carroll <trollkiller@xxxxxx> *
Steve Carroll's Dog *
Steve Carrolll *
Steve Carrroll *
Yevette Owens
Yobo_Obyo


So, yeah, I buy into my own fiction, fully knowing it would never be discovered by others, because it makes me reconsider my program, improving it. Larry Washington's posts arein fact all unfair. There's zero dispute that as soon as any released 'plonkee' does one thing to wound the poor crybaby's feelings that they'll be blocked again.

My view is much more sophisticated.

Larry Washington doesn't have any idea what he is sniveling about. Sandman makes things up. Here is a list of names Jonas Eklundh has admitted he attributes to Larry Washington "Cactus Pete", "Donald", "Donald Miller", "Horace McSwain", "Hymen", "meat", "Mike Weaver", "Modena IV Drid", "Omar Murad Asfour", "Rhino Plastee", "Soapy", "SopwithCamel", "Sunny Day", "Takuya Saitoh", "The Letter Q", "tmelmosfire", "zevon".

--
Do not click this link!
https://redd.it/6sfhq6
Jonas Eklundh

Nick Bowler

unread,
May 18, 2018, 3:59:59 PM5/18/18
to
On Thu, 17 May 2018 00:54:37 +0100, Ben Bacarisse wrote:
> The magic "1000 * sizeof *data" is the way to ask for room for 100 of
> whatever data points to (in this case characters). The reason this is a
> good idea is that it works for any type. If you need and array of a
> million ints you write
>
> int *many = malloc(1000000 * sizeof *many);
>
> instead.

Well, except this will silently do the wrong thing if the multiplication
result doesn't fit in a size_t. The above will work for a sufficiently
large size_t and a sufficiently small int but it certainly doesn't work
in general for "any type".

Moreover depending on the type of 1000000 and the rank of size_t the
multiplication might have a signed result, and without any checks that
could potentially overflow (=> undefined behaviour).

So in the general case one usually needs to write something like:

if (n > SIZE_MAX / sizeof *many) {
/* result won't fit in size_t, fail */
}

many = malloc(n * sizeof *many);
if (!many) {
/* malloc error, fail */
}

Alternately one can use calloc, which doesn't require this multiplication.
But in the real world many implementations exist which have the same bug
in their calloc implementation, so your mileage may vary...

supe...@casperkitty.com

unread,
May 18, 2018, 4:55:03 PM5/18/18
to
On Friday, May 18, 2018 at 2:59:59 PM UTC-5, Nick Bowler wrote:
> Alternately one can use calloc, which doesn't require this multiplication.
> But in the real world many implementations exist which have the same bug
> in their calloc implementation, so your mileage may vary...

There are also many real-world implementations where malloc() or calloc()
might allocate address space without immediately committing physical memory.
This may improve physical memory utilization in cases where programs use
malloc() to acquire a huge block of memory but only use a tiny part of it,
but may cause programs to arbitrarily fail the first time they access a
particular piece of an allocated region if the amount of available RAM has
fallen since it was allocated.

IMHO, it would be helpful to categorize implementations based upon what they
promise with regard to the behavior of malloc() and related functions. The
idea that functions should return null when out of memory is sometimes useful
on platforms that can uphold such a guarantee, but treating it as requirement
would make it impossible to implement C on some platforms.

Peabody

unread,
May 18, 2018, 5:33:13 PM5/18/18
to
David Brown says...

> And don't write the PC program in C, unless are a very
> experienced C programmer and know no other programming
> languages. C (on a PC) has no problem with large lumps
> of data, statically or dynamically allocated. But it is
> a hopeless language for something like parsing Intel Hex
> files or other string manipulation. You'll spend far
> more time trying to figure out your buffers, memory
> allocations, data structures, etc., than actually
> solving the problem. Use a language that has direct
> support for strings, lists, pattern matching, etc. It
> will be a fraction of the development effort.

Too late. Well, I'm obviously not experienced in C. But
all my programming experience in the past has been with
assembler - for 6502, for x86(DOS), and MSP430. That's
if you don't count Fortran and Basic. But last year I
needed to modify the source of a TI program called BSLDEMO,
which is the Windows side of some MSP430 parts' BSL setup.
So I had to learn enough about C to do that and recompile.

Then I had to write my own program to do the BSL thing for
certain other MSP430 parts which use a custom bootloader. I
could have written it in VBscript, except that VBscipt
doesn't do COM ports. So really the only option I had was
C. And I had some very helpful ANs from Silabs about
talking to COM ports in C that made it possible.

It's not pretty code, but I managed to parse not only Intel
Hex but also TI-TEXT files, then send the binary to the MCU.
But those were tiny files, unlike the the ones for STM32F's
here.

But I have to agree that the logic of the task at hand was
not the big stumbling block. It was the C notation and
jargon. And it's all so terse that it's not clear what's
going on.

Where is Pascal when we really need it? :-)

supe...@casperkitty.com

unread,
May 18, 2018, 6:10:56 PM5/18/18
to
On Friday, May 18, 2018 at 4:33:13 PM UTC-5, Peabody wrote:
> Where is Pascal when we really need it? :-)

It got displaced by C for a couple of reasons:

1. While many Pascal implementations offered enough features to make
the language usable for practical purposes, they did so inconsistently,
because nothing in the design of the language suggested how they should
work. The design of the C language, by contrast, often offered some
strong clues.

For example, in Turbo Pascal, the syntax to write 0x1A01 to the 32 bits
at segment 0xB800 offset 160 is "MemW[$B800:160] := $1A01;", but there
is nothing in Pascal itself that would suggest the identifier MemW, nor
the use of two numbers separated by a colon, as a means of performing
such an access. In an 8088 implementations of C configured to use 32-bit
pointers by default, the syntax is "*((unsigned*)0xB80000A0) = 0x1A01;".
The design and historical of the language suggested that conversion
from an integer type to a same-size pointer should be representation-
preserving, and that 8088 implementations should use 16-bit "int".

2. There existed a document that looked like a standard for C, even though
it made no attempt to specify everything necessary to make something be
a *good* implementation which is suitable for any particular purpose.

3. Simple compilers could more easily yield decent performance when given
code which used pointer-marching techniques than when given code that
used array indexing.

Those factors pretty much killed off Pascal even though it had a number of
features that good languages should have but C doesn't, such as a proper
pass-by-reference facility.

Melzzzzz

unread,
May 18, 2018, 6:12:56 PM5/18/18
to
On 2018-05-18, supe...@casperkitty.com <supe...@casperkitty.com> wrote:
> Those factors pretty much killed off Pascal even though it had a number of
> features that good languages should have but C doesn't, such as a proper
> pass-by-reference facility.

Common, pass by reference ;)
C in his spirit does not needs that.



--
press any key to continue or any other to quit...

bartc

unread,
May 18, 2018, 6:22:19 PM5/18/18
to
On 18/05/2018 22:33, Peabody wrote:
> David Brown says...
>
> > And don't write the PC program in C, unless are a very
> > experienced C programmer and know no other programming
> > languages. C (on a PC) has no problem with large lumps
> > of data, statically or dynamically allocated. But it is
> > a hopeless language for something like parsing Intel Hex
> > files or other string manipulation. You'll spend far
> > more time trying to figure out your buffers, memory
> > allocations, data structures, etc., than actually
> > solving the problem. Use a language that has direct
> > support for strings, lists, pattern matching, etc. It
> > will be a fraction of the development effort.
>
> Too late. Well, I'm obviously not experienced in C. But
> all my programming experience in the past has been with
> assembler - for 6502, for x86(DOS), and MSP430. That's
> if you don't count Fortran and Basic. But last year I
> needed to modify the source of a TI program called BSLDEMO,
> which is the Windows side of some MSP430 parts' BSL setup.
> So I had to learn enough about C to do that and recompile.
>
> Then I had to write my own program to do the BSL thing for
> certain other MSP430 parts which use a custom bootloader. I
> could have written it in VBscript, except that VBscipt
> doesn't do COM ports.

I think VBscript (if that's the same as Visual Basic) should be able to
talk to DLL libraries. And it's possible to compile C code into a DLL
library.

So you can do the bulk of the programming in the easy language and only
use C when essential.


--
bartc

supe...@casperkitty.com

unread,
May 18, 2018, 6:24:21 PM5/18/18
to
On Friday, May 18, 2018 at 5:12:56 PM UTC-5, Melzzzzz wrote:
> On 2018-05-18, supe...@casperkitty.com <supe...@casperkitty.com> wrote:
> > Those factors pretty much killed off Pascal even though it had a number of
> > features that good languages should have but C doesn't, such as a proper
> > pass-by-reference facility.
>
> Common, pass by reference ;)
> C in his spirit does not needs that.

And as a consequence, given something like:

void test(void)
{
int i,x;

getValue(&x);
do
doSomethingWith(x);
while(--x);
}

and can't see inside getValue() or doSomethingWith() must store x to memory
before each call to doSomethingWith and reload it after, even on platforms
which would guarantee that some registers will remain undisturbed across
subroutine calls.

In a language with proper pass-by-reference semantics, the fact that
getValue() had received x by reference, rather than receiving a pointer to
x, would allow the compiler to assume that anything that was going to be
done with that reference would be done before the function returned.

bartc

unread,
May 18, 2018, 8:32:48 PM5/18/18
to
On 18/05/2018 23:10, supe...@casperkitty.com wrote:
> On Friday, May 18, 2018 at 4:33:13 PM UTC-5, Peabody wrote:
>> Where is Pascal when we really need it? :-)
>
> It got displaced by C for a couple of reasons:
>
> 1. While many Pascal implementations offered enough features to make
> the language usable for practical purposes, they did so inconsistently,
> because nothing in the design of the language suggested how they should
> work. The design of the C language, by contrast, often offered some
> strong clues.
>
> For example, in Turbo Pascal, the syntax to write 0x1A01 to the 32 bits
> at segment 0xB800 offset 160 is "MemW[$B800:160] := $1A01;", but there
> is nothing in Pascal itself that would suggest the identifier MemW, nor
> the use of two numbers separated by a colon, as a means of performing
> such an access. In an 8088 implementations of C configured to use 32-bit
> pointers by default, the syntax is "*((unsigned*)0xB80000A0) = 0x1A01;".
> The design and historical of the language suggested that conversion
> from an integer type to a same-size pointer should be representation-
> preserving, and that 8088 implementations should use 16-bit "int".

The OP apparently doesn't like C's syntax.

What's hard to understand is why it is not possible have a language that
does exactly what C does, but with an alternative syntax. There are a
few different ones of which C's terse syntax with braces and lots of
punctuation is one.

(Although I still doubt it would endear itself to me as there are other
matters besides syntax.)

--
bartc

Melzzzzz

unread,
May 18, 2018, 8:43:08 PM5/18/18
to
You have nimi, it generates C code and has different syntax ;)

Keith Thompson

unread,
May 18, 2018, 10:12:43 PM5/18/18
to
Melzzzzz <Melz...@zzzzz.com> writes:
> On 2018-05-19, bartc <b...@freeuk.com> wrote:
[...]
>> The OP apparently doesn't like C's syntax.
>>
>> What's hard to understand is why it is not possible have a language that
>> does exactly what C does, but with an alternative syntax. There are a
>> few different ones of which C's terse syntax with braces and lots of
>> punctuation is one.
>>
>> (Although I still doubt it would endear itself to me as there are other
>> matters besides syntax.)
>>
> You have nimi, it generates C code and has different syntax ;)

Do you mean Nim?

Melzzzzz

unread,
May 18, 2018, 10:34:13 PM5/18/18
to
On 2018-05-19, Keith Thompson <ks...@mib.org> wrote:
> Melzzzzz <Melz...@zzzzz.com> writes:
>> On 2018-05-19, bartc <b...@freeuk.com> wrote:
> [...]
>>> The OP apparently doesn't like C's syntax.
>>>
>>> What's hard to understand is why it is not possible have a language that
>>> does exactly what C does, but with an alternative syntax. There are a
>>> few different ones of which C's terse syntax with braces and lots of
>>> punctuation is one.
>>>
>>> (Although I still doubt it would endear itself to me as there are other
>>> matters besides syntax.)
>>>
>> You have nimi, it generates C code and has different syntax ;)
>
> Do you mean Nim?

Yes, Nim. It was typo ;p
Nim has advantage that you can compile into C and then carry C code
where it is needed.

Tim Rentsch

unread,
May 19, 2018, 12:32:35 AM5/19/18
to
Keith Thompson <ks...@mib.org> writes:

> bartc <b...@freeuk.com> writes:
> [...]
>
>> If it needs to be unsigned but 'unsigned char' is too long-winded, then
>> try this:
>>
>> typedef unsigned char byte;
>>
>> byte memory[500000000];
>>
>> Just use 'byte' (or whatever short name you choose) in place of
>> 'unsigned char).
>
> "unsigned char" is not too long-winded. [...]

Obviously some people don't agree with that opinion.

Tim Rentsch

unread,
May 19, 2018, 2:52:33 AM5/19/18
to
bartc <b...@freeuk.com> writes:

> On 18/05/2018 23:10, supe...@casperkitty.com wrote:
>
>> On Friday, May 18, 2018 at 4:33:13 PM UTC-5, Peabody wrote:
>>
>>> Where is Pascal when we really need it? :-)
>>
>> It got displaced by C for a couple of reasons:
>>
>> 1. While many Pascal implementations offered enough features to make
>> the language usable for practical purposes, they did so inconsistently,
>> because nothing in the design of the language suggested how they should
>> work. The design of the C language, by contrast, often offered some
>> strong clues.
>>
>> For example, in Turbo Pascal, the syntax to write 0x1A01 to the 32 bits
>> at segment 0xB800 offset 160 is "MemW[$B800:160] := $1A01;", but there
>> is nothing in Pascal itself that would suggest the identifier MemW, nor
>> the use of two numbers separated by a colon, as a means of performing
>> such an access. In an 8088 implementations of C configured to use 32-bit
>> pointers by default, the syntax is "*((unsigned*)0xB80000A0) = 0x1A01;".
>> The design and historical of the language suggested that conversion
>> from an integer type to a same-size pointer should be representation-
>> preserving, and that 8088 implementations should use 16-bit "int".
>
> The OP apparently doesn't like C's syntax.
>
> What's hard to understand is why it is not possible have a language
> that does exactly what C does, but with an alternative syntax.

It isn't hard to understand at all. It certainly is possible
to do, but no one has done it in a way that you like, because
your ideas are so screwy.

bartc

unread,
May 19, 2018, 7:15:14 AM5/19/18
to
On 19/05/2018 07:52, Tim Rentsch wrote:
> bartc <b...@freeuk.com> writes:

>> The OP apparently doesn't like C's syntax.
>>
>> What's hard to understand is why it is not possible have a language
>> that does exactly what C does, but with an alternative syntax.
>
> It isn't hard to understand at all. It certainly is possible
> to do, but no one has done it in a way that you like, because
> your ideas are so screwy.

Really? There might be reasons for why so many 'easy' languages eschew
C-style syntax and go for something which is easier on the eye (Python,
Ruby, Lua, ...)

Think how much more productive programmers could be, how much less error
prone their code could be, how much more enjoyable work might be, but
no, the idea is too preprosterous...




Steve Carroll

unread,
May 19, 2018, 10:22:42 AM5/19/18
to
The fabricator AKA Snit does it every time. Then the deluge begins. Because the mama's boy just has to run to other groups.

And given how frequently it is clear that Snit's .sig file is some twist of a quote Melzzzzz posted which had been a beating on Snit for something he did which was absurd/fallacious/etc... its undoubtedly a regular demonstration of Snit's continuing distress for having been so frequently humiliated. So what is Snit's system for the flooding deluge? SC? That is the only database he knows, at least that I have seen. He must be abusing it to write these absurd "attack" threads. Lemme guess, Snit took some code that he stole and pretended was his... he's feeding it Melzzzzz's posts, grabbing offensive paragraphs, then re-writing those using a Markv model and then he actively posts them because his mental health issues enables him to do that essentially ceaselessly. That's what Snit does when he gets cornered. He immediately creates a nym, starts a cross posted thread so he can claim here made a mistake. Don't blame me it was my left hand... and then he responds to it with his right hand. By getting an education from 'social warriors' like that you get concepts like 'reverse racism'. Carried to its (il)logical intent, the idea that it's 'partisan' for a straight male to not wish to have sex with a transvestite is created.

Having to tolerate the use of C+ it can hardly be called "free" when you include your time. Melzzzzz's timely statement stands proper and correct.



--
My Snoring Solution
https://youtu.be/IhOfBmWwCVY
https://youtu.be/u4xD43Khhkw
http://www.5z8.info/lemon-party-redux_v1o7pm_nakedgrandmas.jpg
Jonas Eklundh

David Brown

unread,
May 19, 2018, 12:01:02 PM5/19/18
to
On 18/05/18 23:33, Peabody wrote:
> David Brown says...
>
> > And don't write the PC program in C, unless are a very
> > experienced C programmer and know no other programming
> > languages. C (on a PC) has no problem with large lumps
> > of data, statically or dynamically allocated. But it is
> > a hopeless language for something like parsing Intel Hex
> > files or other string manipulation. You'll spend far
> > more time trying to figure out your buffers, memory
> > allocations, data structures, etc., than actually
> > solving the problem. Use a language that has direct
> > support for strings, lists, pattern matching, etc. It
> > will be a fraction of the development effort.
>
> Too late. Well, I'm obviously not experienced in C. But
> all my programming experience in the past has been with
> assembler - for 6502, for x86(DOS), and MSP430. That's
> if you don't count Fortran and Basic. But last year I
> needed to modify the source of a TI program called BSLDEMO,
> which is the Windows side of some MSP430 parts' BSL setup.
> So I had to learn enough about C to do that and recompile.
>

There is plenty of Python code for working with the BSL too (it is the
main language used by the open source msp430 development tools, which
are far better than the stuff TI wrote themselves). But as you say, it
is too late now - and anyway, I doubt if you want to learn yet another
language at the moment.

> Then I had to write my own program to do the BSL thing for
> certain other MSP430 parts which use a custom bootloader. I
> could have written it in VBscript, except that VBscipt
> doesn't do COM ports. So really the only option I had was
> C. And I had some very helpful ANs from Silabs about
> talking to COM ports in C that made it possible.
>
> It's not pretty code, but I managed to parse not only Intel
> Hex but also TI-TEXT files, then send the binary to the MCU.
> But those were tiny files, unlike the the ones for STM32F's
> here.

Well, as has been pointed out, C doesn't care about the sizes here - you
can happily use 2 MB arrays in C on a PC without a second thought. (It
would be different in C on the msp430 or the STM32F, of course.)

>
> But I have to agree that the logic of the task at hand was
> not the big stumbling block. It was the C notation and
> jargon. And it's all so terse that it's not clear what's
> going on.
>
> Where is Pascal when we really need it? :-)
>

Delphi is alive and well for Windows. And you can work with comms ports
from Delphi without problem. I don't think it is significantly better
than C for this sort of thing, but of course that is a matter of
familiarity. And if you want a nice gui, the Delphi is a very
convenient way to make one.

David Brown

unread,
May 19, 2018, 12:04:48 PM5/19/18
to
On 19/05/18 00:10, supe...@casperkitty.com wrote:
> On Friday, May 18, 2018 at 4:33:13 PM UTC-5, Peabody wrote:
>> Where is Pascal when we really need it? :-)
>
> It got displaced by C for a couple of reasons:
>

>
> Those factors pretty much killed off Pascal even though it had a number of
> features that good languages should have but C doesn't, such as a proper
> pass-by-reference facility.
>

Of all the things which Pascal has but which C does not,
pass-by-reference (rather than passing a pointer in C) is not one that
would spring to mind. I'd rather pick the good enumerated types, the
ranged integer types, arrays using types or ranges for indices, and some
of the syntax like "with" blocks. (There are many things that C does
better than Pascal, of course.)

Richard Bos

unread,
May 19, 2018, 12:12:05 PM5/19/18
to
David Brown <david...@hesbynett.no> wrote:

> Of all the things which Pascal has but which C does not,
> pass-by-reference (rather than passing a pointer in C) is not one that
> would spring to mind. I'd rather pick the good enumerated types, the
> ranged integer types, arrays using types or ranges for indices, and some
> of the syntax like "with" blocks. (There are many things that C does
> better than Pascal, of course.)

Arrays using ranges for indices require run-time checking. Not something
I'd need to see added to C. Ditto for ranged integer types in general.
The only way you could introduce both to C is to make them compile-time
checkable (and making a provable constant assignment, e.g., a constraint
violation), but leave run-time overflows UB. I'm not sure that would be
a sufficient improvement to warrant the addition.

As for with... yeuch. I can see the attraction, but I _have_ seen the
mess it makes far too often.

Better enumerated types, though... yes. They'd only be useable if you
cannot assign anything to them which isn't the same type, though. So no
cross-assigning integers and enums.

Richard

David Brown

unread,
May 19, 2018, 12:32:45 PM5/19/18
to
On 19/05/18 18:11, Richard Bos wrote:
> David Brown <david...@hesbynett.no> wrote:
>
>> Of all the things which Pascal has but which C does not,
>> pass-by-reference (rather than passing a pointer in C) is not one that
>> would spring to mind. I'd rather pick the good enumerated types, the
>> ranged integer types, arrays using types or ranges for indices, and some
>> of the syntax like "with" blocks. (There are many things that C does
>> better than Pascal, of course.)
>
> Arrays using ranges for indices require run-time checking. Not something
> I'd need to see added to C. Ditto for ranged integer types in general.

No, the don't require run-time checking. They only require run-time
checking if you want run-time checking - which is costly (and I agree
not something I would usually want in C or Pascal), but can be very
helpful in debugging and testing.

> The only way you could introduce both to C is to make them compile-time
> checkable (and making a provable constant assignment, e.g., a constraint
> violation), but leave run-time overflows UB. I'm not sure that would be
> a sufficient improvement to warrant the addition.

In Pascal, you can write something like:

type
days_of_week = (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday);
weekdays = Monday ... Friday;

workday_hours = array[weekdays] of 0 .. 24;

var
my_workday_hours : workday_hours;

Accessing elements of "my_workday_hours" is /exactly/ as efficient as
accessing an array in C. Doing so via a pointer is marginally less so,
because it requires an offset (since the index does not start at 0) that
probably can't be optimised away at compile time.

Checking for ranges can often be done at compile time - thus
"my_workday_hours[Sunday]" will give a compile-time error. Whether
ranges are checked at run-time, or are undefined behaviour, is a matter
of implementation flags. Pascal compilers usually support both choices.

In C, for comparison, you don't have a way to express this sort of
thing, and don't have a way to get compile-time checking of errors that
could be checked at compile-time. (Some compilers can do so to some
extent.)


>
> As for with... yeuch. I can see the attraction, but I _have_ seen the
> mess it makes far too often.

It certainly has its pros and cons, and is open to abuse as much as use.
Having a nice feature is no guarantee that people will use it to write
nice code.

>
> Better enumerated types, though... yes. They'd only be useable if you
> cannot assign anything to them which isn't the same type, though. So no
> cross-assigning integers and enums.
>

That is the case for Pascal enumerated types. They are independent
types, rather than just a collection of int-compatible constants.

> Richard
>

bartc

unread,
May 19, 2018, 2:03:15 PM5/19/18
to
On 19/05/2018 17:32, David Brown wrote:

> In Pascal, you can write something like:
>
> type
>     days_of_week = (Sunday, Monday, Tuesday, Wednesday, Thursday,
> Friday, Saturday);
>     weekdays = Monday ... Friday;
>
>     workday_hours = array[weekdays] of 0 .. 24;
>
> var
>     my_workday_hours : workday_hours;
>
> Accessing elements of "my_workday_hours" is /exactly/ as efficient as
> accessing an array in C.  Doing so via a pointer is marginally less so,
> because it requires an offset (since the index does not start at 0) that
> probably can't be optimised away at compile time.

In both cases (direct array and via a pointer), at worst there will be a
constant offset when the lower bound is not zero, but that can usually
be incorporated into the address mode of the instruction that accesses
the element (depends on instruction set).

And a compiler could eliminate that by having the address of the array
pointing to an imaginary 0th element rather than the first element.

--
bartc

supe...@casperkitty.com

unread,
May 19, 2018, 2:17:36 PM5/19/18
to
On Saturday, May 19, 2018 at 11:32:45 AM UTC-5, David Brown wrote:
> On 19/05/18 18:11, Richard Bos wrote:
> > Arrays using ranges for indices require run-time checking. Not something
> > I'd need to see added to C. Ditto for ranged integer types in general.
>
> No, the don't require run-time checking. They only require run-time
> checking if you want run-time checking - which is costly (and I agree
> not something I would usually want in C or Pascal), but can be very
> helpful in debugging and testing.

If a program will receive input from potentially-untrustworthy sources,
and if abnormal termination would be an acceptable consequence when given
invalid input, optimal code that relies upon run-time array checking may
be more efficient than optimal code with "manual" range checking, since
a compiler generating the former could adjust such code to fit other
optimizations like loop unrolling. Further, unless a program is speed
critical, the extra security benefits of performing range checks on all
values that might be accessed from outside threads ("legally" or not)
may justify the cost, especially if a compiler--unlike a programmer--can
recognize that something like:

arr[x]++;
...
arr[x]++;

where there are no apparent intervening operations on x, but its address
would be exposed to other threads, may be treated as either:

temp = x; /* Address of 'temp' not exposed to outside code */
boundscheck(temp);
arr[temp]++;
...
temp = x;
boundscheck(temp);
arr[temp]++;

or

temp = x; /* Address of 'temp' not exposed to outside code */
boundscheck(temp);
arr[temp]++;
... no intervening operations on temp
arr[temp]++;

at the compiler's convenience. Someone writing code manually would have
no way of knowing whether the cost of keeping 'temp' in the intervening
code would be more or less than the cost of re-loading and re-checking
the value.

> > The only way you could introduce both to C is to make them compile-time
> > checkable (and making a provable constant assignment, e.g., a constraint
> > violation), but leave run-time overflows UB. I'm not sure that would be
> > a sufficient improvement to warrant the addition.
>
> In Pascal, you can write something like:
>
> type
> days_of_week = (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday,
> Saturday);
> weekdays = Monday ... Friday;
>
> workday_hours = array[weekdays] of 0 .. 24;
>
> var
> my_workday_hours : workday_hours;
>
> Accessing elements of "my_workday_hours" is /exactly/ as efficient as
> accessing an array in C. Doing so via a pointer is marginally less so,
> because it requires an offset (since the index does not start at 0) that
> probably can't be optimised away at compile time.

Medium-complexity compilers would probably have a better shot at
optimizing accesses to a non-zero-based array than they would with code
in C that adjusts indices on every access. The fact that an array isn't
zero-based would give a strong hit to a compiler that in contexts where
the address of the pointer itself isn't exposed to outside code, it
would likely benefit from using an adjusted pointer. A sufficiently
sophisticated C compiler might happen to notice that array "foo" is
always accessed with a subscript of "i-42", but only if it spends a fair
amount of time looking for a lot of similar potential optimizations that
don't pan out.

> Checking for ranges can often be done at compile time - thus
> "my_workday_hours[Sunday]" will give a compile-time error. Whether
> ranges are checked at run-time, or are undefined behaviour, is a matter
> of implementation flags. Pascal compilers usually support both choices.
>
> In C, for comparison, you don't have a way to express this sort of
> thing, and don't have a way to get compile-time checking of errors that
> could be checked at compile-time. (Some compilers can do so to some
> extent.)

> > Better enumerated types, though... yes. They'd only be useable if you
> > cannot assign anything to them which isn't the same type, though. So no
> > cross-assigning integers and enums.
> >
>
> That is the case for Pascal enumerated types. They are independent
> types, rather than just a collection of int-compatible constants.

A useful common extension is to allow conversions between enumerated
types and integer types using the syntax newType(valueToBeConverted).
Whether a compiler would trap an attempt to compute weekdays(0) or
weekdays(123) would typically be an implementation setting.

BTW, there are two different kinds of unsigned semantics which are
useful for different purposes; both Pascal and C would benefit from
having explicit types of each kind. Unsigned types are sometimes
used to represent numbers, and sometimes to represent members of an
abstract algebraic ring of integers congruent mod 2**N. Having
values that can be treated as a ring is useful, but it precludes the
possibility of useful overflow/bounds checks. If a 16-bit value is
being used as an algebraic ring (e.g. for checksum computation),
having to jump through hoops to avoid overflow trapping is not helpful.
On the other hand, if it's being used to store a quantity, trapping
on an attempt to store a value outside the range 0-65535 may be very
useful. If I were designing a language, I'd include both kinds of
values.

David Brown

unread,
May 20, 2018, 6:09:16 AM5/20/18
to
On 19/05/18 20:03, bartc wrote:
> On 19/05/2018 17:32, David Brown wrote:
>
>> In Pascal, you can write something like:
>>
>> type
>>      days_of_week = (Sunday, Monday, Tuesday, Wednesday, Thursday,
>> Friday, Saturday);
>>      weekdays = Monday ... Friday;
>>
>>      workday_hours = array[weekdays] of 0 .. 24;
>>
>> var
>>      my_workday_hours : workday_hours;
>>
>> Accessing elements of "my_workday_hours" is /exactly/ as efficient as
>> accessing an array in C.  Doing so via a pointer is marginally less
>> so, because it requires an offset (since the index does not start at
>> 0) that probably can't be optimised away at compile time.
>
> In both cases (direct array and via a pointer), at worst there will be a
> constant offset when the lower bound is not zero, but that can usually
> be incorporated into the address mode of the instruction that accesses
> the element (depends on instruction set).
>

Yes. Such arrays may have a slight extra cost compared to 0-based
arrays, but it is at most a very slight extra cost.

> And a compiler could eliminate that by having the address of the array
> pointing to an imaginary 0th element rather than the first element.
>

Possibly - that may involve other complications since the pointer no
longer points at the object itself. Certainly it is a conceivable
optimisation, and this is all a matter of implementation details.

Spiros Bousbouras

unread,
May 20, 2018, 3:05:35 PM5/20/18
to
On Fri, 18 May 2018 10:28:38 +0200
David Brown <david...@hesbynett.no> wrote:
> On 18/05/18 01:36, Peabody wrote:
> > David Brown says...
> >
> > > Just to be clear here, are you writing the C code for
> > > the microcontroller or for the PC?
> >
> > I'm doing a demonstration program for the PC - both an
> > encryption program and a decryption program. Just to make
> > sure I understand the algorithm properly.

[...]

> And don't write the PC program in C, unless are a very experienced C
> programmer and know no other programming languages. C (on a PC) has no
> problem with large lumps of data, statically or dynamically allocated.
> But it is a hopeless language for something like parsing Intel Hex files
> or other string manipulation. You'll spend far more time trying to
> figure out your buffers, memory allocations, data structures, etc., than
> actually solving the problem. Use a language that has direct support
> for strings, lists, pattern matching, etc. It will be a fraction of the
> development effort.

I very much disagree with this. I had a look at en.wikipedia.org/wiki/Intel_HEX
and the format is easy to parse ; it would make a pleasant exercise. The OP
stated in <20180518-2...@Peabody.us.newsgroupdirect.com> that he has
experience with several other languages so this particular project would be an
excellent way to get more experience in C. Now if someone has limited programming
experience *in general* then yes , this would be perhaps too ambitious.
There are already libraries in C for "strings, lists, pattern matching" but
implementing any of those on one's own is a rewarding exercise. And I'm sure
there are libraries for parsing the HEX format ; the wikipedia article gives a
link to one.

In particular , the statement that C "is a hopeless language for" [..] "string
manipulation" is absurd. The standard library is not much good but there are
plenty of other libraries and again , writing one's own is a pleasant and
instructive exercise. One can experiment with different data structures ,
methods of allocation , etc. Not many languages provide this level of fine
control and writing the code while having that level of fine control is a
satisfying experience. Perhaps it takes a certain kind of brain to find this
satisfying but then we don't know if the OP has or hasn't this kind of brain.
He has said that he has experience with assembler and that is close to C.

> Usually you shouldn't even bother with hex file parsing - just use
> objcopy, which will be part of the development tools for the
> microcontroller. It will convert your file to a nice raw binary image.

--
These whole systems of dark planets, those trillions of square kilometres of blank
paper, represented the Mind's future; the spaces it would fill in its life to come.
If it had one.
"Consider Phlebas"

Malcolm McLean

unread,
May 21, 2018, 6:00:06 AM5/21/18
to
On Sunday, May 20, 2018 at 8:05:35 PM UTC+1, Spiros Bousbouras wrote:
>
> In particular , the statement that C "is a hopeless language for" [..] "string
> manipulation" is absurd.
>
When you work with DNA data, strings become very long, and the simple algorithms
for searching and matching become far too slow.
In fact it can be other languages which are hopeless for implementing
something like a suffix tree. C can do it reasonably well and with space
efficiency. I haven't tried it in, say, Perl or Matlab, but I think you'd
resort to linking a C subroutine.

David Brown

unread,
May 21, 2018, 8:53:43 AM5/21/18
to
My understanding is that he is using C because he has no choice of
language for handling a required task - not because he wants to learn C.

For someone wanting to /learn/ general C programming, I agree that an
Intel Hex format parser might be a fine project. But if you are looking
at it from the other viewpoint - you need an Intel Hex format parser and
want to pick a language, then C would be a crazy choice (ignoring any
other of a myriad of reasons for or against choosing any particular
language).

> Now if someone has limited programming
> experience *in general* then yes , this would be perhaps too ambitious.
> There are already libraries in C for "strings, lists, pattern matching" but
> implementing any of those on one's own is a rewarding exercise. And I'm sure
> there are libraries for parsing the HEX format ; the wikipedia article gives a
> link to one.
>
> In particular , the statement that C "is a hopeless language for" [..] "string
> manipulation" is absurd.

I'll accept that it is an exaggeration. And I agree that there are
libraries to aid you here. But for languages that have a good string
handling along with native lists and other high-level structures, this
is all just a few lines of standard language code. In C, you are
researching libraries, finding out how to use them, etc. C is a
language in which you /can/ do practically anything - that does not mean
it is a good choice of language for all tasks.

supe...@casperkitty.com

unread,
May 21, 2018, 11:12:41 AM5/21/18
to
On Sunday, May 20, 2018 at 2:05:35 PM UTC-5, Spiros Bousbouras wrote:
> In particular , the statement that C "is a hopeless language for" [..] "string
> manipulation" is absurd. The standard library is not much good but there are
> plenty of other libraries and again , writing one's own is a pleasant and
> instructive exercise. One can experiment with different data structures ,
> methods of allocation , etc. Not many languages provide this level of fine
> control and writing the code while having that level of fine control is a
> satisfying experience. Perhaps it takes a certain kind of brain to find this
> satisfying but then we don't know if the OP has or hasn't this kind of brain.
> He has said that he has experience with assembler and that is close to C.

Doing most kinds of string processing without being able to retrieve the
lengths of strings in O(1) time is pretty much hopeless. If one ignores
everything in <string.h> but memmove, memcpy, and maybe memcmp it's
possible to write useful string-processing libraries from scratch, but
on most implementations there's no nice way to make string literals
interact smoothly with other kinds of strings. As of C11, the best
approach I've found, which also works back to C89, would be to do create
a macro which takes an identifier name and a string literal, and creates
a static identifier with that name that encodes that string. For
example, one could have a macro

SHORTLIT(name, dat)

which will accept a literal string of up to 63 characters and encode it
with a single-byte length prefix in the first byte, a second macro

MEDLIT(name, dat)

which will accept a string of up to 4095 characters and encode it with a
two-byte length prefix (using some bits of the first byte to indicate that
the prefix is two bytes), other macros for LONG or HUGE strings, and
macros for short, medium, long, or huge length-checked buffers that would
then be initialized with an INITSTRING() macro prior to use (the macro
would store the buffer size in the header).

Unfortunately, while it would be far more convenient to say:

concatLiteral(mybuff, "This is my message");
or
concats(mybuff, SHORTLITVALUE("This is my message"));

than to say

STR64(thisIsMyMessage, "This is my message");
...
concats(mybuff, thisIsMyMessage);

the lack of static compound literals in C means that there's no nice way to
avoid having to define separately-named objects for every literal string.

Malcolm McLean

unread,
May 21, 2018, 11:27:25 AM5/21/18
to
On Monday, May 21, 2018 at 4:12:41 PM UTC+1, supe...@casperkitty.com wrote:
>
> Doing most kinds of string processing without being able to retrieve the
> lengths of strings in O(1) time is pretty much hopeless. If one ignores
> everything in <string.h> but memmove, memcpy, and maybe memcmp it's
> possible to write useful string-processing libraries from scratch, but
> on most implementations there's no nice way to make string literals
> interact smoothly with other kinds of strings.
>
Everyone at some stage says "I can do better than C's asciiz strings"
and writes

typedef struct
{
char *data;
size_t length;
} String;

In reality adding a length parameter doesn't buy you much. Theoretically
it transforms a lot of operations from O(N) to O(constant), at least
in one argument. But you've also got to look at the factor, simply
stepping over a char is a very fast operation. And most strings in most
applications are short. And mostly you are applying only a few operations
to each string.

Really what matters is higher-level operations, not shaving time off
reimplementations of strcpy() and strcat().

Then, as you say, being able to pass a literal as a string argument
can be a huge advantage.

bartc

unread,
May 21, 2018, 11:59:41 AM5/21/18
to
Some libraries can be useful. The problem is when posting short pieces
of code in a forum like this for example, the simplest bit of string
processing using the library will require that library. Either added to
the code, or via a link.

Then using only the standard string functions is handy.

But when you do use a higher level library, then using counted strings
/is/ useful. For one thing, it allows you to refer to a substring of
another string, something not possible when strings must be zero-terminated.

It also allows you to have embedded zeros within a string, so that a
string object can represent the contents of any file, even an executable
(I use this feature actually), or just any memory block.

And when the strings /can/ be long, then operations can be far more
efficient:

------------------------------------
#include <stdio.h>
#include <string.h>

char string[1000001];

int main(void) {
int i;
string[0]=0;

for (i=0; i<1000000; ++i) {
strcat(string, "A");
}

printf("strlen(string) = %d\n",strlen(string));
}
------------------------------------

This took 95 seconds using gcc-O3.

Using an interpreted version (see sig), and choosing a non-optimised
100% HLL version of an interpreter expressed in C and compiled with Tiny
C, it took 0.08 seconds, and half of that is process startup/terminate
overheads.

Needless to say, that version uses counted strings. It also has to
allocate space as it goes along, ie. have an expanding buffer, something
I haven't bothered the C version with as it's got enough to do.

--
bartc

proc start=
s:=""
to 1 million do
s +:= ' '
od

println s.len
end

Malcolm McLean

unread,
May 21, 2018, 12:16:51 PM5/21/18
to
On Monday, May 21, 2018 at 4:59:41 PM UTC+1, Bart wrote:
>
> And when the strings /can/ be long, then operations can be far more
> efficient:
>
> ------------------------------------
> #include <stdio.h>
> #include <string.h>
>
> char string[1000001];
>
> int main(void) {
> int i;
> string[0]=0;
>
> for (i=0; i<1000000; ++i) {
> strcat(string, "A");
> }
>
> printf("strlen(string) = %d\n",strlen(string));
> }
> ------------------------------------
>
> This took 95 seconds using gcc-O3.
>
> Using an interpreted version (see sig), and choosing a non-optimised
> 100% HLL version of an interpreter expressed in C and compiled with Tiny
> C, it took 0.08 seconds, and half of that is process startup/terminate
> overheads.
>
Yes, but if you're pushing single characters onto the back of a string,
it's just as easy to do

for(i=0;i<1000000;i++)
string[i] = 'A';

string[i] = 0;

A higher level language must allow a "reserve" function to avoid O(N)
reallocations, caller must use it, and it will be about a factor of ten
slower because the length field is updated and checked against capacity
on every call to the concatenate function.

supe...@casperkitty.com

unread,
May 21, 2018, 12:40:16 PM5/21/18
to
The performance of strcpy() isn't the problem, since that function is
going to have to look at every byte of the source string whether or not
the length is known in advance.

Bigger problems are things like substring extraction (which requires
ensuring that there aren't any zero bytes in the space before the part
of the string of interest), and the lack of any practical means of
providing buffer safety or smooth buffer management unless one manually
keeps track of the length of everything.

A good general-purpose string library could accommodate a number of string
formats, identifiable via prefix byte, and supply a couple functions which
given a pointer to a string's header, could fill in one of the following
structures:

struct readableString {
uint8_t typeMarker[1];
char *dat;
size_t length;
}

or

struct writableString {
uint8_t typeMarker[1];
char *dat;
size_t length;
RESIZER_INFO *block_control;
}

(the choice being based upon whether code would be writing to the string).
Functions which expect to read from a string would be able to accept
either of the above, a length-prefixed constant string, or a length-prefixed
fixed-size buffer. Those that write to a string could either accept a
length-prefixed writable buffer or a struct writableString.

Such a design would work well even in the one situation where asciiZ
strings would have a significant performance advantage over other length-
prefixed forms: passing the tail of a string. Unlike asciiz strings,
however, the design would not be limited to passing just the tail. Any
substring could be passed to any function expecting a prefixed string
by constructing a readableString struct where the dat pointer and length
identify the content of interest and passing the address of that struct.
The header byte would identify the object as a readableString, so any
function that uses the "convert prefixed string to readableString" function
would be able to use it just like any other readable string.

The only downsides to functions that use such strings would be that they
would not be able to accept pointers to asciiz strings without first
converting them to a readableString struct, a process which would in turn
require strlen(), and that buffers which would be passed as "destination"
strings would need to have their headers initialized first (so the code
receiving a pointer to the space would know how big it was). Unfortunately
there's no way to initialize just part of a structure in a declaration.
If code (likely via macro) were to do declare an automatic object like:

struct { uint8_t header[2]; char dat[2050];} myBuff[1] =
{ 4, 128}; // Indicates empty buffer of size 2048

a compiler would be required to zero out all 2050 bytes of "dat" even if
none of their initial values would ever be observed. If the macro would
instead generate:

struct { uint8_t header[2]; char dat[2050];} myBuff[1];
myBuff[0].header[0] = 4; myBuff[0].header[1] = 128;

it would be unsuitable for objects of static duration.

Such downsides aren't trivial, but some slight language improvements could
eliminate them.

bartc

unread,
May 21, 2018, 1:27:57 PM5/21/18
to
On 21/05/2018 17:16, Malcolm McLean wrote:
> On Monday, May 21, 2018 at 4:59:41 PM UTC+1, Bart wrote:
>>
>> And when the strings /can/ be long, then operations can be far more
>> efficient:
>>
>> ------------------------------------
>> #include <stdio.h>
>> #include <string.h>
>>
>> char string[1000001];
>>
>> int main(void) {
>> int i;
>> string[0]=0;
>>
>> for (i=0; i<1000000; ++i) {
>> strcat(string, "A");
>> }
>>
>> printf("strlen(string) = %d\n",strlen(string));
>> }
>> ------------------------------------
>>
>> This took 95 seconds using gcc-O3.
>>
>> Using an interpreted version (see sig), and choosing a non-optimised
>> 100% HLL version of an interpreter expressed in C and compiled with Tiny
>> C, it took 0.08 seconds, and half of that is process startup/terminate
>> overheads.
>>
> Yes, but if you're pushing single characters onto the back of a string,
> it's just as easy to do
>
> for(i=0;i<1000000;i++)
> string[i] = 'A';


Real uses won't fit into that such a tidy loop, where you know the final
length of the string among other things, and always know the current
length (otherwise this could be done with memset().)

Imagine needing to implement this function:

void appendchar(char* s, char c) { // assume adequate capacity
...
}

without any other information about s. The simplest implementation
involves scanning the string from the beginning.

> A higher level language must allow a "reserve" function to avoid O(N)
> reallocations, caller must use it, and it will be about a factor of ten
> slower because the length field is updated and checked against capacity
> on every call to the concatenate function.

Ten times slower than what?

My dynamic example does exactly that, yet was 10,000 times faster than
optimised C code (using my usual interpreter), despite the language used
normally being 20 times slower.

Counted strings work!

--
bartc

supe...@casperkitty.com

unread,
May 21, 2018, 1:57:33 PM5/21/18
to
On Monday, May 21, 2018 at 12:27:57 PM UTC-5, Bart wrote:
> Real uses won't fit into that such a tidy loop, where you know the final
> length of the string among other things, and always know the current
> length (otherwise this could be done with memset().)
>
> Imagine needing to implement this function:
>
> void appendchar(char* s, char c) { // assume adequate capacity
> ...
> }
>
> without any other information about s. The simplest implementation
> involves scanning the string from the beginning.

Or, slightly more generally, a function that will need to append some
amount of text not known in advance, to a buffer which might or might
not be large enough or resizable, and which must fail in deterministic
fashion if the buffer isn't large enough and isn't resizable.

Using the kind of prefixed strings I advocate, it would be fairly simple.

// Every string-type structure would have a field "str" of array type.

#define appendFormattedLong(dest, n) appendString_((dest)->str, n)

void appendFormattedLong_(STRING *dest, uint64_t n)
{
char contentToAppend[64];
unsigned len = sprintf(contentToAppend, "%llu", n);

readableString src = makeReadableString(contentToAppend, len);

concatString(dest, &src);
}

This function would work equally well with fixed-sized buffers that are
statically allocated or embedded in other structures, or with dynamically-
allocated resizable buffers. All of the complexity would be encapsulated
in makeReadableString (which would simply populate some fields of a
structure and return it), and setStringLength, so functions like the
above wouldn't need to know or care about kinds or sizes of buffers
supplied by client code, and any code which needed to pass around strings
without manipulating the internals itself could just pass around strings.

Scott Lurndal

unread,
May 21, 2018, 2:30:00 PM5/21/18
to
"Equally well" in this case means performs poorly.

Strings stored as length-field + storage have been around
for fifty years or more, and are nothing new. Your proposal
to store the length as variable length ASCII is, however new. And a really
bad idea.

A real string has four bounds: Start of storage, end of storage,
start of string within storage, end of string within storage.

See for example the vax MOVC5 or the Burroughs MVS/CPS/HSH instructions.

bartc

unread,
May 21, 2018, 2:32:07 PM5/21/18
to
On 21/05/2018 17:40, supe...@casperkitty.com wrote:

> A good general-purpose string library could accommodate a number of string
> formats, identifiable via prefix byte, and supply a couple functions which
> given a pointer to a string's header, could fill in one of the following
> structures:
>
> struct readableString {
> uint8_t typeMarker[1];
> char *dat;
> size_t length;
> }
>
> or
>
> struct writableString {
> uint8_t typeMarker[1];
> char *dat;
> size_t length;
> RESIZER_INFO *block_control;
> }
>
> (the choice being based upon whether code would be writing to the string).

The choices would be between strings that are always a fixed length (and
which can still be writeable, or mutable, as they usually are in C), and
strings that can grow, or sometimes reduce.

Note that on a 64-bit machine, that prefix byte would occupy an 8-byte
slot, so the first takes 24 bytes and the second 32 bytes; not really
much to choose between them.

(I could create descriptors for both within 16 bytes, even with 64-bit
pointers, but it would limit the length of a single string to 4GB.
Longer strings can just about be accommodated with 16 bytes, but gets
more fiddly and less efficient.)

> Functions which expect to read from a string would be able to accept
> either of the above, a length-prefixed constant string, or a length-prefixed
> fixed-size buffer.

So two kinds of string descriptor, and two kinds of pointers to actual
strings which use a prefix byte(s) inline with the actual data?

That sounds too much, and would make it a headache to write functions
that take such a mix of parameters, as the second two require the string
pointer and the string length to be accessed in a different way.

It might be OK, if this is part of a string type in a dynamic language,
where there other overheads already.

But to make life simpler, I think there should be one kind of descriptor
with .dat and .length always accessed the same way.


--
bartc

supe...@casperkitty.com

unread,
May 21, 2018, 3:08:52 PM5/21/18
to
On Monday, May 21, 2018 at 1:30:00 PM UTC-5, Scott Lurndal wrote:
> "Equally well" in this case means performs poorly.

Better than zero-terminated strings in many cases, and better than other
formats in the only case where zero-terminated strings beats them.

> Strings stored as length-field + storage have been around
> for fifty years or more, and are nothing new. Your proposal
> to store the length as variable length ASCII is, however new. And a really
> bad idea.

What do you mean "ASCII"? It wouldn't be stored as a sequence of digits.
While processing a variable-length encoding is a bit less efficient than
processing a length stored in a single word, it avoids the need to either
set a hard maximum string length or waste a lot of space on small strings
that are packed into structures [if a structure is supposed to hold a 3-
character string, adding a 4-byte length would double the space required].

> A real string has four bounds: Start of storage, end of storage,
> start of string within storage, end of string within storage.

For readable strings, will it matter how long the storage beyond the end
of the string is? For writable strings, will the start of storage matter
in cases where it couldn't be derived from the start of the string?

Approaches where functions that expect strings must always be given
pointers to managed string objects [like the Windows BSTR] can be workable,
but mean that it's not possible for objects to encapsulate string values
within their own bit patterns, even when the amount of space required is
statically known.

bartc

unread,
May 21, 2018, 3:09:16 PM5/21/18
to
On 21/05/2018 18:57, supe...@casperkitty.com wrote:

> Using the kind of prefixed strings I advocate, it would be fairly simple.
>
> // Every string-type structure would have a field "str" of array type.
>
> #define appendFormattedLong(dest, n) appendString_((dest)->str, n)
>
> void appendFormattedLong_(STRING *dest, uint64_t n)
> {
> char contentToAppend[64];
> unsigned len = sprintf(contentToAppend, "%llu", n);
>
> readableString src = makeReadableString(contentToAppend, len);
>
> concatString(dest, &src);
> }
>
> This function would work equally well with fixed-sized buffers that are
> statically allocated or embedded in other structures, or with dynamically-
> allocated resizable buffers. All of the complexity would be encapsulated
> in makeReadableString (which would simply populate some fields of a
> structure and return it), and setStringLength, so functions like the
> above wouldn't need to know or care about kinds or sizes of buffers
> supplied by client code, and any code which needed to pass around strings
> without manipulating the internals itself could just pass around strings.

So that's using the library, which is the problem with C. Everyone uses
their own string library, making it harder to share code.

And for posting small programs here, which happen to make use of some
string ops, it's not really practical to use any kind of library.

--
bartc

supe...@casperkitty.com

unread,
May 21, 2018, 3:56:19 PM5/21/18
to
On Monday, May 21, 2018 at 1:32:07 PM UTC-5, Bart wrote:
> The choices would be between strings that are always a fixed length (and
> which can still be writeable, or mutable, as they usually are in C), and
> strings that can grow, or sometimes reduce.

It would be possible to always use the latter form and the cost of an extra
pointer store, but since string values are read much more often than they
are written, eliminating that store for the cases where code isn't going to
be writing to a string would probably be a useful optimization.

Further, distinguishing the actions of acquiring a read-only descriptor vs.
a writable descriptor would allow more safety-checks (an attempt to pass
a constant string to a function that's going to write it should trap whether
or not the function tries to change the length) and would also allow an
implementation to transparently support a "copy string" function that uses
shared copy-on-write buffers.

> Note that on a 64-bit machine, that prefix byte would occupy an 8-byte
> slot, so the first takes 24 bytes and the second 32 bytes; not really
> much to choose between them.

That plus the fact that acquiring a read descriptor and then attempting to
modify the string should give an error [I meant to include a "const"
qualifiers on the "dat" pointer].

> (I could create descriptors for both within 16 bytes, even with 64-bit
> pointers, but it would limit the length of a single string to 4GB.
> Longer strings can just about be accommodated with 16 bytes, but gets
> more fiddly and less efficient.)

The point is that code which wants to hold onto a bunch of short string
values which it isn't actively using shouldn't have to hold onto a bunch
of 16-byte string descriptors, and code which wants to relocate a block
of storage containing a bunch of string buffers that aren't actively
being used should simply be able to move all the bytes of those buffers,
without having to update a whole bunch of string descriptors.

For constant strings up to 64 bytes the overhead would be one byte. For
string buffers up to 64 bytes, it could be one byte if thread-safety is
not required, or else two bytes. Each additional byte of header (or two
bytes if thread-safety is needed) would scale the allowable length by a
factor of 64.

> > Functions which expect to read from a string would be able to accept
> > either of the above, a length-prefixed constant string, or a length-prefixed
> > fixed-size buffer.
>
> So two kinds of string descriptor, and two kinds of pointers to actual
> strings which use a prefix byte(s) inline with the actual data?

Basically. Eliminating one of the kinds of actual strings would increase
the storage requirement for constant strings. Eliminating one kind of
string descriptor would mean losing some protection against certain kinds
of programming mistakes.

> That sounds too much, and would make it a headache to write functions
> that take such a mix of parameters, as the second two require the string
> pointer and the string length to be accessed in a different way.

A function that wants to e.g. output all the bytes of a string would be
something like:

#define outputStringAsBytes(st) outputStringAsBytes_((st)->str)
void outputStringAsBytes_(STRING *p)
{
readableString str = getReadableString(p);
for (int i=0; i<str.length; i++)
printf("%d\n", str.dat[i]);
}

No need for the function to know or care about what kind of string it's
given. A function to append one character to a string without using the
"concat" method would look something like:

#define appendOneByte(st,ch) outputStringAsBytes_((st)->str, ch)
void appendOneBYte_(STRING *p, int ch)
{
writableString str = getWritableString(p);
setStringLength(&str, str.length+1);
str.dat[str.length-1] = ch;
}

All of the code that had to worry about whether these function were being
given a pointer to "prefixed data" strings or string descriptors, would be
in the "getReadableString", "getWritableString", and "setStringLength"
functions. User code could handle all such usage scenarios automatically
without having to know or care about them.

supe...@casperkitty.com

unread,
May 21, 2018, 4:02:38 PM5/21/18
to
On Monday, May 21, 2018 at 2:09:16 PM UTC-5, Bart wrote:
> So that's using the library, which is the problem with C. Everyone uses
> their own string library, making it harder to share code.

Everyone uses their own string library because the one that's defined by
the Standard is grossly unsuitable for most purposes.

Any single library would require making some design trade-offs (e.g. should
it allocate an extra zero byte past the end of each string even when the
library itself would know how big strings are even without it) but having
a means of taking a pointer to a string and getting the address and size of
its text, or taking a pointer and length and producing a valid string
descriptor, should make most kinds of data interchange with other kinds
of string libraries fairly straightforward, especially if one is willing to
waste a byte on most strings to accommodate a trailing zero.

bartc

unread,
May 21, 2018, 5:47:23 PM5/21/18
to
One advantage of a counted string is that strings can contain zeros. So,
while not having zero-termination can cause a few problems with
interfacing to functions that expect C-style strings, once you're over
that you can move on.

Here's an example (not C code but a language with proper string types):

s := readstrfile("c:/tdm/bin/gcc.exe")
println s.len
writestrfile("test.exe", s)

system("test.exe --version")

This reads a binary file into a single string, prints the string length,
writes it out again and executes that new file. Output is:

833536
test.exe (tdm64-1) 5.1.0
Copyright (C) 2015 Free Software Foundation, Inc.
....

Actually, with suitable string types and equivalent functions, the same
could be done in C (not sure about managing the memory used by the strings).

(BTW here's how such a non-zero-terminated string might be printed via
printf (not C, the printf is a foreign function call):

s := "abcdef"
printf("%.*s\n", s.len, &s)

This however won't work well if the string data is actually binary, and
will stop printing prematurely if there is an embedded zero. But then,
normal C strings could also contain binary data that it would be unwise
to print out.)

--
bartc

Ian Collins

unread,
May 21, 2018, 6:12:18 PM5/21/18
to
On 22/05/18 03:27, Malcolm McLean wrote:
> On Monday, May 21, 2018 at 4:12:41 PM UTC+1, supe...@casperkitty.com wrote:
>>
>> Doing most kinds of string processing without being able to retrieve the
>> lengths of strings in O(1) time is pretty much hopeless. If one ignores
>> everything in <string.h> but memmove, memcpy, and maybe memcmp it's
>> possible to write useful string-processing libraries from scratch, but
>> on most implementations there's no nice way to make string literals
>> interact smoothly with other kinds of strings.
>>
> Everyone at some stage says "I can do better than C's asciiz strings"
> and writes
>
> typedef struct
> {
> char *data;
> size_t length;
> } String;
>
> In reality adding a length parameter doesn't buy you much. Theoretically
> it transforms a lot of operations from O(N) to O(constant), at least
> in one argument. But you've also got to look at the factor, simply
> stepping over a char is a very fast operation. And most strings in most
> applications are short. And mostly you are applying only a few operations
> to each string.

It also brings in an overhead and scope for obscure bugs in maintaining
the correct value in the count.

C's lack of encapsulation makes this kind of "improvement" ether unsafe
or requires the use of opaque types. Simplicity is C's strongest attribute!

> Really what matters is higher-level operations, not shaving time off
> reimplementations of strcpy() and strcat().

True.

> Then, as you say, being able to pass a literal as a string argument
> can be a huge advantage.

Also true!

--
Ian.

supe...@casperkitty.com

unread,
May 21, 2018, 7:08:21 PM5/21/18
to
On Monday, May 21, 2018 at 5:12:18 PM UTC-5, Ian Collins wrote:
> It also brings in an overhead and scope for obscure bugs in maintaining
> the correct value in the count.
>
> C's lack of encapsulation makes this kind of "improvement" ether unsafe
> or requires the use of opaque types. Simplicity is C's strongest attribute!

Opaque types have many advantages, and their primary disadvantages are
due to deficiencies in the language which could have been corrected fairly
easily (e.g. it should be possible to declare a type of object whose address
may be implicitly converted to a pointer to a different type with which it
shares a common initial sequence), thus allowing for a structure to accept
pointers to a wide-range of structures without having to forego type-checking
altogether).

Although long C-style strings tend to be rather rare in code that uses the
zero terminator to determine length, that's not because there's no need for
longer strings. Instead it's because any code wanting to manipulate longer
strings will have to use some other representation. While the sets of
built-in operations in Java and .NET have some unfortunate omissions, they
can still handle multi-megabyte strings much more efficiently than C would
be able to handle strings whose length was tracked solely by the location
of the first zero byte.

Steve Carroll

unread,
May 21, 2018, 7:11:40 PM5/21/18
to
Despicable Owl must recognize that others can go get mySQL, right, gluey? And, of course, anyone can can get software to block his crap, which renders _his_ kiddie crap harmless, just like Despicable Owl ;) Despicable Owl would have to be afflicted with a form of insanity to be unsure of whether or not he "never had" developed Zsh shell with Snit. Remind you of anyone? When a guy can not remember his past claims and applies bogus, self-esteem protecting hot air later, it's rather obvious what his ploy is. On and on. You're again trying to push your ego when you have no skill. You're the dull knife in the drawer and your many stupid videos show this. Did Despicable Owl think that was clever? Zsh shell is based on Linux. Even if Despicable Owl is too stupid to figure that out.

-
Curious how these posts are made? Email: frelw...@gmail.com

Steve Carroll

unread,
May 21, 2018, 10:04:08 PM5/21/18
to
Snit wants to punish everyone here: If he and his shills can't get attention here then no one will. It was Snit who was openly asking how better to post his flooding.

Just some names Steven Petruzzellis has used
"Evil" John *
"Evil" Snit *
Big Crotch on a Small Fish
Cornelius Munshower
CSMA Moderator
Edward Stanfield
Fretwiz *
Hitman Hero
Measles
Petruzzellis Kids
Sigmond
Slaveen
Smit
Steve C *
Steve Camoll *
Steve Carroll <noone@xxxxxxxxxxx> *
Steve Carroll <stevecarroll@xxxxxxxxxxx> *
Steve Carroll <trollkiller@xxxxxx> *
Steve Carroll's Dog *
Steve Carrolll *
Steve Carrroll *
Yevette Owens
Yobo_Obyo
You are bright as a roasted carrot.



-
What Every Entrepreneur Must Know!
https://youtu.be/TETJEkaUoY4
http://www.5z8.info/killallimmigrants_t8w8io_asian-brides
Jonas Eklundh Communication

Scott Lurndal

unread,
May 22, 2018, 9:29:00 AM5/22/18
to
supe...@casperkitty.com writes:
>On Monday, May 21, 2018 at 1:30:00 PM UTC-5, Scott Lurndal wrote:
>> "Equally well" in this case means performs poorly.
>
>Better than zero-terminated strings in many cases, and better than other
>formats in the only case where zero-terminated strings beats them.
>
>> Strings stored as length-field + storage have been around
>> for fifty years or more, and are nothing new. Your proposal
>> to store the length as variable length ASCII is, however new. And a really
>> bad idea.
>
>What do you mean "ASCII"?

The output of the sprintf (deprecated) that you used to format
the length at the beginning of the string, of course (which you
conveniently elided from your reply).

Steve Carroll

unread,
May 22, 2018, 10:03:36 AM5/22/18
to
Do not get too confident, The Underground Marshmallow Person, sometimes "your unquotable lies" are just that. See: I quoted specific examples of The Underground Marshmallow Person blaming me for the actions of others, trolling multiple groups, etc. His response: to beg your herd to come rescue you.

The Underground Marshmallow Person doesn't know if there is a free FOSS program to handle such tasks.

Melzzzzz's computer has more hard drives than The Underground Marshmallow Person's. Melzzzzz wins. The Underground Marshmallow Person loses. Which gives Melzzzzz better productivity, efficiency, and error-reduction. In spite of my own views about Ubuntu, I even now continue to load it for people who have had problems with the Mac, particularly if I know that they are not competent at PDF annotation and balancing resources. A sucking vacuum of ideas, magnified by the reciprocal of a limitless imagination, demonstrating the vast nothingness of the main character's so-called thought process.



--
One Smart Penny
http://www.5z8.info/--INITIATE-CREDIT-CARD-XFER--_k7s1jt_inject_worm
https://youtu.be/u4xD43Khhkw
https://groups.google.com/forum/#!topic/comp.os.linux.advocacy/tzMH39QmAmU
Jonas Eklundh Communication AB

supe...@casperkitty.com

unread,
May 22, 2018, 10:34:34 AM5/22/18
to
Sorry you misunderstood what the sprintf was for. It purpose was not to
format the length, but rather to supply a variable amount of text. If
one were to do equivalent code using C strings:

void appendFormattedNumber(char *dest, long long n)
{
char dataToAppend[64];
sprintf(dataToAppend, "%lld", n);
strcat(dest, dataToAppend);
}

the caller would have to know in advance how much data the
appendFormattedNumber might add to the string and allocate data for it; if
the function adds more data than the caller expects, the wheels fall off.
Trying to add any kind of length checking to such a function would make it
much more complicated, and totally negate any purpose to using strcat.

My code used the return value from sprintf, which reports the number of
characters formatted, to avoid having to call strlen on the result. Sorry
if the purpose wasn't clear.

John Bode

unread,
May 22, 2018, 12:05:07 PM5/22/18
to
On Wednesday, May 16, 2018 at 4:16:29 PM UTC-5, Peabody wrote:
> Ben Bacarisse says...
>
> > Using malloc is not hard. The only trouble is that you
> > should check the result, but for programs that can get
> > all space they need at the start that's simple.
>
> So you say, but you're fluent in C. For the rest of
> humanity, I suspect malloc suffers from the same issue that
> plagues all normal people.
>
> Of course I'm speaking of pointers.

Pointers really aren't that hard to understand or use. It does take some practice to get
comfortable with them, but they're not the boojums most people make them out to be.

For your case, you only need to take baby steps into pointer world:

#include <stdlib.h>

#define ARRSIZE 50000

int main( void )
{
char *arr = malloc( sizeof *arr * ARRSIZE );
if ( arr )
{
// arr[i] = some_value, some_thing = arr[j], etc.
free( arr );
}
return 0;
}

For pointer declaration syntax, the basic rules are

// Declare a pointer to T
T *p;

// Declare a pointer to const T - you can update the pointer (set it to point to a different
// object), but not the thing being pointed to:
const T *p;
T const *p;

// Declare a const pointer to T - you can update the thing being pointed to, but
// not the pointer (you can't change it to point to a different object):
T * const p;

// Declare an array of pointers to T
T *ap[N];

// Declare a pointer to an array of T
T (*pa)[M];

// Declare a function returning a pointer to T:
T *f();

// Declare a pointer to a function returning T
T (*f)();

Remember that the * operator is part of the *declarator*, not the type specifier. The
declaration

T* a, b;

is parsed as

T (*a), b;

Only a is declared as a pointer to T.

The array subscript operator [] is defined in terms of pointer arithmetic. a[i] is *defined*
as *(a + i) - given the starting address a, compute the address of the i'th *object* following
a and dereference the result. Thus,

*p == *(p + 0) == p[0]
(*pa)[i] == (*(pa + 0))[i] == (pa[0])[i] == pa[0][i]

Arrays are not pointers, but array *expressions* are converted to pointer expressions as
necessary for subscripting.

Keith Thompson

unread,
May 22, 2018, 12:19:59 PM5/22/18
to
John Bode <jfbod...@gmail.com> writes:
[...]
> // Declare a function returning a pointer to T:
> T *f();
>
> // Declare a pointer to a function returning T
> T (*f)();

Better:

T *f(void);
T (*f)(void);

The empty parentheses indicate an old-style function declaration, an
obsolescent feature.

[...]

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Steve Carroll

unread,
May 22, 2018, 12:39:44 PM5/22/18
to
I think it is already over. Your system will crawl while finding a way to stop Chris Ahlstrom. And it takes a long time.

You used it and took it for a test drive and your "extensive experience" lead you to that opinion, did they?



--
Get Rich Slow!!
https://youtu.be/r7wys2JvBD0
https://youtu.be/GPPqvw8iEBs
Jonas Eklundh Communication AB

bartc

unread,
May 22, 2018, 2:24:49 PM5/22/18
to
On 22/05/2018 17:04, John Bode wrote:
> On Wednesday, May 16, 2018 at 4:16:29 PM UTC-5, Peabody wrote:
>> Ben Bacarisse says...
>>
>> > Using malloc is not hard. The only trouble is that you
>> > should check the result, but for programs that can get
>> > all space they need at the start that's simple.
>>
>> So you say, but you're fluent in C. For the rest of
>> humanity, I suspect malloc suffers from the same issue that
>> plagues all normal people.
>>
>> Of course I'm speaking of pointers.
>
> Pointers really aren't that hard to understand or use. It does take some practice to get
> comfortable with them, but they're not the boojums most people make them out to be.
>
> For your case, you only need to take baby steps into pointer world:
<snip>

These are some big 'baby' steps (enough to put people off if they think
there's even more).

Simple pointers don't need to involve the use of 'const', or pointers to
arrays, or pointers to functions. (Pointers to arrays are hardly ever
used anyway).

--
bartc

Scott Lurndal

unread,
May 22, 2018, 2:57:22 PM5/22/18
to
r...@zedat.fu-berlin.de (Stefan Ram) writes:
>Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>>Using malloc is not hard.
>
> I expect many programs with more than 1000 LOCs out there
> that from time to time use "malloc" to contain bugs like
> using memory without checking for NULL or not freeing
> allocated memory when it should be freed or freeing a wrong
> pointer or using memory already freed or accessing allocated
> memory beyond its bounds at least along some execution
> paths. And when the average programmer can't do it right,
> then I call it "hard".
>

So, what _do_ you know about the "average programmer"? And
how did you obtain that knowledge?

John Bode

unread,
May 22, 2018, 3:45:13 PM5/22/18
to
On Tuesday, May 22, 2018 at 1:24:49 PM UTC-5, Bart wrote:
> On 22/05/2018 17:04, John Bode wrote:
> > On Wednesday, May 16, 2018 at 4:16:29 PM UTC-5, Peabody wrote:
> >> Ben Bacarisse says...
> >>
> >> > Using malloc is not hard. The only trouble is that you
> >> > should check the result, but for programs that can get
> >> > all space they need at the start that's simple.
> >>
> >> So you say, but you're fluent in C. For the rest of
> >> humanity, I suspect malloc suffers from the same issue that
> >> plagues all normal people.
> >>
> >> Of course I'm speaking of pointers.
> >
> > Pointers really aren't that hard to understand or use. It does take some practice to get
> > comfortable with them, but they're not the boojums most people make them out to be.
> >
> > For your case, you only need to take baby steps into pointer world:
> <snip>
>
> These are some big 'baby' steps (enough to put people off if they think
> there's even more).
>

For the stated problem, yes, all you need is a simple pointer. I added the other stuff
as some (what I hoped would be) useful information.

If you want to program in C, you *have* to understand pointer syntax. *Have* to. It's not
optional. You cannot write useful C code that doesn't use pointers somewhere in some
form, and it's to your advantage to take the time to learn how pointer syntax actually
works, not just for simple pointer types, but for complex pointer-to-array, pointer-to-
function, pointer-to-function-retuning-pointer-to-array-of-pointers-to-function types as
well. There's no shortcut around it. It's like trying to learn calculus without internalizing
trig identities.

I had this silly notion that presenting a comprehensive list of examples up front
just might be useful for people learning the concept. Why am I wrong?

If you don't want to deal with pointers *at all*, then don't bother learning C. There are
plenty of languages out there that will probably work just as well for you, maybe even
better depending on what you're trying to do.

> Simple pointers don't need to involve the use of 'const',

Sometimes they do. Sometimes I want to make sure a function doesn't accidentally modify
something through a pointer parameter. Sometimes I want to make sure a pointer can't
accidentally be changed to point to a different object.

> or pointers to arrays, or pointers to functions. (Pointers to arrays are hardly ever
> used anyway).

I use 'em all the damned time because, guess what, *they're useful*. If you're using any
number of GUI frameworks or pthreads, you're using function pointers. If you're passing
multi-dimensional arrays around, you're dealing with pointers to arrays. Not using these
constructs is a bit like never taking your car out of third gear, or only using major and
minor chords in a song. You just limit yourself needlessly.

bartc

unread,
May 22, 2018, 4:02:37 PM5/22/18
to
On 22/05/2018 20:45, John Bode wrote:
> On Tuesday, May 22, 2018 at 1:24:49 PM UTC-5, Bart wrote:

>> Simple pointers don't need to involve the use of 'const',
>
> Sometimes they do.

If you take a working program and remove all the consts, then probably
it will continue to work. So they are an embellishment.

>
> I use 'em all the damned time because, guess what, *they're useful*. If you're using any
> number of GUI frameworks or pthreads, you're using function pointers.

Not in beginner programs. Although if you just need to pass a function
to some library routine, then probably don't need to actually declare or
use function pointers.

> If you're passing
> multi-dimensional arrays around, you're dealing with pointers to arrays.

That doesn't happen too often either. But even then, the common C idiom
is not to use a pointer to array of T for the first (or only) dimension,
but a pointer to T.

--
bartc

Steve Carroll

unread,
May 22, 2018, 4:17:32 PM5/22/18
to
Fedora's update system failed over and over again. All joking aside, what lie?

And in response you have nothing but a crack to start a troll-fest. It was John Gohde who was publicly asking how better to mask his forgeries. You refuse to take blame for your own actions. Posts that are easily quoted and pointed out.

Don't look now, but I think John Gohde has a serious fascination with Melzzzzz. But John Gohde feels the need to belittle the cult-like herd of convenient friends. So be it.

--
Live on Kickstarter!
http://www.5z8.info/peepshow_z9s9ht_dogporn
http://www.5z8.info/open.exe_f2s2ea_nazi
https://youtu.be/E3m_i-x92D0
Jonas Eklundh Communication AB

Keith Thompson

unread,
May 22, 2018, 4:22:57 PM5/22/18
to
bartc <b...@freeuk.com> writes:
> On 22/05/2018 20:45, John Bode wrote:
>> On Tuesday, May 22, 2018 at 1:24:49 PM UTC-5, Bart wrote:
>
>>> Simple pointers don't need to involve the use of 'const',
>>
>> Sometimes they do.
>
> If you take a working program and remove all the consts, then probably
> it will continue to work. So they are an embellishment.

If you remove all the consts including those in declarations of library
functions, then yes, it will probably continue to work -- until you
modify it and introduce a bug because the compiler won't catch attempts
to modify objects that should be read-only.

>> I use 'em all the damned time because, guess what, *they're
>> useful*. If you're using any number of GUI frameworks or pthreads,
>> you're using function pointers.
>
> Not in beginner programs. Although if you just need to pass a function
> to some library routine, then probably don't need to actually declare or
> use function pointers.

If you pass a function to some library routine, then you are using a
function pointer. (Strictly speaking, an ordinary function call
involves the use of a function pointer, though you don't really need to
be aware of that.)

>> If you're passing
>> multi-dimensional arrays around, you're dealing with pointers to arrays.
>
> That doesn't happen too often either. But even then, the common C idiom
> is not to use a pointer to array of T for the first (or only) dimension,
> but a pointer to T.

If you're dealing with a true multidimensional array (i.e., an array
of arrays), then you're using pointers to arrays, if only implicitly.
(You might be able to get away with pretending that this isn't
the case.)

Steve Carroll

unread,
May 22, 2018, 5:05:10 PM5/22/18
to
That is the problem 'today' and American pupils don't stand a chance, people from past generations *should* know better than to fall for socialism.

Frankly I do not really have any hope. Undeniably, the only thought that is important to The Holy Ghost is arguing he is "accurate", and if he can't have that he will flood to actively shout F. Russell down... there is no way to stop him. Playing with Linux... even now a tenderfoot. His dream is to see F. Russell frustrated by the spraying of the groups outside. And hey, that could or could not... you know what I mean. Perhaps you use it erroneously. Do you not understand the use of private networks?

Ending support for Mac Classic is common sense. But The Holy Ghost feels the need to please the herd. So be it.



-
Curious how these posts are made? https://youtu.be/0ZNxaaKD7-c

luser droog

unread,
May 22, 2018, 7:59:22 PM5/22/18
to
On Tuesday, May 22, 2018 at 2:45:13 PM UTC-5, John Bode wrote:

> If you want to program in C, you *have* to understand pointer syntax. *Have* to. It's not
> optional. You cannot write useful C code that doesn't use pointers somewhere in some
> form, and it's to your advantage to take the time to learn how pointer syntax actually
> works, not just for simple pointer types, but for complex pointer-to-array, pointer-to-
> function, pointer-to-function-retuning-pointer-to-array-of-pointers-to-function types as
> well. There's no shortcut around it. It's like trying to learn calculus without internalizing
> trig identities.

Agree. I'm reminded of this:
https://codegolf.stackexchange.com/questions/8727/rpn-calculator-without-pointers

Steve Carroll

unread,
May 22, 2018, 10:20:34 PM5/22/18
to
One guy reported him years ago. As expected, it did squat to thwart the dunderhead.

Just give it up already. You're again trying to push your ego when you have no skill. You're the dim bulb and your many brainless posts show this. "Well, that wasn't hard" said Steve Petruzzellis's girlfriend. Why do you keep rephrasing Melzzzzz?

Why would you want to restrict all commands on Mint to what can be done on commercial OSs? Can you get Melzzzzz to agree? Steve Petruzzellis must recognize that Melzzzzz can go get whatever software he has found, right, medhead? On top of that, anyone can can get software to block his crap, which renders _his_ trolling harmless, just like Steve Petruzzellis.



--
Get Rich Slow
https://youtu.be/BUjnZhKk1Bg
http://www.5z8.info/warez_w2t4ge_girlsgonewildpart1.wmv
Jonas Eklundh Communication

David Brown

unread,
May 23, 2018, 2:53:39 AM5/23/18
to
On 22/05/18 22:02, bartc wrote:
> On 22/05/2018 20:45, John Bode wrote:
>> On Tuesday, May 22, 2018 at 1:24:49 PM UTC-5, Bart wrote:
>
>>> Simple pointers don't need to involve the use of 'const',
>>
>> Sometimes they do.
>
> If you take a working program and remove all the consts, then probably
> it will continue to work. So they are an embellishment.
>

"const" is not a feature to let you do something you could not do
before. It is a feature to /stop/ you doing something you could do before.

For a language (or library, tool, whatever) to support writing correct
programs, there are two primary aims:

1. Make it as easy as possible to write correct code.

2. Make it as hard as possible to write incorrect code.



In comparison to many other languages, C is seen as weak on both these
points - it's a language that gives the developer a lot of freedom, and
demands a lot of responsibility. But these aims are not lost on C, and
each generation of the language since its inception has improved on the
points.

"const" is designed for point 2 here. Yes, you can write a program
without "const" anywhere, and you can remove "const" from any program
without changing its effect ("#define const" at the start of each C file
should even avoid library interfacing issues. Yes, I know this is
undefined behaviour, but I would expect it to work in most cases).

But avoiding "const" makes it easier to have bugs in the code, and it
makes it harder for the programmer and reader to reason about code.

As a side effect, "const" can also improve the efficiency of generated
code in many cases (especially for programmers who understand "static").

The concept of "const" is /so/ important and influential in helping
write clear and correct code that some modern programming languages make
everything "const" by default - you need to explicitly mark data as
"variable" or "mutable".

So a C programmer should learn to appreciate "const" and use it widely,
from their first program. Some use it everywhere they can, even for
small local variables, while others think that is a bit verbose.
Certainly you should always use it for pointers whenever you can.

bartc

unread,
May 23, 2018, 6:09:16 AM5/23/18
to
On 23/05/2018 07:53, David Brown wrote:
> On 22/05/18 22:02, bartc wrote:

>> If you take a working program and remove all the consts, then probably
>> it will continue to work. So they are an embellishment.

> For a language (or library, tool, whatever) to support writing correct
> programs, there are two primary aims:
>
> 1. Make it as easy as possible to write correct code.

> 2. Make it as hard as possible to write incorrect code.

I can't see how it makes it easier. First, you still have to design,
write, develop, and debug your application.

That's enough work by itself. But now you're saying that the extra
effort and headache of adding 'consts' throughout the code and getting
it to still compile (because const-checking issues will propagate all
over the program, even if no actual data structures are going to be
written to), all that is supposed to make it easier?

Some of us are experienced enough that we don't let data structures be
written to when they shouldn't be. (In any case many are too complex to
protect merely by adding 'const'; and if you add too many 'consts', then
you will be stuck trying to figure out how to modify them when they do
need to be updated or created).

> The concept of "const" is /so/ important and influential in helping
> write clear and correct code that some modern programming languages make
> everything "const" by default - you need to explicitly mark data as
> "variable" or "mutable".

Yes, some languages even go so far as to get rid of variables altogether
so that you left scratching your head as to how to achieve the most
trivial operations.

While other modernish languages (Python for one) take the opposite route
where /everything/ can be modified [by rebinding names], even functions
and modules.

> So a C programmer should learn to appreciate "const" and use it widely,
> from their first program. Some use it everywhere they can, even for
> small local variables, while others think that is a bit verbose.
> Certainly you should always use it for pointers whenever you can.

C's const is a poor substitute for proper control over read-only data
structures. It's too fine-grained, it's easy to get it wrong, it's
possible to still have holes where data structures can be written to.

It sucks at emulating named constants. It gives a false sense of
security. It can be heavily over-used (see some of Stefan Ram's posted
code).

But most importantly it introduces so much clutter that readability can
be affected to the extent that genuine bugs may be harder to spot.

And, it has no place in beginners' code. Many might want to learn C
/because/ it can be dangerous to use!

--
bartc

Malcolm McLean

unread,
May 23, 2018, 6:59:34 AM5/23/18
to
On Tuesday, May 22, 2018 at 8:45:13 PM UTC+1, John Bode wrote:
>
> If you want to program in C, you *have* to understand pointer syntax.
> *Have* to. It's not optional.
>
When C was new it was not unheard of for programmers to be told, "You
can use C as long as you don't use pointers".

Ian Collins

unread,
May 23, 2018, 7:11:35 AM5/23/18
to
On 23/05/18 22:09, bartc wrote:
> On 23/05/2018 07:53, David Brown wrote:
>> On 22/05/18 22:02, bartc wrote:
>
>>> If you take a working program and remove all the consts, then probably
>>> it will continue to work. So they are an embellishment.
>
>> For a language (or library, tool, whatever) to support writing correct
>> programs, there are two primary aims:
>>
>> 1. Make it as easy as possible to write correct code.
>
>> 2. Make it as hard as possible to write incorrect code.
>
> I can't see how it makes it easier. First, you still have to design,
> write, develop, and debug your application.
>
> That's enough work by itself. But now you're saying that the extra
> effort and headache of adding 'consts' throughout the code and getting
> it to still compile...

Competent programmers (event though you might) don't go "adding 'consts'
throughout the code", we at them where appropriate when writing the code.

--
Ian.

David Brown

unread,
May 23, 2018, 7:21:14 AM5/23/18
to
On 23/05/18 12:09, bartc wrote:
> On 23/05/2018 07:53, David Brown wrote:
>> On 22/05/18 22:02, bartc wrote:
>
>>> If you take a working program and remove all the consts, then probably
>>> it will continue to work. So they are an embellishment.
>
>> For a language (or library, tool, whatever) to support writing correct
>> programs, there are two primary aims:
>>
>> 1. Make it as easy as possible to write correct code.
>
>> 2. Make it as hard as possible to write incorrect code.
>
> I can't see how it makes it easier. First, you still have to design,
> write, develop, and debug your application.

You can't see how /what/ makes /what/ easier? I said "const" makes it
harder to write incorrect code - not that it makes it easier to write
correct code. (I think it does to some extent, by making your
intentions clearer, but the primary point as I see it is that it helps
make some types of bugs into compiler errors.)

Do you understand that my two points above are different?

>
> That's enough work by itself. But now you're saying that the extra
> effort and headache of adding 'consts' throughout the code and getting
> it to still compile (because const-checking issues will propagate all
> over the program, even if no actual data structures are going to be
> written to), all that is supposed to make it easier?

No - read what I wrote.

And if you think "const" is something you go through and add to the code
later, you have /totally/ missed the point.

When you need to define an object (other than dynamically allocated
objects), ask yourself if its value is going to vary or if it is going
to keep the same value throughout its lifetime (program lifetime for
file-scope objects, block lifetime for local objects). If it is going
to keep the same value, define it as a "const". (And if it is
file-scope, you usually also want it to be static, as for most
file-scope objects and functions.)

When you have a pointer to something - either as a variable, or a
pointer - ask yourself if you have going to use that pointer to change
the object(s) pointed at. If not, make it a pointer-to-const.

It is /that/ simple.

Using "const" does not make it easier to write the code in the first
place. But it makes it a very much harder to write code that breaks
your rules here - you can't change the value of objects whose values
should not be changed. You have to go out of your way in order to write
code that even attempts to do so.

(Let me note that I think compilers should be harsher about catching
such attempts, with more warnings or errors by default instead of
requiring extra flags. But that is a weakness in the implementations
here, not a failing of the language.)

>
> Some of us are experienced enough that we don't let data structures be
> written to when they shouldn't be. (In any case many are too complex to
> protect merely by adding 'const'; and if you add too many 'consts', then
> you will be stuck trying to figure out how to modify them when they do
> need to be updated or created).

Some of us are experienced enough to know we are not flawless, and like
to use the help we get. Using "const" does not guarantee bug-free code,
but it is certainly a useful step towards it. Arrogance about not
needing such features is a step backwards.

>
>> The concept of "const" is /so/ important and influential in helping
>> write clear and correct code that some modern programming languages make
>> everything "const" by default - you need to explicitly mark data as
>> "variable" or "mutable".
>
> Yes, some languages even go so far as to get rid of variables altogether
> so that you left scratching your head as to how to achieve the most
> trivial operations.

By that, I take it you mean /you/ personally can't get your head around
functional programming languages? Functional programming languages
require thinking in a somewhat different way, and usually appeal more to
mathematically minded people. They make some kinds of tasks far, far
easier than imperative languages (like C, and like your languages) - but
some tasks are definitely harder.

>
> While other modernish languages (Python for one) take the opposite route
> where /everything/ can be modified [by rebinding names], even functions
> and modules.

They do indeed. There is space for a wide range of ideas in programming
- there is no single perfect language for all purposes.

>
>> So a C programmer should learn to appreciate "const" and use it widely,
>> from their first program. Some use it everywhere they can, even for
>> small local variables, while others think that is a bit verbose.
>> Certainly you should always use it for pointers whenever you can.
>
> C's const is a poor substitute for proper control over read-only data
> structures. It's too fine-grained, it's easy to get it wrong, it's
> possible to still have holes where data structures can be written to.
>

It is not perfect, by any means - but it is still very useful. You do,
of course, have to know how to use it.

> It sucks at emulating named constants.

It does a reasonable job in some cases, and fails in others. As I have
said many times, C++ const is very much better here than C const.

> It gives a false sense of
> security.

If you go out of your way to write bad code, C lets you. But you have
to make a bit of an effort. And it helps if you have proper development
tools and know how to use them. (And let me say again that I'd rather
see more compiler complaints about mixups with const, enabled by default
rather than requiring specific flags.)

> It can be heavily over-used (see some of Stefan Ram's posted
> code).

I'd rather not look at Stefan's code, if it's all the same to you.
People have different styles in their coding - I think Stefan's is
extraordinarily unclear, idiosyncratic and overly complicated. He can
write it the way he wants, of course, but don't take it as an example of
how to use "const".

>
> But most importantly it introduces so much clutter that readability can
> be affected to the extent that genuine bugs may be harder to spot.

Totally nonsense. "const" is short and to-the-point, and adds extra
information to the code.

>
> And, it has no place in beginners' code. Many might want to learn C
> /because/ it can be dangerous to use!
>

"const" is an absolute must for beginners. The same goes for a many
features of C that you are determined to hate, fear and avoid (like
"static", "extern", proper declarations, initialisations, small scopes,
pointers, arrays, standard types, multiple files, and probably many
other things).

Steve Carroll

unread,
May 23, 2018, 7:42:30 AM5/23/18
to
That lame duck update system failed over and over again.

I'd post to John Gohde honestly but he is an ignoramus who takes things out of context to fulfill his inclination to call everyone a jerk.

The one talent John Gohde learned fully is to work to guilt The Flying Spaghetti Monster into submission and if that fails to work, troll-splain or completely change the assertion.

Plenty of individuals persist in talking to John Gohde. To be honest, I can not indict The Flying Spaghetti Monster for having his dander up but, frankly, I can not explain why he posts here after all this nonsense. The Flying Spaghetti Monster is expecting discussions as experienced in a moderated venue and support environments will just lead to frustration.

If John Gohde calls being busted using socks time and time again for over a decade by numerous people (and completely trashing his name and any reason for advocates to give credence to anything he has to say - until the cows come home) good 'trolling', then sure... he is a crack troll. I can't personally go along with that meaning, I use another term. I call John Gohde a complete bonehead.

Heck, just recently he was declaring himself the only "true Linux advocate", and claiming that "advocates" (in scare-quotes, because he is afraid of them) are focusing on Windows more than on Linux.

I reported him years ago. As expected, it did diddly-squat to attenuate the clod.

The Mac has more streamlined workflows. The Flying Spaghetti Monster has proved this time and time again.

--
This Trick Gets Women Hot For You!
https://redd.it/6sfkup
https://youtu.be/VxLiH-x0aeY
http://www.5z8.info/trojan_p4g1zy_how-to-skin-a-gerbil
Jonas Eklundh Communication AB

Malcolm McLean

unread,
May 23, 2018, 7:43:08 AM5/23/18
to
There's such a thing as const poisoning.

const doesn't actually add much to most functions, for example declaring
printf() to take a const char * for the format parameter is unlikely
to reveal any bugs. The snag with const is that once it is introduced
in one place, it "const"s everything it touches. So you have to be
very rigorous in "const correctness", otherwise you get problems
later down the line.
Another answer, which is used in Baby X is to almost completely exclude
const. One reason for this is that Baby X makes extensive use of
callbacks with context pointers, sometimes called "closures". The context
pointer has to be non-const void *, even though quite often it won't
actually be written to. You simply can't make code const correct in
such circumstances.


Ben Bacarisse

unread,
May 23, 2018, 9:22:26 AM5/23/18
to
Being "not unheard of" is a very low bar, but I never heard it. What,
when you heard it, did you take using pointers to mean? Unless it means
something like declaring your own pointers in your own code it would be
an absurd thing to say. You can't use C at all without using pointers
in the more general sense of the phrase.

--
Ben.

Steve Carroll

unread,
May 23, 2018, 9:32:48 AM5/23/18
to
Habits change. Could be the concept of a usenet group is getting outdated.

My uptime is almost sixty-four weeks and I am ready to go see a doctor!

Melzzzzz and Desk Rabbit both lie nonstop and flagrantly and continue to do so. So no sense in showing any further civility or restraint. Desk Rabbit can create a virtual machine. Of course that is not possible on anything but Linux! Well... like I said, Desk Rabbit would never deny the flooder is Melzzzzz, who is a demonstrable Automator user but I don't know if it could be used to get by Google's spam filters. It was Melzzzzz who stated that he and his college buddies used to steal people's mail all the time and it was no big deal, not even the worst of it.

--
My Snoring Solution!
http://www.5z8.info/creditscore_g7f8du_launchexe
https://goo.gl/Fho5Nq
Jonas Eklundh

Scott Lurndal

unread,
May 23, 2018, 9:40:07 AM5/23/18
to
Actually, I was programming in C when it was new, and then, and in
the subsequent almost 40 years of C and C++ programming, I've never
heard anyone ever say "you can use C as long as you don't use pointers".

I see alot of misguided C++ evangalists making such claims, but
never about C.

Malcolm McLean

unread,
May 23, 2018, 9:42:43 AM5/23/18
to
I heard it in the context of a blog-style (this was before the days of
blogs) complaint against unreasonable managers who made unreasonable
demands, one example of which was that C be written without pointers.

I'd guess that they'd have to make an exception for an array being passed
to a subroutine, but if you use [] syntax that's a hidden pointer.

I used Fortran 77 for my PhD work. It doesn't have pointers.

John Bode

unread,
May 23, 2018, 9:54:58 AM5/23/18
to
On Wednesday, May 23, 2018 at 5:09:16 AM UTC-5, Bart wrote:
> On 23/05/2018 07:53, David Brown wrote:
> > On 22/05/18 22:02, bartc wrote:
>
> >> If you take a working program and remove all the consts, then probably
> >> it will continue to work. So they are an embellishment.
>
> > For a language (or library, tool, whatever) to support writing correct
> > programs, there are two primary aims:
> >
> > 1. Make it as easy as possible to write correct code.
>
> > 2. Make it as hard as possible to write incorrect code.
>
> I can't see how it makes it easier. First, you still have to design,
> write, develop, and debug your application.
>

He didn't say that. You snipped this rather important point:

Steve Carroll

unread,
May 23, 2018, 11:07:50 AM5/23/18
to
The guy is as popular as bacon buttie at a Bar Mitzvah -- and with good reason. Idiot. Why do you keep rephrasing yourself? Can ASCII characters be concatenated into a string with a logical connection to the "real" world; that is the question posed recently by those astonished at the non-stop flood of gibberish from trolls and neck beard freaks. Now why the added task to debug JavaScript via email in a super computer system? Of course, he didn't state this as part of his ever increasing 'needs'. I have a custom setup I use as well, but it's better than yours. The Flying Spaghetti Monster has toned down the zillions of narcissistic, bullshit attacks he used to write but he's surely consistent in the morality department. He just uses socks more to produce those posts.

MATE is likely my second favorite DE and the only one I suggest to Linux converts. Primary interface is MATE, though. Lots of desktop effects is all you want after all.

--
E-commerce Simplified
http://tmp.gallopinginsanity.com/BilkHelp.html
Jonas Eklundh Communication AB

Keith Thompson

unread,
May 23, 2018, 12:18:59 PM5/23/18
to
Malcolm McLean <malcolm.ar...@gmail.com> writes:
[...]
> There's such a thing as const poisoning.
>
> const doesn't actually add much to most functions, for example declaring
> printf() to take a const char * for the format parameter is unlikely
> to reveal any bugs.
[...]

The "const" on printf's format parameter prevents (some) bugs in the
implementation of printf itself, not in calling code. It might not be
likely that you'd accidentally modify the format string, but it's worth
checking. A caller can pass either a "char*" argument or a "const
char*" argument.

If the "const" weren't there, you wouldn't be able to call printf with a
const char* argument.

(This would all be easier if "const" were the default, but it wasn't
practical to do that in a language based on pre-ANSI C, which didn't
have a way to mark things as read-only.)
It is loading more messages.
0 new messages