Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

portable code

214 views
Skip to first unread message

raffamaiden

unread,
Dec 16, 2010, 1:45:38 PM12/16/10
to
Hi all. I'm writing a program wich will write some variables to an
output file. I do something like

int a =5;
fwrite(&a, sizeof(int), 1, my_file_ptr);

This will write an int to the file pointed by my_file_ptr. But i know
that the c standard does not specify the exact size in bytes for its
primitive type, as far as i know it only specifies that and int is an
integer type that rapresents a number with a sign, but different
implementations\operating systems can have different size for an int.

So this mean that my program will write a 32 bit integer with one
implementation and a 16 bit with another implementation. This would
also mean that the file generated by the program that is running in
one implementation will not be readable in another implementation,
unless the program knows also in which implementation the instance
that generated the file was running.
That is right? I do not want such a behavior. How can i solve this?
--
comp.lang.c.moderated - moderation address: cl...@plethora.net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.

Jasen Betts

unread,
Dec 24, 2010, 2:12:40 PM12/24/10
to
On 2010-12-16, raffamaiden <raffa...@gmail.com> wrote:
> Hi all. I'm writing a program wich will write some variables to an
> output file. I do something like
>
> int a =5;
> fwrite(&a, sizeof(int), 1, my_file_ptr);
>
> This will write an int to the file pointed by my_file_ptr. But i know
> that the c standard does not specify the exact size in bytes for its
> primitive type, as far as i know it only specifies that and int is an
> integer type that rapresents a number with a sign, but different
> implementations\operating systems can have different size for an int.
>
> So this mean that my program will write a 32 bit integer with one
> implementation and a 16 bit with another implementation. This would
> also mean that the file generated by the program that is running in
> one implementation will not be readable in another implementation,
> unless the program knows also in which implementation the instance
> that generated the file was running.
> That is right? I do not want such a behavior. How can i solve this?

read and write the file one byte at a time.

or use textual representations

fprintf(my_file_ptr,"%d ",a);

fscanf(read_file,"%d",&a);

--
⚂⚃ 100% natural

Manohar

unread,
Dec 24, 2010, 2:15:52 PM12/24/10
to
Hi raffamaiden,

I can think of two approaches:
1. Use either short or long. They will be 16 or 32 bits respectively
on any(not sure, need confirmation ;-)) system.
2. Convert it to string and store. By doing this, the file will be
readable when opened as text file. But if you are keen on performance
then this is not good approach.

Regards,
Manohar

On Dec 16, 11:45 pm, raffamaiden <raffamai...@gmail.com> wrote:
> Hi all. I'm writing a program wich will write some variables to an
> output file. I do something like
>
> int a =5;
> fwrite(&a, sizeof(int), 1, my_file_ptr);
>
> This will write an int to the file pointed by my_file_ptr. But i know
> that the c standard does not specify the exact size in bytes for its
> primitive type, as far as i know it only specifies that and int is an
> integer type that rapresents a number with a sign, but different
> implementations\operating systems can have different size for an int.
>
> So this mean that my program will write a 32 bit integer with one
> implementation and a 16 bit with another implementation. This would
> also mean that the file generated by the program that is running in
> one implementation will not be readable in another implementation,
> unless the program knows also in which implementation the instance
> that generated the file was running.
> That is right? I do not want such a behavior. How can i solve this?
> --

> comp.lang.c.moderated - moderation address: c...@plethora.net -- you must

Dag-Erling Smørgrav

unread,
Dec 24, 2010, 2:16:43 PM12/24/10
to
raffamaiden <raffa...@gmail.com> writes:
> That is right? I do not want such a behavior. How can i solve this?

Store your data in text form rather than binary. It takes more work to
parse, but it's far more portable and robust, and allows you to inspect
and even modify the result with a regular text editor.

DES
--
Dag-Erling Smørgrav - d...@des.no

Hans-Bernhard Bröker

unread,
Dec 24, 2010, 2:10:39 PM12/24/10
to
On 16.12.2010 19:45, raffamaiden wrote:
> Hi all. I'm writing a program wich will write some variables to an
> output file. I do something like
>
> int a =5;
> fwrite(&a, sizeof(int), 1, my_file_ptr);

Don't do that.

> also mean that the file generated by the program that is running in
> one implementation will not be readable in another implementation,
> unless the program knows also in which implementation the instance
> that generated the file was running.

Yes --- which is exactly why you shouldn't do that.

> That is right? I do not want such a behavior. How can i solve this?

By far the easiest solution is human-readable text output. In a
nutshell: use fprintf() instead of fwrite().

It can be done with fwrite(), but it takes quite a lot of extra work.
For starters, you won't be writing anything bigger than one byte
directly. You have to break it down to individual bytes yourself.
Otherwise you have no control over the relation between numerical values
and their representation in the file.

Keith Thompson

unread,
Dec 24, 2010, 2:12:09 PM12/24/10
to
raffamaiden <raffa...@gmail.com> writes:
> Hi all. I'm writing a program wich will write some variables to an
> output file. I do something like
>
> int a =5;
> fwrite(&a, sizeof(int), 1, my_file_ptr);
>
> This will write an int to the file pointed by my_file_ptr. But i know
> that the c standard does not specify the exact size in bytes for its
> primitive type, as far as i know it only specifies that and int is an
> integer type that rapresents a number with a sign, but different
> implementations\operating systems can have different size for an int.
>
> So this mean that my program will write a 32 bit integer with one
> implementation and a 16 bit with another implementation. This would
> also mean that the file generated by the program that is running in
> one implementation will not be readable in another implementation,
> unless the program knows also in which implementation the instance
> that generated the file was running.
> That is right?

Yes. In addition to size, you have to worry about byte ordering and the
method used to represent signed integers (2's-complement is nearly
useful these days, but other representations are possible and have been
used).

> I do not want such a behavior. How can i solve this?

Start by deciding exactly what behavior you *do* want.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Francis Glassborow

unread,
Jan 2, 2011, 3:35:23 PM1/2/11
to
On 24/12/2010 19:15, Manohar wrote:
> Hi raffamaiden,
>
> I can think of two approaches:
> 1. Use either short or long. They will be 16 or 32 bits respectively
> on any(not sure, need confirmation ;-)) system.

The C Standard does not require this. Effectively short must be at least
16 bits (and there are systems where it is 32, and I think there was a
system where it was 24, two 12-bit words). Long must be at least 32-bits
and there were certainly systems where it was 64-bits.

> 2. Convert it to string and store. By doing this, the file will be
> readable when opened as text file. But if you are keen on performance
> then this is not good approach.

Not likely to be very significant. In general only use raw data in
temporary files (ones written and later read by the same process) There
are exceptions but fewer than the times when it is abused by people who
are fanatical about performance and then find that they have to spend
hundreds of hours porting code because the data files are not portable.

>
> Regards,
> Manohar
>
> On Dec 16, 11:45 pm, raffamaiden<raffamai...@gmail.com> wrote:
>> Hi all. I'm writing a program wich will write some variables to an
>> output file. I do something like
>>
>> int a =5;
>> fwrite(&a, sizeof(int), 1, my_file_ptr);
>>
>> This will write an int to the file pointed by my_file_ptr. But i know
>> that the c standard does not specify the exact size in bytes for its
>> primitive type, as far as i know it only specifies that and int is an
>> integer type that rapresents a number with a sign, but different
>> implementations\operating systems can have different size for an int.
>>
>> So this mean that my program will write a 32 bit integer with one
>> implementation and a 16 bit with another implementation. This would
>> also mean that the file generated by the program that is running in
>> one implementation will not be readable in another implementation,
>> unless the program knows also in which implementation the instance
>> that generated the file was running.
>> That is right? I do not want such a behavior. How can i solve this?
>> --
>> comp.lang.c.moderated - moderation address: c...@plethora.net -- you must
>> have an appropriate newsgroups line in your header for your mail to be seen,
>> or the newsgroup name in square brackets in the subject line. Sorry.

--
Note that robinton.demon.co.uk addresses are no longer valid.

Barry Schwarz

unread,
Jan 2, 2011, 3:35:53 PM1/2/11
to
On Fri, 24 Dec 2010 13:15:52 -0600 (CST), Manohar <sman...@gmail.com>
wrote:

>Hi raffamaiden,
>
>I can think of two approaches:
>1. Use either short or long. They will be 16 or 32 bits respectively
>on any(not sure, need confirmation ;-)) system.

Not necessarily. A short must be at least 16 bits and a long must be
at least 32 bits but either or both can be more. And this does
nothing to solve the problem of reading the data on another system
with different a representation.

>2. Convert it to string and store. By doing this, the file will be
>readable when opened as text file. But if you are keen on performance
>then this is not good approach.

While obviously consuming more resources than simply reading or
writing the binary values, the cost of converting to text should be
relatively small compared to the overall cost of doing the I/O.

>On Dec 16, 11:45�pm, raffamaiden <raffamai...@gmail.com> wrote:
>> Hi all. I'm writing a program wich will write some variables to an
>> output file. I do something like
>>
>> int a =5;
>> fwrite(&a, sizeof(int), 1, my_file_ptr);
>>
>> This will write an int to the file pointed by my_file_ptr. But i know
>> that the c standard does not specify the exact size in bytes for its
>> primitive type, as far as i know it only specifies that and int is an
>> integer type that rapresents a number with a sign, but different
>> implementations\operating systems can have different size for an int.
>>
>> So this mean that my program will write a 32 bit integer with one
>> implementation and a 16 bit with another implementation. This would
>> also mean that the file generated by the program that is running in
>> one implementation will not be readable in another implementation,
>> unless the program knows also in which implementation the instance
>> that generated the file was running.
>> That is right? I do not want such a behavior. How can i solve this?
>> --
>> comp.lang.c.moderated - moderation address: c...@plethora.net -- you must
>> have an appropriate newsgroups line in your header for your mail to be seen,
>> or the newsgroup name in square brackets in the subject line. �Sorry.

--
Remove del for email

Jeremy Hall

unread,
Jan 20, 2011, 12:59:39 PM1/20/11
to
By the way, C99 does specify the sizes of basic types.
#include <stdint.h> (or <inttypes.h>)
then use int32_t or int16_t or uint64_t etc as desired.
GCC and others support the C99 standard, sadly Microsoft Visual C does
not.

Jeremy

If you are reading and writing the int

Hans-Bernhard Bröker

unread,
Jan 24, 2011, 4:46:06 PM1/24/11
to
On 20.01.2011 18:59, Jeremy Hall wrote:
> By the way, C99 does specify the sizes of basic types.

No, it doesn't. It offers more basis types than C90 did, and some of
the new ones that C99 defined have specified size, or specified minimal
sizes. But that doesn't mean the existing basic types like int, long
ect. suddenly would have had their sizes specified.

Keith Thompson

unread,
Jan 24, 2011, 4:46:37 PM1/24/11
to
Jeremy Hall <gcc....@gmail.com> writes:
> By the way, C99 does specify the sizes of basic types.
> #include <stdint.h> (or <inttypes.h>)
> then use int32_t or int16_t or uint64_t etc as desired.
> GCC and others support the C99 standard, sadly Microsoft Visual C does
> not.

But it doesn't require int32_t to exist. It's defined only if there's a
predefined type that meets its requirements (2's-complement, no padding
bits, exactly 32 bits).

On most modern systems it will exist -- and if you happen to encounter a
system where it doesn't, you'll find out about it very quickly when your
code fails to compile. (As opposed to code that just assumes int is
32 bits, which can fail quietly when that assumption is violated.)

If you don't actually need *exactly* 32 bits, you can use int_least32_t
or int_fast32_t, which are required to exist (they can be bigger than 32
bits).

Similar considerations apply to the uint*_t types, though of course
the 2's-complement requirement doesn't apply.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

rang...@gmail.com

unread,
Apr 4, 2012, 7:05:26 PM4/4/12
to
You can say

write((i >> 24) & 0xFF)
write((i >> 16) & 0xFF)
write((i >> 8) & 0xFF)
write(i & 0xFF)

This is because the integer of 32 bits you want has 4 bytes. so when you read it back simply...

int out = 0;
out += (read() >> 24);
out += (read() >> 16);
out += (read() >> 8);
out += (read());

This will always work no matter the endian or binary type of your system, which means the code will be more portable. Also you will never have to worry about if the compiler will have int32_t and so on because you can just assume control of the 4 bytes. As a time saving measure do the work into a buffer of unsigned char first and then just write the whole buffer of the system.

If you just have 2 bytes then same difference. Even will work with 8 bytes and __uint64 values. the example of 4 bytes is for unsigned int which is guranteed always containing 4 bytes worth of integer.

Dag-Erling Smørgrav

unread,
Apr 23, 2012, 9:31:46 AM4/23/12
to
rang...@gmail.com writes:
> You can say
>
> write((i >> 24) & 0xFF)
> write((i >> 16) & 0xFF)
> write((i >> 8) & 0xFF)
> write(i & 0xFF)

There is no write() in C, but there is one in POSIX that doesn't match
this usage:

ssize_t write(int fd, void *buf, size_t len);

Let's assume for now that this in fact a private function based on
a global FILE *f:

void write(unsigned char i)
{
if (fwrite(&i, 1, 1, f) != 1) {
fprintf(stderr, "fwrite() failed\n");
exit(EX_FAILURE);
}
}

(actually, we should assume that this function is called funwrite() and
that write() is the following macro:

#define write(i) funwrite(i);

but I'm being charitable)

> int out = 0;
> out += (read() >> 24);
> out += (read() >> 16);
> out += (read() >> 8);
> out += (read());

There is no read() in C, but there is one in POSIX that doesn't match
this usage:

ssize_t read(int fd, void *buf, size_t len);

Let's assume for now that this is in fact a private function based on a
global FILE *f:

unsigned char read(void)
{
unsigned char i;
if (fread(&i, 1, 1, f) != 1) {
fprintf(stderr, "fread() failed\n");
exit(EX_FAILURE);
}
return i;
}

then the following:

unsigned int i = 0x01020304;
write((i >> 24) & 0xFF);
write((i >> 16) & 0xFF);
write((i >> 8) & 0xFF);
write(i & 0xFF);
rewind(f);
i = 0;
i += (read() >> 24);
i += (read() >> 16);
i += (read() >> 8);
i += (read());
printf("i = 0x%x\n", i);

will print

i = 0x4

The correct answer is "store your data in a textual representation".

DES
--
Dag-Erling Smørgrav - d...@des.no

Bharat

unread,
Apr 23, 2012, 9:32:31 AM4/23/12
to
See
http://code.google.com/p/protobuf-c/
http://en.wikipedia.org/wiki/Type-length-value
you will have portable at the cost of slight cpu overhead.

echo ma

unread,
Apr 23, 2012, 9:32:46 AM4/23/12
to
You can use int32_t in <stdint.h>, this is part of C99. You should use int32_t in both source code and the file format document, so other developers will know the size of your int.
Another thing you should know, the byte order things. Intel X86 and IA64 are using little endian, while IBM PowerPC and Sparc are using big endian. Memory layout are totally defferent when you assign 5 for an interger variable on there CPUs. You should use the a uniform byte order independent of CPUs, so you would like to use htons() and htonl() to made it. Or, you can use text representations, this is more portable and easy for human reading.

Sorry for my poor English. Wish this help you.

Jorgen Grahn

unread,
Apr 30, 2012, 10:59:31 PM4/30/12
to
On Mon, 2012-04-23, Dag-Erling Smørgrav wrote:
> rang...@gmail.com writes:
>> You can say
>>
>> write((i >> 24) & 0xFF)
>> write((i >> 16) & 0xFF)
>> write((i >> 8) & 0xFF)
>> write(i & 0xFF)
>
> There is no write() in C, but there is one in POSIX that doesn't match
> this usage:

I am pretty sure rangsynth just used "write" as a generic "write an
octet to somewhere" function. The details are irrelevant here.

[snip irrelevant]

> but I'm being charitable)
>
>> int out = 0;
>> out += (read() >> 24);
>> out += (read() >> 16);
>> out += (read() >> 8);
>> out += (read());
>
> There is no read() in C, but there is one in POSIX that doesn't match
> this usage:

[snip irrelevant]

> then the following:
>
> unsigned int i = 0x01020304;
> write((i >> 24) & 0xFF);
> write((i >> 16) & 0xFF);
> write((i >> 8) & 0xFF);
> write(i & 0xFF);
> rewind(f);
> i = 0;
> i += (read() >> 24);
> i += (read() >> 16);
> i += (read() >> 8);
> i += (read());
> printf("i = 0x%x\n", i);
>
> will print
>
> i = 0x4

If it does, it's because you introduced a bug somewhere. Why? I feel
I'm missing some point here; can you please explain it more clearly?

> The correct answer is "store your data in a textual representation".

I prefer that too, but you don't always get to choose your requirements.
Especially not file formats and protocols!

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Bharat

unread,
Apr 30, 2012, 11:00:01 PM4/30/12
to
On Friday, 17 December 2010 00:15:38 UTC+5:30, raffamaiden wrote:
See usage of TLV
http://en.wikipedia.org/wiki/Type-length-value

See http://code.google.com/p/protobuf/

Dag-Erling Smørgrav

unread,
May 4, 2012, 6:29:59 PM5/4/12
to
Jorgen Grahn <grahn...@snipabacken.se> writes:
> Dag-Erling Smørgrav <d...@des.no> writes:
> > There is no write() in C, but there is one in POSIX that doesn't match
> > this usage:
> I am pretty sure rangsynth just used "write" as a generic "write an
> octet to somewhere" function. The details are irrelevant here.

The details are not irrelevant. When you post code that operates on the
value returned by a function, the semantics of that function and the
range and representation of its return value are highly relevant.

> > then the following:
> >
> > unsigned int i = 0x01020304;
> > write((i >> 24) & 0xFF);
> > write((i >> 16) & 0xFF);
> > write((i >> 8) & 0xFF);
> > write(i & 0xFF);
> > rewind(f);
> > i = 0;
> > i += (read() >> 24);
> > i += (read() >> 16);
> > i += (read() >> 8);
> > i += (read());
> > printf("i = 0x%x\n", i);
> >
> > will print
> >
> > i = 0x4
>
> If it does, it's because you introduced a bug somewhere. Why? I feel
> I'm missing some point here; can you please explain it more clearly?

I didn't introduce a bug anywhere. On the contrary - I was pointing out
a bug in rangsynth's code. I'm sure you'll see it too if you look
closely.

DES
--
Dag-Erling Smørgrav - d...@des.no

Jiří Zárevúcky

unread,
May 4, 2012, 6:30:14 PM5/4/12
to
On Monday, 23 April 2012 15:31:46 UTC+2, Dag-Erling Smørgrav wrote:
> then the following:
>
> unsigned int i = 0x01020304;
> write((i >> 24) & 0xFF);
> write((i >> 16) & 0xFF);
> write((i >> 8) & 0xFF);
> write(i & 0xFF);
> rewind(f);
> i = 0;
> i += (read() >> 24);
> i += (read() >> 16);
> i += (read() >> 8);
> i += (read());
> printf("i = 0x%x\n", i);
>
> will print
>
> i = 0x4

The snippet has wrong direction of the bit shifts for reads.
Correct would be:

i += (read() << 24);
i += (read() << 16);
i += (read() << 8);
i += (read());

I suspect it's a simple copy-paste error of the original author.

Jorgen Grahn

unread,
May 9, 2012, 2:06:07 AM5/9/12
to
On Fri, 2012-05-04, Dag-Erling Smørgrav wrote:
> Jorgen Grahn <grahn...@snipabacken.se> writes:
>> Dag-Erling Smørgrav <d...@des.no> writes:
>> > There is no write() in C, but there is one in POSIX that doesn't match
>> > this usage:
>> I am pretty sure rangsynth just used "write" as a generic "write an
>> octet to somewhere" function. The details are irrelevant here.
>
> The details are not irrelevant. When you post code that operates on the
> value returned by a function, the semantics of that function and the
> range and representation of its return value are highly relevant.

I agree that the types are important in this case, e.g. that you
shift and mask unsigned values.

>> > then the following:
>> >
>> > unsigned int i = 0x01020304;
>> > write((i >> 24) & 0xFF);
>> > write((i >> 16) & 0xFF);
>> > write((i >> 8) & 0xFF);
>> > write(i & 0xFF);
>> > rewind(f);
>> > i = 0;
>> > i += (read() >> 24);
>> > i += (read() >> 16);
>> > i += (read() >> 8);
>> > i += (read());
>> > printf("i = 0x%x\n", i);
>> >
>> > will print
>> >
>> > i = 0x4
>>
>> If it does, it's because you introduced a bug somewhere. Why? I feel
>> I'm missing some point here; can you please explain it more clearly?
>
> I didn't introduce a bug anywhere. On the contrary - I was pointing out
> a bug in rangsynth's code. I'm sure you'll see it too if you look
> closely.

There are less indirect ways of saying "there's a bug in that code" ...

Combined with your advice to print the number as text instead
(snipped), I didn't understand what point you were trying to make.

For a while it seemed you were saying "this is too hard to implement,
so you should switch to another output format". I tend to dislike
binary data formats too. They have many problems, but this (portably
reading/writing unsigned integers) isn't one of them.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
0 new messages