Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Writing Bit Fields to a file

18 views
Skip to first unread message

john

unread,
Nov 26, 2009, 4:55:42 PM11/26/09
to
Hello friends,

I.m doing i/o with the standard C functions, fopen, fwrite etc.

I want to write a file that is not an exact number of bytes, but finishes
with an incomplete byte (only 1 nybble).

I don't want to pad out the rest of the byte with zero bits, but just to
finish the file with only 1 nybble in the last byte.

I had hoped to do this with fwrite and a 4 bit bitfield integer, but this
always writes complete bytes.

Can anyone help?

Thx

Message has been deleted
Message has been deleted

john

unread,
Nov 26, 2009, 5:38:23 PM11/26/09
to
Thanks for your interest.

I.m mainly working on 64/bit Windows but I'd really like a portable
solution that will work on any o/s, hence wishing to stay in standard C.


Stefan Ram wrote:


> john <jo...@nospam.com> writes:
>>I want to write a file that is not an exact number of bytes, but
>>finishes with an incomplete byte (only 1 nybble).
>

> To do this, you need a file system (operating system) that supports
> such files, and then you might be able to call the corresponding
> system operations from a C implementation.
>
> But to learn the details, you would need to have to ask in the
> newsgroup for this operating system.
>
> Or, you might develope or use a custom file format, that contains the
> length in nybbles or bits as an additional datum.

Flash Gordon

unread,
Nov 26, 2009, 5:41:26 PM11/26/09
to
Stefan Ram wrote:
> john <jo...@nospam.com> writes:
>> I want to write a file that is not an exact number of bytes
>
> �A binary stream is an ordered sequence of characters�
>
> ISO/IEC 9899:1999 (E), 7.19.2, #3

You forgot to include the quote defining characters.

In any case, most file systems and opterating systems don't have the
concept of files which are not whole numbers of bytes long. I've
certainly never come across or heard of such a system. I suppose with a
punch card you could use scissors to remove part of one of the columns,
but I doubt that the card readers were designed to cope with such a
mutilated card.
--
Flash Gordon

bartc

unread,
Nov 26, 2009, 6:09:25 PM11/26/09
to

"john" <jo...@nospam.com> wrote in message news:hemtgu$n03$1...@aioe.org...

> Hello friends,
>
> I.m doing i/o with the standard C functions, fopen, fwrite etc.
>
> I want to write a file that is not an exact number of bytes, but finishes
> with an incomplete byte (only 1 nybble).
>
> I had hoped to do this with fwrite and a 4 bit bitfield integer, but this
> always writes complete bytes.
>
> Can anyone help?

If I write a zero-byte file to my Windows OS, it is reported as having size
zero.

If I write a one-byte file, it will have size 1 byte.

What would you expect the reported size to be, if you managed to write a
one-bit file?

Presumably you want to superimpose a bit-oriented file-system on top of the
(likely) byte-oriented one provided by the OS. You'll have your work cut out
then.

> I don't want to pad out the rest of the byte with zero bits, but just to
> finish the file with only 1 nybble in the last byte.

Since the padding bits are outside the file itself, it doesn't matter what
they are filled with. And since the hard drive might (for example) only
write out sectors of 512-bytes at a time, those last 4 bits after writing
511 and a half bytes have to be set to something which is either zero or
one, as getting the read/write head to generate anything else (or have the
current turned off at that instant) is not really practical.

--
Bartc

jacob navia

unread,
Nov 26, 2009, 6:12:33 PM11/26/09
to
john a écrit :

Hi John

Today, I bought a hard disk of 1.5 terabytes for... 89 euros.

TERABYTES, i.e. 1500 GB (not 1024+512, just 1000+500).

And now you want to write half a byte to the disk and still
be able to avoid writing those 4 bits?

4 bits?

I do not know how many bits that disk can hold but I doubt that
you will ever find an OS that writes files at the bit level.

I wonder WHY you want to avoid writing those 4 bits?

But anyway, there is a solution and it is perfectly portable everywhere:

Write each bit in a byte. All you write to disk are bytes either 1 or zero.
This will take eight times the place that normally you would, but it will be
able to write at the bit level and NOT a SINGLE bit more!

Keith Thompson

unread,
Nov 26, 2009, 6:29:15 PM11/26/09
to

As others have said, C doesn't support partial byte reads and writes
to files, and most operating systems don't either. Every file is
a whole number of bytes.

It's possible that some OS might support such files, but I've never
heard of one that does.

It's (probably) perfectly reasonable to want to store a bit array
of some arbitrary length, but the OS won't do it for you.

In a typical implementation, physical disk files occupy some whole
number of blocks (the size of a block varies but is typically some
small power of 2), and the size in bytes is maintained by the file
system as metadata (information stored outside the file itself,
such as name, modification time, permissions, and so forth).
If the OS won't let you store a size in bits, you'll just have to
do it yourself. Define a file format in which the size in bits
(or perhaps just the number of used bits in the last byte) is
stored in the file itself. All software that reads or writes the
file then needs to understand the format.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Flash Gordon

unread,
Nov 26, 2009, 6:48:48 PM11/26/09
to
john wrote:
>
>
> Stefan Ram wrote:
>> john <jo...@nospam.com> writes:
>>> I want to write a file that is not an exact number of bytes, but
>>> finishes with an incomplete byte (only 1 nybble).
>> To do this, you need a file system (operating system) that supports
>> such files, and then you might be able to call the corresponding
>> system operations from a C implementation.
>>
>> But to learn the details, you would need to have to ask in the
>> newsgroup for this operating system.
>>
>> Or, you might develope or use a custom file format, that contains the
>> length in nybbles or bits as an additional datum.

> Thanks for your interest.
>
> I.m mainly working on 64/bit Windows but I'd really like a portable
> solution that will work on any o/s, hence wishing to stay in standard
> C.

Please don't top post. Your response should be under the text you are
replying to, so when reading it is possible to read from the top down
rather than having to start part way down, read down, then go back up to
the top.

As to your problem, no version of Windows or DOS that I'm aware of
supports files ending with an incomplete byte. So your problem as stated
cannot be solved either in standard C or with implementation specific
extensions. So as far as I can see your only option is the last one
Stefan suggested, namely putting length information in the file and not
worrying about making the actual file really end with half a byte.

Of course, it is entirely possibly you are trying to solve the wrong
problem. Depending on what your real problem is there might be other
possible solutions.
--
Flash Gordon

robert...@yahoo.com

unread,
Nov 27, 2009, 2:35:07 AM11/27/09
to
On Nov 26, 5:12 pm, jacob navia <ja...@nospam.org> wrote:
> john a écrit :
>
>
>
>
>
> > Hello friends,
>
> > I.m doing i/o with the standard C functions, fopen, fwrite etc.
>
> > I want to write a file that is not an exact number of bytes, but finishes
> > with an incomplete byte (only 1 nybble).
>
> > I don't want to pad out the rest of the byte with zero bits, but just to
> > finish the file with only 1 nybble in the last byte.
>
> > I had hoped to do this with fwrite and a 4 bit bitfield integer, but this
> > always writes complete bytes.
>
> > Can anyone help?
>
> > Thx
>
> Hi John
>
> Today, I bought a hard disk of 1.5 terabytes for... 89 euros.
>
> TERABYTES, i.e. 1500 GB (not 1024+512, just 1000+500).
>
> And now you want to write half a byte to the disk and still
> be able to avoid writing those 4 bits?
>
> 4 bits?
>
> I do not know how many bits that disk can hold but I doubt that
> you will ever find an OS that writes files at the bit level.
>
> I wonder WHY you want to avoid writing those 4 bits?


If your data naturally is bit oriented, it would be entirely
reasonable to want to write it to a file where you did not need to
create some sort of padding scheme to deal with some larger unit of
storage. “Saving” a few bits on the disk is unlikely to be an issue.

There is precedent. CP/M used to commonly store data in 128 byte
sectors, but not store an exact length of the file, thus applications
had to invent their own scheme since whatever they wrote ended up
padded out to a 128 byte boundary. That's where the MS-DOS "control-
z" came from, and that was certainly a huge PITA.

That being said, I know of no OSs that provide such a thing, and don't
really see enough demand to want to see it added. OTOH, a nice set of
wrapper functions around the usual stream functions out to be easy
enough to develop.

The common padding scheme for bit strings is to pad with the bits
0.1.1.1.1... with enough ones to fill out to the next byte (or word,
or whatever), boundary. To undo it, you just scan backwards for the
first zero bit, and truncate there. Obviously that will add an extra
storage unit if the stream naturally fills the last one. The inverse
(1.0.0.0.0…) is also common.

David Thompson

unread,
Dec 17, 2009, 4:07:50 AM12/17/09
to
On Thu, 26 Nov 2009 23:35:07 -0800 (PST), "robert...@yahoo.com"
<robert...@yahoo.com> wrote:

> On Nov 26, 5:12�pm, jacob navia <ja...@nospam.org> wrote:

> > john a �crit :
<snip>


> > > I want to write a file that is not an exact number of bytes, but finishes
> > > with an incomplete byte (only 1 nybble).

<snip>


> > I wonder WHY you want to avoid writing those 4 bits?
>
>
> If your data naturally is bit oriented, it would be entirely
> reasonable to want to write it to a file where you did not need to
> create some sort of padding scheme to deal with some larger unit of
> storage. �Saving� a few bits on the disk is unlikely to be an issue.
>
> There is precedent. CP/M used to commonly store data in 128 byte
> sectors, but not store an exact length of the file, thus applications
> had to invent their own scheme since whatever they wrote ended up
> padded out to a 128 byte boundary. That's where the MS-DOS "control-
> z" came from, and that was certainly a huge PITA.
>

RT-11 also counted only to a block of 512 byte=octets. I think RSX-11
might have done the same for unstructured files, but I'm not sure.

DOS/360 and OS/360, and I believe at least most of the other
early mainframe OSes, allocated most files in fixed-length records
(most commonly 80 = punched card or 120 or 132 +1 = printer line).
That was a limitation on data formatting more generally, not just file
storage, although they are closely related.

> That being said, I know of no OSs that provide such a thing, and don't
> really see enough demand to want to see it added. OTOH, a nice set of
> wrapper functions around the usual stream functions out to be easy
> enough to develop.
>

Multics did. Of course Multices are pretty thin on the ground now.

> The common padding scheme for bit strings is to pad with the bits
> 0.1.1.1.1... with enough ones to fill out to the next byte (or word,
> or whatever), boundary. To undo it, you just scan backwards for the
> first zero bit, and truncate there. Obviously that will add an extra
> storage unit if the stream naturally fills the last one. The inverse
> (1.0.0.0.0�) is also common.

A variant used in common cryptographic hashes (MD5, SHA-1 and
the SHA-2 family) is 100... plus an integer (64bits or 128bits) giving
(redundantly) the count of valid bits. In practice most of the values
people actually hash are octet multiples, but the specs don't require
it. (OTOH block _ciphers_ as commonly used require octets and pad to
blocks, of 8 or 16 octets for current algorithms.)

ASN.1 BER/DER encoding of bitstring uses a prefix octet containing
the count of unused/pad bits (at the bottom of the rightmost octet).
In practice many of the things encoded as bitstrings are themselves
DER or BER encodings (necessarily octet multiple). E.g. an X.509
certificate contains a public-key field defined as a bitstring whose
content varies depending on the public-key algorithm, and for popular
algorithms like RSA and DSA it is a DER encoding. This means the 'pad'
count is always zero, making it easy(ier) for people to misunderstand
and misparse it.

0 new messages