change stream element-type on the fly?

R. Matthew Emerson

unread,

Mar 23, 1999, 8:00:00 AM3/23/99

to

It would be handy to be able to change the element-type of
a stream on-the-fly.

I wrote code to read PPM files. One variation consists of an ASCII
header (magic, width, height, max-sample-value) and then width*height
samples, stored as raw bytes.

It would be cool to use READ to read the ASCII header, and then switch
the element-type of the stream to (unsigned-byte 8) and use
READ-SEQUENCE to slurp in the raw data all at once. As it is now,
I see no alternative but to use the moral equivalent of C's atoi()
and friends.

-matt

Steve Gonedes

unread,

Mar 23, 1999, 8:00:00 AM3/23/99

to

r...@nightfly.apk.net (R. Matthew Emerson) writes:

< It would be handy to be able to change the element-type of
< a stream on-the-fly.

You can always reopen the stream and reset the file position.

(with-open-file
(input "file.txt" :direction :input :element-type 'character)
(read-line input)
(with-open-file
(input2 input :direction :input :element-type '(unsigned-byte 8))
(file-position input2 (file-position input))
(read-byte input2)))

< I wrote code to read PPM files. One variation consists of an ASCII
< header (magic, width, height, max-sample-value) and then width*height
< samples, stored as raw bytes.

What are PPM files?

Howard R. Stearns

unread,

Mar 23, 1999, 8:00:00 AM3/23/99

to

R. Matthew Emerson wrote:
>
> It would be handy to be able to change the element-type of
> a stream on-the-fly.
>

> I wrote code to read PPM files. One variation consists of an ASCII
> header (magic, width, height, max-sample-value) and then width*height
> samples, stored as raw bytes.
>

> It would be cool to use READ to read the ASCII header, and then switch
> the element-type of the stream to (unsigned-byte 8) and use
> READ-SEQUENCE to slurp in the raw data all at once. As it is now,
> I see no alternative but to use the moral equivalent of C's atoi()
> and friends.
>
> -matt

I think it's legit for an implementation to not want to change the
element-type on the fly: even if the stream is an instance of some
element-type-specific class and we used change-class to change the class
on the fly, it might not be easy to figure out what to with various
internal buffers.

However, before giving up on multiple element types, what would happen
if you opened once under one element-type and kept careful note of
file-position before closing, and then opened again under a new
element-type and immediately set the file-position to an appropiate
place. (More generally, you might even have both streams open
simultaneously, but this can get pretty hairy if there's output
involved.) Note, that the file-position might be in different units for
the two element-types. (Let's not debate the ANSI spec for
file-position right now unless it's absolutely necessarry for someone's
immediate problem. I promise I'll bring it up in the future...)

Erik Naggum

unread,

Mar 24, 1999, 8:00:00 AM3/24/99

to

* r...@nightfly.apk.net (R. Matthew Emerson)

| It would be handy to be able to change the element-type of a stream
| on-the-fly.

chuck the C mind-set and re-evaluate the problem.

| I wrote code to read PPM files. One variation consists of an ASCII
| header (magic, width, height, max-sample-value) and then width*height
| samples, stored as raw bytes.

this isn't all that uncommon.

| It would be cool to use READ to read the ASCII header, and then switch
| the element-type of the stream to (unsigned-byte 8) and use READ-SEQUENCE
| to slurp in the raw data all at once. As it is now, I see no alternative
| but to use the moral equivalent of C's atoi() and friends.

how about slurping a major portion of the file into a specialized vector
of unsigned-byte 8, write a character stream class that can eat out of
such a vector as its input buffer with an accessible buffer index you
could use to retrieve individual bytes?

if you need to read lines of data, or, say, until a double newline,
search for the terminator as bytes with SEARCH, and use (map 'string
#'code-char ...) of a subsequence of the full vector/buffer. (use
displaced arrays to cut the copying costs.) use this form directly in
WITH-INPUT-FROM-STRING.

remember: bytes never _were_ characters. just because two different
objects have the same machine representation doesn't mean they are the
same. the whole notion of _type_ is about communicating intent. C has
never understood this, and gives you a short-cut that most people don't
realize _is_ a short-cut. a lot of really bad design follows from this.

#:Erik

Kent M Pitman

unread,

Mar 24, 1999, 8:00:00 AM3/24/99

to

Steve Gonedes <sgon...@worldnet.att.net> writes:

> r...@nightfly.apk.net (R. Matthew Emerson) writes:
>

> < It would be handy to be able to change the element-type of
> < a stream on-the-fly.
>

> You can always reopen the stream and reset the file position.

This requires knowing that the byte sizes of the two streams are
compatible. On the Lisp Machine, there was a string-char mode for
reading files which was random-access and a character mode that
was not because it used prefix escaping for some characters. Going
from either to the other was painful and not subject to a simple
math computation without knowledge of whether and how many escapes had
been used already on the stream.

> (with-open-file
> (input "file.txt" :direction :input :element-type 'character)
> (read-line input)
> (with-open-file
> (input2 input :direction :input :element-type '(unsigned-byte 8))
> (file-position input2 (file-position input))
> (read-byte input2)))

This would fail not only in the implementation I mentioned but also
in an implementation using two-byte character. PROBABLY in most
implementations base-char is an 8-bit thing, though even that is not
guaranteed. It's quite a lot more likely that 8bit is not compatible
with CHARACTER in implementations suporting unicode, etc.

It is not inconceivable that there is a Common Lisp using base-char
that is 7-bit or 9-bit or some other odd size. Such implementations
used to be prevalent. Many may have been lost in recent years.

> < I wrote code to read PPM files. One variation consists of an ASCII
> < header (magic, width, height, max-sample-value) and then width*height
> < samples, stored as raw bytes.
>

> What are PPM files?

A graphics file format: Portable PixMap Utilities.
Related to PGM, PGM, and PNM.
Bitmap, up to 24-bit, uncompressed, by Jef Poskanzer for Unix and Intel PC.
Intermediate format used in converting other formats to more familiar ones.
Some interesting filter tools commonly available at least for Unix and
Intel PC, maybe others.

See "Encyclopedia of Graphics File Formats" from O'Reilly for more details.
A very useful reference (though its description of GIF decoding seemed
buggy when I tried it).

Another useful related reference is "The File Formats Handbook", though
it happens not to show PPM in the index. It does have a number of obscure
formats not in the Encyclopedia, as well as many that are not graphics
related. (This one had what seemed like a more correct description of
GIF decoding, too. Useful to have a second reference for obscure things
like that in case you're up against a wall trying to understand the
main one you've chosen to use.)