questions on QP, non-text attachments and munpack

Message has been deleted

Ned Freed

unread,

Feb 9, 1996, 3:00:00 AM2/9/96

to

> o Is my mailer correct in believing that it should be able to use QP
> for non-text attachments? After all, QP IS supposed to just be an
> encoding.

It is perfectly correct. If it makes sense for you to use QP for non-text
material you are free to do so.

The system I use makes this decision based on which encoding is more efficient.
(Not that either one is all that efficient...) There are plenty of cases where
QP is more efficient than base64 and I use it when it is, regardless of content
type.

> o Is munpack wrong in stripping CR's when dealing with non-text
> attachments?

It depends on what CRs you are referring to. There are three sorts in QP:

(1) Those that appear as =0D. These should only appear in the content of
types other than text -- the rules for text require that CR only be
used a part of a CRLF sequence and that such sequences be represented
as a line break in the encoded output. When they do appear in non-text
material, however, they must not be stripped.

(2) Those that appear as part of a =<CR><LF> sequence. These "soft breaks"
should always be removed regardless of the type of content involved.

(3) Those that appear as a <CR><LF> sequence that isn't preceeded by =. These
"hard breaks" are only supposed to be used in MIME text. They should not
appear in non-text material. When they appear in text material, however,
they must not be discarded.

My guess is that the case you're referring to is (3), and you did say that the
material isn't textual in form. In this case there aren't supposed to be any
hard breaks present. And if there are MUNPACK could strip them if it wanted to.
It is neither "correct" or "incorrect" to do so.

I could go either way on whether or not it is a good idea for MUNPACK to behave
this way. On one hand, it would be nice to accomodate improper usage of hard
line breaks when encoding non-text material. On the other hand, I can see an
argument for ignoring such breaks as possibly having originated in mishandling
of the encoded material at some point.

For what it's worth, I believe my decoder would turn hard breaks into CRLF
sequences in non-text material. In other words, I didn't do it the way
MUNPACK does, but I cannot claim that my approach is clearly the better
one. (In fact, now that I know what MUNPACK does, I'm going to rethink my
choice!)

Ned

Message has been deleted

Ned Freed

unread,

Feb 11, 1996, 3:00:00 AM2/11/96

to

> No, the program is encoding CR's as =0D. The problem is that munpack is
> stripping those out after decoding them. My query to the group was: is
> munpack wrong in doing so? You and Ned appear to agree with my feeling that
> munpack is indeed wrong.

I agree that it is wrong to do this with non-text material. It appears that
munpack assumes that all use of QP is associated with text. There are plenty
cases where this won't be true, however -- I find QP to be quite useful
in handling application/postscript, for example, and PostScript is definitely
*not* textual in form.

Ned

John Gardiner Myers

unread,

Feb 12, 1996, 3:00:00 AM2/12/96

to

Ned Freed <N...@INNOSOFT.COM> writes:
> It appears that
> munpack assumes that all use of QP is associated with text.

Munpack does have this assumption as a heuristic. This heuristic is
made necessary by the design requirement that munpack not require user
configuration of which types are what.

--
_.John G. Myers Internet: jg...@CMU.EDU
LoseNet: ...!seismo!ihnp4!wiscvm.wisc.edu!give!up

John Gardiner Myers

unread,

Feb 13, 1996, 3:00:00 AM2/13/96

to

han...@pegasus.att.com writes:
> But munpack uses a different heuristic for base64 encoding: it appears to
> check to see if the content type is text and passes a flag to the base64
> decoder that says whether or not CR's should be suppressed. This heuristic
> works well.

That is because the heuristic is to use the choice of quoted-printable
vs. base64 to intuit whether or not the object is textual. Few
textual non-text/* objects are encoded in base64. All text/* objects
are by the spec textual.

Ned Freed

unread,

Feb 14, 1996, 3:00:00 AM2/14/96

to

> I then asked this group who was right: munpack or the software which encoded
> the body part with quoted-printable. The consensus is that the q-p software
> is right and munpack is wrong.

> We've had to advise the customer to stop using munpack because of this. (We
> may even have to put out a general customer advisory on this issue.)

I agree, albeit reluctantly, with this assessment. Speaking as someone with a
MIME implementation that might very well elect to use Q-P for non-text data
should it prove to be more efficient, I'm going to have to consider pointing
people at a minimal MIME agent other than MUNPACK in the future.

Ned

Keith Moore

unread,

Feb 15, 1996, 3:00:00 AM2/15/96

to

> Yes, all text/* objects are by the spec textual. And non-text/* objects may
> be either textual or non-textual.

Are there any non-{text,message,multipart}/* objects that require that
CRLF be mapped to and from the local newline convetion?

Reading the IANA list, I didn't see any, but I'm not familar with all of
them. Some of them (e.g. application/pdf) appear to be tolerant of
changes to the newline convention, while others (application/PostScript)
may be intolerant of such changes, depending on the content. But I don't
know of any which *require* CRLF->local-convention to be usable.

If this is the case, the "do CRLF->local conversion only for text/*"
would seem to be the right heuristic.

Keith
--------
Take the pledge! "I do not limit my speech to satisfy the whims of Congress."

Ned Freed

unread,

Feb 15, 1996, 3:00:00 AM2/15/96

to

> > Yes, all text/* objects are by the spec textual. And non-text/* objects may
> > be either textual or non-textual.

> Are there any non-{text,message,multipart}/* objects that require that
> CRLF be mapped to and from the local newline convetion?

> Reading the IANA list, I didn't see any, but I'm not familar with all of
> them. Some of them (e.g. application/pdf) appear to be tolerant of
> changes to the newline convention, while others (application/PostScript)
> may be intolerant of such changes, depending on the content. But I don't
> know of any which *require* CRLF->local-convention to be usable.

> If this is the case, the "do CRLF->local conversion only for text/*"
> would seem to be the right heuristic.

For what it's worth, this is the approach I use. However, my life is further
complicated by the fact that newline conversion is far from the only
local-convention in some of the environments I support. In particular, I have
to allow for the testing and setting of fairly complex file attributes based on
file type, along with some fairly peculiar format conversions.

The net result is that I have a table that specifies all of this stuff. And
while at present there are no exceptions based on newline conversion in there,
there are a bunch of others having to do with various other sorts of
conversions I have to do.

Ned