Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Saving text/plain attachments

63 views
Skip to first unread message

anton

unread,
Jan 17, 2012, 4:14:20 AM1/17/12
to
Hi,

using mutt 1.5.20 here (from ubuntu 10.04 LTS). Recently I received a
mail with two attachments of type "text/plain". If I press "v" on the
message (when being on the index), I see the folowing:

I 1 <no description> [multipa/alternativ, 7bit, 7.2K]
I 2 ├─><no description> [text/plain, 7bit, iso-8859-1, 0.7K]
I 3 └─><no description> [text/html, 7bit, iso-8859-1, 6.3K]
A 4 file1.txt [text/plain, 8bit, us-ascii, 2.3K]
A 5 file2.txt [text/plain, 8bit, us-ascii, 2.3K]

It turns out that both file1.txt and file2.txt have non ascii characters
in it, following the Latin-1 encoding. So I suppose the "us-ascii" part
is misleading (the message was sent by Thunderbird MUA). When viewing
the message on the pager, these characters are replaced with the "?"
character.

So far so good. The problem is that if I save the attachmets (using
's'), the saved files are also translated, so that non ascii characters
are replaced again by the "?" character.

If I manually change the mime type of the attachment (using ctrl-e and
setting "application/octet-stream", for instance) the files are
correctly saved on disk.

So my question is: is any way to tell mutt "ok, the mime information may
be wrong. In any case, I want you to save the attachment as-is to the
disc, i.e. without any translation etc".

thanks,
aitor

Jorgen Grahn

unread,
Jan 17, 2012, 11:06:49 AM1/17/12
to
On Tue, 2012-01-17, anton wrote:
> Hi,
>
> using mutt 1.5.20 here (from ubuntu 10.04 LTS). Recently I received a
> mail with two attachments of type "text/plain". If I press "v" on the
> message (when being on the index), I see the folowing:
>
> I 1 <no description> [multipa/alternativ, 7bit, 7.2K]
> I 2 ??????><no description> [text/plain, 7bit, iso-8859-1, 0.7K]
> I 3 ??????><no description> [text/html, 7bit, iso-8859-1, 6.3K]
> A 4 file1.txt [text/plain, 8bit, us-ascii, 2.3K]
> A 5 file2.txt [text/plain, 8bit, us-ascii, 2.3K]
>
> It turns out that both file1.txt and file2.txt have non ascii characters
> in it, following the Latin-1 encoding. So I suppose the "us-ascii" part
> is misleading (the message was sent by Thunderbird MUA).

I don't know for sure how MIME defines us-ascii, but I'm pretty sure
that's a broken mail you're looking at.

> When viewing
> the message on the pager, these characters are replaced with the "?"
> character.
>
> So far so good. The problem is that if I save the attachmets (using
> 's'), the saved files are also translated, so that non ascii characters
> are replaced again by the "?" character.

That part sounds clearly as a bug.

I assume mutt's goal is "take this attachment and save it in the
user's favorite encoding". At the point where it writes that '?', it
*knows* that the MIME information lied to it and that it can never
perform that translation correctly. It would be better to display an
error message and not save anything.

It's different if the MIME information is consistent, but the
attachment still cannot be converted because the mail contains
characters which aren't present in "the user's favorite encoding".

> If I manually change the mime type of the attachment (using ctrl-e and
> setting "application/octet-stream", for instance) the files are
> correctly saved on disk.
>
> So my question is: is any way to tell mutt "ok, the mime information may
> be wrong. In any case, I want you to save the attachment as-is to the
> disc, i.e. without any translation etc".

Hope someone else can answer here.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

anton

unread,
Jan 19, 2012, 3:54:31 AM1/19/12
to
Hi!

On 2012-01-17, Jorgen Grahn <grahn...@snipabacken.se> wrote:
> On Tue, 2012-01-17, anton wrote:
>>
>> So far so good. The problem is that if I save the attachmets (using
>> 's'), the saved files are also translated, so that non ascii characters
>> are replaced again by the "?" character.
>
> That part sounds clearly as a bug.

I'll try with the latest mutt version, and if the problem persists I'd
like to report the bug.

Thank you for the help,

--
anton

Gary Johnson

unread,
Jan 19, 2012, 7:15:56 PM1/19/12
to
anton wrote:
> Hi!
>
> On 2012-01-17, Jorgen Grahn wrote:
> > On Tue, 2012-01-17, anton wrote:
> >>
> >> So far so good. The problem is that if I save the attachmets (using
> >> 's'), the saved files are also translated, so that non ascii characters
> >> are replaced again by the "?" character.
> >
> > That part sounds clearly as a bug.
>
> I'll try with the latest mutt version, and if the problem persists I'd
> like to report the bug.

It does seem like a bug, although it could be argued that mutt is simply
decoding characters outside the range of the stated charset to
characters within that range.

To work around it, you could try putting the following in your .muttrc:

charset-hook ^us-ascii$ windows-1252

The windows-1252 charset is latin1 with M$ extensions.

--
Gary Johnson

Jorgen Grahn

unread,
Jan 20, 2012, 7:35:59 AM1/20/12
to
On Fri, 2012-01-20, Gary Johnson wrote:
> anton wrote:
>> Hi!
>>
>> On 2012-01-17, Jorgen Grahn wrote:
>> > On Tue, 2012-01-17, anton wrote:
>> >>
>> >> So far so good. The problem is that if I save the attachmets (using
>> >> 's'), the saved files are also translated, so that non ascii characters
>> >> are replaced again by the "?" character.
>> >
>> > That part sounds clearly as a bug.
>>
>> I'll try with the latest mutt version, and if the problem persists I'd
>> like to report the bug.
>
> It does seem like a bug, although it could be argued that mutt is simply
> decoding characters outside the range of the stated charset to
> characters within that range.

If you look at the low level, sure. But did you read my analysis upthread?
Mutt sees that the MIME info is inconsistent, but continues anyway
without warnings, and eventually presents known-broken data to the user.

At least that's what it seems like to me.

anton

unread,
Jan 20, 2012, 9:19:52 AM1/20/12
to
Hi,

On 2012-01-20, Gary Johnson <gary...@eskimo.com> wrote:
>
> It does seem like a bug, although it could be argued that mutt is simply
> decoding characters outside the range of the stated charset to
> characters within that range.

I understand the decoding problems when viewing a (mailfomed) text
attachment in the pager. However, I think that when saving the
attachment to disk, mutt shouldn't do any decoding/translation at all,
and just create a verbaim copy of the attachment to disk. Or, at least,
have a command to do so. I find it odd that if someone sends a
mailformed text file as an attachment, there is not direct way to save
an exact copy of the attached file.

And there is also the problem pointed out by Jorgen Grahn: in the
decosing phase, mutt knows that some error ocurred (and thus created the
'?' characters) but does nothing (like warning the user).

> To work around it, you could try putting the following in your .muttrc:
>
> charset-hook ^us-ascii$ windows-1252
>
> The windows-1252 charset is latin1 with M$ extensions.

thanks for the tip!

best,

--
anton

Jorgen Grahn

unread,
Jan 20, 2012, 10:38:48 AM1/20/12
to
On Fri, 2012-01-20, anton wrote:
> Hi,
>
> On 2012-01-20, Gary Johnson <gary...@eskimo.com> wrote:
>>
>> It does seem like a bug, although it could be argued that mutt is simply
>> decoding characters outside the range of the stated charset to
>> characters within that range.
>
> I understand the decoding problems when viewing a (mailfomed) text
> attachment in the pager. However, I think that when saving the
> attachment to disk, mutt shouldn't do any decoding/translation at all,
> and just create a verbaim copy of the attachment to disk. Or, at least,
> have a command to do so.

The latter alternative, please. Attachments come in all kinds of funny
encodings, but users want their text files on disk to have the
system-default one (usually UTF-8 these days). Note that you lose the
metadata about encoding when you save.

(Not that saving attachments to disk is something I frequently do.)

Kai Burghardt

unread,
Oct 10, 2013, 8:29:40 PM10/10/13
to
Hi,

On 2012-01-17, Jorgen Grahn <grahn...@snipabacken.se> wrote:
> On Tue, 2012-01-17, anton wrote:
>> So far so good. The problem is that if I save the attachmets (using
>> 's'), the saved files are also translated, so that non ascii characters
>> are replaced again by the "?" character.
>
> That part sounds clearly as a bug.

Nononooo, just to ensure that it really might be a "bug": What external
viewers are you using? My environment does show chinese characters as
question-marks too.
--
Sincerely yours
Kai Burghardt

Jorgen Grahn

unread,
Oct 12, 2013, 1:30:08 AM10/12/13
to
I wrote that almost two years ago ...

Since you're asking /me/: I didn't use a viewer at all. I was just
commenting on anton's statement. I assumed he meant he had confirmed
that the files had ASCII ? in them, though.

% LANG=C od -a foo

should be a foolproof way to do that.

blmblm.m...@gmail.com

unread,
Oct 12, 2013, 12:43:16 PM10/12/13
to
In article <slrnl5hniv.2...@frailea.sa.invalid>,
Jorgen Grahn <grahn...@snipabacken.se> wrote:
> On Fri, 2013-10-11, Kai Burghardt wrote:
> > Hi,
> >
> > On 2012-01-17, Jorgen Grahn <grahn...@snipabacken.se> wrote:

[ snip ]

> > Nononooo, just to ensure that it really might be a "bug": What external
> > viewers are you using? My environment does show chinese characters as
> > question-marks too.
>
> I wrote that almost two years ago ...
>

The same person has replied to several quite old posts. One can only
suppose that Google's posting interface .... Oh, that's interesting!
based on headers he/she seems to be posting from news.albasani.net.
What a long retention period ....

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.

Jorgen Grahn

unread,
Oct 12, 2013, 2:54:11 PM10/12/13
to
On Sat, 2013-10-12, blmblm myrealbox.com wrote:
> In article <slrnl5hniv.2...@frailea.sa.invalid>,
> Jorgen Grahn <grahn...@snipabacken.se> wrote:
>> On Fri, 2013-10-11, Kai Burghardt wrote:
>> > Hi,
>> >
>> > On 2012-01-17, Jorgen Grahn <grahn...@snipabacken.se> wrote:
>
> [ snip ]
>
>> > Nononooo, just to ensure that it really might be a "bug": What external
>> > viewers are you using? My environment does show chinese characters as
>> > question-marks too.
>>
>> I wrote that almost two years ago ...
>>
>
> The same person has replied to several quite old posts. One can only
> suppose that Google's posting interface .... Oh, that's interesting!
> based on headers he/she seems to be posting from news.albasani.net.
> What a long retention period ....

Well, comp.mail.mutt /is/ a rather important group ;-)
0 new messages