Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Rmail and the raw-text coding system

7 views
Skip to first unread message

Stefan Monnier

unread,
Jan 14, 2011, 4:00:40 PM1/14/11
to mark.lil...@hp.com, emacs...@gnu.org
> It then decodes the BABYL message part:

> (unless (and coding-system
> (coding-system-p coding-system))
> (setq coding-system
> ;; Emacs 21.1 and later writes RMAIL files in emacs-mule, but
> ;; earlier versions did that with the current buffer's encoding.
> ;; So we want to favor detection of emacs-mule (whose normal
> ;; priority is quite low), but still allow detection of other
> ;; encodings if emacs-mule won't fit. The call to
> ;; detect-coding-with-priority below achieves that.
> (car (detect-coding-with-priority
> from to
> '((coding-category-emacs-mule . emacs-mule))))))
> (unless (memq coding-system
> '(undecided undecided-unix))
> (set-buffer-modified-p t) ; avoid locking when decoding
> (let ((buffer-undo-list t))
> (decode-coding-region from to coding-system))
> (setq coding-system last-coding-system-used))
> (set-buffer-modified-p modifiedp)
> (setq buffer-file-coding-system nil)
> (setq save-buffer-coding-system
> (or coding-system 'undecided))))

> This process leaves the buffer as a unibyte buffer.

The question for me is why did it choose raw-text here (which results
indeed in a unibyte buffer)? It should have been emacs-mule.


Stefan


Mark Lillibridge

unread,
Jan 14, 2011, 7:06:53 PM1/14/11
to Stefan Monnier, emacs...@gnu.org

Stefan wrote:

> I (Mark) wrote:
> > It then decodes the BABYL message part:
>
> > ...

>
> > This process leaves the buffer as a unibyte buffer.
>
> The question for me is why did it choose raw-text here (which results
> indeed in a unibyte buffer)? It should have been emacs-mule.

I assume it did so because the buffer contained "invalid" code
points. Remember that loading raw-text then converting to multibyte can
(I believe) produce a buffer with essentially arbitrary bytes modulo no
unaccompanied continuation bytes. Presumably, the result can be
considered invalid by emacs-mule.

- Mark


Stefan Monnier

unread,
Jan 17, 2011, 2:19:10 PM1/17/11
to mark.lil...@hp.com, emacs...@gnu.org
>> > I assume it did so because the buffer contained "invalid" code
>> > points.
>> That would mean that the BABYL file is corrupted. Is it?

> Not as far as I can tell. Weird characters are displayed for some
> messages, but that is normal with Rmail 22 as it doesn't understand
> MIME. I believe the use of raw-text does not lose data.

The BABYL file is supposed to use the emacs-mule encoding. So if it
contains invalid emacs-mule byte sequences, it presumably means
it's corrupted. Of course, maybe they are valid sequences which
Emacs23/24 rejects by mistake, or maybe there's yet something else
going on.

But AFAIK BABYL files use a single encoding for the whole file, and
since around Emacs-21.x that single encoding is supposed to be
emacs-mule (and I seem to remember that the BABYL file is supposed to
contain an annotation at the very beginning saying it's using
emacs-mule, if so).


Stefan

0 new messages