Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

html mail filter in gnus

6 views
Skip to first unread message

Kin Cho

unread,
Sep 17, 2002, 3:02:48 PM9/17/02
to
Hi,

I'm getting html mail quite often now and it's getting annoying.
Does anyone has a gnus filter to filter out the html tags and
leave the plain text behind?

-kin

James Cozine

unread,
Sep 17, 2002, 6:16:07 PM9/17/02
to
Kin Cho <k...@neoscale.com> writes:

,----[ C-h v mm-discouraged-alternatives RET ]
| mm-discouraged-alternatives's value is
| ("text/html" "text/richtext")
|
|
| Documentation:
| List of MIME types that are discouraged when viewing multipart/alternative.
| Viewing agents are supposed to view the last possible part of a message,
| as that is supposed to be the richest. However, users may prefer other
| types instead, and this list says what types are most unwanted. If,
| for instance, text/html parts are very unwanted, and text/richtext are
| somewhat unwanted, then the value of this variable should be set
| to:
|
| ("text/html" "text/richtext")
|
| You can customize this variable.
|
| Defined in `mm-decode'.
`----

-jc
--
Debug is human, de-fix divine.

Ichimusai

unread,
Sep 18, 2002, 2:38:48 AM9/18/02
to
James Cozine <jmco...@yahoo.com> writes:

> Kin Cho <k...@neoscale.com> writes:
>
> > Hi,
> >
> > I'm getting html mail quite often now and it's getting annoying.
> > Does anyone has a gnus filter to filter out the html tags and
> > leave the plain text behind?
> >
> > -kin
>
> ,----[ C-h v mm-discouraged-alternatives RET ]
> | mm-discouraged-alternatives's value is
> | ("text/html" "text/richtext")

This works best if there is a plain text part to the message. However I
receive lots of mails which does not have a plain text message in them
at all and that is annoying.

W3 catches some of them, and renders them nicely enough, but if it is
MS Exchange who has converted the mail into HTML or if it is sent by
Outlook or Outlook Express, the HTML is so bad that W3 gives up and
the message is shown in HTML source instead. It's ugly. It got worse
since I upgraded W3.

If everything else fails I use this:

;; Remove HTML tags from a buffer
(defun wash-ugly-html ()
"Remove ugly HTML tags"
(interactive)
(toggle-read-only -1)
(save-excursion
(beginning-of-buffer)
(while (re-search-forward "<[^<@>]*>" nil t)
(replace-match "" nil nil))
(beginning-of-buffer)
(while (re-search-forward "&gt;" nil t)
(replace-match ">" nil nil))
(beginning-of-buffer)
(while (re-search-forward "&lt;" nil t)
(replace-match "<" nil nil))
(beginning-of-buffer)
(while (re-search-forward "&.*;" nil t)
(replace-match "" nil nil))))

Bind it to a key of your liking and use it when all else fails. Not
the most elegant solution, but it works. It leaves a few tags in
sometimes, it's made to leave the References: line in the header and
email addresses, but it makes things a lot more readable.

--
// AA#769 ICQ: 1645566 http://www.ichimusai.org/
\X/ ASCII ribbon campaign - No HTML, RTF or MS Word in mail
Morality is doing what is right, no matter what you're told.
Religion is doing what you're told, not matter what is right.
-- Jerry Sturdivant, alt.atheism

Kai Großjohann

unread,
Sep 18, 2002, 4:38:02 AM9/18/02
to
Kin Cho <k...@neoscale.com> writes:

I use w3m together with emacs-w3m or w3m_el or whatever this package
is called. That seems to do a decent job if displaying the HTML in
an unobtrusive way.

kai
--
~/.signature is: umop 3p!sdn (Frank Nobis)

Jonas Steverud

unread,
Sep 18, 2002, 4:57:09 AM9/18/02
to
Kin Cho <k...@neoscale.com> writes:

I've added the following during the pGnus days:

(add-to-list 'mm-inline-media-tests '("text/html" nil (lambda (h) nil)))
(add-to-list 'mm-discouraged-alternatives "text/html")
(add-to-list 'mm-discouraged-alternatives "text/richtext")
(add-to-list 'mm-discouraged-alternatives "text/enriched")
(setq mm-automatic-display (remove "text/html" mm-automatic-display))
(setq mm-automatic-display (remove "text/richtext" mm-automatic-display))
(setq mm-automatic-display (remove "text/enriched" mm-automatic-display))

It doesn't handle all possible, broken ways HTML can get to you but it
helps a bit.

(Improvements that doesn't involve W3 or similar are welcome.)

--
( www.dtek.chalmers.se/~d4jonas/ ! Wei Wu Wei )
( Meaning of U2 Lyrics, Roleplaying ! To Do Without Do )

D. Goel

unread,
Sep 18, 2002, 4:07:32 PM9/18/02
to


this shows you the lynxed mail.. this got into my .gnus from an
earlier post in g.e.gnus --->


;; function to call to handle text/html attachments
(defun my:gnus-html2text (handle)
(let (text)
(with-temp-buffer
(mm-insert-part handle)
(save-window-excursion
(my:html2text-region (point-min) (point-max))
(setq text (buffer-string))))
(mm-insert-inline handle text)))

(defun my:html2text-region (min max)
"Replace the HTML region from MIN to MAX with lynx --dump."
(interactive "r")
(let ((file "~/tmp/email.html"))
(unwind-protect
(progn
(write-region min max file)
(delete-region min max)
(insert (shell-command-to-string
(concat "lynx "
"lynx -dump "
(shell-quote-argument
(expand-file-name file))))))
;;(delete-file file)
)))

(And of course, after that, to see the actual html mail in lynx
itself, you can always type "lynx ~/tmp/email.html" on a term once
gnus has put it there.. in fact

alias browseemail='lynx ~/tmp/email.html'
or
alias browseemail='w3m ~/tmp/email.html'
)

Adrian Kubala

unread,
Sep 18, 2002, 4:22:12 PM9/18/02
to
Kin Cho <k...@neoscale.com> writes:

If you use Oort, you can do something like:
(setq mm-text-html-renderer 'lynx)

Kin Cho

unread,
Sep 19, 2002, 4:28:27 PM9/19/02
to
D. Goel <de...@glue.umd.edu> writes:

Hi,

This approach seems to best fit my need. How do you hook
my:gnus-html2text into gnus so that it is run automatically on
html mail?

Thanks.

-kin

D. Goel

unread,
Sep 19, 2002, 4:58:01 PM9/19/02
to
Kin Cho <k...@neoscale.com> writes:

> This approach seems to best fit my need. How do you hook
> my:gnus-html2text into gnus so that it is run automatically on
> html mail?

oh, forgot to paste that part.. wow, never realized my .gnus has
become so bizarre by now--->

I guess this (dotgnus-remassoc) removes any previous text/html
bindings that mm-inline-media-tests may have. Dunno if this step is
needed. Wasn't there a simpler function for this dotgnus-remassoc?
remove* perhaps?


(setq ;; use lynx -dump to view inline HTML mm-inline-media-tests
(cons '("text/html" my:gnus-html2text (lambda (handle) (fboundp
'my:gnus-html2text))) (dotgnus-remassoc "text/html"
mm-inline-media-tests)) )

(defun dotgnus-remassoc (elt list)
(let ((result '()))
(mapcar (lambda (e) (unless (and (consp e)
(equal elt (car e)))
(setq result (cons e result))))
list)
(nreverse result)))


DG http://24.197.159.102/~deego/
--

D. Goel

unread,
Sep 19, 2002, 5:00:43 PM9/19/02
to
D. Goel <de...@glue.umd.edu> writes:
>
> (setq ;; use lynx -dump to view inline HTML mm-inline-media-tests
> (cons '("text/html" my:gnus-html2text (lambda (handle) (fboundp
> 'my:gnus-html2text))) (dotgnus-remassoc "text/html"
> mm-inline-media-tests)) )
>
>

Sorry for the ugly formatting --->

(setq
mm-inline-media-tests (cons '("text/html"

Nils Goesche

unread,
Sep 20, 2002, 11:17:49 AM9/20/02
to
D. Goel <de...@glue.umd.edu> writes:

[howto use lynx for HTML MIME parts]

And after a few changes, it works fine for me, too. Thanks!

Here everything together:


(defun my:gnus-html2text (handle)
(let (text)
(with-temp-buffer

(mm-with-unibyte-buffer


(mm-insert-part handle)
(save-window-excursion
(my:html2text-region (point-min) (point-max))

(setq text (buffer-string)))))
(mm-insert-inline handle text)))

(defun my:html2text-region (min max)
"Replace the HTML region from MIN to MAX with lynx --dump."
(interactive "r")

(let ((file "/tmp/email.html"))


(unwind-protect
(progn
(write-region min max file)
(delete-region min max)
(insert (shell-command-to-string
(concat "lynx "
"lynx -dump "
(shell-quote-argument
(expand-file-name file))))))
(delete-file file))))

(setq mm-inline-media-tests


(cons '("text/html" my:gnus-html2text
(lambda (handle)
(fboundp 'my:gnus-html2text)))

(let ((old (assoc "text/html" mm-inline-media-tests)))
(if old
(delete old mm-inline-media-tests)
mm-inline-media-tests))))

Regards,
--
Nils Goesche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x0655CFA0

D. Goel

unread,
Sep 20, 2002, 12:45:45 PM9/20/02
to


> (let ((file "/tmp/email.html"))

Wanna write out your personal email to /tmp? :)


DG http://24.197.159.102/~deego/
--

Nils Goesche

unread,
Sep 20, 2002, 1:59:01 PM9/20/02
to
D. Goel <de...@glue.umd.edu> writes:

> > (let ((file "/tmp/email.html"))
>
> Wanna write out your personal email to /tmp? :)

Actually, it was /tmp/nils/email.html, with nils set to 700, but I
thought I'd better remove the ``nils'' part before posting :-)

D. Goel

unread,
Sep 24, 2002, 6:06:44 PM9/24/02
to

(See Robin Socha's mail in a similar thread pointing to the page:
http://my.gnus.org/Lisp/1012919677 )

[1] I was missing the 'credits' line in my .gnus---apologies for that.
i now see that that code was written by Mark Thomas.

[2] The code is perhaps now improved further on that page.

[3] in any case, The code may have now been rendered obsolete (?) ---
the page says so.. (by gnus-art.el?)

DG http://24.197.159.102/~deego/
--

0 new messages