Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Is There a Way To Remove HTML from Messages?

1,005 views
Skip to first unread message

croy

unread,
Jul 16, 2011, 9:09:43 PM7/16/11
to
Thunderbird has had the ability to remove attachments from
messages for quite some time (sweet), but how about HTML?

--
croy

Beauregard T. Shagnasty

unread,
Jul 16, 2011, 9:26:14 PM7/16/11
to
croy wrote:

> Thunderbird has had the ability to remove attachments from
> messages for quite some time (sweet), but how about HTML?

Physically, no. It was sent to you that way.

However, you can choose to View -> Message Body As -> Plain Text

--
-bts
-Four wheels carry the body; two wheels move the soul

Jonathan Kamens

unread,
Jul 16, 2011, 11:25:08 PM7/16/11
to
croy <ha...@spam.invalid.net> writes:
>Thunderbird has had the ability to remove attachments from
>messages for quite some time (sweet), but how about HTML?

Not yet.

However, if/when my patch in
https://bugzilla.mozilla.org/show_bug.cgi?id=602718 is
approved and integrated into Thunderbird, you'll be able to do
View > Message Body Has > All Body Parts and then right-click
on the HTML attachment in the attachments pane and delete it.

It won't be terribly straightforward, since the attachment
will be named something ugly like "Part 1.1.2" and won't say
what its MIME type is, so you'll have to open the attachments
until you find the right one, but it will be possible, albeit
hard.

Which is more than can be said for most email clients, where
it's totally impossible. :-)

Poutnik

unread,
Jul 17, 2011, 4:02:05 AM7/17/11
to
In article <_rednXSNn6qrp7_T...@mozilla.org>,
a.non...@example.invalid says...

>
> croy wrote:
>
> > Thunderbird has had the ability to remove attachments from
> > messages for quite some time (sweet), but how about HTML?
>
> Physically, no. It was sent to you that way.
>
> However, you can choose to View -> Message Body As -> Plain Text

He may has thought about the case the email body text is send
in 2 alternative versions - plain text and HTML text,
to keep just plain one.

--
Poutnik

Greywolf

unread,
Jul 17, 2011, 8:16:01 AM7/17/11
to


And if the messages is entirely in HTML, you'll delete the whole thing.

Sweet?

Wolf K.

Jonathan Kamens

unread,
Jul 18, 2011, 9:34:06 AM7/18/11
to
Greywolf <wek...@sympatico.ca> writes:
>And if the messages is entirely in HTML, you'll delete the whole thing.

If the message is entirely in HTML, then why *would* you
delete the whole thing?

I was talking about someone wanting to remove the HTML part
of a multipart/alternative message. It seems clear to me
that's what the OP was talking about, and what I was saying
in my posting is that once my patch is released, that will be
possible.

Yes, I suppose someone could remove the HTML part of an
HTML-only message if they wanted to do that. But if they did,
it would be them doing it knowingly, and it's their choice.
There's nothing in my code that is going to remove
attachments from people's messages without their knowledge or
consent. GIGO.

Jay Garcia

unread,
Jul 18, 2011, 10:23:28 AM7/18/11
to
On 18.07.2011 08:34, Jonathan Kamens wrote:

--- Original Message ---

A little late to the party here .. but .. If your preference is to "view
=> message body as => plain text", then what good would this patch be?
And for what reason would a user want to remove a portion of a message
anyway?


--
*Jay Garcia - Netscape Champion*
www.ufaq.org
Netscape - Firefox - SeaMonkey - Thunderbird

Greywolf

unread,
Jul 18, 2011, 10:47:40 AM7/18/11
to
On 18/07/2011 9:34 AM, Jonathan Kamens wrote:
> Greywolf<wek...@sympatico.ca> writes:
>> And if the messages is entirely in HTML, you'll delete the whole thing.
>
> If the message is entirely in HTML, then why *would* you
> delete the whole thing?

In context: "removing HTML" from an HTML message would delete it.
Conversion to plain text would preserve it.

[...]

Wolf K.

Ron Hunter

unread,
Jul 18, 2011, 1:41:29 PM7/18/11
to

That makes more sense. I would like to have a 'convert to plain text'
feature so that I could save things to text files, rather than the html
'junk' that goes along.

Bob Henson

unread,
Jul 18, 2011, 1:54:20 PM7/18/11
to
On 17/07/2011 02:09, croy wrote:
> Thunderbird has had the ability to remove attachments from
> messages for quite some time (sweet), but how about HTML?
>
If you use Spampal, it has a plug-in that will strip HTML, or merely
remove that which is dangerous. Thunderbird recognises Spampal too, so
it's quite handy.

--
http://www.galen.org.uk


Forced to choose between two evils - pick the one you haven't tried before!

Jay Garcia

unread,
Jul 18, 2011, 2:09:13 PM7/18/11
to
On 18.07.2011 12:54, Bob Henson wrote:

--- Original Message ---

> On 17/07/2011 02:09, croy wrote:
>> Thunderbird has had the ability to remove attachments from
>> messages for quite some time (sweet), but how about HTML?
>>
> If you use Spampal, it has a plug-in that will strip HTML, or merely
> remove that which is dangerous. Thunderbird recognises Spampal too, so
> it's quite handy.
>

Wouldn't that be the same as view => message body as => plain text ?? If
so then why install an addon? The end result I believe would be the
same. Stripping bad stuff from an html message OR rendering in plain
text would disable anything harmful I would think.

Jay Garcia

unread,
Jul 18, 2011, 2:10:13 PM7/18/11
to
On 18.07.2011 12:41, Ron Hunter wrote:

--- Original Message ---

Removing html *tags* renders the message in plain text, doesn't delete
the message.

Bob Henson

unread,
Jul 18, 2011, 3:16:17 PM7/18/11
to
On 18/07/2011 19:09, Jay Garcia wrote:
> On 18.07.2011 12:54, Bob Henson wrote:
>
> --- Original Message ---
>
>> On 17/07/2011 02:09, croy wrote:
>>> Thunderbird has had the ability to remove attachments from
>>> messages for quite some time (sweet), but how about HTML?
>>>
>> If you use Spampal, it has a plug-in that will strip HTML, or merely
>> remove that which is dangerous. Thunderbird recognises Spampal too, so
>> it's quite handy.
>>
>
> Wouldn't that be the same as view => message body as => plain text ?? If
> so then why install an addon? The end result I believe would be the
> same. Stripping bad stuff from an html message OR rendering in plain
> text would disable anything harmful I would think.
>

That seems sensible to me - but I threw it in as an alternative, since
the OP actually mentioned removing it.

--
http://www.galen.org.uk


You know you're old when you don't mind where your partner goes out to,
so long as they don't ask you along too.

Jay Garcia

unread,
Jul 18, 2011, 3:47:08 PM7/18/11
to
On 18.07.2011 14:16, Bob Henson wrote:

--- Original Message ---

> On 18/07/2011 19:09, Jay Garcia wrote:
>> On 18.07.2011 12:54, Bob Henson wrote:
>>
>> --- Original Message ---
>>
>>> On 17/07/2011 02:09, croy wrote:
>>>> Thunderbird has had the ability to remove attachments from
>>>> messages for quite some time (sweet), but how about HTML?
>>>>
>>> If you use Spampal, it has a plug-in that will strip HTML, or merely
>>> remove that which is dangerous. Thunderbird recognises Spampal too, so
>>> it's quite handy.
>>>
>>
>> Wouldn't that be the same as view => message body as => plain text ?? If
>> so then why install an addon? The end result I believe would be the
>> same. Stripping bad stuff from an html message OR rendering in plain
>> text would disable anything harmful I would think.
>>
>
> That seems sensible to me - but I threw it in as an alternative, since
> the OP actually mentioned removing it.
>

I think it's possible that the OP, when mentioning removing html,
actually meant stripping the *tags* which basically renders the html
formatting to text.

Tarkus

unread,
Jul 18, 2011, 4:15:44 PM7/18/11
to
On 7/18/2011 10:54 AM, Bob Henson wrote:
> On 17/07/2011 02:09, croy wrote:
>> > Thunderbird has had the ability to remove attachments from
>> > messages for quite some time (sweet), but how about HTML?
>> >
> If you use Spampal, it has a plug-in that will strip HTML, or merely
> remove that which is dangerous. Thunderbird recognises Spampal too, so
> it's quite handy.

What kind of HTML is dangerous in Thunderbird?

Jonathan Kamens

unread,
Jul 18, 2011, 10:35:32 PM7/18/11
to
Jay Garcia <J...@JayNOSPAMGarcia.com> writes:
>A little late to the party here .. but .. If your preference is to "view
>=> message body as => plain text", then what good would this patch be?
>And for what reason would a user want to remove a portion of a message
>anyway?

Save disk space and memory when storing and processing
messages. Some people think it's silly to have two copies of
the same content in every email message. They've got a
point... if they have no need for both the text and HTML,
there's no need for them to save both.

Ralph Fox

unread,
Jul 19, 2011, 4:27:46 AM7/19/11
to
On Mon, 18 Jul 2011 13:10:13 -0500, in message <icSdnSRulO-e6rnT...@mozilla.org>
Jay Garcia wrote:

> Removing html *tags* renders the message in plain text, doesn't delete
> the message.

I have seen some seriously unreadable plain text emails which were
converted from HTML that way -- headings, paragraphs, and table cells,
all jammed together into one long paragraph with no space characters
between different paragraphs or different table cells.

I received another one yesterday :-(


--
Kind regards
Ralph

Ron Hunter

unread,
Jul 19, 2011, 4:31:51 AM7/19/11
to

It is a terribly wasteful method of serving both plain text and HTML
users. Still, it makes both types of user happy.

croy

unread,
Jul 20, 2011, 9:58:58 PM7/20/11
to
On Sun, 17 Jul 2011 10:02:05 +0200, Poutnik <m...@privacy.net>
wrote:

Exactly my desire.

--
croy

croy

unread,
Jul 20, 2011, 10:00:54 PM7/20/11
to

Deleting all but the headers would be my preference. I can
then request the sender to send again, in plain-text.

--
croy

croy

unread,
Jul 20, 2011, 10:02:51 PM7/20/11
to
On Mon, 18 Jul 2011 13:34:06 +0000 (UTC),
j...@kamens.brookline.ma.us (Jonathan Kamens) wrote:

>Greywolf <wek...@sympatico.ca> writes:
>>And if the messages is entirely in HTML, you'll delete the whole thing.
>
>If the message is entirely in HTML, then why *would* you
>delete the whole thing?
>
>I was talking about someone wanting to remove the HTML part
>of a multipart/alternative message. It seems clear to me
>that's what the OP was talking about, and what I was saying
>in my posting is that once my patch is released, that will be
>possible.

Yeah!

>Yes, I suppose someone could remove the HTML part of an
>HTML-only message if they wanted to do that. But if they did,
>it would be them doing it knowingly, and it's their choice.
>There's nothing in my code that is going to remove
>attachments from people's messages without their knowledge or
>consent. GIGO.

If it would leave the headers alone, that would be fine by
me.

--
croy

croy

unread,
Jul 20, 2011, 10:03:49 PM7/20/11
to

So that I can save a 1kb message, instead of a 12kb message.

--
croy

croy

unread,
Jul 20, 2011, 10:04:13 PM7/20/11
to

Exactly!

--
croy

croy

unread,
Jul 20, 2011, 10:06:10 PM7/20/11
to
On Mon, 18 Jul 2011 13:10:13 -0500, Jay Garcia
<J...@JayNOSPAMGarcia.com> wrote:

>On 18.07.2011 12:41, Ron Hunter wrote:
>
> --- Original Message ---
>
>> On 7/18/2011 9:47 AM, Greywolf wrote:
>>> On 18/07/2011 9:34 AM, Jonathan Kamens wrote:
>>>> Greywolf<wek...@sympatico.ca> writes:
>>>>> And if the messages is entirely in HTML, you'll delete the whole thing.
>>>>
>>>> If the message is entirely in HTML, then why *would* you
>>>> delete the whole thing?
>>>
>>> In context: "removing HTML" from an HTML message would delete it.
>>> Conversion to plain text would preserve it.
>>>
>>> [...]
>>>
>>> Wolf K.
>>
>> That makes more sense. I would like to have a 'convert to plain text'
>> feature so that I could save things to text files, rather than the html
>> 'junk' that goes along.
>>
>
>Removing html *tags* renders the message in plain text, doesn't delete
>the message.

But if the message contains both to start with, removing the
tags leaves the second part a mess.

--
croy
As if I have any idea what I'm talking about....

croy

unread,
Jul 20, 2011, 10:10:52 PM7/20/11
to
On Mon, 18 Jul 2011 14:47:08 -0500, Jay Garcia
<J...@JayNOSPAMGarcia.com> wrote:

>On 18.07.2011 14:16, Bob Henson wrote:
>
> --- Original Message ---
>
>> On 18/07/2011 19:09, Jay Garcia wrote:
>>> On 18.07.2011 12:54, Bob Henson wrote:
>>>
>>> --- Original Message ---
>>>
>>>> On 17/07/2011 02:09, croy wrote:
>>>>> Thunderbird has had the ability to remove attachments from
>>>>> messages for quite some time (sweet), but how about HTML?
>>>>>
>>>> If you use Spampal, it has a plug-in that will strip HTML, or merely
>>>> remove that which is dangerous. Thunderbird recognises Spampal too, so
>>>> it's quite handy.
>>>>
>>>
>>> Wouldn't that be the same as view => message body as => plain text ?? If
>>> so then why install an addon? The end result I believe would be the
>>> same. Stripping bad stuff from an html message OR rendering in plain
>>> text would disable anything harmful I would think.
>>>
>>
>> That seems sensible to me - but I threw it in as an alternative, since
>> the OP actually mentioned removing it.
>>
>
>I think it's possible that the OP, when mentioning removing html,
>actually meant stripping the *tags* which basically renders the html
>formatting to text.

Actually, whether or not the message has a plain-text
section, I'd like the HTML section to be deleted.

If I determine from the headers that it is something
important to me, and there's no message body left, then I
would contact the sender and ask them to send again, this
time in plain-text. If they can't manage that, then I'll do
without the message body.

--
croy

gla...@linuxuser.iam

unread,
Jul 21, 2011, 12:13:31 AM7/21/11
to
On 07/18/2011 06:10 PM, Jay Garcia wrote:
> On 18.07.2011 12:41, Ron Hunter wrote:
<>

>> That makes more sense. I would like to have a 'convert to plain text'
>> feature so that I could save things to text files, rather than the html
>> 'junk' that goes along.
>>
>
> Removing html *tags* renders the message in plain text, doesn't delete
> the message.

excuse my entering on your post. i could not find croy's original post,
possibly because i did not dl enough of this 'news group', and your post
was at top of thread.

i do not know if there are any available for ms os, but in unix and linux,
what croy wishes to do is handled by email 'pre filters'.

pre filters have ability to strip out what you want removed and will
remove or convert html.

being that you and ron are 'ms heads' [not meant derogatorily], i thought
i would mention here as you two may be aware of such for ms os.


peace out.

tc,hago.

g
.

walking the walk.

long live tux.

signature.asc

Ron Hunter

unread,
Jul 21, 2011, 4:51:27 AM7/21/11
to
I am certainly not familiar enough with either Linux or Apple OSs to
give much help to those users, as you say. I don't know of any such pre
filters, but that is not to say one doesn't exist. There area LOT of
extensions I have never seen.
I just store the email as I receive it, and unless it is something
business oriented, like orders, receipts, or other things I need to
save, I just delete the message when I have read it. No storage
worries. Frankly, with HDs running into terabyte sizes for under $100,
worrying about a few thousand bytes for each message is just a bit
obsessive. Relax, it's just bytes, and 90% of the messages don't have
any lasting value.


Ralph Fox

unread,
Jul 21, 2011, 6:28:57 AM7/21/11
to
On Tue, 19 Jul 2011 07:27:49 -0500, in message <5sWdnW87H6PZ5bjT...@mozilla.org>
Jay Garcia wrote:

> On 19.07.2011 03:27, Ralph Fox wrote:
>
> --- Original Message ---
>

> Understand .. but .. you still received the message, that's my point, it
> wasn't deleted because the html was renedered to text.


It was unreadable, and for practical purposes, worthless.

And the discussion bears no relevance to Jonathan Kamens's patch,
which does not automatically delete HTML from incoming messages.


--
Kind regards
Ralph

Jay Garcia

unread,
Jul 21, 2011, 7:53:13 AM7/21/11
to

--- Original Message ---

Messages viewed as plain text are saved as plain text.

Jay Garcia

unread,
Jul 21, 2011, 7:55:10 AM7/21/11
to
On 20.07.2011 21:06, croy wrote:

--- Original Message ---

> On Mon, 18 Jul 2011 13:10:13 -0500, Jay Garcia
> <J...@JayNOSPAMGarcia.com> wrote:
>
>>On 18.07.2011 12:41, Ron Hunter wrote:
>>
>> --- Original Message ---
>>
>>> On 7/18/2011 9:47 AM, Greywolf wrote:
>>>> On 18/07/2011 9:34 AM, Jonathan Kamens wrote:
>>>>> Greywolf<wek...@sympatico.ca> writes:
>>>>>> And if the messages is entirely in HTML, you'll delete the whole thing.
>>>>>
>>>>> If the message is entirely in HTML, then why *would* you
>>>>> delete the whole thing?
>>>>
>>>> In context: "removing HTML" from an HTML message would delete it.
>>>> Conversion to plain text would preserve it.
>>>>
>>>> [...]
>>>>
>>>> Wolf K.
>>>
>>> That makes more sense. I would like to have a 'convert to plain text'
>>> feature so that I could save things to text files, rather than the html
>>> 'junk' that goes along.
>>>
>>
>>Removing html *tags* renders the message in plain text, doesn't delete
>>the message.
>
> But if the message contains both to start with, removing the
> tags leaves the second part a mess.
>

If your preference is set to View => Message body as "plain text" it
shouldn't be a "mess".

Ron Hunter

unread,
Jul 21, 2011, 8:52:58 AM7/21/11
to
I suppose a rather complex HTML message would be a bit hard to read in
plain text, but I had no trouble with one I just tested.

James Silverton

unread,
Jul 21, 2011, 9:29:12 AM7/21/11
to
If you just want to remove HTML once in a while in Windows, you could
copy the whole message and paste with Steve Miller's PureText.

--
Jim Silverton,
Potomac, MD.

gla...@linuxuser.iam

unread,
Jul 21, 2011, 12:38:07 PM7/21/11
to
On 07/21/2011 08:51 AM, Ron Hunter wrote:
<>

> I am certainly not familiar enough with either Linux or Apple OSs to
> give much help to those users, as you say. I don't know of any such pre
> filters, but that is not to say one doesn't exist. There area LOT of
> extensions I have never seen.

"pre filters" are _programs_, not extensions.

they are used between isp and email client to process inbound email.

some also have spam/junk/malware trapping ability along with stripping
and converting ability.

not sure if they exist for apple os/x, tho with it's similarity to
unix/linux, i would think such may be available.

<>

> Frankly, with HDs running into terabyte sizes for under $100,
> worrying about a few thousand bytes for each message is just a bit
> obsessive. Relax, it's just bytes, and 90% of the messages don't have
> any lasting value.

true, and i agree with your reasoning.

i keep old tsl emails for 2 years for search reasons. a lot faster than
trying to google search or search moz archives.

signature.asc

Ray_Net

unread,
Jul 21, 2011, 12:39:27 PM7/21/11
to
Let the sender do what he want ... It's his liberty.
Perhaps you can ask politely the sender to send a plain-text mail next time.
But if as a sender i receive a request to ask to RE-SEND the same mail -
i will IGNORE this request, and if possibly *never* send you something.

Ron Hunter

unread,
Jul 21, 2011, 1:35:02 PM7/21/11
to

If I received ONE such request, I would certainly try to fulfill it.
Then I would set that user's setting for 'plain text only', and forget
it. Your response seems a bit antagonistic toward helping a friend.

croy

unread,
Jul 21, 2011, 5:36:42 PM7/21/11
to
On Thu, 21 Jul 2011 06:53:13 -0500, Jay Garcia
<J...@JayNOSPAMGarcia.com> wrote:

Are you talking about saving outside of Tb (as a computer
file), or within Tb, as a message in a mail folder?

--
croy

croy

unread,
Jul 21, 2011, 5:40:02 PM7/21/11
to
On Thu, 21 Jul 2011 06:55:10 -0500, Jay Garcia
<J...@JayNOSPAMGarcia.com> wrote:

My goal is to be able to save the message, within the Tb
eMail "folder" structure, as small as possible.

--
croy

Jay Garcia

unread,
Jul 21, 2011, 6:04:27 PM7/21/11
to

--- Original Message ---

In TB .. File => Save as => file


--
Jay Garcia - Netscape / Flock Champion
Netscape - Firefox - Flock - Thunderbird Support
UFAQ - http://www.UFAQ.org

Jay Garcia

unread,
Jul 21, 2011, 6:06:22 PM7/21/11
to

--- Original Message ---

If your view pref is set to view => Message body as => Plain Text try
moving the message to one of your other folders or archive it and see if
the messages is still formatted as html or does it retain the plain text
format.

Jonathan Kamens

unread,
Jul 21, 2011, 9:56:27 PM7/21/11
to
Jay Garcia <J...@JayNOSPAMGarcia.com> writes:
>If your view pref is set to view => Message body as => Plain Text try
>moving the message to one of your other folders or archive it and see if
>the messages is still formatted as html or does it retain the plain text
>format.

It would be nice if people who don't know what they're talking
about would refrain from offering incorrect advice.

View > Message Body As > Plain Text does nothing to alter the
actual source of the message. It just changes how it is
displayed in the Thunderbird window. That's why it's in the
"View" menu.

If you move or archive a message, it retains its original
source code unchanged, regardless of how you're viewing it
when you move it.

Perhaps you should actually try this stuff yourself before
posting useless, incorrect advice to the entire world.

Jonathan Kamens

unread,
Jul 21, 2011, 9:54:21 PM7/21/11
to
croy <ha...@spam.invalid.net> writes:
>But if the message contains both to start with, removing the
>tags leaves the second part a mess.

People are really not getting it.

The OP isn't talking about removing tags.

He's talking about, when he receives a multipart/alternative
message that has both a text/plain part and a text/html part
which have essentially the same content, he wants to
completely remove the text/html part from the message so that
only the text/plain part is left.

My patch to enable this was approved, and it will be in
Thunderbird 8 or 9 (i.e., available within a few months).

And I was slightly wrong when I said it would be difficult
to tell which part is the HTML part... There are actually
little icons next to the attachments showing their MIME
types, so if you look closely you can see which one has the
icon for an HTML file and remove that one.

Jonathan Kamens

unread,
Jul 21, 2011, 9:58:50 PM7/21/11
to
gla...@linuxuser.iam writes:
>i do not know if there are any available for ms os, but in unix and linux=

>,
>what croy wishes to do is handled by email 'pre filters'.

Indeed, at one point I used to have a filter in my
.procmailrc file which looked at every incoming message for
multipart/alternative with both text/plain and text/html, and
if it found one, it replaced the multipart/alternative with
just the text/plain and got rid of the text/html entirely.

I stopped doing that when I stopped using GNU Emacs to read
my email. :-) It took me a lot of years, but I did finally
come to grips with the fact that sometimes the HTML
formatting actually has useful content in it.

Jonathan Kamens

unread,
Jul 21, 2011, 9:52:08 PM7/21/11
to

Yes, well, that's not what the OP is talking about. He's
talking about reducing the size of his mail stored in his
mailbox folders by throwing away the HTML that he doesn't want
or need.

Jay Garcia

unread,
Jul 21, 2011, 10:16:03 PM7/21/11
to
On 21.07.2011 20:52, Jonathan Kamens wrote:

--- Original Message ---

Yes, we understand all that but what IF the received message was
sent/received in HTML only, that's where I thought we were going with
this. You can't remove 100% of the html because you end up with ZERO.

Now, if the OP is receiving every message sent as both then we need to
find a workable solution. Perhaps your extension is the answer in that case.

Jay Garcia

unread,
Jul 21, 2011, 10:18:21 PM7/21/11
to
On 21.07.2011 20:56, Jonathan Kamens wrote:

--- Original Message ---

If you re-read what I wrote I said TRY. Asking a user to try something
isn't incorrect advice. We appreciate your advice and expertise, just
don't patronize us for trying, ok?

Jonathan Kamens

unread,
Jul 21, 2011, 10:11:36 PM7/21/11
to
j...@kamens.brookline.ma.us (Jonathan Kamens) writes:
>croy <ha...@spam.invalid.net> writes:
>>But if the message contains both to start with, removing the
>>tags leaves the second part a mess.
>
>People are really not getting it.
>
>The OP isn't talking about removing tags...

Heh. I just noticed I posted this in response to the OP, who,
like me, was trying to get other people to understand what he
was talking about. D'oh!

Ron Hunter

unread,
Jul 22, 2011, 3:32:27 AM7/22/11
to

Perhaps posting a RFE for a 'store without HTML', and a 'store without
plain text' option that would strip them automatically would help,
eventually.


John H Meyers

unread,
Jul 23, 2011, 8:06:48 AM7/23/11
to
On 7/16/2011 10:25 PM, Jonathan Kamens wrote:

> croy <ha...@spam.invalid.net> writes:
>> Thunderbird has had the ability to remove attachments from
>> messages for quite some time (sweet), but how about HTML?
>
> Not yet.
>
> However, if/when my patch in
> https://bugzilla.mozilla.org/show_bug.cgi?id=602718 is
> approved and integrated into Thunderbird, you'll be able to do
> View > Message Body Has > All Body Parts and then right-click
> on the HTML attachment in the attachments pane and delete it.
>
> It won't be terribly straightforward, since the attachment
> will be named something ugly like "Part 1.1.2" and won't say
> what its MIME type is, so you'll have to open the attachments
> until you find the right one, but it will be possible, albeit
> hard.
>
> Which is more than can be said for most email clients, where
> it's totally impossible. :-)

Eudora has always had this ability, via these two features:

o Ability to edit incoming messages ("pencil edit" tool).
o "Remove formatting" tool button while editing.

This works perfectly well, by the way,
even on "HTML-only" messages,
because it filters the HTML message part,
and does not care about any plain text part
(which in fact it has already discarded).

There is, however, a drawback to this method of filtering HTML,
which is that if you have links whose visible text differs
from the actual URL, you are left with the visible text,
and no trace of the URL. This won't be any problem
with your approach to simply delete the HTML part,
provided that the message also has a plain text part,
and that any actual URLs are included in that plain text part.

Ultimately there is no fully satisfactory solution to the general desire
to "remove html," any more than you can still be satisfied by
food that has had all fat removed or all sugars replaced,
because if there was any really complex original flavor,
it will have been destroyed.

However, flavorless pure text (or text that's merely been colored
or indented or given a font or simple effects) will survive,
and this is the only really good candidate for "removing html."

Even then, if coloring has been integral to showing document
changes between versions, for example, then to suppress it
destroys the value of the message, so no unthinking process
can always properly discriminate when you should or should not
filter what you receive.

I did not notice whether anyone has produced an extension
that would be capable of filtering HTML down to plain text,
when an original plain text part is missing. Does Thunderbird
already contain some such partial ability, inasmuch as you can
already choose to view only "simple html"?

Outside of Thunderbird, here's one (Windows) utility
for stripping HTML and leaving as text:
<http://www.nirsoft.net/utils/htmlastext.html>

Inside of Thunderbird, I wonder whether "Attachment Extractor" add-on
might consider adding a feature to remove HTML parts as well as
attachments, but only when a plain text part also exists?
(I'd think that this would be extremely well appreciated :)

My last two sentences can't help but remind me of this:

"Outside of a dog, a book is man's best friend.
Inside of a dog it's too dark to read." [Groucho Marx]

--

Ron Hunter

unread,
Jul 23, 2011, 9:40:21 AM7/23/11
to
The trouble with removing HTML tags, etc., from an HTML message is they
often have CONTENT in the images which would be lost if only the text
part were retained. This can make a message all but useless, especially
in the case of commercial messages.
Extracting the text part of a multi-part mixed message is pretty simple
by comparison, and retains all the information content.

John H Meyers

unread,
Jul 23, 2011, 10:30:10 AM7/23/11
to
On 7/23/2011 8:40 AM, Ron Hunter wrote:

> The trouble with removing HTML tags, etc., from an HTML message
> is they often have CONTENT in the images which would be lost
> if only the text part were retained.

Yes, much as URLs are lost in links when only "click here"
is retained, but I've seen really smart emailers
send links to either images or "click here to view this
email on the web," which bypasses the entire issue.

One of the smartest things I've seen comes from our
email scanning appliance, which sends a daily
"Quarantine digest" to everyone who has had any messages
quarantined -- this digest comes in two completely different formats,
one being an HTML part with brief links in every item,
and a plaintext part where every item is several lines long,
every line being the full URL of one of the shorter HTML links.

You can hardly believe that it's the same report,
but either one of those "parts" is complete and fully functional,
including to any Thunderbird users who view only plain text parts.

Only one problem: If I first set "view" to "plain text" and get a note
that the message is truncated because it was larger than my threshold,
then clicking the link to download the full message is doing nothing.
I haven't time to explore this further, but it's just another indication
of unexpected unreliability, which biases me against using this program
for my real-life work -- which I never do.

--

croy

unread,
Jul 23, 2011, 1:29:52 PM7/23/11
to

Exactly!

--
croy

croy

unread,
Jul 23, 2011, 1:40:18 PM7/23/11
to
On Thu, 21 Jul 2011 21:16:03 -0500, Jay Garcia
<J...@JayNOSPAMGarcia.com> wrote:

>On 21.07.2011 20:52, Jonathan Kamens wrote:
>
> --- Original Message ---
>
>> Jay Garcia <J...@JayNOSPAMGarcia.com> writes:
>>>On 21.07.2011 16:36, croy wrote:
>>>> Are you talking about saving outside of Tb (as a computer
>>>> file), or within Tb, as a message in a mail folder?
>>>In TB .. File => Save as => file
>>
>> Yes, well, that's not what the OP is talking about. He's
>> talking about reducing the size of his mail stored in his
>> mailbox folders by throwing away the HTML that he doesn't want
>> or need.
>
>Yes, we understand all that but what IF the received message was
>sent/received in HTML only, that's where I thought we were going with
>this. You can't remove 100% of the html because you end up with ZERO.

But you would still have the headers, right?

>Now, if the OP is receiving every message sent as both then we need to
>find a workable solution. Perhaps your extension is the answer in that case.

I need to save every message I get, within the eMail folder
hierarchy. Whether the body is sent in both plain-text and
HTML, or just HTML, I want the HTML gone. If that leaves me
with only the headers, fine. At least I will know where it
came from, when, and what it was titled, and that is all I
need.

I'm getting the impression that you are concerned with the
legal or forensic aspects of altering eMail messages. I am
not the least bit concerned about that. It's my computer,
and I want to save the messages in a certain way. After
all, HTML or plain-text, it's *all* just text, than any
6th-grader (or forensic "expert") could hack into and alter
any way desired.

--
croy

croy

unread,
Jul 23, 2011, 1:43:41 PM7/23/11
to
On Fri, 22 Jul 2011 01:54:21 +0000 (UTC),
j...@kamens.brookline.ma.us (Jonathan Kamens) wrote:

>croy <ha...@spam.invalid.net> writes:
>>But if the message contains both to start with, removing the
>>tags leaves the second part a mess.
>
>People are really not getting it.
>
>The OP isn't talking about removing tags.
>
>He's talking about, when he receives a multipart/alternative
>message that has both a text/plain part and a text/html part
>which have essentially the same content, he wants to
>completely remove the text/html part from the message so that
>only the text/plain part is left.
>
>My patch to enable this was approved, and it will be in
>Thunderbird 8 or 9 (i.e., available within a few months).

Yippee!

>And I was slightly wrong when I said it would be difficult
>to tell which part is the HTML part... There are actually
>little icons next to the attachments showing their MIME
>types, so if you look closely you can see which one has the
>icon for an HTML file and remove that one.

Sweet!

--
croy

Greywolf

unread,
Jul 23, 2011, 3:41:26 PM7/23/11
to
On 23/07/2011 10:30 AM, John H Meyers wrote:
[....]

> Only one problem: If I first set "view" to "plain text" and get a note
> that the message is truncated because it was larger than my threshold,[....]

A plain text message larger than your threshold? Sounds mildly weird to
me. Just how low is your threshold? Did you set it, or is it company policy?

Wolf K.

Jay Garcia

unread,
Jul 23, 2011, 5:24:50 PM7/23/11
to

--- Original Message ---

I never mentioned anything about legality or forensics, so where you got
that impression is a mystery.

First of all, stripping the html out of a message is not a feature or a
function of TB natively. Looks to me like that extension may be your
best solution.

A more kludgy solution would be to "edit message as new" and possibly
format it as text rather than as html. Don't know if this will work, I'm
out of town at the moment and on my laptop. I may be able to test this
in a little bit nonetheless. I'll try it by sending myself an HTML-only
message and see what happens.

Cy Burnot

unread,
Jul 26, 2011, 8:34:00 PM7/26/11
to
Greywolf has written on 7/18/2011 10:47 AM:

> On 18/07/2011 9:34 AM, Jonathan Kamens wrote:
>> Greywolf<wek...@sympatico.ca> writes:
>>> And if the messages is entirely in HTML, you'll delete the whole thing.
>>
>> If the message is entirely in HTML, then why *would* you
>> delete the whole thing?
>
> In context: "removing HTML" from an HTML message would delete it.
> Conversion to plain text would preserve it.


How about "removing HTML tags"?

Cy Burnot

unread,
Jul 26, 2011, 8:35:04 PM7/26/11
to
Ralph Fox has written on 7/19/2011 4:27 AM:

> On Mon, 18 Jul 2011 13:10:13 -0500, in message <icSdnSRulO-e6rnT...@mozilla.org>
> Jay Garcia wrote:
>
>> Removing html *tags* renders the message in plain text, doesn't delete
>> the message.
>
> I have seen some seriously unreadable plain text emails which were
> converted from HTML that way -- headings, paragraphs, and table cells,
> all jammed together into one long paragraph with no space characters
> between different paragraphs or different table cells.
>
> I received another one yesterday :-(

Change all <br>s and <p>s to newlines. Delete other tags.

Ron Hunter

unread,
Jul 26, 2011, 9:33:06 PM7/26/11
to
Just removing the HTML tags will still leave some problems. For one,
formatting, and for another, data that is necessary to make the page
useful may be included in the external image data in those tags.
It really isn't a satisfying message display. That is, basically, what
'display as plain text' does.
Depending on the message, it may be satisfactory, or illegible.

0 new messages