Newsgroups: comp.lang.awk
From: dave.gma+news...@googlemail.com.invalid (Dave Gibson)
Date: Thu, 20 Sep 2012 14:40:04 +0100
Local: Thurs, Sep 20 2012 9:40 am
Subject: MIME-encoded messages in a digest (was: Re: Deleting unwanted message headers from saved email)
Steve Hayes <hayes...@telkomsa.net> wrote:
They're standard MIME encodings intended to prevent message data being
> On Wed, 19 Sep 2012 12:56:13 +0100, dave.gma+news...@googlemail.com.invalid > (Dave Gibson) wrote: >>Steve Hayes <hayes...@telkomsa.net> wrote:
>>The gibberish is the message body encoded as base64 -- it's not
> I've just been checking some of the messages I've been trying to save.
> These ones are hard to read and save:
> Content-Type: text/plain; charset="utf-8"
> These are not quite as hard to read or save, but still cause some
> Content-Type: text/plain; charset=utf-8
> These ones are easy to read and save:
> Content-Type: text/plain; charset="us-ascii"
> The ones that are hardest to read and save appear to be produced
corrupted in transit. <http://tools.ietf.org/html/rfc2045>
Your mail user agent should be able to convert them to local format
Have a look at these:
<http://www.convertstring.com/EncodeDecode/Base64Decode>
> Perhaps one could tell awk to delete such messages.
<http://www.fourmilab.ch/webtools/base64/>
Anyway, assuming messages are in a digest, separated by lines containing
#v+
}
b64 { next }
!body && /^$/ {
}
body { print ; next }
/^[Cc][Oo][Nn][Tt][Ee][Nn][Tt]-[Tt][Rr][Aa][Nn][Ss][Ff][Ee][Rr]-[Ee][Nn][Cc ][Oo][Dd][Ii][Nn][Gg]: [Bb][Aa][Ss][Ee]64/ {
}
{ header[++hlines] = $0 }
----script ends on previous line #v- > Would it also
Perl has modules for dealing with various mail formats so may well be
> be able to convert "quoted printable" into something more readable? better suited to your requirements. #v+
}
/^-- End --/ { qp = 0 ; body = 0 }
/^$/ { body = 1 }
!body && /^[Cc][Oo][Nn][Tt][Ee][Nn][Tt]-[Tt][Rr][Aa][Nn][Ss][Ff][Ee][Rr]-[Ee][Nn][Cc ][Oo][Dd][Ii][Nn][Gg]: [Qq][Uu][Oo][Tt][Ee][Dd]-[Pp][Rr][Ii][Nn][Tt][Aa][Bb][Ll][Ee]/ {
}
body && qp && /=/ {
s = $0 # Brackets '[', ']' on next line contain a space and a tab u = sub(/=[ ]*$/, "", s) t = "" while (match(s, /=[0-9A-F][0-9A-F]/)) { t = t substr(s, 1, RSTART - 1) \ ch[hex[substr(s, RSTART + 1, 1)] * 16 + hex[substr(s, RSTART + 2, 1)]] s = substr(s, RSTART + RLENGTH) } $0 = t s if (u) { printf "%s", $0 next } }
{ print }
----script ends on previous line #v- You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
| ||||||||||||||