Yes, but there are better tools, specifically MIME-aware tools. Or even perl.
A grep for lines that are 76 characters long and contain no spaces or punctuation would match the actual encoded attachments, but grabbing the MIME boundaries is trickier.
^[+a-z\/0-9]{76}$
This will match all the lines in the attachment except the last one.
> I suppose the solution is something like this: many of the text
> strings begin with the same set of three or four characters. I want
> the BBEdit script or command to search for these sets, then delete
> them and everything following until it reaches the string "--Apple-
> Mail."
That way madness lies. Either get a MIME aware tool that will strip the MIME attachments from the mbox file, or simply strip the encoded lines and don't worry about the boundary lines (or at least deal with those in a second step).
your attachment will look something like this:
--Apple-Mail-2-649664677
Content-Disposition: inline;
filename*=iso-8859-1''GMT%%A0Receipt.pdf
Content-Type: application/pdf;
name="=?iso-8859-1?Q?GMT=A0Receipt.pdf?="
Content-Transfer-Encoding: base64
JVZERi0xLjMKJcTl8uXrp/Og0MTGCjQgMCBvYmoKPDwgL0xlbmd0aCA1IDAgUiAvRmlsdGVyIC9G
bGF0ZURlY29kZSA+PgpddHJlYW0KeAHNW1lz3LgRfsevQLn8QLkyIx7g5afIWtlxsutL46Qq2TzI
knVsNBrZI23if5+vG2gQIClrOONUxXYNSRBEd3994vAX/V5/0Vmj23peFZXRZV7Ym6qu5gWa9NfP
+m/6Ru8frjN9utYZ/12ffvcjhY/Ox9uDhj3X+68w6MVap/MqLdo8q8buFBFr56Y0ZalNwXeVXuo8
BUT, that --Apple-Mail line will appear multiple times in the email (In one message with a single attachment, "--Apple-Mail-" appeared 8 times), so you cannot just willy-nilly delete everything up to one of those line.
--
Elves are wonderful. They provoke wonder. Elves are marvellous. They
cause marvels. Elves are fantastic. They create fantasies. Elves are
glamorous. They project glamour. Elves are enchanting. They weave
enchantment. Elves are terrific. They beget terror.
I agree with LuKreme that you probably don't want to start mucking with
the mbox files without a MIME aware tool lest you run the risk of
corrupting the mobx. I would not even attempt this w/o perl and
MIME::Tools (or equiv.)
It might be simpler for you to change the Prefs in Apple Mail so that
the "Keep copies of messages for offline viewing" option (under
Accounts->Advanced) was set to "All messages, but omit attachments"
Good Luck
Matt
>Hello:
>
>Just joined the group. I hope someone here can help.
>The problem: when I archive my e-mail Inbox (Apple Mail), images and
>graphics are saved as enormously long, unintelligible strings of
>alphanumeric characters. I want to keep the archived "mbox" text files
>but remove these big blocks of text, to reduce the files' sizes.
It might be a lot easier to remove the attachments before archiving.
In Mail create a new smart folder with the rule "Contains
Attachments". You could restrict it to searching the folders you
intend to archive if needed. Then go into that smart folder, select
all the emails, go to the "Messages" menu and select "Remove
Attachments". Done.
Then archive them.
The trouble is, those MIME lines are BOUNDARIES, so they exist at the beginning and end of each MIME part. Also, the message is marked as multipart. If you simply delete the content of the mime part and the ending boundary, you will effectively destroy the message from being properly read by most programs.
I don't have specific recommendations as the tools to do this sort of manipulation on messages are 1) command-line tools or libraries 2) tricksy 3) dangerous.
I would start over with thinking about exactly what the problem is you're trying to solve (personally, I don't want to keep emails without keeping all their contents, but that's not to say that others might feel differently).
Just as an example, if your actual need is that you want a mbox of just plain text emails without HTML, attachments, or any 'extraneous' data, then I would pipe the mbox through formail -s procmail and call a simple procmail recipe that called the command-line web browser links (or lynx) with a -dump option. I use to do this automatically for HTML email back 15 years ago or so.
$ links -dump www.google.com
_________________________________________
_________________________________________
_________________________________________
_________________________________________
_________________________________________
_________________________________________
_________________________________________
Web Images Videos Maps News Shopping Gmail more >>
iGoogle | Settings | Sign in
Google
__________________________________________________________ Advanced
[ Google Search ] [ I'm Feeling Lucky ] SearchLanguage
Tools
Advertising ProgramsBusiness SolutionsAbout Google
(c) 2011 - Privacy
There is MIMEdefang, which is a tool designed to work with sendmail (or sendmail replacements that support milters, like postfix) and also demime which may or may not help. But as I said, these are low level tools designed to be used by people who really REALLY know what they are doing and I don't recommend them. And they are beyond the scope of this list.
SHort answer: other than writing a perl script that you execute from within BBEdit anything other than simply deleting the data lines is likely to screw up the mbox file. Deleting the data lines should not alter the messages other than to remove all encoded content. Be aware that some emails will ONLY be encoded content, however. It is possible you will lose the entire body of the message doing this, depending on how the messages were encoded.
--
Be careful what you wish for. You never know who will be listening. Or
what, for that matter.
Thanks for your reply. There's much useful information there.
I'll look into writing a PERL script from within BBEdit, as you suggest. I'm interested in removing only the gobbledygook text into which Mail renders attached or embedded graphics, not the (legible) content of the e-mail message itself.
I've already looked over formail's man page, and will give that option a try.
As for MIMEdefang and other MIME-aware tools, they look pretty formidable and beyond my needs (not to mention my comprehension).
Regards,
shirasagi
On Jan 4, 2011, at 5:24 PM, LuKreme wrote:
> The trouble is, those MIME lines are BOUNDARIES, so they exist at the beginning and end of each MIME part. Also, the message is marked as multipart. If you simply delete the content of the mime part and the ending boundary, you will effectively destroy the message from being properly read by most programs.
>
> I don't have specific recommendations as the tools to do this sort of manipulation on messages are 1) command-line tools or libraries 2) tricksy 3) dangerous.
>
> I would start over with thinking about exactly what the problem is you're trying to solve (personally, I don't want to keep emails without keeping all their contents, but that's not to say that others might feel differently).
>
> Just as an example, if your actual need is that you want a mbox of just plain text emails without HTML, attachments, or any 'extraneous' data, then I would pipe the mbox through formail -s procmail and call a simple procmail recipe that called the command-line web browser links (or lynx) with a -dump option. I use to do this automatically for HTML email back 15 years ago or so.
>
On 1/5/11 at 2:18 PM, shir...@earthlink.net (Marc Reavis) wrote:
>I'm interested in removing only the gobbledygook text into
>which Mail renders attached or embedded graphics
I was browsing an instructional video site and saw the following
snippet. Perhaps it will be helpful for you.
>Enhanced AppleScript for Extracting Email Attachments
http://www.screencastsonline.com/index_files/SCO0250-macmontage15.php
-Said