Wanted:MAIL.MAI structure definition

Ruslan R. Laishev

unread,

Jun 4, 2006, 3:05:54 AM6/4/06

to

Hello All!

I looking for "legal" structure definition of the file MAIL.MAI, I'd
like to write some application to automaticaly repairing VMS mailboxes.
I found nothing in the starletsd.tlb.

TIA.

--
Cheers, Ruslan.
+---------------------pure personal opinion------------------------+
RADIUS Server for OpenVMS project - www.starlet.spb.ru/radiusvms/
ICQ: 319518233, Skype: SysMan-One, Mobile: +7 (901) 316-3222

Volker Halle

unread,

Jun 5, 2006, 3:29:25 AM6/5/06

to

This has been discussed and answered over in the OpenVMS ITRC forum:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1032273

Volker.

Larry Kilgallen

unread,

Jun 5, 2006, 9:51:30 AM6/5/06

to

In article <1149492565....@j55g2000cwa.googlegroups.com>, "Volker Halle" <volker...@hotmail.com> writes:
> This has been discussed and answered over in the OpenVMS ITRC forum:
>
> http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1032273

I don't see that there.

I see a couple of utility recommendations, and the comment
"You're not going to get an official definition".

There is also some sample code in C.

Hoff Hoffman

unread,

Jun 5, 2006, 11:01:42 AM6/5/06

to

Ruslan R. Laishev wrote:

> I looking for "legal" structure definition of the file MAIL.MAI, I'd
> like to write some application to automaticaly repairing VMS mailboxes.
> I found nothing in the starletsd.tlb.

As Hein indicated over in the cited thread at ITRC (in addition to
the source code that was posted), there is no formal specification
available for this area -- the available specification is effectively
the source code to MAIL itself. (And the format is subject to change
without notice, and the format of the data has changed over time.)

Also see VFYMAIL and other Freeware tools. VFYMAIL is on Freeware
V4. There are other similar tools around. MAILDIR, MAILCOUNT, etc.

Ruslan R. Laishev

unread,

Jun 6, 2006, 2:11:38 AM6/6/06

to

Hello !

Under high load and concurrent access with POP3 and other application
to VMS mailbox we have frequent problems when MAIL.MAI is empty but user
home contains MAIL$****.MAI files. I'd like to embending an additional
functionality to POP3 server to checking losen MAIL$*.MAI files and
deleting it. Just for FUY.

Hoff Hoffman wrote:

Hoff Hoffman

unread,

Jun 6, 2006, 2:25:46 PM6/6/06

to

Ruslan R. Laishev wrote:

> Under high load and concurrent access with POP3 and other
> application to VMS mailbox we have frequent problems when MAIL.MAI is
> empty but user home contains MAIL$****.MAI files. I'd like to embending
> an additional functionality to POP3 server to checking losen MAIL$*.MAI
> files and deleting it. Just for FUY.

Sounds like there's a bug somewhere, and I'm not sure I'd want to
have a POP3 tool deleting files from underneath MAIL. (I'd be as
interested in why there's an error like this lurking, and what software
here is at fault.)

The code necessary to verify the component MAIL files is available on
the Freeware. It's known as VFYMAIL, IIRC.

Tom Linden

unread,

Jun 6, 2006, 2:56:57 PM6/6/06

to

On Tue, 06 Jun 2006 11:25:46 -0700, Hoff Hoffman <hoff-rem...@hp.com>
wrote:

I understand that the layout is subject to change, but why couldn't the
definition
be included in starlet. Just curious.

Tom

JF Mezei

unread,

Jun 6, 2006, 4:05:01 PM6/6/06

to

Tom Linden wrote:
> I understand that the layout is subject to change, but why couldn't the
> definition
> be included in starlet. Just curious.

OK, what are the odds that HP would give permission to VMS engineers to
make even simple modifications to an application on VMS (MAIL) ?

Looks to me that the most that we, users, can expect is improvements to
DCL by Guy Peleg, with the rest of the resources dedicated to the actual
OS itself (cluster interconnects, file system, caches etc, stuff that
users don't see).

However, MAIL is the one area where huge improvemenst/changes are needed
in order for it to allow internet email much more gracefully. Consider
just the time stamp, which would need to be the time stamp of the
message, not the date that the message was received on that node).

So if one day engineers are unleashed are are allowed to work on stuff
like TPU, MAIL and other applciations that haven't been touched since
the last century, I can see the need to make big changes to the file
format, so you might want to keep hacks to a minimum.

On the other hand, if HP makes a firm annoucnement that no application
on VMS will ever be updated again, then by all means, all hacks to MAIL
should be allowed and its source code released.

Hoff Hoffman

unread,

Jun 6, 2006, 4:20:41 PM6/6/06

to

Tom Linden wrote:

> I understand that the layout is subject to change, but why couldn't the
> definition be included in starlet. Just curious.

Before any further discussion here: I am aware of no plans to make
the MAIL definitions externally visible -- the recommended approach for
most operations involves the use the MAIL calling interface, and the use
of the available itemcodes and structures associated with that interface.

OpenVMS source code facilities (and most layered products, for that
matter) that weren't intending or intended to share their data structure
definitions outside the facility tend to have them located locally;
within the OpenVMS facility. If you have to reach into the facility
directories within an OpenVMS build, you tend to realize you're using an
unpublished and volatile interface, after all.

Nothing technically prevents relocating the definitions, but this
would obviously also include a requirement to recode and rebuild the
facility -- and the folks maintaining the facility would also have to
accept some number of folks "locking into" the definitions, too. The
reloocation is easily feasible. Convincing various of the engineers
involved around the need to open the API is the challenge.

FWIW, STARLET is the "published" area. LIB is the "volatile" area;
the area where the system- and version and semi-documented definitions
tend to reside.

In this particular corruption case, I'd still focus on figuring out
where the problem is lurking -- an approach based on having to re-verify
the mail files for corruptions seems to be somewhat less than optimal.
(And if it's MAIL that's involved, the obvious approach would include
fixing the error, and (if needed) the addition of the necessary repair
tool(s) into one of the MAIL facilities within OpenVMS itself.)

Tom Linden

unread,

Jun 6, 2006, 7:20:30 PM6/6/06

to

On Tue, 06 Jun 2006 13:20:41 -0700, Hoff Hoffman <hoff-rem...@hp.com>
wrote:

> Tom Linden wrote:

As I said, it was just curiosity. If there is not a compelling reason to
keep something proprietary, then starlet it. There, I created a new verb.

JF Mezei

unread,

Jun 6, 2006, 7:39:19 PM6/6/06

to

Tom Linden wrote:
> As I said, it was just curiosity. If there is not a compelling reason to
> keep something proprietary, then starlet it. There, I created a new verb.

Yes, but it begs the question: which came first: the library, or the
node ? (starlet) :-) :-) :-) :-)

To starlet or not to starlet, that is the question :-)

Hoff Hoffman

unread,

Jun 6, 2006, 7:44:00 PM6/6/06

to

Tom Linden wrote:

> As I said, it was just curiosity. If there is not a compelling reason to
> keep something proprietary, then starlet it. There, I created a new verb.

I'd flip that over. If there's not a compelling reason to share a
particular interface -- to render the interface public, or at least to
allow some sort of sharing -- then keep the interface private.

Opening any arbitrary interface runs contrary to the typical opacity
of the internal interfaces and of application modularity and of
long-standing programming practices. And it makes the usual and
expected application compatibility far more difficult to achieve and to
maintain.

If I had the cycles to re-implement MAIL all over again, for
instance, we'd be using XML and/or a relational database, and some of
these pieces would be (far) more visible. But opening up traditional
internal APIs or database file organizations to visibility obviously
invites folks to use the APIs, and this is access is most definitely a
two-edged sword. With XML or a relational database -- or a way to
export to same -- maintaining compatibility is easier, and defending
against corruptions -- whether in the code, or in something that a
programmer has hooked into the interface -- is easier.

There have been changes to the naming of various MAIL files over the
years, for instance, and there are definite limits to the current MAIL
design. If we open the database API, we effectively codify the current
design in concrete, and make it far harder (if not impossible) to
re-work these interfaces and these designs compatibly.

If there are particular requirements not met within the existing
interface(s), then these should be addressed through callable MAIL or
similar extensions. Opening up the internals of an arbitrary hunk of
code serves no purpose, and (in my experience) tends to cause problems
for everybody involved. I've ported code, for instance, that expected
modifications directly to the underlying operating system in support of
the application code -- kernel patches. Shudder.

JF Mezei

unread,

Jun 6, 2006, 8:11:26 PM6/6/06

to

Hoff Hoffman wrote:
> instance, we'd be using XML and/or a relational database, and some of
> these pieces would be (far) more visible.

Why XML ? Why not use simple RFC822 concepts ?

And why a relational database instead of a simple indexed file ?

John Santos

unread,

Jun 6, 2006, 9:01:40 PM6/6/06

to

JF Mezei wrote:
> Hoff Hoffman wrote:
>
>>instance, we'd be using XML and/or a relational database, and some of
>>these pieces would be (far) more visible.
>
>
>
> Why XML ? Why not use simple RFC822 concepts ?
>

If I am not mistaken, RFC822 relates to mail transport, not to mail
storage. The mail.mai structure has nothing to do with mail transport
and everything to do with mail storage.

I think XML provides a portable, extensible way to deal with various
mail headers (traditional DEC mail as well as SMTP headers) and other
information about mail (organization of local storage, key words for
lookup, threading, etc.), which would provide the basis for a much
quicker and more versatile user interface program than current mail.exe.

(I use VMS Pine as a mail program. It uses the standard mail.mai files
and callable mail, but it attempts to impose some structure, such as
threading the messages and selectively displaying RFC-style headers
and manufacturing them from the DEC mail headers if they aren't there
(i.e. for local mail or mail received over DECnet.) It is far from
ideal, and very slow, even though its searching abilities, address book,
and so forth are much better than standard VMS mail. There is huge
room for improvement here. For example, I simulate a multilevel folder
structure by naming my folders "cat1-subcat1", "cat1-subcat2", "cat2-
subcat1", "cat2-subcat1-subsubcat1", etc. which makes them sort nicely,
but there is no way, for example to display a subtree or to search it.
If I know I got mail in about something that is probably in one of the
"Cat3-..." folders sometime between 1999 and 2003, I need to iteratively
select each or the cat3* folders in turn, specify a date range (Pine
will do this), and then search. Repeat until fingers fall off. Or the
easier way, just use DCL search [.mail]*.mai "something to look for"
with a big window and hope it isn't too common.)

> And why a relational database instead of a simple indexed file ?

It's much easier and faster to do ad hoc searches and summary display
organization and navigation, all of which are basically database
queries, with a database than with a simple indexed file. (BTW,
the user doesn't need to know anything of the implementation details
to use this stuff, unless he wants to write his own user interface
program.)

--
John Santos
Evans Griffiths & Hart, Inc.
781-861-0670 ext 539

Ruslan R. Laishev

unread,

Jun 7, 2006, 5:15:43 AM6/7/06

to

Tom Linden wrote:

Yes!

>
> Tom
>

--
+ WBR, OpenVMS [Sys|Net] HardWorker ............. Skype: SysMan-One +
Delta Telecom JSC, IMT-MC-450(CDMA2000) cellular operator
Russia,191119,St.Petersburg,Transportny per. 3 Cel: +7 (812) 716-3222
+http://starlet.deltatelecom.ru ............. Frying on OpenVMS only +

Ruslan R. Laishev

unread,

Jun 7, 2006, 5:15:13 AM6/7/06

to

Hello, Hoff!

Hoff Hoffman wrote:

> Ruslan R. Laishev wrote:
>
>> Under high load and concurrent access with POP3 and other
>> application to VMS mailbox we have frequent problems when MAIL.MAI is
>> empty but user home contains MAIL$****.MAI files. I'd like to
>> embending an additional functionality to POP3 server to checking losen
>> MAIL$*.MAI files and deleting it. Just for FUY.
>
>
> Sounds like there's a bug somewhere, and I'm not sure I'd want to have
> a POP3 tool deleting files from underneath MAIL.

Agreed. But, we have ~40 k VMS mailboxes on our VMS cluster, these users is our
mobile subscribers, they don't have an access to VMS directly to performs any
repair actions.

> (I'd be as interested
> in why there's an error like this lurking, and what software here is at
> fault.)

I'm too. Can you please provide me some contact to VMS Mail Men to discuss a
situation ?

>
> The code necessary to verify the component MAIL files is available on
> the Freeware. It's known as VFYMAIL, IIRC.

Thanks.

JF Mezei

unread,

Jun 7, 2006, 5:51:25 AM6/7/06

to

"Ruslan R. Laishev" wrote:
> Agreed. But, we have ~40 k VMS mailboxes on our VMS cluster, these users is our
> mobile subscribers, they don't have an access to VMS directly to performs any
> repair actions.

Do you know for a fact that the stray message files that have no
pointers in MAIL.MAI are valid messages, or are they deleted messages
where the pointer was removed from mail.mai but the delete of the actual
file failed ?

(consider a case where the file is locked for some reason, the POP
server can delete the record in mail.mai but woudln't be able to delete
the actual contents file.)

This is where there is a big difference between ALLIN1 and MAIL. ALLIN1
comes with procedures to repair a user,s mailbox, reset its user count
etc etc. There is also a file cabinet verification that looks for stray
files as well as missing files.

dav...@alpha2.mdx.ac.uk

unread,

Jun 7, 2006, 6:08:33 AM6/7/06

to

In article <Uzphg.9661$3i3.8801@trnddc08>, John Santos <jo...@egh.com> writes:
>JF Mezei wrote:
>> Hoff Hoffman wrote:
>>

>> And why a relational database instead of a simple indexed file ?
>

>It's much easier and faster to do ad hoc searches and summary display
>organization and navigation, all of which are basically database
>queries, with a database than with a simple indexed file. (BTW,
>the user doesn't need to know anything of the implementation details
>to use this stuff, unless he wants to write his own user interface
>program.)
>

Relational databaes aren't a particularly good fit for storing and searching
Mime encoded mail messages.

For a discussion on this subject see for instance

http://groups.google.co.uk/group/comp.mail.misc/browse_frm/thread/51f4a5e1dac42387/

David Webb
Security team leader
CCSS
Middlesex University

Ruslan R. Laishev

unread,

Jun 7, 2006, 6:49:08 AM6/7/06

to

JF Mezei wrote:

No problemo to implement so repairing/verification procedure, but it requires
API or/and direct access to MAIL.MAI file which structure are lightly
"documented" and "subject to change w/o notice".

Larry Kilgallen

unread,

Jun 7, 2006, 7:13:52 AM6/7/06

to

In article <EA2AE58D046CF8B6...@NNTP.DeltaTel.RU>, "Ruslan R. Laishev" <zzLa...@zzDeltaTelecom.RU-remove.all-zz-to-reply> writes:
>
> Tom Linden wrote:

>> I understand that the layout is subject to change, but why couldn't the
>> definition
>> be included in starlet. Just curious.
> Yes!

NO !!! Starlet is the definitions that remain valid for subsequent
versions of VMS.

bri...@encompasserve.org

unread,

Jun 7, 2006, 8:14:50 AM6/7/06

to

In article <Uzphg.9661$3i3.8801@trnddc08>, John Santos <jo...@egh.com> writes:

> JF Mezei wrote:
>> Hoff Hoffman wrote:
>>
>>>instance, we'd be using XML and/or a relational database, and some of
>>>these pieces would be (far) more visible.
>>
>>
>>
>> Why XML ? Why not use simple RFC822 concepts ?
>>
>
> If I am not mistaken, RFC822 relates to mail transport, not to mail
> storage.

RFC822 is message format.
RFC821 is mail transport.

I'm no expert, but I'd expect a well specified XML format to be
easier to create and to parse than standard RFC822+MIME.

On the other hand, translating between XML and RFC822+MIME in the
SMTP gateway component could be unneccessary if the message store
were natively RFC822+MIME. And the POP3 component would be similarly
simplified.

Tom Linden

unread,

Jun 7, 2006, 8:43:20 AM6/7/06

to

I ahve to withdraw and agree with Larry,
to Lib or not lib.

Tom Linden

unread,

Jun 7, 2006, 8:49:37 AM6/7/06

to

On Tue, 06 Jun 2006 16:44:00 -0700, Hoff Hoffman <hoff-rem...@hp.com>
wrote:

> Tom Linden wrote:

I see the merit to your arguments. I am not convinced that XML is a step
forward. I have looked at embedding an XML parser in PL/I as IBM has done
and is used as part of their Websphere package. Personally, I don't think
anyone should have to write XML, if you need to use it as an interface then
let's develop tools to generated it. If I have understood correctly, mail
is organised as an ISAM file. Such files are eminently more efficient than
relation DBs.

dav...@alpha2.mdx.ac.uk

unread,

Jun 7, 2006, 11:20:27 AM6/7/06

to

In article <$EQADg...@eisner.encompasserve.org>, bri...@encompasserve.org writes:
>In article <Uzphg.9661$3i3.8801@trnddc08>, John Santos <jo...@egh.com> writes:

>> JF Mezei wrote:
>>> Hoff Hoffman wrote:
>>>
>>>>instance, we'd be using XML and/or a relational database, and some of
>>>>these pieces would be (far) more visible.
>>>
>>>
>>>
>>> Why XML ? Why not use simple RFC822 concepts ?
>>>
>>
>> If I am not mistaken, RFC822 relates to mail transport, not to mail
>> storage.
>

>RFC822 is message format.
>RFC821 is mail transport.
>
>I'm no expert, but I'd expect a well specified XML format to be
>easier to create and to parse than standard RFC822+MIME.
>
>On the other hand, translating between XML and RFC822+MIME in the
>SMTP gateway component could be unneccessary if the message store
>were natively RFC822+MIME. And the POP3 component would be similarly
>simplified.

Any new mail store would also need to support IMAP 4 and an API which can be
used for web access and for various mail tools (including a mail-handling
facility similar to DELIVER).

Hoff Hoffman

unread,

Jun 7, 2006, 11:42:36 AM6/7/06

to

Ruslan R. Laishev wrote:

> Agreed. But, we have ~40 k VMS mailboxes on our VMS cluster, these
> users is our mobile subscribers, they don't have an access to VMS
> directly to performs any repair actions.

What you want here is a near-term fix, I'd probably look to run one
of the external verification tools as a stop-gap, and to work through
the specific trigger for the error through source code reviews and such.
Something is clearly stomping on an incoming new or outgoing deleted
message, but it's not clear what.

>> (I'd be as interested in why there's an error like this lurking, and
>> what software here is at fault.)
> I'm too. Can you please provide me some contact to VMS Mail Men to
> discuss a situation ?

Comparatively involved problems such as this are best fielded by a
formal report into the applicable customer support center -- that
provides the most efficient scheduling and tracking. From direct
personal experience as an engineer receiving these, direct email is
secondary to the task prioritizing and scheduling provided by the formal
problem reports. The formal reports are always given priority, and that
means that the more informal reports can see arbitrary delays.

I'm guessing you're not likely going to be able to provide a concise
reproducer here, and that you can presently only detect these
corruptions some time after they arise, and that there may or may not be
a log of the activities around the corruption available once the error
has been detected. I'm further guessing we might end up seeing a
substantial volume of source code, and that's going to take some time to
learn and to review.

Hoff Hoffman

unread,

Jun 7, 2006, 11:51:30 AM6/7/06

to

dav...@alpha2.mdx.ac.uk wrote:

> Relational databaes aren't a particularly good fit for storing and searching
> Mime encoded mail messages.

Relational databases would work just fine for this purpose -- I
regularly store vastly more information in databases, and far more
volatile and more active information.

I can't say that MySQL would or would not work for this task nor that
the solution would scale arbitrarily, but Oracle Rdb is blazingly fast,
and it's likely more than capable of handling the email traffic on the
local engineering cluster, for instance.

Hoff Hoffman

unread,

Jun 7, 2006, 11:58:33 AM6/7/06

to

Tom Linden wrote:

> I see the merit to your arguments. I am not convinced that XML is a step
> forward.

For an arbitrary operation, XML is a massively large step forward
from a locally-defined and locally-parsed data format -- it gets the
application out of the business of processing the structures and the
format, and into the business of dealing with the data. Is it the most
efficient? No. Does it work? Yes. And XML is seriously flexible.

> I have looked at embedding an XML parser in PL/I as IBM has done
> and is used as part of their Websphere package. Personally, I don't think
> anyone should have to write XML, if you need to use it as an interface then
> let's develop tools to generated it.

libxml2 works quite nicely, and I've been (re)porting versions of it.
Versions are available at the Freeware V8 staging area -- and FWIW,
the submission deadline for Freeware V8 is rapidly approaching: 15-Jun-2006.

> If I have understood correctly, mail
> is organised as an ISAM file. Such files are eminently more efficient than
> relation DBs.

More efficient, and less flexible -- it's a trade-off. There are
serious limits within the current design, such as the inability to
generally search the mail file, or to have a message in two folders, or
to handle the information in a transactional format, etc.

JF Mezei

unread,

Jun 7, 2006, 3:44:24 PM6/7/06

to

"Ruslan R. Laishev" wrote:
> No problemo to implement so repairing/verification procedure, but it requires
> API or/and direct access to MAIL.MAI file which structure are lightly
> "documented" and "subject to change w/o notice".

Consider that if VMS ever has an upgrade which includes a change in the
MAIL.MAI format, this is bound to be very explicitely spelled out in the
release notes and there may be some utility to convert exsiting .MAI
files to the new format.

While the engineers do (rightly) reserve the right to change that file
format, the odds of it happening without users knowing it are extremely
low. So while the engineres HAVE to warn you not to do it, I think you
are fairly safe in making assumptions in the current format will be
there for many years.

And when the time comes to do an upgrade of VMS on your system, you then
need to ensure that the MAIL.MAI format hasn't changed.

In the end, the goal is to provide good service to your customers TODAY.
There may be a cost in the future if the file's format changes, but not
providing good service during all those years has an ever greater cost.

JF Mezei

unread,

Jun 7, 2006, 3:53:04 PM6/7/06

to

Hoff Hoffman wrote:
> I'm guessing you're not likely going to be able to provide a concise
> reproducer here, and that you can presently only detect these
> corruptions some time after they arise, and that there may or may not be
> a log of the activities around the corruption available once the error
> has been detected. I'm further guessing we might end up seeing a
> substantial volume of source code, and that's going to take some time to
> learn and to review.

Which means that if the engineers responsible for the POP server AND
the engineer responsible for the MAIL callable interface AND the
engineer responsible for the SMTP server were involved, they might come
up with a list of all possible scenarios where a content .MAI file would
be left hanging without a pointer in the main MAIL.MAI indexed file, and
then they can each look in their code to see if their code woudl allow
such conditions to arise.

This is where open source shines because customers can do much of the
debugging for you, especially in cases such as the TCPIP product which
appears to be on auto-pilot without any of the original engineers left
to take care of it.

JF Mezei

unread,

Jun 7, 2006, 4:13:25 PM6/7/06

to

Hoff Hoffman wrote:
> More efficient, and less flexible -- it's a trade-off. There are
> serious limits within the current design, such as the inability to
> generally search the mail file, or to have a message in two folders, or
> to handle the information in a transactional format, etc.

Funny, ALLIN1 does all of this and it uses ISAM files for in indexes.

and DECWINDOWS mail does offer searching capabilities.

ALLIN1 stores messages into a "central" shared set of directories. Each
mail area has its own indexed file containing an index of messages. Each
record contains the main document attributes (sender, recipients, dates
(created, sent, delivered etc), message subject , type of content etc.
It also contains a count of users having a pointer to it.

A user has a local database of his messages. It contains "local"
information such as folder, message status (read, unread, whether a read
receipt is requested, has been issued or not etc), and a pointer to the
shared area and key within that shared areas,s indexed file for the full
details of that document (whicn includes the filename where the actual
contents are stored). And a Document can contain multiple attachements.

When a user deletes a document, it decrements the usage count in the
shared area. When that goes to 0, then the file is actually deleted and
the indexed records in the shared area removed. This allows a message
sent to 500 employees to be stored as 1 copy, with 500 records in user
indexes pointing to the 1 record (which points to the 1 file) in the
shared area.

The use of a central (or multiple central) areas (each area consists of
a whole bunch of directories where files are automatically evently
distributed) allows a system manager to manage mail storage on different
disks than user disks.

While the actual ALLIN1 implementation has some missing features when
looking at internet mail, the concept is pretty sound.

The problem with mail these days is that RFC822 headers are quite
variable and new fields are added and not always consistently used. And
one really needs to keep that header "intact" because it is like a
postmark on a letter. So even if you were to parse the RFC822 header
into some fancy XML structure, you'd still need to retain the original
RFC822 header because your XML parser wouldn't be able to understand new
fields being added to RFC822.

Dave Froble

unread,

Jun 8, 2006, 1:34:10 AM6/8/06

to

JF Mezei wrote:
> Hoff Hoffman wrote:
>> More efficient, and less flexible -- it's a trade-off. There are
>> serious limits within the current design, such as the inability to
>> generally search the mail file, or to have a message in two folders, or
>> to handle the information in a transactional format, etc.
>
> Funny, ALLIN1 does all of this and it uses ISAM files for in indexes.

Yeah, through the years we did lots of things. We also learned some
lessons. New ideas and products grew out of such.

Bottom line, a relational database is more flexible in the retrival of
data. Note that I'm not a big fan of relational databases as a cure-all
for all purposes. But for retrieval and searching, they're pretty good.

> The problem with mail these days is that RFC822 headers are quite
> variable and new fields are added and not always consistently used. And
> one really needs to keep that header "intact" because it is like a
> postmark on a letter. So even if you were to parse the RFC822 header
> into some fancy XML structure, you'd still need to retain the original
> RFC822 header because your XML parser wouldn't be able to understand new
> fields being added to RFC822.

Wanna bet?

--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: da...@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486

Phillip Helbig---remove CLOTHES to reply

unread,

Jun 8, 2006, 4:00:19 AM6/8/06

to

In article <Uzphg.9661$3i3.8801@trnddc08>, John Santos <jo...@egh.com>
writes:

> (I use VMS Pine as a mail program. It uses the standard mail.mai files

> and callable mail, but it attempts to impose some structure, such as
> threading the messages and selectively displaying RFC-style headers
> and manufacturing them from the DEC mail headers if they aren't there
> (i.e. for local mail or mail received over DECnet.) It is far from
> ideal, and very slow, even though its searching abilities, address book,
> and so forth are much better than standard VMS mail. There is huge
> room for improvement here. For example, I simulate a multilevel folder
> structure by naming my folders "cat1-subcat1", "cat1-subcat2", "cat2-
> subcat1", "cat2-subcat1-subsubcat1", etc. which makes them sort nicely,
> but there is no way, for example to display a subtree or to search it.
> If I know I got mail in about something that is probably in one of the
> "Cat3-..." folders sometime between 1999 and 2003, I need to iteratively
> select each or the cat3* folders in turn, specify a date range (Pine
> will do this), and then search. Repeat until fingers fall off. Or the
> easier way, just use DCL search [.mail]*.mai "something to look for"
> with a big window and hope it isn't too common.)

The MLSEARCH utility (written in FORTRAN, IIRC) can do much more
flexible searching than the MAIL> SEARCH command. A while back, Hunter
Goatley modified it so that it would run on ALPHA as well.

dav...@alpha2.mdx.ac.uk

unread,

Jun 8, 2006, 7:04:51 AM6/8/06

to

In article <6CChg.1608$Ot2....@news.cpqcorp.net>, Hoff Hoffman <hoff-rem...@hp.com> writes:

>dav...@alpha2.mdx.ac.uk wrote:
>
>> Relational databaes aren't a particularly good fit for storing and searching
>> Mime encoded mail messages.
>

> Relational databases would work just fine for this purpose -- I
>regularly store vastly more information in databases, and far more
>volatile and more active information.
>
> I can't say that MySQL would or would not work for this task nor that
>the solution would scale arbitrarily, but Oracle Rdb is blazingly fast,
>and it's likely more than capable of handling the email traffic on the
>local engineering cluster, for instance.
>

Relational databases are very good for structure which naturally can be put
into a tabular format. They aren't particularly good for handling objects like
mail messages which are of an arbitrary size and arbitrarily complex structure
(eg messages containing nested multipart/mixed, multipart/alternative etc
parts containing images, video, audio, text and other document formats).
Many of which will certainly be base-64 or uuencoded and others of which may be
stored in an encrypted format. Also the exact format of the message body may
need to be maintained in order to not invalidate signing of the message.
The exact order of message body parts also needs to be maintained.
This means that the message body would need to be stored as an opaque object in
the database (or as with exchange stored as a flat file on disk with just a
pointer to the file stored in the database).

Headers are slightly better in that the complexity is reduced. However the
order of header-lines need to be maintained, header-line types such as
received-lines appear multiple times and there is no complete list of valid
header lines. The exact formats of header lines vary between implementations
(even where they are defined in the appropriate RFCs there are many
implementations which are not strictly compliant).
Headers (as well as message bodies) may also contain encoded text to deal with
different charactersets as per RFC 2047.

Hence once again the majority of headers are best stored as an opaque object
along with the message body or in a flat file.

There are only a very limited number of headers which it makes sense to
search on eg Subject, From address, To address (and possibly CC address) and
date.

The only advantage a centralised relational database has for storing these
limited items as against individual indexed files in each users home directory
is the possibility of just storing a single copy of a mail message rather than
separate copies for each user. However this only really applies when the mail
message has entered the mail system as one single mail message which says it
needs to be delivered to all those users. The mail message-id is supposed to
uniquely identify a mail message which if true would allow mail messages which
had been split up during delivery but which were really still the same message
to be stored just once. However unfortunately this is not possible since there
are a number of systems which do not produce unique message-ids for every
message.

One problem with the mail.mai indexed file approach as currently implemented is
that all the folders referenced in the mail.mai file are stored in the same
directory. A better approach would be to create separate sub-directories for
each folder. This would improve the performance when a user has tens to
hundreds of thousands or millions of mail messages without requiring the
creation of separate (and more difficult to access) mail.mai files (and
should allow for reasonable searching across all the folders referenced in
the mail.mai file).

Also given the size of mail messages in todays world it doesn't make sense for
the mail.mai file to store small mail messages directly in itself much better
just to store a pointer to the mail message in a flatfile on disk.

Obviously there are a lot of things which need to be done to VMS Mail to better
support standard internet mail headers etc but I think the basic indexed-file
structure is a better approach than a relational database approach.

Hoff Hoffman

unread,

Jun 8, 2006, 8:53:48 AM6/8/06

to

Dave Froble wrote:
> JF Mezei wrote:
>> Hoff Hoffman wrote:
>>> More efficient, and less flexible -- it's a trade-off. There are
>>> serious limits within the current design, such as the inability to
>>> generally search the mail file, or to have a message in two folders, or
>>> to handle the information in a transactional format, etc.
>>
>> Funny, ALLIN1 does all of this and it uses ISAM files for in indexes.
>
> Yeah, through the years we did lots of things. We also learned some
> lessons. New ideas and products grew out of such.
>
> Bottom line, a relational database is more flexible in the retrival of
> data. Note that I'm not a big fan of relational databases as a cure-all
> for all purposes. But for retrieval and searching, they're pretty good.

I have to assume that there is a lack of familiarity with relational
databases here. Relational databases are very powerful, and very easy
to add new data and new tables, and allow the programmer to provide the
end user great flexibility -- in terms of data organization, cross
linkages, action routines, transactional integrity, etc.

Could you do this with RMS? Ayup. But by the time you're done, you
are maintaining your own semi-relational database -- and there are
features of database products that your upgraded RMS would still lack --
and then you get to maintain your database, support and upgrade it, and
all the effort that entails?

>> The problem with mail these days is that RFC822 headers are quite
>> variable and new fields are added and not always consistently used. And
>> one really needs to keep that header "intact" because it is like a
>> postmark on a letter. So even if you were to parse the RFC822 header
>> into some fancy XML structure, you'd still need to retain the original
>> RFC822 header because your XML parser wouldn't be able to understand new
>> fields being added to RFC822.
>
> Wanna bet?

I have to assume that there is a lack of familiarity with XML here.

Looking past the hype (not an easy task :-), XML is very powerful and
very portable, and it gets the application out of the business of
parsing the data. (There are trade-offs here too, as the XML libraries
have substantial overhead. It doesn't scale as well as a more
traditional database.)

In OpenVMS terms, XML is a portable text-based itemlist-like
construct -- and one that can be nested, provided with attributes and
tags, structurally verified, displayed, embedded, transfered, and
extended as required. I'm using XML within OpenVMS V8.3, and for all of
these reasons.

I would also propose storing the data in a database, and allowing the
external software to access the data via XML; to allow data imports and
exports using XML. This gets the other software out of the business of
processing RFC-compliant headers. (And, going full circle, if SMTP mail
were to be (re)implemented today, it is exceedingly likely that XML
would have been used as the basis of the wrappers.)

Hoff Hoffman

unread,

Jun 8, 2006, 8:58:07 AM6/8/06

to

dav...@alpha2.mdx.ac.uk wrote:

> Relational databases are very good for structure which naturally can be put
> into a tabular format. They aren't particularly good for handling objects like
> mail messages which are of an arbitrary size and arbitrarily complex structure
> (eg messages containing nested multipart/mixed, multipart/alternative etc
> parts containing images, video, audio, text and other document formats).
> Many of which will certainly be base-64 or uuencoded and others of which may be
> stored in an encrypted format. Also the exact format of the message body may
> need to be maintained in order to not invalidate signing of the message.
> The exact order of message body parts also needs to be maintained.
> This means that the message body would need to be stored as an opaque object in
> the database (or as with exchange stored as a flat file on disk with just a
> pointer to the file stored in the database).

I maintain the OpenVMS source code control system. Storing and
retrieving MAIL messages is easy.

dav...@alpha2.mdx.ac.uk

unread,

Jun 8, 2006, 10:30:21 AM6/8/06

to

In article <z9Vhg.1657$vY2...@news.cpqcorp.net>, Hoff Hoffman <hoff-rem...@hp.com> writes:
>dav...@alpha2.mdx.ac.uk wrote:
>
>> Relational databases are very good for structure which naturally can be put
>> into a tabular format. They aren't particularly good for handling objects like
>> mail messages which are of an arbitrary size and arbitrarily complex structure
>> (eg messages containing nested multipart/mixed, multipart/alternative etc
>> parts containing images, video, audio, text and other document formats).
>> Many of which will certainly be base-64 or uuencoded and others of which may be
>> stored in an encrypted format. Also the exact format of the message body may
>> need to be maintained in order to not invalidate signing of the message.
>> The exact order of message body parts also needs to be maintained.
>> This means that the message body would need to be stored as an opaque object in
>> the database (or as with exchange stored as a flat file on disk with just a
>> pointer to the file stored in the database).
>

> I maintain the OpenVMS source code control system. Storing and
>retrieving MAIL messages is easy.
>

In recent years pure relational databases such as Oracle have changed to allow
the storage and retrieval of opaque data such as images so yes you could store
and retrieve opaque mail message objects but I don't see what advantage that
gives you.

In a previous message you also mentioned XML with the comment

"
I would also propose storing the data in a database, and allowing the
external software to access the data via XML; to allow data imports and
exports using XML. This gets the other software out of the business of
processing RFC-compliant headers. (And, going full circle, if SMTP mail
were to be (re)implemented today, it is exceedingly likely that XML
would have been used as the basis of the wrappers.)
"

SMTP mail and all the software out there which processes RFC-compliant headers
is not going to be reimplemented using XML anytime soon.
If you store mail messages you must be able to present them with all their RFC
headers, Mime structure, PGP signatures etc intact
This isn't a static environment new headers are being created constantly -
sometimes pretty much standardised such as the new header lines for
"anti-forging" techniques such as DomainKeys and SPF - sometimes just locally
introduced headers (which should be X-???? header lines but often aren't).

Oracle's XML appears to support two types of storage CLOB (which maintains
document fidelity and appears just to be an opaque object) and O-R storage
which maintains document object model fidelity by decomposing the XML into
the underlying O-R structures.
I would maintain that to store mail messages you would need to use CLOB
storage.

Converting SMTP mail to other formats (as for instance Exchange does) is one of
the main ways of producing software which has problems interoperating with
other SMTP compliant systems. I think that breaking up mail messages using an
XML schema to store them in O-R structures and then reconstructing them for
output to SMTP based tools or for forwarding would be prone to errors.

Ruslan R. Laishev

unread,

Jun 8, 2006, 10:37:32 AM6/8/06

to

Hoff Hoffman wrote:

Cool! But ! OracleRDB is not free-of-charge software.

Dave Froble

unread,

Jun 8, 2006, 11:47:51 AM6/8/06

to

Hoff Hoffman wrote:
> Dave Froble wrote:
>> JF Mezei wrote:
>>> Hoff Hoffman wrote:
>>>> More efficient, and less flexible -- it's a trade-off. There are
>>>> serious limits within the current design, such as the inability to
>>>> generally search the mail file, or to have a message in two folders, or
>>>> to handle the information in a transactional format, etc.
>>>
>>> Funny, ALLIN1 does all of this and it uses ISAM files for in indexes.
>>
>> Yeah, through the years we did lots of things. We also learned some
>> lessons. New ideas and products grew out of such.
>>
>> Bottom line, a relational database is more flexible in the retrival of
>> data. Note that I'm not a big fan of relational databases as a
>> cure-all for all purposes. But for retrieval and searching, they're
>> pretty good.
>
>
> I have to assume that there is a lack of familiarity with relational
> databases here.

Relatively, very possible.

> Relational databases are very powerful, and very easy
> to add new data and new tables, and allow the programmer to provide the
> end user great flexibility -- in terms of data organization, cross
> linkages, action routines, transactional integrity, etc.

That's sort of what I thought I was trying to write.

However, relational databases have significantly higher overhead than,
for example, something like RMS. Perhaps this impact has been softened
as hardware has gotten faster, but, overhead is overhead, and if you
weren't using the advances in hardware for overhead, those designs that
have less overhead would also gain from the faster hardware.

> Could you do this with RMS? Ayup. But by the time you're done, you
> are maintaining your own semi-relational database

Goes back to what I wrote about having learned some lessons through the
years. One would have a real hard time convincing me that for any
general capability that custom code is better, or even as good, as a
reusable tool designed for that capability. We progressed from
addressing cylinder and track on disk to file systems and then to
database systems. Lessons learned and applied.

> -- and there are
> features of database products that your upgraded RMS would still lack --
> and then you get to maintain your database, support and upgrade it, and
> all the effort that entails?

Yep! In most cases why do so, when it's not a core part of what you're
trying to do? Use the tools appropriate to the job.

>>> The problem with mail these days is that RFC822 headers are quite
>>> variable and new fields are added and not always consistently used. And
>>> one really needs to keep that header "intact" because it is like a
>>> postmark on a letter. So even if you were to parse the RFC822 header
>>> into some fancy XML structure, you'd still need to retain the original
>>> RFC822 header because your XML parser wouldn't be able to understand new
>>> fields being added to RFC822.
>>
>> Wanna bet?
>
>
> I have to assume that there is a lack of familiarity with XML here.

Again, possibly so, but I can envision some designs that could 'learn'
about previously unknown types of data and make some reasonable guesses
for adding such without need for re-programming. Definitely not a 100%
solution, but could handle minor variations.

Remember, those things that 'cannot be done' remain so only until the
first time they are 'done'.

> Looking past the hype (not an easy task :-), XML is very powerful and
> very portable, and it gets the application out of the business of
> parsing the data. (There are trade-offs here too, as the XML libraries
> have substantial overhead. It doesn't scale as well as a more
> traditional database.)
>
> In OpenVMS terms, XML is a portable text-based itemlist-like construct
> -- and one that can be nested, provided with attributes and tags,
> structurally verified, displayed, embedded, transfered, and extended as
> required. I'm using XML within OpenVMS V8.3, and for all of these reasons.
>
> I would also propose storing the data in a database, and allowing the
> external software to access the data via XML; to allow data imports and
> exports using XML. This gets the other software out of the business of
> processing RFC-compliant headers. (And, going full circle, if SMTP mail
> were to be (re)implemented today, it is exceedingly likely that XML
> would have been used as the basis of the wrappers.)
>
>
>

Paul Sture

unread,

Jun 8, 2006, 12:59:57 PM6/8/06

to

Ruslan R. Laishev wrote:

>
> Cool! But ! OracleRDB is not free-of-charge software.
>

It is indeed a shame that since the sell-off to Oracle, the Rdb run time
license no longer comes with VMS. :-(

Hoff Hoffman

unread,

Jun 8, 2006, 2:40:17 PM6/8/06

to

dav...@alpha2.mdx.ac.uk wrote:

> In recent years pure relational databases such as Oracle have changed to allow
> the storage and retrieval of opaque data such as images so yes you could store
> and retrieve opaque mail message objects but I don't see what advantage that
> gives you.

Other than resolving the "the existing mail data store is rather more
limited than I would prefer?" matter?

And commercial databases have been capable of arbitrary storage for
eons. This includes opaque objects.

> In a previous message you also mentioned XML with the comment
>
> "
> I would also propose storing the data in a database, and allowing the
> external software to access the data via XML; to allow data imports and
> exports using XML. This gets the other software out of the business of
> processing RFC-compliant headers. (And, going full circle, if SMTP mail
> were to be (re)implemented today, it is exceedingly likely that XML
> would have been used as the basis of the wrappers.)
> "
>
> SMTP mail and all the software out there which processes RFC-compliant headers
> is not going to be reimplemented using XML anytime soon.

Hence my "if SMTP mail were to be (re)implemented..." comment.

The existing SMTP RFC scheme is a very old design, albeit a widely
accepted and functional design. If most anyone here were to redesign
the SMTP mail traffic, the result would likely be using XML. (Wanna
spot bad headers? Verify it with a schema. Want to extract data from
the header? Grab it. What to add extensions? Have at. Want to
implement attachment encoding? Trivial.)

And I'd tend to provide the XML in parallel to the existing RFC-based
approaches -- assuming there isn't already an RFC for XML-based mail.

> If you store mail messages you must be able to present them with all their RFC
> headers, Mime structure, PGP signatures etc intact
> This isn't a static environment new headers are being created constantly -
> sometimes pretty much standardised such as the new header lines for
> "anti-forging" techniques such as DomainKeys and SPF - sometimes just locally
> introduced headers (which should be X-???? header lines but often aren't).

That software evolves is obvious. XML is very good at dealing with
this, both from generating the new information and from being able to
operate when newer information is presented to an older client.

Who's to say that providing a parallel XML interface won't be
accepted? It would certainly make a reasonable RFC, as a start --
again, if there isn't already an XML MAIL RFC around.

> Oracle's XML appears to support two types of storage CLOB (which maintains
> document fidelity and appears just to be an opaque object) and O-R storage
> which maintains document object model fidelity by decomposing the XML into
> the underlying O-R structures.
> I would maintain that to store mail messages you would need to use CLOB
> storage.

Databases store data. Data is data. Data is bytes. Bytes can be
stored. Metadata can be stored. Data structures can be reconstituted.
You can store a music file as a series of linked data records.

> Converting SMTP mail to other formats (as for instance Exchange does) is one of
> the main ways of producing software which has problems interoperating with
> other SMTP compliant systems. I think that breaking up mail messages using an
> XML schema to store them in O-R structures and then reconstructing them for
> output to SMTP based tools or for forwarding would be prone to errors.

OpenVMS and Windows Exchange Server and most any other tool converts
mail -- from its RFC wire format into the local host on-disk format --
all the time. Every mail message arriving into an OpenVMS system gets
its format converted. Converted at least twice, if you're using POP3 or
IMAP to access and read off the messages from the OpenVMS MAIL data
store -- once on the way in, and once on the way out.

[I have to be being dense here as to not see what the concern is, as
none of this would be difficult to code up using j-random version of
Oracle Rdb and j-random version of libxml2. Most any database itself
has in-built XML capabilities, as well.]

Hoff Hoffman

unread,

Jun 8, 2006, 2:53:53 PM6/8/06

to

Dave Froble wrote:

> Hoff Hoffman wrote:
>> I have to assume that there is a lack of familiarity with XML here.
>
> Again, possibly so, but I can envision some designs that could 'learn'
> about previously unknown types of data and make some reasonable guesses
> for adding such without need for re-programming. Definitely not a 100%
> solution, but could handle minor variations.

XML offers a block quote mechanism, allowing a translation tool to
convert the incoming RFC-compliant SMTP headers into recognized and
structured XML, and into what amounts to quoted headers for the odd
stuff. When regurgitating the message, the block quotes can be replayed
into the data stream. Or the incoming traffic detects the pieces of
the header of interest, and block quotes the whole SMTP header for
posterity. Much like how existing mailers handle this stuff.

If SMTP can do "it", then an XML translation can be implemented for
"it", too.

> Remember, those things that 'cannot be done' remain so only until the
> first time they are 'done'.

Ayup. That's part of the fun of working in a development group, too.

JF Mezei

unread,

Jun 8, 2006, 8:02:33 PM6/8/06

to

Hoff Hoffman wrote:
> all the time. Every mail message arriving into an OpenVMS system gets
> its format converted. Converted at least twice, if you're using POP3 or
> IMAP to access and read off the messages from the OpenVMS MAIL data
> store -- once on the way in, and once on the way out.

Properly RFC formatted messages do NOT get reformatted. The RFC header
is parsed so that the SMTP symbiont can properly fill out the basic VMS
Mail fields, but the RFC header remains intact. And when you use POP or
IMAP, if the RFC header is intact, it is used as is. It is only when the
RFC header is wrong or missing that a new one is synthetized from the
VMS mail enveloppe information.

> [I have to be being dense here as to not see what the concern is, as
> none of this would be difficult to code up using j-random version of
> Oracle Rdb and j-random version of libxml2. Most any database itself
> has in-built XML capabilities, as well.]

XML gives you no added functionality. You will still need code to parse
a dynamic RFC header into a fixed XML definition and thene wnever you
need to access the mesage, code to reconstitute the RFC header,
hopefully exactly as it was before.

XML is like RFID. A nice bozzword that is greatly abused and misused.
Puttin RFID on a box so you can scan it as it passes near a sensor on a
conveyor belt is great. Putting RFID in a car jey or passports is stupid
because anyone can scan your key/passport without you knowing about it.

dav...@alpha2.mdx.ac.uk

unread,

Jun 9, 2006, 8:40:01 AM6/9/06

to

In article <la_hg.1672$L33...@news.cpqcorp.net>, Hoff Hoffman <hoff-rem...@hp.com> writes:
>dav...@alpha2.mdx.ac.uk wrote:
>
>> In recent years pure relational databases such as Oracle have changed to allow
>> the storage and retrieval of opaque data such as images so yes you could store
>> and retrieve opaque mail message objects but I don't see what advantage that
>> gives you.
>

> Other than resolving the "the existing mail data store is rather more
>limited than I would prefer?" matter?
>

I already pointed out one way to improve the performance of the mail store
without using a relational database - putting mail folders in separate
sub-directories.
The other problems with VMS mail are more to do with lack of builtin Mime
support and for IMAP usage the lack of a way of storing a uid in the index
along with the message pointer (which leads to PMDF for instance having to
provide ancillary files for this purpose).

> And commercial databases have been capable of arbitrary storage for
>eons. This includes opaque objects.
>

Object oriented features were only added relatively recently to Oracle Classic
- Oracle 9i (possibly a bit in Oracle 8i) before that all you had for such
types were the Long and Long Raw datatypes. So yes if you are referring to
kludging the storage of objects with Long Raw then it's been there for eons.
However storage of opaque objects doesn't really provide any added
functionality.

>> In a previous message you also mentioned XML with the comment
>>
>> "
>> I would also propose storing the data in a database, and allowing the
>> external software to access the data via XML; to allow data imports and
>> exports using XML. This gets the other software out of the business of
>> processing RFC-compliant headers. (And, going full circle, if SMTP mail
>> were to be (re)implemented today, it is exceedingly likely that XML
>> would have been used as the basis of the wrappers.)
>> "
>>
>> SMTP mail and all the software out there which processes RFC-compliant headers
>> is not going to be reimplemented using XML anytime soon.
>

> Hence my "if SMTP mail were to be (re)implemented..." comment.
>
And spam would be eliminated if only everybody would re-implement SMTP :)

> The existing SMTP RFC scheme is a very old design, albeit a widely
>accepted and functional design. If most anyone here were to redesign
>the SMTP mail traffic, the result would likely be using XML. (Wanna
>spot bad headers? Verify it with a schema. Want to extract data from
>the header? Grab it. What to add extensions? Have at. Want to
>implement attachment encoding? Trivial.)
>
> And I'd tend to provide the XML in parallel to the existing RFC-based
>approaches -- assuming there isn't already an RFC for XML-based mail.
>

There are RFCs for XML datatypes to be used as content in Mime SMTP mail
messages but there are, as far as I am aware, no XML schemas for the
structure of an SMTP/MIME mail message.

>> If you store mail messages you must be able to present them with all their RFC
>> headers, Mime structure, PGP signatures etc intact
>> This isn't a static environment new headers are being created constantly -
>> sometimes pretty much standardised such as the new header lines for
>> "anti-forging" techniques such as DomainKeys and SPF - sometimes just locally
>> introduced headers (which should be X-???? header lines but often aren't).
>

> That software evolves is obvious. XML is very good at dealing with
>this, both from generating the new information and from being able to
>operate when newer information is presented to an older client.
>
> Who's to say that providing a parallel XML interface won't be
>accepted? It would certainly make a reasonable RFC, as a start --
>again, if there isn't already an XML MAIL RFC around.
>

>> Oracle's XML appears to support two types of storage CLOB (which maintains
>> document fidelity and appears just to be an opaque object) and O-R storage
>> which maintains document object model fidelity by decomposing the XML into
>> the underlying O-R structures.
>> I would maintain that to store mail messages you would need to use CLOB
>> storage.
>

> Databases store data. Data is data. Data is bytes. Bytes can be
>stored. Metadata can be stored. Data structures can be reconstituted.
> You can store a music file as a series of linked data records.
>

>> Converting SMTP mail to other formats (as for instance Exchange does) is one of
>> the main ways of producing software which has problems interoperating with
>> other SMTP compliant systems. I think that breaking up mail messages using an
>> XML schema to store them in O-R structures and then reconstructing them for
>> output to SMTP based tools or for forwarding would be prone to errors.
>

> OpenVMS and Windows Exchange Server and most any other tool converts
>mail -- from its RFC wire format into the local host on-disk format --

>all the time. Every mail message arriving into an OpenVMS system gets
>its format converted. Converted at least twice, if you're using POP3 or
>IMAP to access and read off the messages from the OpenVMS MAIL data
>store -- once on the way in, and once on the way out.
>

No as JF Mezei pointed out that is not true.

> [I have to be being dense here as to not see what the concern is, as
>none of this would be difficult to code up using j-random version of
>Oracle Rdb and j-random version of libxml2. Most any database itself
>has in-built XML capabilities, as well.]
>

Simply storing a mail message as separate rows in a database or some other
opaque object gains you nothing over storing it as a flat file.
Analysing the message and breaking it down into components which can be stored
in separate tables and columns is inefficient and error prone when you need to
then pull it all together again for display to existing mail tools. For
instance if you make a single mistake eg extra space in the body of the
message then the message's PGP signature is invalidated (or even worse it may
be impossible to decrypt an encrypted mail message).
So that just leaves extracting a limited number of fields which could be useful
for searching on (subject, From: address, To: address, CC:address, Date) and
storing those whilst storing the message as either a flatfile or an opaque
object.
To my mind a database is an overly complex solution for that - an indexed file
solution with individual messages stored in flatfiles in multiple directories
is a simpler solution.

David Webb
security team leader
CCSS
Middlesex University

>
>
>
>