So I wind up with a bunch of information at the top of the file that
my application (which reads the files) completely doesn't care about.
Reading the rest of the information is really easy, and I've already
taken care of it. However, my trouble comes when it seems that this
initial runtime class information is NOT a fixed size. I haven't been
able to find any information at, say the top of the file, that
contains the size of this runtime information.
Is this size contained anywhere (if so, I haven't found it yet), or is
there an easy way to skip this runtime information so I can get to
what I need?
Thanks,
~Scoots
You say you have source information so just look at the code that
writes the file. Then read it with "inverse" code that just does
everything in exactly the same order with the exactly the same data
strictures but just backwards, reading instead of writing.
Oh, and the application does NOT read the files, so I don't have that
code to see how they solved it.
I've dug into the code (the machine does have VS2008 installed on it),
and I've traced it through the following code:
(All of this code is MFC)
COleDocument::OnSaveDocument
(creates through StgCreateDocfile, which may write some of
the information, I'm not sure)
OnSaveDocument calls SaveToStorage()
calls COleStreamFile::CreateStream with the
name "Contents". This definately shows up in that junk at the top of
the file.
These are the only locations that I could possibly see writing
anything, as after that is the call to Serialize (which they have
overrided).
Unless the serialization is stored in memory and the Commit call in
On SaveDocument is doing the work.
Like you, my experience with serialization is when the same
application is opening and saving so I haven't delved into this level
of detail on what exactly is going on. But since I can't do that in
this case (legally or practically), is there any other information
held in that mess that's useful?
Here is a sample of that header in ascii (sorry, it doesn't appear to
support unicode. But perhaps the keywords Root Entry and Contents can
help us sort it out.
ÐÏ à¡± á> þÿ
þÿÿÿþÿÿÿ ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿýÿÿÿþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRoot
Entry ÿÿÿÿÿÿÿÿÿÿÿÿþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRoot
Entry ÿÿÿÿÿÿÿÿ Àª7‡
£žÉ þÿÿÿContents ÿÿÿÿÿÿÿÿÿÿÿÿþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿþÿÿÿýÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
<rest of the file is useful information>
ÐÏ à¡± á> þÿ
þÿÿÿ ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿýÿÿÿþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRoot
Entry ÿÿÿÿÿÿÿÿÿÿÿÿþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRoot
Entry ÿÿÿÿÿÿÿÿ Ð Ð¨Ï É € Contents ÿÿÿÿÿÿÿÿÿÿÿÿ
{ ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿþÿÿÿýÿÿÿþÿÿÿ þÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
þÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
Thanks,
~Scoots
Tom
"Scoots" <linki...@msn.com> wrote in message
news:5fdecb80-5f33-4ab5...@v13g2000pro.googlegroups.com...
How about asking for/creating a function (you say you have all
sources) that can read up to the part you can read? Then you can
continue from the known point, in that known water. DLL is probably
the best solution, but be aware of caveats, like, your code must be on
a same MFC release as the DLL.
To create such a function, you could envisage splitting the document
class into base one that only reads data and the full-blown one used
in the external app. (But this may be hard, as document classes tend
to meddle everywhere in the app, and, if one is allowed an opinion,
for dubious reasons).
HTH,
Goran.
P.S. I disagree with Joseph on the general usefulness of MFC
serialization. Sure, one can do better, but MFC does correctly what it
does (e.g. serializing MFC stuff like strings and containers, backward-
compatible object versioning, storing/retrieving of object
references). There's no forward-compatibility, though (old code can
read newer files), which is probably Joe's primary complaint. There
are other issues I've seen, too, but hey, life ain't perfect! ;-)
P.S. I disagree with Joseph on the general usefulness of MFC
serialization. Sure, one can do better, but MFC does correctly what it
does (e.g. serializing MFC stuff like strings and containers, backward-
compatible object versioning, storing/retrieving of object
references). There's no forward-compatibility, though (old code can
read newer files), which is probably Joe's primary complaint. There
are other issues I've seen, too, but hey, life ain't perfect! ;-)
******
I too find serialization very useful in my apps, and with some effort you
can also support forward compatibility to a degree, though it does require
you to put in extra data into the stream that stores the size of the objects
you've just written, and you occasionally have to rewind the stream to write
the sizes back into the stream.
It's also very useful as a private format for supporting undo/redo.
Anthony Wieser
Wieser Software Ltd
Tom
"Anthony Wieser" <newsgroup...@wieser-software.com> wrote in message
news:uEpmCr6o...@TK2MSFTNGP02.phx.gbl...
Tom
"Goran" <goran...@gmail.com> wrote in message
news:202d0b3a-75fe-4beb...@a12g2000yqm.googlegroups.com...
Those who have access can check out the article on the techniques we developed in the 1979
time frame (the article was published in 1987), and more detail is available in our book
(1989)
http://portal.acm.org/citation.cfm?id=39309
http://www.amazon.com/Idl-Language-Implementation-Prentice-Hall-Software/dp/0134502140
The book, alas, is out of print. My XML version was an ad-hoc adaptation of the
principles described in these publications (if I had more time, and the client more money,
I would have done a binary representation driven off the DTD, but that would have been too
expensive for the project needs)
joe
To reiterate, this application saves TWO files. One is a complex file
that relies entirely on serialization of complex types and bumps into
every problem with serialization you all have mentioned so far. This
is not a stable file format, as every time they change the data
structures in the slightest, the schema changes, and we don't know if
we will have access to future versions of the source code. This is
not the file type to use, but I do have the code that writes AND reads
this file. However, the application also writes a simpler file format
used by other applications and it is much more stable. It does use
serialization to trigger, which is why we get this file header issue,
but after that it is a very structure (and very easy to read) file
format. This file format I have already handled and is very simple.
It's just the header isn't always the same, or even the same size.
By following the reading path of the other graphic file, I've managed
to get into the MFC code that reads the COleDocument and found some
things that might be significant. The base header (the first one I
posted above) appears to be the COleDocument itself, and appears to be
a minimum of 2048 bytes. Even if the document is "empty", this gets
written. In looking at the loading algorithm for the COleDocument, it
hits LoadFromStorage(). The code is on another machine so I can't
just directly copy it here, but the LoadFromStorage (Line 703 in
oledoc1.cpp, for me) method DOES call COleStreamFile::OpenStream(...)
on the stream "Contents", which you will see in the headers I posted.
It then appears to serialize based on this stream position. There
must be a bit of automatic information in the stream, as there is no
"hokey pokey" to get the file pointer in the right position for the
serialization that follows.
Yes, Visual Studio's binary editor is the only reason I've been able
to get as far as I have in this file, but unfortunately I cannot copy
from the binary editor to here, otherwise I'd show the binary for the
header. It appears that this stream will get me close, but I still
haven't found what causes the file header to change (sometimes very
radically). The number of bytes following the Contents keyword is not
constant. If I delete all of the graphical elements in their
application and save, this extra header information disappears and I'm
left with the base header, and yet the very first thing that gets
serialized by their custom Serialize is a byte that is very easy to
pick out from the header information (in terms of pattern
recognition. None of those 0xFF's flying around.).
And yes, I do have their Serialize, Joseph (fortunately!). That's how
I've gotten as far as I have, and if I manually delete the header, my
code can read their file just fine. That, fortunately, was actually
fairly simple since they do NOT use any CStrings or more complex types
in the more stable of the two files. Mainly, they write out
structures, bytes, and chars, so it was a fairly simple process to
load.
The base header is fairly easy to pick out, as an empty page will
generate the COleDocument information and it always appears to be 2048
bytes. This is the first header I posted. However, this header is
sometimes significantly larger when data is present and I don't know
why. Their custom serialization does not write anything that should
cause this, and I haven't found anything in the COleDocument to
account for this. I'll keep plugging at it and post what I find.
In the meantime, the CArchive has an m_bForceFlat member variable.
The COleDocument sets this to FALSE, but I haven't found documentation
on what behavior this causes. Anyone know?
Thanks again,
~Brian
bool Decoder::OpenAndReadPreamble(CString p_csFilename,
COleStreamFile* p_pfr)
{
//The preamble is completely and utterly unimportant. This gets
written
//even if (Name removed for confidentiality) doesn't even save the
file! It's runtime information left over
//from serialization. We just... plain... don't care.
LPSTORAGE lpRootStg = NULL;
//This is based on the COleDocument code for reading.
BOOL bResult = FALSE;
TRY
{
if (lpRootStg == NULL)
{
LPCOLESTR lpsz = T2COLE(p_csFilename);
// use STGM_CONVERT if necessary
SCODE sc;
LPSTORAGE lpStorage = NULL;
if (StgIsStorageFile(lpsz) == S_FALSE)
{
// convert existing storage file
sc = StgCreateDocfile(lpsz, STGM_READWRITE|
STGM_TRANSACTED|/*STGM_SHARE_EXCLUSIVE|*/STGM_CONVERT,
0, &lpStorage);
if (FAILED(sc) || lpStorage == NULL)
sc = StgCreateDocfile(lpsz, STGM_READ|
STGM_TRANSACTED|/*STGM_SHARE_EXCLUSIVE|*/STGM_CONVERT,
0, &lpStorage);
}
else
{
// open new storage file
sc = StgOpenStorage(lpsz, NULL,
STGM_READWRITE|STGM_TRANSACTED/*|STGM_SHARE_EXCLUSIVE*/,
0, 0, &lpStorage);
if (FAILED(sc) || lpStorage == NULL)
sc = StgOpenStorage(lpsz, NULL,
STGM_READ|STGM_TRANSACTED/*|STGM_SHARE_EXCLUSIVE*/,
0, 0, &lpStorage);
}
if (FAILED(sc))
AfxThrowOleException(sc);
ASSERT(lpStorage != NULL);
lpRootStg = lpStorage;
}
ASSERT(lpRootStg != NULL);
// open Contents stream
CFileException fe;
if (!p_pfr->OpenStream(lpRootStg, _T("Contents"),
CFile::modeRead|CFile::shareExclusive, &fe) &&
!p_pfr->CreateStream(lpRootStg, _T("Contents"),
CFile::modeRead|CFile::shareExclusive|CFile::modeCreate, &fe))
{
if (fe.m_cause == CFileException::fileNotFound)
AfxThrowArchiveException(CArchiveException::badSchema);
else
AfxThrowFileException(fe.m_cause, fe.m_lOsError);
}
// load it with CArchive (loads from Contents stream)
CArchive loadArchive(p_pfr, CArchive::load |
CArchive::bNoFlushOnDelete);
}
CATCH_ALL(e)
{
MessageBox(NULL,_T("Whoops"), _T("We did something bad."), MB_OK);
return false;
}
END_CATCH_ALL
return true;
}
Okay, so there is a fair amount of debugging left to do, and nevermind
the fact that the messagebox is completely uninformative or that I
haven't commented it, this appears to automatically handle the
variable size header information. I'll let you know as I test it
more.
~Scoots
In my serialization, I start off with
[Version]
[File Length]
[File Version Info]
followed by a variety of objects such as
[Object type]
[Length]
[object values]*
where each object value is of the form
[Field type]
[length]
[bytes of data]
[padding]*
where padding is enough 0 bytes to get a DWORD alignment.
so a
typedef struct {
int n;
double d;
char x[80];
} SomeStruct;
would be
[SomeStructID]
[116]
[code for n]
[4]
[value of n]
[code for d]
[8]
[value of d]
[code for x]
[80]
[value of x]
[SomeStructID]
...same format as above
Note that I give each field a code, so I could change the header file, rearrange the
fields, and the reader automatically handles the assignment because the test is
essentially
if(fieldcode == code_for_n)
object->n = ReadIntValue(length of field);
and so on. This "tagged binary" representation is ancient; I saw it in specifications for
files in the late 1960s, and was obviously well-established by that point.
The IDL system we did at CMU, and the LG (Linear Graph) system that preceded it, would
automatically generate the tables that drove the readers and writers. I've often thought
about how I might build binary reader/writer code from DTDs, but haven't pursued it, since
I've already done it twice, and both times better than XML could hope for.
We even stored complex structures using the equivalent of what Microsoft calls "based
pointers" for the binary representation, and we could handle forward-pointer resolution
when reading the text representation.
Since my latest project now takes too long to read in a 2MB XML file, I will probably add
a binary reader/writer in the near future.
joe