How to serialize Mfc template dialog class ?

Elizabeta

unread,

Sep 4, 2009, 5:06:01 AM9/4/09

to

Hello

How can I serialize Mfc template dialog class , I can't use DECLARE_SERIAL
and IMPLEMENT_SERIAL for this . I noticed that IMPLEMENT_SERIAL_T macro
exist, but it is not documented and there is no corresponding
DECLARE_SERAIL_T.

Thanks

Giovanni Dicanio

unread,

Sep 4, 2009, 5:19:56 AM9/4/09

to

Elizabeta ha scritto:

I've never seen an MFC serialization of a CDialog-derived class.
I don't know if it is possible.

What would you like to do?
Could you please clarify your goal?

If your dialog is used to entry/show some data, you may want to create a
C++ class derived from CObject that stores that data (e.g. a CPerson,
with name, surname, address, etc.), and use DECLARE_SERIAL and
IMPLEMENT_SERIAL on this non-GUI class.

However, I think that in these days there are better techniques for
serializing data, like using XML, instead of MFC serialization.

If you are interested in XML, you may find this CMarkup MFC class freely
available on CodeProject useful:

http://www.codeproject.com/KB/cpp/markupclass.aspx

HTH,
Giovanni

Elizabeta

unread,

Sep 4, 2009, 5:53:01 AM9/4/09

to

>>If your dialog is used to entry/show some data, you may want to create a
>>C++ class derived from CObject that stores that data (e.g. a CPerson,
>>with name, surname, address, etc.), and use DECLARE_SERIAL and
>>IMPLEMENT_SERIAL on this non-GUI class.

Ok lets say that for Dialog template classes I can do your suggestion, thanks.

But what about template classes that are not dialogs, so I want to
reformulate my question :
How to use macros like DECLARE_SERIAL and IMPLEMENT_SERIAL with c++ template
classes derived from CObject ?

Giovanni Dicanio

unread,

Sep 4, 2009, 6:19:08 AM9/4/09

to

Elizabeta ha scritto:

> Ok lets say that for Dialog template classes I can do your suggestion, thanks.

You are welcome.

> But what about template classes that are not dialogs, so I want to
> reformulate my question :
> How to use macros like DECLARE_SERIAL and IMPLEMENT_SERIAL with c++ template
> classes derived from CObject ?

I'm not sure if DECLARE_SERIAL and IMPLEMENT_SERIAL macros work well
with C++ template classes.

However, you may want to learn from what MFC developers did in a similar
case. If my understanding is correct, you would like to define a C++
template class, derived from CObject, and with support for MFC
serialization.

There are some examples of these classes already defined in MFC, e.g.
CArray is an MFC template class (CArray<TYPE, ARG_TYPE>), derived from
CObject, and with support with serialization.

You may want to open <afxtempl.h> file and read the implementation of
CArray, in particular the CArray<TYPE, ARG_TYPE>::Serialize() method.
(I think that their approach is to not use DECLARE_SERIAL and
IMPLEMENT_SERIAL macros, though).

(However, I still think that other options like XML should be used
instead of MFC serialization, IMHO.)

Giovanni

Joseph M. Newcomer

unread,

Sep 4, 2009, 8:59:09 AM9/4/09

to

It wouldn't make sense to serialize a dialog. Nor does it make much sense to have a
dialog class created by a template. If you need some kind of templating, it most likely
applies to data *within* the dialog, and therefore you should create a templated class to
hold your data.

I do not believe you can derive a template class from CObject, because of how the macros
work. You did not say which version of VS you are using.

Generally, MFC serialization is among the worst possible ways to implement persistent
memory of information. It is needlessly complex (in spite of the claims otherwise)
because it is exceptionally fragile under what is known as "schema migration". Change
anything in the class, such as adding or deleting a member variable, or changing its type,
and all existing files are rendered unreadable. I would suggest avoiding it entirely. I
tend to use either simple text files or XML files if I have complex structured data. I
have studiously avoided using MFC serialization for 15 years (we had implemented the
equivalent of MFC serialization in 1977 and knew exactly what was wrong with it, and when
I saw MFC serialization, all I could do was shake my head and say "how could they DO
something this bad when we knew almost 20 years ago [I looked at this in 1996] that it
cannot work". Our sollution, by the way, was to have an XML file we wrote out, along with
the binary. If the binary file had a different version than the program could read, we
ignored the binary form of the file and read the XML form [yes, I helped invent XML in
1977; it looked much better than the current XML, was more powerful, and the second
generation, done ca. 1980, and known as IDL, was even better. But it was not yet time to
build steam engines, so our work was ignored and then badly reinvented by the XML
designers. We documented our work in a 1989 book].
joe

Joseph M. Newcomer [MVP]
email: newc...@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

Goran

unread,

Sep 4, 2009, 10:06:12 AM9/4/09

to

On Sep 4, 11:53 am, Elizabeta <Elizab...@discussions.microsoft.com>
wrote:

You really can't do that. You can only serialize a template
instantiation. So, for example, you could do:

template<typename T>
class C : public CObject
{
void Serialize(CArchive& ar) { ar>>member, ar>>member; }
T member;
};

then

class CInt : public CObject<int>
{ DECLARE_SERIAL(CInt) }
IMPLEMENT_SERIAL(CDouble, CObject, schema...)

class CDouble : public CObject<double>
{ DECLARE_SERIAL(CDouble) }
IMPLEMENT_SERIAL(CDouble, CObject, schema...)

So...
1. you must have a concrete class for XXX_SERIAL macros to even
compile
2. if you ever change template type (e.g. from int to short), you must
change the schema number and deal with that. This may be tricky for a
beginner ;-)

Also, to be able to serialize a class (e.g a template), you don't
necessarily need to use XXX_SERIAL macros, but then:
1. you can't use schema number for format evolution
2. >> and << operators don't work (consequence: you can't serialize
using CObArray).

What Joe says about schema migration being impossible is not true.
It's not easy, but not impossible. You must know how-tos and pitfalls,
but it's doable. For example, I have code here at work that can read
stuff serialized more than a decade ago, and that has since undergone
literally thousands of changes in saved data. There are constraints,
but it is very rare that we break serialization. If fact,
serialization problems are a tiny minority compared to other issues.

HTH,
Goran.

Joseph M. Newcomer

unread,

Sep 4, 2009, 8:40:48 PM9/4/09

to

I was overstating the case somewhat. But for someone asking about how to do
serialization, "incredibly difficult" is not that distinguishable from "impossible".
Having done this numerous times, in a variety of languages and environments over the last
46 years, the code grows in complexity each time you make a schema change, and after the
tenth major change, you end up with an unintelligible, not to mention unmaintainable,
mess. I've probably worked on a dozen projects that did this, and while we could and did
read files saved not-quite-ten years ago, you had to maintain in the comments a
specification of each of the schemata. The real problem was someone making a change that
broke the ability to read the 3.7 release (which was different from 3.6 and 3.8 but only
in trivial ways), and one of my tasks was to find and fix that problem.

In one system, we wrote a tape-copy program that took in tapes in the old schema and wrote
new tapes in the new schema. At General Motors, a schema conversion involved three days'
of runtime on an IBM 360/75. Most of the delay was not computational, but in the massive
amount of data in an automobile database; even today we don't have the fantastic I/O
bandwidth of a 1970s huge mainframe, except of the highest-end multiprocessor multi-bus
servers with the super-high-end ($2500) disk controller cards. Computationally, we are
orders of magnitude faster than those old mainframes, but I/O bandwidth has *not* doubled
every 18 months! So binary schema migration is essentially the same problem it was in the
1950s, only we can do it with faster computational engines. (So fast, in fact, that
parsing XML is almost not a factor in the cost). One client was upset at their slow
input, but it turned out that the programmer decided that reading text (not XML, just
plain text) was going to be slower, so added a progress bar. The progress bar update took
more time than the reading of the data (I discovered this when I moved the reader to a
separate thread and updated the progress bar every 100 records instead of every record;
reading the text was only 2x as long as reading the binary, 30 seconds vs. 15).

In fact, one of the serious defects of MFC is the inability to move serialization easily
into a background thread. Too much of MFC was designed with a single-thread model in
mind.

One interesting C++ solution we used was to treat each schema as a new class instance,
keeping the old classes for serialization. So we would read in using the
CVersion36Data::Serialize if we found (using my previous example) the header was a 3.6
(well 0x00036000) version, then we had a subroutine that took the 3.6 version and produced
a 3.7 version. We couldn't do this years ago because we didn't have enough memory to hold
both versions, but even in Win16 3.1 this was not a problem. But it did mean that we had
to write a n<=3.m for n.m<3.7 converter. Perhaps surprisingly, this was much easier than
the if(majorversion < 3) and minor-version interlaced tests that had pervaded the 1.x and
2.x code; my predecessor on the project who created version 3 threw up his hands and said
"this is impossible!" and culled through the old versions, extracting the code for each
serialization and building a new class. I got the code about version 7, and was able to
easily create version 7.x+1, with many new features added to the file. (He later told me
that his real failure was that he did not cascade the converters; that is, if you read
version 1.n, just sequentially run the 1.n to 1.n+1, 1.n+1 to 1.n+2, etc. to 1.m to 2.0,
etc.; his thought was that this would mean the first read of an old file might be much
slower, but since there was no backward compatibility requirement, only one output writer
was required. We had somewhat less than 25 version converters, each quite simple, so he
said it ultimately didn't matter too much).

In our version of proto-XML, we kept the undefined fields as text and the defined fields
in binary, and had a table-drive algorithm that could convert text-to-binary+text or
binary-to-text. IDL did not support the notion of handling undefined fields, but was also
table-drive. We could read a structure in as text or binary and fix up all the pointers
so we could save arbitrary graphs of objects-pointing-to-objects, even with cyclic lists.
I found this was a particular failure of most MFC serialization techniques. Ultimately, I
got to the point where I'd convert the files to text and nobody noticed or cared that they
were larger or read more slowly, with one exception (which we were able to fix rather
trivially).

Getting multi-schema serialization to work without massive pain requires some careful
up-front design. I have not encountered one example of this being done correctly in all
the serialization code I've had to fix.

The most serious one was the project where we had to add new fields to the structure and
had to maintain backward compatibility to the older binary executables; that's where I
explained that this was not going to happen. We solved it by writing two files, one which
was old-binary-compatible and lacked the richer structure of the new version, and another
which was the "fixup" of the structure which embodied all the new features. This was
actually pretty clean, but took two passes to write two archive files, and we had to use a
derived class of CArchive to indicate which pass we were in.

In addition, I have encountered, but not worked on, a fairly large number of projects
whose participants complained about the problems of serialization, at least the binary
serialization that is normally used.

So your experience seems anomalous.
joe

Goran

unread,

Sep 5, 2009, 6:19:29 AM9/5/09

to

OK, so I'll share my experience and how we do it overall. It is about
a run-off-the mill editor-type application that gets developed over
the course of a decade (and counting). I am not the original author
(Hi, Serge!), but I see no major problems serialization-wise.
Certainly there are things that could have been done better, but hey,
rarely do I think that my own code is good when looking at it a couple
of years, either! ;-)

What we do is provide only backwards compatibility. That decision
serves us well, I'd say. We use VERSIONABLE_SCHEMA for particular
classes and also have global document version number. Canonical
Serialize is:

void CFoo::Serialize(ar)
{
//storing:
ar << member1 << m2 << m3 << ...;

//loading:
ar << member1<<m2;
if (schema >= 2)
{
ar << m3;
if (schema >= 3)
{
ar << m4;
// etc
}
}
}

For one particular class, I don't think we ever went over 20 or so.
Clearly, that "if" nesting in loading branch can certainly become
deep, but I don't remember that we tried to alleviate by moving code
(e.g. ReadFromSchema3(ar)). Frankly, I think it's rather canonical and
not complicated at all.

About global document format number: each set of particular changes,
before going into the wild, requires a bump in this number. That way,
code knows whether it's too old to read a particular file (newer).
Older files it must read anyhow.

Sometimes (that is very rare, though), we remove something from the
program. For that, we just do ar << unused and ar >> unused. So no
schema change for that. Sometimes we change data type of a particular
variable. That is handled by either "unused" trick, either specific
schema check "in the middle". But that is rare, too.

I'd say, when the above is followed strictly, serialization really is
a breeze!

Now, of course, this is only overall approach, and things indeed do
get messy in places (I'd prefer not to talk about them ;-) ).

A pet peeve of Serialization for many must be the inability to have
schema number on a per-derived-class basis. That is, schema change
anywhere in a hierarchy demands schema change for all participating
classes, so that they all carry the same schema number. Go MS, huh?
Recently, someone here complained that he could not rename a class
because of serialization. Yes, that is a breaking change, but not
insurmountable, either (I don't think we ever did this, BTW!)

Goran.

Joseph M. Newcomer

unread,

Sep 5, 2009, 11:40:34 AM9/5/09

to

None of the serialization projects I worked on were fortunate enough to have structures
that simple. Data structures were 2-3 levels deep and very complex.
joe

Goran

unread,

Sep 7, 2009, 3:37:04 AM9/7/09

to

On Sep 5, 5:40 pm, Joseph M. Newcomer <newco...@flounder.com> wrote:
> None of the serialization projects I worked on were fortunate enough to have structures
> that simple. Data structures were 2-3 levels deep and very complex.

Well... That doesn't matter. You have two choices: either you treat
"substructure" as a full-blown serializable class, in which case you
have schema version an' all^^^, either you don't, in which case you
can (well, have to) pass "parent" structure schema version to
serialization of those.

^^^Note: in this case, is not __obliged__ to use heap allocation and
ar >> pObj/arr<<pObj. Instead, ar.SerializeClass() and obj.Serialize()
produces the same effect for "embedded" structures. AFAIK
SerializeClass has a very reasonable footprint.

Goran.

Scot T Brennecke

unread,

Sep 7, 2009, 5:02:47 AM9/7/09

to

I find it stunning how often you declare something as "unusable" or "worthless" or "needlessly complex", when that very thing you
criticize is something I have been using successfully for years. I think you quite often judge quickly and give up too easily.

Joseph M. Newcomer

unread,

Sep 7, 2009, 11:48:48 AM9/7/09

to

Being used for years is not the same as "being usable". In all cases where I have seen
MFC serialization used, there have either been serious problems or amazing complexity. In
neither case would I have referred to any of the solutions I have seen as the consequence
of a "usable" mechanism.

I know people who say that "synchronous sockets" are "usable". Then you start talking to
them and discover that they have huge problems with them, that they wouldn't have with
asynchronous sockets. Serialization works well in trivial cases, and small incremental
chnages are easy, but complex structures are hard to maintain long-term, particularly when
there is massive schema evolution (or, read, "We changed the fundamental data
structures"). This is why, even before XML was a tool to use, I was using XML-like data
structures (as far back as Win16 in 1992). Or I would use tagged-binary representations,
which are XML-like. Cleaner growth mechanisms.

I look at something, look at its implications, look at how hard it is to do something with
it, and either use it or dimiss it. Then, for the things I dismiss, I see people
struggling with the same problems I anticipated, because they hadn't. So I try to
discourage others from going down the same garden path as those who have already
encountered problems. Note that sometimes I went down the same garden path, and got hit
by the same problems, but backtracked and adopted a cleaner solution. I do not offer
these opinions off-the-cuff, but based on experience, both mine and that of my clients.
joe

Goran

unread,

Sep 8, 2009, 3:15:39 AM9/8/09

to

On Sep 7, 5:48 pm, Joseph M. Newcomer <newco...@flounder.com> wrote:
> Serialization works well in trivial cases, and small incremental
> chnages are easy, but complex structures are hard to maintain long-term, particularly when
> there is massive schema evolution (or, read, "We changed the fundamental data
> structures").

I beg to disagree about complex structures being hard to maintain long
term. I think I've shown how that's done more or less easily.

Massive schema change is more difficult, but it goes for rewrites
only, so it's hopefully rare. Also, I'd wager that more often you will
find yourself wanting to replace a particular __piece__ of the whole
structure with another.

Anyhow, you still can migrate like so:

1. in "global" version BOOM-1 you have your usual serialization
2. in version BOOM, you drop all of what you want replaced --except
class definitions and Serialize functions--
3. upon loading, if "global" version is BOOM-1 and less, load obsolete
data, then "convert" to new data model
4. continue saving new model.
...
X. you decide that you don't need to support versions BOOM-1 and less
and drop corresponding classes from the code.

Goran.

Joseph M. Newcomer

unread,

Sep 8, 2009, 9:31:14 AM9/8/09

to

Now do this with data structures that are very rich and which have been modified for ten
years. It is rare that you are lucky enough that your code is readable after ten years of
evolution. I find a lot of "It must be usable because..." and the story usually ends up
being "...and therefore I am the exception that proves it is usable". Yet I more often
find, with a lot of the things I call "unusable", that if I advise a client *against*
using whatever the feature is, and they ignore me and use it anyway, that about three
years down the line I am able to say "...but didn't I warn you this would happen?"
Sometimes I'm lucky and get paid to rewrite the code. Sometimes I'm very unlucky, and get
paid to rewrite the code. Mostly, they grimace and say "We should have listened, but now
we're stuck with this..." And sometimes they scrap the whole thing.

It is far easier to manage schema evolution with XML. That's why it is all I have used
for the last ten years, and before that, I used text files that looked a lot like a very
limited subset of XML.
joe

Goran

unread,

Sep 10, 2009, 3:03:55 AM9/10/09

to

On Sep 8, 3:31 pm, Joseph M. Newcomer <newco...@flounder.com> wrote:
> Now do this with data structures that are very rich and which have been modified for ten
> years.

Hi!

I tried imagining how some soft of "tagged" data (I am guessing,
similar to XML, you are trying to push for some sort of id<->value
mapping for any particular n-tuple of data (a struct or a class) can
help significantly better than e.g. serialization and I can't see it.

Here's what I imagine (I am focusing on reading, because writing is
easier anyhow; I am also focusing on one n-tuple because everything
else, e.g. containers or other sorts of composition seem to be a
matter of putting bricks together):

If you have random access e.g. DOM of XML (I think, if you have
sequential access a là SAX of XML, things only get closer to
serialization), reading consists of getting out a value based on the
desired attribute, e.g.

member1 = value(attr1); member2 = value(attr2); ...

What this gives over serialization is random read order. That gives
possibility to skip "unknown" data items without using my "unused
idiom". In the light of small schema changes (which IMO in
overwhelming majority are additions to the n-tuple), that doesn't buy
much, either. One can alleviate missing attributes (get_value(member,
attr) and nothing is done if there's no attribute), avoiding if
(schema>=x) {} nesting. But that comes at a price of having bigger
"infrastructure"^^^.

Now... As far for "major" overhauls, I fail to see how "tagged data"
can avoid my step 3 from up here ("upon loading, if "global" version

is BOOM-1 and less, load obsolete data, then "convert" to new data

model"). It does avoid step 2, and there is hidden only major
conceptual advantage I can see: stored data is clearly disconnected
from data model present in the code. But again, that comes at a price
of having bigger "infrastructure". It should be noted that even with
serialization, he who is smart does not tie serialized "data model"
more than necessary with the rest of the code (incidentally, for the
most part that isn't my case :-( ).

^^^ infrastructure that which one doesn't do at all with
serialization - big amount of stuff is already there, e.g. container
serialization, schema support, very easy low-level primitives a là >>
and << etc. What serialization also offers is easy serialization of
complicated data graphs (doing ar <</>> p for one value of p in
multiple places results in correct p instance being read back). That's
something many are not aware of and comes in quite handy. Doable with
any other approach, sure, but again, a thing to write (and test).

?

Goran.

Joseph M. Newcomer

unread,

Sep 10, 2009, 9:55:06 AM9/10/09

to

See below...

On Thu, 10 Sep 2009 00:03:55 -0700 (PDT), Goran <goran...@gmail.com> wrote:

>On Sep 8, 3:31�pm, Joseph M. Newcomer <newco...@flounder.com> wrote:
>> Now do this with data structures that are very rich and which have been modified for ten
>> years.
>
>Hi!
>
>I tried imagining how some soft of "tagged" data (I am guessing,
>similar to XML, you are trying to push for some sort of id<->value
>mapping for any particular n-tuple of data (a struct or a class) can
>help significantly better than e.g. serialization and I can't see it.

****
Because when you read a tag, and the data structure does not support it, you know that you
have to reject the value (or do something else with it). In serialization, the assumption
is that the values are positional and therefore the n-to-n+3 bytes MUST be the int value
'count', but under schema migration you actually have no guarantee, so you have to assume
that it is correct; if it isn't, you are screwed. Or, you have to be able to serialize by
using the structure that was defined in the earlier version, which means you now have a
"version managaement" problem with no tools to help you manage it; you have to manage the
many versions of your data structure from witiin your code structure, for example, by
knowing that in versions < 3, it was the count, but for versions >=3 and <5 it was the
offset, and versions >=5 and < 9 it was the bit mask of valid data, and versions >=9 it is
something else again. The code gets pretty ugly.
****

>
>Here's what I imagine (I am focusing on reading, because writing is
>easier anyhow; I am also focusing on one n-tuple because everything
>else, e.g. containers or other sorts of composition seem to be a
>matter of putting bricks together):
>
>If you have random access e.g. DOM of XML (I think, if you have

>sequential access a l� SAX of XML, things only get closer to

>serialization), reading consists of getting out a value based on the
>desired attribute, e.g.
>
>member1 = value(attr1); member2 = value(attr2); ...
>
>What this gives over serialization is random read order. That gives
>possibility to skip "unknown" data items without using my "unused
>idiom". In the light of small schema changes (which IMO in
>overwhelming majority are additions to the n-tuple), that doesn't buy
>much, either. One can alleviate missing attributes (get_value(member,
>attr) and nothing is done if there's no attribute), avoiding if
>(schema>=x) {} nesting. But that comes at a price of having bigger
>"infrastructure"^^^.

****
If you believe all schema changes are small, fine, but I've found that every few years
there is a major schema redesign in far too many programs (hence the three-year watershed
where they realize they have painted themselves into a corner)
****

>
>Now... As far for "major" overhauls, I fail to see how "tagged data"
>can avoid my step 3 from up here ("upon loading, if "global" version
>is BOOM-1 and less, load obsolete data, then "convert" to new data
>model"). It does avoid step 2, and there is hidden only major
>conceptual advantage I can see: stored data is clearly disconnected
>from data model present in the code. But again, that comes at a price
>of having bigger "infrastructure". It should be noted that even with
>serialization, he who is smart does not tie serialized "data model"
>more than necessary with the rest of the code (incidentally, for the
>most part that isn't my case :-( ).

****
It doesn't help in the conversion, but it does result in simpler code to handle it. The
"bigger infrastructure", e.g., using XML, is a trivial cost to pay for the simplification.

Binary-tagged data works well in the case you describe. And it decouples reading from
structure, in that reading the bytes n..n+3 does not tie the meaning to offset n of the
structure. Instead, you read a tag (which can actually be an integer local to the
individual structure: I am tag 4 of the XyZZy structure) and know that the next 4 bytes
after the tag are always the count. If the count-in-bytes is changed to a count-in-words
for some reason, the tag number changes. If the count is dropped, you can ignore the tag.
If the structure morphs to something completely different (it now contains a
std::vector<T> instead of a compile-time constant T[SOME_FIXED_SIZE] it makes no
difference, because the type was not serialized by some mechanism that requires you have
to know the type to deserialize it; instead, the types used in the serialization are
defined solely by the serialization. Only for external types do you use the external type
serialization (e.g., for embedded OLE objects) and even then there would be a tag saying
"I am an embedded chart", which would tell you to use the chart serialization to read it
in).

For complex graphs, we had a mechanism that allowed them to be expressed as text, and it
only had to be written and tested once. We then used it for another six years. XML has a
similar provision.
joe
****

>
>^^^ infrastructure that which one doesn't do at all with
>serialization - big amount of stuff is already there, e.g. container

>serialization, schema support, very easy low-level primitives a l� >>

>and << etc. What serialization also offers is easy serialization of
>complicated data graphs (doing ar <</>> p for one value of p in
>multiple places results in correct p instance being read back). That's
>something many are not aware of and comes in quite handy. Doable with
>any other approach, sure, but again, a thing to write (and test).
>
>?
>
>Goran.