Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

compare XML serialized file and a normal XML file

2 views
Skip to first unread message

Tony Johansson

unread,
Apr 15, 2010, 1:42:34 PM4/15/10
to
Hi!

Below I have two blocks of data.the first block is from 3 Movie objects that
have been XML serialized.
The second block of data is just these three movie object in an XML file.
I just wonder when I look at these two blocks of data they look almost
identical.. There are some minor differences.
So my question is simply is a XML serialized file the same as an normal XML
file ?

?xml version="1.0"?>
<ArrayOfMovie xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Movie>
<Title>My Sister's Keeper</Title>
<RunningLength>109</RunningLength>
<ProductionYear>2009</ProductionYear>
<IsbnNumber>9100120820</IsbnNumber>
<Url>http://images.filmtipset.se/posters/86432038.jpg</Url>
</Movie>
<Movie>
<Title>Fight Club</Title>
<RunningLength>139</RunningLength>
<ProductionYear>1999</ProductionYear>
<IsbnNumber>9100123064</IsbnNumber>
<Url>http://images.filmtipset.se/posters/93394736.jpg</Url>
</Movie>
<Movie>
<Title>Star Trek</Title>
<RunningLength>127</RunningLength>
<ProductionYear>2009</ProductionYear>
<IsbnNumber>9172321385</IsbnNumber>
<Url>http://images.filmtipset.se/posters/83532544.jpg</Url>
</Movie>
</ArrayOfMovie>


//This is a normal XML file consisting of three Movie objects.
<?xml version="1.0" encoding="utf-8" ?>
- <movies>
- <movie>
<title>My Sister's Keeper</title>
<runningLength>109</runningLength>
<productionYear>2009</productionYear>
<isbn>9100120820</isbn>
<url>http://images.filmtipset.se/posters/86432038.jpg</url>
</movie>
- <movie>
<title>Fight Club</title>
<runningLength>139</runningLength>
<productionYear>1999</productionYear>
<isbn>9100123064</isbn>
<url>http://images.filmtipset.se/posters/93394736.jpg</url>
</movie>
- <movie>
<title>Star Trek</title>
<runningLength>127</runningLength>
<productionYear>2009</productionYear>
<isbn>9172321385</isbn>
<url>http://images.filmtipset.se/posters/83532544.jpg</url>
</movie>
</movies>


Martin Honnen

unread,
Apr 15, 2010, 1:54:02 PM4/15/10
to
Tony Johansson wrote:

> Below I have two blocks of data.the first block is from 3 Movie objects that
> have been XML serialized.
> The second block of data is just these three movie object in an XML file.
> I just wonder when I look at these two blocks of data they look almost
> identical.. There are some minor differences.
> So my question is simply is a XML serialized file the same as an normal XML
> file ?

Define "normal XML file".

Hopefully any XML serialization creates a well-formed XML document if
that is what you consider "normal".

With .NET's XmlSerializer you can use certain attributes in your class
and member definitions to for instance specify the element or attribute
name a class/member is mapped to, see
http://msdn.microsoft.com/en-us/library/83y7df3e(v=VS.90).aspx
so that way you can get the names for instance you had in the sample you
called normal.


--

Martin Honnen --- MVP Data Platform Development
http://msmvps.com/blogs/martin_honnen/

Tony Johansson

unread,
Apr 15, 2010, 2:02:39 PM4/15/10
to
"Martin Honnen" <maho...@yahoo.de> skrev i meddelandet
news:ulXsnSM3...@TK2MSFTNGP06.phx.gbl...

With normal XML file I mean if I for example use VS create XML file and
enter the data into the file.
//Tony


Martin Honnen

unread,
Apr 15, 2010, 2:17:44 PM4/15/10
to
Tony Johansson wrote:

> With normal XML file I mean if I for example use VS create XML file and
> enter the data into the file.

The main difference between your samples is the case of letters in
element names and the exact term used in element names I think. I don't
see why e.g.
<isbn>9172321385</isbn>
is "normal" and e.g.
<IsbnNumber>9172321385</IsbnNumber>
is not "normal".
And the XmlSerializer (by default) always emits two namespace
declarations on the root element, in case the namespaces might be used
deeper in the document with attributes like xsi:type.

Tony Johansson

unread,
Apr 15, 2010, 2:35:38 PM4/15/10
to
"Martin Honnen" <maho...@yahoo.de> skrev i meddelandet
news:ucE92fM3...@TK2MSFTNGP06.phx.gbl...

Sorry it was a typo it should be IsbnNumber!

//tony


Alan Meyer

unread,
Apr 15, 2010, 7:54:20 PM4/15/10
to

On 4/15/2010 1:42 PM, Tony Johansson wrote:
> Hi!
>
> Below I have two blocks of data.the first block is from 3 Movie
objects that
> have been XML serialized.
> The second block of data is just these three movie object in an XML file.

I'm not sure what you mean by "normal". Both of your XML objects
are serialized XML. "Serialized" just means that the XML is in
the form of a text stream, not a collection of nodes in some
internal format like DOM, ElementTree, or whatever.

Serialized XML is standardized. It delimits tags with angle
brackets, it has a specific way of representing, attributes,
namespaces, comments, processing instructions, text, etc.

Other formats, such as a DOM tree, do not have any standard
representation. A DOM tree has a standard interface, but the
data is represented however the implementer wants to represent it
internally. It's almost certainly different for every DOM
implementation.

> I just wonder when I look at these two blocks of data they look almost
> identical.. There are some minor differences.
> So my question is simply is a XML serialized file the same as an
normal XML
> file ?

Answer = Yes - if by "normal" you mean "serialized" :^)

Less sarcastically, if by "normal" you mean what you showed us,
the answer is still, almost, Yes. But see the comment on
hyphens below.

> //This is a normal XML file consisting of three Movie objects.
> <?xml version="1.0" encoding="utf-8" ?>
> -<movies>
> -<movie>

...

Actually, this is serialized XML with an illegal character in it,
namely the first hyphen. It looks like something you copied from
an Internet Explorer screen, not something that came straight
from a text file of legal XML. The other hyphens are not
necessarily illegal, but I bet you don't really intend for them
to be part of the XML. I bet that they're IE artifacts.

Also, this XML document does not contain three "Movie" objects.
It contains three "movie" objects. Case is significant.

What you need to do is find a good beginner reference book or
tutorial on XML. There are some on the Internet.

Now having said that, I will also say that your original problem
of how to compare two XML documents is not trivial to solve. A
text comparison using a tool like "diff" or "fc" doesn't work
because two documents that are identical from an XML point of
view may differ from a text point of view due to line breaks,
spaces, character entities, single vs. double quotes, etc.

I know of two approaches. One it to get a specialized XML
comparator such as Altova's "DiffDog". There are some open
source ones, but I don't know which ones actually work and/or
are currently maintained.

A second approach is to pass each of the two documents through an
indent formatter that eliminates the non-significant differences
between the docs, then pass the output of that to a textual diff
program. That works but can be harder to use.

A third approach, of course, is to write your own program. But
it looks like you're not ready for that yet.

Alan

0 new messages