Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

problem with AnyData perl module when converting XML to CSV

215 views
Skip to first unread message

Todd S

unread,
Aug 28, 2012, 10:55:01 AM8/28/12
to
I have a question about the AnyData perl module when converting XML to CSV. Is there any way to retain all of the data columns even if there's missing data associated with one of the XML tags in the first child of the root of the XML tree? I think this might be a bug.

For example, below is a simple script that I wrote and some sample data that I copied and modified slightly from the examples on CPAN namely: http://search.cpan.org/~jzucker/AnyData-0.10/AnyData.pm

The 'example1b.xml' and 'example1b.csv' files illustrate exactly what the problem is.

Is there perhaps some kind of '$flags' variable that can be passed to adConvert( ) that might correct this problem?

Thanks for any help or suggestions.

=========================================================================
#!/usr/bin/perl
# This script converts a XML file to CSV format.

# Load the AnyData XML to CSV conversion modules
use XML::Parser;
use XML::Twig;
use AnyData;

#my $output_file_xml = "example1a.xml";
#my $output_file_csv = "example1a.csv";

my $output_file_xml = "example1b.xml";
my $output_file_csv = "example1b.csv";

#my $output_file_xml = "example1c.xml";
#my $output_file_csv = "example1c.csv";

$result = `rm $output_file_csv 2>&1`;

$flags->{record_tag} = 'row';

$result_adConvert = adConvert( 'XML', $output_file_xml, 'CSV', $output_file_csv, $flags );

=========================================================================
file = example1a.xml

<table>
<row row_id="1"><name>Joe</name><location>Seattle</location></row>
<row row_id="2"><name>Sue</name><location>Portland</location></row>
</table>


=========================================================================
converted file = example1a.csv
row_id,name,location
1,Joe,Seattle
2,Sue,Portland

=========================================================================
file = example1b.xml

<table>
<row row_id="1"><name></name><location>Seattle</location></row>
<row row_id="2"><name>Sue</name><location>Portland</location></row>
</table>

=========================================================================
converted file = example1b.csv

row_id,location <--- NOTE: it completely dropped the 'name' data even though 'Sue' was in row #2
1,Seattle <--- NOTE: it completely dropped the 'name' data even though 'Sue' was in row #2
2,Portland <--- NOTE: it completely dropped the 'name' data even though 'Sue' was in row #2

I think the correct output should look as follows:

row_id,name,location
1,,Seattle
2,Sue,Portland

=========================================================================
file = example1c.xml

<table>
<row row_id="1"><name>Joe</name><location>Seattle</location></row>
<row row_id="2"><name></name><location>Portland</location></row>
</table>

=========================================================================
converted file = example1c.csv

row_id,name,location <-- All of the input data appears to be preserved in this example
1,Joe,Seattle
2,,Portland

=========================================================================

0 new messages