Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Perl for spliting vcf files (palm->iPod)

23 views
Skip to first unread message

Michael Robbins

unread,
Jan 2, 2003, 8:02:44 AM1/2/03
to
Palm software outputs a vcf file that contains multiple records, with
spaces in between but my iPod won't accept that.

I must remove the spaces and break up the file into pieces.

I am not very good at Perl and I was hoping you guys could give me
some suggestions.

I plan to post the finished code on the iPod website so I was hoping
to make it more complete than what I would make for myself.


I haven't tested this, but this is kind-of what I was thinking about:

$pathname="d:\\xfer\\";
$sourcefilename="Palm20021206.vcf";
$tempfilename="temp.vcf";
$begintoken="BEGIN:VCARD";
$endtoken="END:VCARD";
$nametoken="FN:";

open(SOURCE, "< $pathname$sourcefilename")
or die "Couldn't open $sourcefilename for reading: $!";
while (<SOURCE>) {
if (/$begintoken/ .. /$end token/) {
# line falls between begin and end, inclusive
if ($begintoken) {
open(SINK, "> $pathname$tempfilename")
or die "Couldn't open $tempfilename for reading: $!";
} #if
print SINK $_ or die "can't write $sinkfilename: $!";
$sinkfilename="$1.vcf\n" if (/$nametoken(.*?)\n/);
if (/$endtoken/) {
# TO DO: What if a file by that name already exists?
# or if there is no FN?
# John Doe1, John Doe2, ...
close(SINK) or die "couldn't close $sinkfilename: $!";
rename("$pathname$tempfilename","$pathname$sinkfilename");
} # if
} # if
} # while (<>)
close(SOURCE) or die "couldn't close $sourcefilename: $!";

John W. Krahn

unread,
Jan 2, 2003, 8:32:15 PM1/2/03
to


If the records are separated by blank lines you can use paragraph mode to read each record.

#!/usr/bin/perl -w
use strict;
# vcard 2.1 - rfc2425,rfc2426

my $pathname = 'd:/xfer';
my $sourcefilename = 'Palm20021206.vcf';

$/ = ''; # set paragraph mode
open SOURCE, "< $pathname/$sourcefilename"


or die "Couldn't open $sourcefilename for reading: $!";
while ( <SOURCE> ) {

chomp;
my $sinkfilename;
if ( /^(fn[;:].+)/im ) {
( undef, $sinkfilename ) = split /(?<!\\):/, $1, 2;
}
elsif ( /^(n[;:].+)/im ) {
( undef, $sinkfilename ) = split /(?<!\\):/, $1, 2;
# n: field is "lastname;firstname"
# change to "firstname lastname"
$sinkfilename = join ' ', reverse split /(?<!\\);/, $sinkfilename;
}
my $count = '';
if ( -e "$pathname/$sinkfilename" ) {
1 while -e "$pathname/$sinkfilename" . ++$count;
}
open SINK, "> $pathname/$sinkfilename$count"
or die "Couldn't open $sinkfilename$count for writing: $!";
print SINK "$_\n" or die "can't write $sinkfilename$count: $!";
close SINK or die "couldn't close $sinkfilename$count: $!";
}
close SOURCE or die "couldn't close $sourcefilename: $!";

__END__

John
--
use Perl;
program
fulfillment

Benjamin Goldberg

unread,
Jan 2, 2003, 9:18:08 PM1/2/03
to
Michael Robbins wrote:
>
> Palm software outputs a vcf file that contains multiple records, with
> spaces in between but my iPod won't accept that.
>
> I must remove the spaces and break up the file into pieces.
>
> I am not very good at Perl and I was hoping you guys could give me
> some suggestions.

My first suggestion is for you to read RFC 2426, to know the precise
format for vcards.

Then, (after you see how much there is to do to *properly* process
vcards), look on CPAN to see if anyone else has done it. The module
XML::SAXDriver::vCard looks fairly promising.

If you don't want to use that, then consider using Parse::RecDescent and
writing a grammer for vcards.

--
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);

Michael Robbins

unread,
Jan 3, 2003, 11:43:58 AM1/3/03
to
> My first suggestion is for you to read RFC 2426, to know the precise
> format for vcards.
>
> Then, (after you see how much there is to do to *properly* process
> vcards), look on CPAN to see if anyone else has done it. The module
> XML::SAXDriver::vCard looks fairly promising.
>
> If you don't want to use that, then consider using Parse::RecDescent and
> writing a grammer for vcards.

Thank you. Those are excellent suggestions, but I don't really need
to process the vcards that thoroughly.

For this little task of splitting a large vcard file into many small
ones, all I need to know is where the card starts and where it ends.

Michael Robbins

unread,
Jan 6, 2003, 7:31:23 AM1/6/03
to
> If the records are separated by blank lines you can use paragraph mode to read each record.

<snip>

Thank you very much. It works nicely. 'much more perl-like.

I just added

$sinkfilename =~ s/[^\w~,\- ]//g;

before the line that reads

my $count = '';

in order to avoid characters that the OS didn't like.

0 new messages