ex:
<-- Title -->
<-- Date -->
...
info to parse
...
> While reading a file, ignoring empty lines seems pretty
> straightforward, but how can I ignore a line that starts with
> something like <-- which I use for comments.
The standard way of ignoring some lines while processing a file would
be something like:
while(<>) {
next if <some condition>;
# process line
}
So you'll just have to come up with a condition matching the lines you
want to ignore. I would probably just use aa regular expression.
//Makholm
What about
perl -lne 'print unless /^\s*($|<--.*-->)/'
- HTH atul
Actually I changed up my csv files to make it easier. Now i just want
to next if the line is blank or starts w/ a # for a comment.
I'm understand the next usage I'm just trying to understand the
expressions like /^s*...
That's a regular expression, applied to to the line you just
read for matching. If you don't know about regular expressions
you better should get you a good book about Perl (or e.g.
"Mastering Regular Expressions" by Friedl, which covers the
use of regular expressions not only in Perl but also in several
other languages and gives some theoretical background) - they're
much too wide a topic to be covered by some news posting, but
are one of the big strenghts of Perl and you will find them in
nearly every Perl program you will look at.
But to give you an idea here's what you would need if you want to
skip blank lines or lines that have a '#' at the start (or after
one or more white-space characters, i.e. spaces, tabs etc.):
while ( <$f> ) {
next if /^\s*($|#)/;
do_something_with_the_line( $_ );
}
The stuff in between the slashes is the regular expression,
applied to try to match the default argument '$_', which in
this case is the line you just read in from the file.
The '^' at the start says that the comparison is to start at
the very beginning of the line. The '\s*' matches an unspe-
cified number of white-space charecters (between 0 and as
many as there are). The '$|#' means: either the end of the
line ('$') or a '#' character, with the '|' being the or
operator. Thus the whole line can be read as: next if the
line just read starts with zero or more white-space charac-
ters, followed by the end of the line or a '#' character.
The paranthesis around the '$|#' are necessary because
/^\s*$|#/
would mean: match if the line either just comtains zero or more
white space characters (i.e. it's a blank line) or if there's a
'#' to be found anywhere within the line.
Regards, Jens
--
\ Jens Thoms Toerring ___ j...@toerring.de
\__________________________ http://toerring.de
> while ( <$f> ) {
> next if /^\s*($|#)/;
next if /^\s*(?:$|#)/;
There is no need to capture anything.
> do_something_with_the_line( $_ );
I also prefer to use lexical variables if I need to pass anything
further to another routine (just to avoid any possible action-at-a-
distance effects).
while ( my $line = <$f> ) {
next if $line =~ /^\s+$/;
next if $line =~ /^\s*(?:$|#)/;
do_something( $line );
}
<great explanation snipped for brevity>
Sinan
--
A. Sinan Unur <1u...@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
> I'm understand the next usage I'm just trying to understand the
> expressions like /^s*...
For quick/in-depth reference on regular expressions have a look at these
documents provided with your Perl distribution:
perldoc perlrequick
perldoc perlretut
perldoc perlreref
where 'perldoc' is the command you type at the command prompt,
'perlre...' is the document to be displayed.
hth, Hartmut
--
------------------------------------------------
Hartmut Camphausen h.camp[bei]textix[punkt]de