xoxo,
Andy
diff -ruN bleadperl/pod/perlopentut.pod bleadpatch/pod/perlopentut.pod
--- bleadperl/pod/perlopentut.pod Wed Jun 12 19:02:45 2002
+++ bleadpatch/pod/perlopentut.pod Thu Sep 19 10:41:20 2002
@@ -5,7 +5,9 @@
=head1 DESCRIPTION
Perl has two simple, built-in ways to open files: the shell way for
-convenience, and the C way for precision. The choice is yours.
+convenience, and the C way for precision. The shell way also has 2- and
+3-argument forms, which have different semantics for handling the filename.
+The choice is yours.
=head1 Open E<agrave> la shell
@@ -36,7 +38,7 @@
The C<open> function takes two arguments: the first is a filehandle,
and the second is a single string comprising both what to open and how
to open it. C<open> returns true when it works, and when it fails,
-returns a false value and sets the special variable $! to reflect
+returns a false value and sets the special variable C<$!> to reflect
the system error. If the filehandle was previously opened, it will
be implicitly closed first.
@@ -56,6 +58,14 @@
A few things to notice. First, the leading less-than is optional.
If omitted, Perl assumes that you want to open the file for reading.
+Note also that the first example uses the C<||> logical operator, and the
+second uses C<or>, which has lower precedence. Using C<||> in the latter
+examples would effectively mean
+
+ open INFO, ( "< datafile" || die "can't open datafile: $!" );
+
+which is definitely not what you want.
+
The other important thing to notice is that, just as in the shell,
any white space before or after the filename is ignored. This is good,
because you wouldn't want these to do different things:
@@ -76,6 +86,40 @@
as well. For accessing files with naughty names, see
L<"Dispelling the Dweomer">.
+There is also a 3-argument version of C<open>, which lets you put the
+redirection special characters into their argument.
+
+ open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
+
+Here, you don't have to worry about C<$datafile> containing characters
+that might influence the open mode, or whitespace at the beginning of
+the filename that would be absorbed in the 2-argument version. Plus,
+any reduction of unneccessary string interpolation is a good thing.
+
+=head2 Indirect Filehandles
+
+Until Perl 5.6, C<open>'s first argument had to be a filehandle. Now,
+we can pass an expression that is a reference to the actual filehandle,
+or an I<indirect filehandle>. The indirect filehandle is automatically
+created on open, and can be used anywhere a filehandle can.
+
+ open( my $in, $infile ) or die "Couldn't read $infile: $!";
+ while ( <$in> ) {
+ # do something
+ }
+ close $in;
+
+Indirect filehandles are critical to avoid namespace clashes, since
+filehandles are global. Two functions trying to open handle C<INFILE>
+will clash, but two functions opening C<my $infile> will not.
+
+Indirect filehandles also have the benefit of automatically closing when
+the variable dies, such as when a lexical variable goes out of scope.
+
+ sub firstline {
+ open( my $in, shift ) && return <$in>;
+ }
+
=head2 Pipe Opens
In C, when you want to open a file using the standard I/O library,
@@ -85,7 +129,7 @@
remains the same--just its argument differs.
If the leading character is a pipe symbol, C<open> starts up a new
-command and open a write-only filehandle leading into that command.
+command and opens a write-only filehandle leading into that command.
This lets you write into that handle and have what you write show up on
that command's standard input. For example:
@@ -98,7 +142,7 @@
command writes to its standard output show up on your handle for reading.
For example:
- open(NET, "netstat -i -n |") || die "can't fun netstat: $!";
+ open(NET, "netstat -i -n |") || die "can't open netstat: $!";
while (<NET>) { } # do something with input
close(NET) || die "can't close netstat: $!";
@@ -220,7 +264,7 @@
@ARGV = glob("*") unless @ARGV;
-You could even filter out all but plain, text files. This is a bit
+You could even filter out all but plain text files. This is a bit
silent, of course, and you might prefer to mention them on the way.
@ARGV = grep { -f && -T } @ARGV;
@@ -252,8 +296,8 @@
Yes, this also means that if you have a file named "-" (and so on) in
your directory, that they won't be processed as literal files by C<open>.
-You'll need to pass them as "./-" much as you would for the I<rm> program.
-Or you could use C<sysopen> as described below.
+You'll need to pass them as "./-", much as you would for the I<rm> program,
+or you could use C<sysopen> as described below.
One of the more interesting applications is to change files of a certain
name into pipes. For example, to autoprocess gzipped or compressed
@@ -310,7 +354,7 @@
C<O_DEFER>, C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>,
C<O_NOCTTY>, C<O_NDELAY> and C<O_LARGEFILE>. Consult your open(2)
manpage or its local equivalent for details. (Note: starting from
-Perl release 5.6 the O_LARGEFILE flag, if available, is automatically
+Perl release 5.6 the C<O_LARGEFILE> flag, if available, is automatically
added to the sysopen() flags because large files are the default.)
Here's how to use C<sysopen> to emulate the simple C<open> calls we had
@@ -503,7 +547,7 @@
That's because you opened a filehandle FH, and had read in seven records
from it. But what was the name of the file, not the handle?
-If you aren't running with C<strict refs>, or if you've turn them off
+If you aren't running with C<strict refs>, or if you've turned them off
temporarily, then all you have to do is this:
open($path, "< $path") || die "can't open $path: $!";
@@ -608,7 +652,7 @@
symbolic links, named pipes, Unix-domain sockets, and block and character
devices. Those are all files, too--just not I<plain> files. This isn't
the same issue as being a text file. Not all text files are plain files.
-Not all plain files are textfiles. That's why there are separate C<-f>
+Not all plain files are text files. That's why there are separate C<-f>
and C<-T> file tests.
To open a directory, you should use the C<opendir> function, then
@@ -622,8 +666,8 @@
closedir(DIR);
If you want to process directories recursively, it's better to use the
-File::Find module. For example, this prints out all files recursively,
-add adds a slash to their names if the file is a directory.
+File::Find module. For example, this prints out all files recursively
+and adds a slash to their names if the file is a directory.
@ARGV = qw(.) unless @ARGV;
use File::Find;
@@ -645,6 +689,8 @@
}
}
+=head2 Opening Named Pipes
+
Named pipes are a different matter. You pretend they're regular files,
but their opens will normally block until there is both a reader and
a writer. You can read more about them in L<perlipc/"Named Pipes">.
@@ -685,6 +731,8 @@
also some high-level modules on CPAN that can help you with these games.
Check out Term::ReadKey and Term::ReadLine.
+=head2 Opening Sockets
+
What else can you open? To open a connection using sockets, you won't use
one of Perl's two open functions. See
L<perlipc/"Sockets: Client/Server Communication"> for that. Here's an
@@ -752,11 +800,13 @@
Never use the existence of a file C<-e $file> as a locking indication,
because there is a race condition between the test for the existence of
-the file and its creation. Atomicity is critical.
+the file and its creation. It's possible for another process to create
+a file in the slice of time between your existence check and attempt to
+create the file. Atomicity is critical.
Perl's most portable locking interface is via the C<flock> function,
whose simplicity is emulated on systems that don't directly support it,
-such as SysV or WindowsNT. The underlying semantics may affect how
+such as SysV or Windows. The underlying semantics may affect how
it all works, so you should learn how C<flock> is implemented on your
system's port of Perl.
@@ -814,9 +864,8 @@
or die "can't truncate filename: $!";
# now write to FH
-Finally, due to the uncounted millions who cannot be dissuaded from
-wasting cycles on useless vanity devices called hit counters, here's
-how to increment a number in a file safely:
+Here's an example of safely incrementing a number in a file, as you
+might for a web page hit counter:
use Fcntl qw(:DEFAULT :flock);
@@ -856,7 +905,7 @@
=item *
-The three-(or more)-argument form of C<open()> is being used and the
+The three-(or more)-argument form of C<open> is being used and the
second argument contains something else in addition to the usual
C<< '<' >>, C<< '>' >>, C<< '>>' >>, C<< '|' >> and their variants,
for example:
@@ -865,7 +914,7 @@
=item *
-The two-argument form of C<binmode<open()> is being used, for example
+The two-argument form of C<binmode> is being used, for example
binmode($fh, ":encoding(utf16)");
--
'Andy Lester an...@petdance.com
Programmer/author petdance.com
Daddy parsley.org/quinn Jk'=~/.+/s;print((split//,$&)
[unpack'C*',"n2]3%+>\"34.'%&.'^%4+!o.'"])
> @@ -76,6 +86,40 @@
> as well. For accessing files with naughty names, see
> L<"Dispelling the Dweomer">.
>
> +There is also a 3-argument version of C<open>, which lets you put the
> +redirection special characters into their argument.
I think that should read "into their own argument".
> +
> + open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
> +
> +Here, you don't have to worry about C<$datafile> containing characters
> +that might influence the open mode, or whitespace at the beginning of
> +the filename that would be absorbed in the 2-argument version. Plus,
> +any reduction of unneccessary string interpolation is a good thing.
That should be spelled 'unnecessary'.
> +
> +=head2 Indirect Filehandles
> +
> +Until Perl 5.6, C<open>'s first argument had to be a filehandle. Now,
> +we can pass an expression that is a reference to the actual filehandle,
> +or an I<indirect filehandle>. The indirect filehandle is automatically
> +created on open, and can be used anywhere a filehandle can.
This paragraph needs to be rewritten. open($f, $file), where $f can be a
ref to a glob or even a plain old string, has been supported in Perl for a
long time. The new feature in 5.6 is that $f can be an uninitialized, and
a filehandle will be autovivified and placed in $f.
> +
> + open( my $in, $infile ) or die "Couldn't read $infile: $!";
> + while ( <$in> ) {
> + # do something
> + }
> + close $in;
> +
> +Indirect filehandles are critical to avoid namespace clashes, since
> +filehandles are global. Two functions trying to open handle C<INFILE>
> +will clash, but two functions opening C<my $infile> will not.
I think it would be more accurate to say that filehandles are package
variables. Two functions trying to open handle INFILE will clash only if
the functions are in the same package.
That detail probably isn't necessary in the tutorial, though, so never
mind. :)
> +
> +Indirect filehandles also have the benefit of automatically closing when
> +the variable dies, such as when a lexical variable goes out of scope.
I don't think that Perl has a concept of a variable "dying". Could this be
written another way?
> @@ -220,7 +264,7 @@
>
> @ARGV = glob("*") unless @ARGV;
>
> -You could even filter out all but plain, text files. This is a bit
> +You could even filter out all but plain text files. This is a bit
> silent, of course, and you might prefer to mention them on the way.
>
> @ARGV = grep { -f && -T } @ARGV;
I think the original was correct in this case. plain and text are two
separate attributes, corresponding to -f and -T.
> @@ -608,7 +652,7 @@
> symbolic links, named pipes, Unix-domain sockets, and block and character
> devices. Those are all files, too--just not I<plain> files. This isn't
> the same issue as being a text file. Not all text files are plain files.
> -Not all plain files are textfiles. That's why there are separate C<-f>
> +Not all plain files are text files. That's why there are separate C<-f>
> and C<-T> file tests.
This is the justification for the comma in the block above. :)
Ronald
Thanks for the clarification (and other notes).
I specifically left out "autovivified" as being vocabulary out of range
for a tutorial. Thoughts?
xoxo,
Andy
Hmm... Something like "A filehandle will be created for you and placed in
$f"?
Ronald
> +There is also a 3-argument version of C<open>, which lets you put the
> +redirection special characters into their argument.
I've found it often useful to start with the 3 argument form, and then
later add the 2 and 1 argument forms. Would that be something to
consider?
-- Johan
I'll second that. For a tutorial it is IMHO better to tell about the
more general and robust interface first and only later mention the
short forms that cater to lazyness. "First get it right, then get it
fast" applies to tutorials too (though with a changed meaning)...
Roland
--
RGie...@cpan.org