local $/;
my $data = <$fh>;
print $data;
prints $data at once. (it calls PerlIO_write once)
This code, which I'd think should do exactly the same:
local $/;
print <$fh>;
calls PerlIO_write 10 times, on each line. The only way I found to make it
equivalent to the slurp-into-var-and-then-print is to undef $\ as well:
local $\;
local $/;
print <$fh>;
My guess is that it's non-specific to the io system, but has to do with how
perl handles <$fh> in a slurp mode. I just used PerlIO_ layers to trace it.
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:st...@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
It is context. <$fh> is in scaler context in the first case, and array
context in the second case.
--
Glenn -- http://nevcal.com/
===========================
The best part about procrastination is that you are never bored,
because you have all kinds of things that you should be doing.
> On approximately 3/4/2004 6:36 PM, came the following characters from
> the keyboard of Stas Bekman:
>
>> let's say we have a file with 10 lines of text in it and we have $fh
>> opened to read this file. This code:
>>
>> local $/;
>> my $data = <$fh>;
>> print $data;
>>
>> prints $data at once. (it calls PerlIO_write once)
>>
>> This code, which I'd think should do exactly the same:
>>
>> local $/;
>> print <$fh>;
>>
>> calls PerlIO_write 10 times, on each line. The only way I found to
>> make it equivalent to the slurp-into-var-and-then-print is to undef $\
>> as well:
>>
>> local $\;
>> local $/;
>> print <$fh>;
>>
>> My guess is that it's non-specific to the io system, but has to do
>> with how perl handles <$fh> in a slurp mode. I just used PerlIO_
>> layers to trace it.
>
>
> It is context. <$fh> is in scaler context in the first case, and array
> context in the second case.
Of course, it is more than context, too.
Not quite. That calls <$fh> in a list context.
Sure. But since $/ is undef, in both cases you'd expect
the whole filecontents to be passed in one go to print.
And print receives a one-element list in both cases,
so I'd still expect the number of PerlIO_write's to be
the same.
Yup, that has nothing to do with the context. At least it shouldn't have to.
this program should have everything in the first item of the array:
local $/;
my @data = <DATA>;
print join '%', @data;
__DATA__
1
2
3
and it does, as it prints:
1
2
3
whereas commenting out the first line, prints:
1
%2
%3
I'd expect perlio to behave the same. looks like it ignores the local value of $/?
> My guess is that it's non-specific to the io system, but has to do
> with how perl handles <$fh> in a slurp mode. I just used PerlIO_
> layers to trace it.
can you post the code you used?
--
andreas
So the bug would be with print() rather than with <$fh> ?
Having has a look at this, in all cases having $/ undef causes the whole
file to be slurped in as a single PV. In the case of
local $/;
print <DATA>;
__END__
foo1
bar2
baz3
it shows the following:
(/tmp/p1:4) gv(main::DATA)
=> * GV()
(/tmp/p1:4) readline
=> * PV("foo1\12bar2\12baz3\12"\0)
(/tmp/p1:4) print
foo1
bar2
baz3
=> SV_YES
So print just prints a single 3-line string; however, once it gets as far as
PerlIOBuf_write() this function indirectly calls PerlIO_write once for
each line in the string. As to whether this is a good thing for it to do, I
have no opinion.
--
You live and learn (although usually you just live).
Well, on my system the following script:
local $/;
open my $fh, "/etc/hosts" or die "$!\n";
my $data = <$fh>;
print $data;
Calls PerlIO_write() once for each line in the file. I Guess what determines
whether PerlIO_write() is called once or multiple times is to do with
how STDOUT is buffered. In fact, looking at PerlIOBuf_write(), it has code
along the lines of:
if (PerlIOBase(f)->flags & PERLIO_F_LINEBUF) {
.. print a line at a line
}
else {
... print as a single chunk.
}
Perhaps that's the phenomenon(*) what Stas was seeing???
Dave.
(*) Try typing that word correctly at 1:15am.
--
The Enterprise is captured by a vastly superior alien intelligence which
does not put them on trial.
-- Things That Never Happen in "Star Trek" #10
Yes, I'm looking at that code too. I can't figure out who sets the
PERLIO_F_LINEBUF flag. It seems like a crlf layer would do that.
Andreas, I had a crash and I also did some updates while installing 5.8.4-tobe
before I came to work on this case again and something has changed. I can no
longer reproduce the case :( The original fh was coming from CGI.pm's file
upload method.
Most likely that was the case that I was seeing - some layer that called
Setlinebuf was pushed onto the STDOUT layers stack, causing the behaviour that
I saw.
perliol.pod has this entry:
---------------------
=item Setlinebuf
void (*Setlinebuf)(pTHX_ PerlIO *f);
Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the
PERLIO_F_LINEBUF flag and is normally sufficient.
----------------------
but it doesn't explain when this gets called by perl (I can't find any
occurence of perl calling this callback). Or when a layer should call it. I
set this callback in :Apache layer as well (I don't call it) and now I think I
should replace it with PerlIOBase_noop_ok? Or will this break something?
OK, so it messes up with the flags directly. So, what is this PerlIO tab entry
for:
Setlinebuf PerlIOBase_setlinebuf
Who calls it? I can't figure out whether I need to set this entry, without
knowing when does it get called.
>>It seems like a crlf layer would do that.
>
> No, it diddles with the buffer. You still get a whole buffer's write()
> unless stream is line buffered.
Thank you.
PerlIOBuf_pushed() sets stream to line buffered if attached to a tty.
This minics stdio "spec".
>It seems like a crlf layer would do that.
As far as I know nothing. It is there from 5.003_02's PerlIO API
at the time a goal was to mimic stdio via #define layer and
some XS then called setlinebuf() passing in what it thought was
a FILE *. Without this entry encapsulation broke and you got segfaults.
I can't find anything in //depot/maint-5.8/perl/... which uses it.
>I can't figure out whether I need to set this entry, without
>knowing when does it get called.
I think you can just use the PerlIOBase_setlinebuf to set the flag.
I left it as a Vtable entry as :stdio wanted an active hook
to call setlinebuf().
Oh, that's a bug -- it's a misunderstanding of what "line buffering"
means.
"Line buffered" doesn't mean "one write per line", it means "flush
everything before a newline along with the newline".
So, given "a\nb\nc", where there is no trailing newline, "a\nb\n"
should be written out as a single write, and "c" should be left in the
output buffer.
--
Chip Salzenberg - a.k.a. - <ch...@pobox.com>
"I wanted to play hopscotch with the impenetrable mystery of existence,
but he stepped in a wormhole and had to go in early." // MST3K
PerlIOBuf_write currently does two writes and leaves "c" in the buffer.
Question is whether the efficiency gain of doing one write is worth
either the extra house keeping entry to remember that last \n is
at offset N and then a memmove() to get fragment at start of buffer.
Patch welcome ;-)
The only change required is that the linebuffer code should look for
the *last* newline instead of the *first*. Everything else is the
same. How hard can it be? <- famous last words
Neither did I. I thought may be some 3rd party module uses it.
>>I can't figure out whether I need to set this entry, without
>>knowing when does it get called.
>
>
> I think you can just use the PerlIOBase_setlinebuf to set the flag.
> I left it as a Vtable entry as :stdio wanted an active hook
> to call setlinebuf().
So can the perliol.pod manpage be amended to explain what is it for, and what
are the implications of having this flag set? So we don't have to repeat this
thread in the future. You've just explained it in another branch of this
thread ("writing everything up to and including the last \n, leaving the rest
in the buffer").
This patch is appropriate for both blead and maint-5.8, IMO.
(A possible further optimization is to call memchr() once to determine
whether there even *are* newlines in the target string, before going
through the whole thing backwards by hand. But given that the target
is probably a tty, there's likely no point.)
==== //depot/perl/perlio.c#247 - /u/projects/perl/current/perlio.c ====
@@ -3692,4 +3692,5 @@
PerlIOBuf *b = PerlIOSelf(f, PerlIOBuf);
const STDCHAR *buf = (const STDCHAR *) vbuf;
+ const STDCHAR *flushptr = buf;
Size_t written = 0;
if (!b->buf)
@@ -3702,30 +3703,24 @@
}
}
+ if (PerlIOBase(f)->flags & PERLIO_F_LINEBUF) {
+ flushptr = buf + count;
+ while (flushptr > buf && *(flushptr - 1) != '\n')
+ --flushptr;
+ }
while (count > 0) {
SSize_t avail = b->bufsiz - (b->ptr - b->buf);
if ((SSize_t) count < avail)
avail = count;
+ if (flushptr > buf && flushptr <= buf + avail)
+ avail = flushptr - buf;
PerlIOBase(f)->flags |= PERLIO_F_WRBUF;
- if (PerlIOBase(f)->flags & PERLIO_F_LINEBUF) {
- while (avail > 0) {
- int ch = *buf++;
- *(b->ptr)++ = ch;
- count--;
- avail--;
- written++;
- if (ch == '\n') {
- PerlIO_flush(f);
- break;
- }
- }
- }
- else {
- if (avail) {
- Copy(buf, b->ptr, avail, STDCHAR);
- count -= avail;
- buf += avail;
- written += avail;
- b->ptr += avail;
- }
+ if (avail) {
+ Copy(buf, b->ptr, avail, STDCHAR);
+ count -= avail;
+ buf += avail;
+ written += avail;
+ b->ptr += avail;
+ if (buf == flushptr)
+ PerlIO_flush(f);
}
if (b->ptr >= (b->buf + b->bufsiz))
I don't know if this is relevant to the thread (I've not followed it closely)
but FYI the DBI uses PerlIO_setlinebuf on trace files.
Tim.
it's relevant to the point of where is it used ;)
> but FYI the DBI uses PerlIO_setlinebuf on trace files.
that's to avoid interleaving of log messages?
Partly, more so that a hard crash (segfault) won't leave much unwritten.
Tim.