# Does Perl have known memory leaks?

229 views

### Sean Casey

Nov 5, 1991, 10:49:07 AM11/5/91
to
I've been having trouble running some perl scripts on a DOS machine
(the dos port of perl 4) because it runs out of memory mid-run. Thing
is, it's just iterating on a very large file. I'm not allocating new
space or doing anything that would obviously tie up memory. When I
looked on the Unix box (also perl 4), sure enough, the same script was
using well over a megabyte of memory.

Does perl have known memory leaks? Does scanning for patterns perform
mallocs in perl that aren't ever freed?

Sean

--
Sean Casey |Wind, waves, etc. are breakdowns in the face of the
se...@s.ms.uky.edu | commitment to getting from here to there. But they are the
U of KY, Lexington| conditions for sailing -- not something to be gotten rid
606-258-6000 x280 | of, but something to be danced with.''

### Tom Christiansen

Nov 5, 1991, 12:30:48 PM11/5/91
to
From the keyboard of se...@ms.uky.edu (Sean Casey):
:Does perl have known memory leaks? Does scanning for patterns perform

:mallocs in perl that aren't ever freed?

Perl has in the past had memory leaks. It may still, especially in
a DOS implementation where Larry has much less chance to test it.
But be careful of things like this:

while (<>) {
local($pattern) = &bar; } Putting the local in the loop would trigger behavior resembling a memory leak. --tom ### Sean Casey unread, Nov 6, 1991, 8:43:24 AM11/6/91 to Tom Christiansen <tch...@convex.COM> writes: | while (<>) { | local($pattern) = &bar;
| }

|Putting the local in the loop would trigger behavior resembling
|a memory leak.

That's exactly what I'm doing. Hmmm I'll try moving to global
variables and see if it gets better.

### Jurgen Botz

Nov 6, 1991, 3:31:39 PM11/6/91
to
In article <1991Nov05....@convex.com> tch...@convex.COM (Tom Christiansen) writes:
>But be careful of things like this:
>
> while (<>) {
> local($pattern) = &bar; > } > >Putting the local in the loop would trigger behavior resembling >a memory leak. Why? -- Jurgen Botz | Internet: JB...@mtholyoke.edu Academic Systems Consultant | Bitnet: JB...@mhc.bitnet Mount Holyoke College | Voice: (US) 413-538-2375 (daytime) South Hadley, MA, USA | Snail Mail: J. Botz, 01075-0629 ### Tom Christiansen unread, Nov 6, 1991, 5:12:47 PM11/6/91 to From the keyboard of jb...@mtholyoke.edu (Jurgen Botz): :In article <1991Nov05....@convex.com> tch...@convex.COM (Tom Christiansen) writes: :>But be careful of things like this: :> :> while (<>) { :> local($pattern) = &bar;
:> }
:>
:>Putting the local in the loop would trigger behavior resembling
:>a memory leak.
:
:Why?

Because local() is *NOT* a declaration!! It's an executable statement
that pushes a new value on a stack for that symbol. The
previous value will be restored when the local()'s scope is exited,
which in this case is the enclosing while loop. That means

for (1..1000) {
local($foo); } Gives you 1000${foo}s before you're done. If you read the man page
closely under local() it says:

Note that local() is a run-time command, and so gets
executed every time through a loop, using up more
stack storage each time until it's all released at
once when the loop is exited.

--tom

### Larry Wall

Nov 6, 1991, 10:51:47 AM11/6/91
to
In article <1991Nov5.1...@ms.uky.edu> se...@ms.uky.edu (Sean Casey) writes:
: I've been having trouble running some perl scripts on a DOS machine

: (the dos port of perl 4) because it runs out of memory mid-run. Thing
: is, it's just iterating on a very large file. I'm not allocating new
: space or doing anything that would obviously tie up memory. When I
: looked on the Unix box (also perl 4), sure enough, the same script was
: using well over a megabyte of memory.
:
: Does perl have known memory leaks? Does scanning for patterns perform
: mallocs in perl that aren't ever freed?

A couple of small leaks are fixed in 4.018. One involved foreach loops
with null lists, and the other involved local(*FILEHANDLE).

You can test for leaks by compiling with -DLEAKTEST, inserting

warn "New allocations:\n" unless $counter++ % 100; in the middle of your loop, and running your script with -D4096, which makes the warn command dump out a list of newly allocated memory. The numbers reported correspond to arguments to the New() macro or, in the 700's, the Str_new() macro. If it's not a leak, look in the Space Efficiency section of Chapter 7. It would be fun to write a profiler that added up the amount of memory allocated for each statement. Larry ### Bernie Cosell unread, Nov 9, 1991, 2:46:17 PM11/9/91 to tch...@convex.COM (Tom Christiansen) writes: }From the keyboard of jb...@mtholyoke.edu (Jurgen Botz): }:In article <1991Nov05....@convex.com> tch...@convex.COM (Tom Christiansen) writes: }:>But be careful of things like this: }:> }:> while (<>) { }:> local($pattern) = &bar;
}:> }
}:>
}:>Putting the local in the loop would trigger behavior resembling
}:>a memory leak.
}:
}:Why?

}Because local() is *NOT* a declaration!! It's an executable statement
}that pushes a new value on a stack for that symbol. The
}previous value will be restored when the local()'s scope is exited,
}which in this case is the enclosing while loop.

Ah --- an interesting and subtle bit of perl's semantics. You
slightly misstate the underlying cause problem: it is not whether
'local' is executable or a declaration, but rather that perl has a
rather unusual notion of "scope". It is something which I find *VERY*
peculiar [in fact, I'd almost call it a bug]: most languages define
'scope' quite a bit differently from the way perl does in this case,
and so most 'normal' experience with scoping rules would lead you to
believe that the scope of the 'local' was between the {}'s and *not*
the entire loop. This notion of how scoping works is quite singular to
Perl --- I can't think of another language or programming environment
in which it works the way it does here.

/Bernie\

### Tom Christiansen

Nov 9, 1991, 5:42:42 PM11/9/91
to
Perl's scoping of local()s is dynamic. Other languages
have dynamic scoping as well, although most compiled ones
use static scoping.

Try to remember that local() is not a declaration, and to
deply understand what it is doing.

--tom

### Bernie Cosell

Nov 11, 1991, 10:39:01 AM11/11/91
to
tch...@convex.COM (Tom Christiansen) writes:

}Perl's scoping of local()s is dynamic. Other languages
}have dynamic scoping as well, although most compiled ones
}use static scoping.

FOO! This is an irrelevancy. I understand dynamic vs static scoping
perfectly well, thank you very much, and it is wholly irrelevant to the
matter. The question here is where the scoping boundaries *ARE*, not
how names are associated with values.

}Try to remember that local() is not a declaration, and to
}deply understand what it is doing.

I'm insulted by the condescension implicit in the dismissal of my
assertion that perl is a bit off in this regard as being attributable
to my lack of 'deep understanding'. What nonsense. And the stuff
about declaration versus executable status is equally nonsense. The
matter at issue is where the scope boundaries are, and whether perl
has, for whatever reason, chosen to put scope boundaries in a VERY
unusual place [one that is sufficiently unusual, I think, to be at the
least labeled an anomaly of the language, if not an actual misfeature
of the design].

We turn to the camel book, and in the entry for local() we read:

This function declares (sic) the listed variables to be local to the
enclosing block, subroutine, or {\bf eval}... This operator works
by saving the current values ... and restoring them upon exiting the
block, subroutine or eval.

Now we turn to the section on Compound Statements:

A sequence of statements may be grouped together by enclosing them in
curly brackets ({}). We will call this a BLOCK. ..

The following compound statements may be used to control flow:
...
LABEL: while (EXPR) BLOCK
...

Now, I've programmed in upwards of 100 languages over the years, and I
had no trouble reaching a 'deep understanding' of that: assuming that
perl used *anything* like the normal definitions for those words, then
you'd expect the local() to be undone at every iterationof the loop,
just as it does in every OTHER language with this general syntax.

The idea that perl gratuitously folds the compound statement to be
*INSIDE* the block, despite the clear defintion that shows the block to
be inside the compound statement, and not vice verase, is just
perverse. *NO* language I know of does such a thing, precisely because
it almost hopelessly muddies the defintion of 'block'.

In fact, the more I think about it, the more it is becoming crystal
clear to me that this is nothing more than a bug in perl: the handling
of local() is in error because it violate the proper and normal [not to
mention defined-by-perl] semantics of "block". Period.

/Bernie\

### Felix Lee

Nov 11, 1991, 12:25:34 PM11/11/91
to
Re: local() statement inside a block,

Bernie Cosell writes:
>The idea that perl gratuitously folds the compound statement to be
>*INSIDE* the block, despite the clear defintion that shows the block to
>be inside the compound statement, and not vice verase, is just
>perverse.

Okay. Here's a different explanation. A 'block' is a dynamic object
that takes considerable effort to create. If a loop were to create
and destroy a block on every iteration, it would be much, much slower,
so instead, loops create their block just once and repeatedly execute
it. (This is not really any different from how other languages treat
blocks, except many languages have static block objects instead.)

Now this loop optimization interacts poorly with the fact that 'local'
is an executable statement, not a declaration. In nearly every other
language, the equivalent to local is a declaration, so it happens just
once, when the block is created. But in perl, the local happens every
time the block is executed.

This is merely a flaw in implementation. It makes no difference to
the meaning of the program whether a local() in a loop consumes
megabytes of memory or not.

One way to fix this is to turn local into a declaration, but this may
break code like this:
$x = 'local($y)';
{ eval $x;$y = 3; }
Does this matter? Perhaps not.

Another way to fix this is to note which block local variables belong
to. If you execute another local for the same variable in the same
block, you can reclaim the old version since it will never be
accessible again. Or, equivalently, you could make the local a noop
and just reuse the same slot.
--
Felix Lee fl...@cs.psu.edu

### Bob Kerns

Nov 12, 1991, 2:18:10 AM11/12/91
to
In article <kht90...@cosell.bbn.com>, cos...@cosell.bbn.com (Bernie Cosell) writes:

> I'm insulted by the condescension implicit in the dismissal of my
> assertion that perl is a bit off in this regard as being attributable
> to my lack of 'deep understanding'.

Now, now, I don't think Tom was being particularly condescending,
although I can see how it might have felt that way. He merely
couldn't see your model, and thus misjudged where you were coming
from, and was trying to helpfully explain. Try to remember the
the dangers of electronic misunderstanding.

I must say that while I could immediately see where you were
coming from, your model results in a VERY convoluted picture
of the semantics. (And yes, the semantics are already extremely
convoluted; especially the concept of a 'variable').

But we language-designers should be forgiving of perl's faults.
Remember, it's something that grew, and wasn't really designed
at all. View it as a piece of natural history, rather than a
machine. Yes, a donkey is a rather asinine machine for turning
oats into CO2 and H20 while carrying loads, but hey, *I* didn't
have to design it, and its free, and even self-reproducing!

> Now, I've programmed in upwards of 100 languages over the years, and I
> had no trouble reaching a 'deep understanding' of that: assuming that
> perl used *anything* like the normal definitions for those words, then
> you'd expect the local() to be undone at every iterationof the loop,
> just as it does in every OTHER language with this general syntax.

Indeed, this would make it look more like the variable had a scope.

But in fact, it's hard to think of scope when you can have
CONDITIONAL and MULTIPLE localization. Scope relates to the
mapping between the syntax and the semantics, and there
really isn't a mapping here at all.

> perverse. *NO* language I know of does such a thing, precisely because
> it almost hopelessly muddies the defintion of 'block'.

Quite true. Welcome to perl, which has absolutely the most
perverse semantics I've ever seen. It's essentially an overgrown
collection of hacks. What's so weird is that it is still useful
despite this. Mostly I avoid the wierdness through stylized usage;
I pretend that various pieces of the language don't exist, or
have a simpler semantics. For example, I don't write conditional
LOCAL's. I write LOCAL at the beginning of my blocks, and I
*mostly* ignore that it isn't an executable. Once in a while I
groan and move a LOCAL earlier than I'd like.

It's really like programming in assembly sometimes. I think of
LOCAL as being PUSH SP,VAR or MOVE VAR,(SP)+ (or your favorite
assembler syntax here).

> In fact, the more I think about it, the more it is becoming crystal
> clear to me that this is nothing more than a bug in perl: the handling
> of local() is in error because it violate the proper and normal [not to
> mention defined-by-perl] semantics of "block". Period.

Let's be clear here. It's a bit of mis-design of the language,
not a bug in the implementation. (It's good you didn't mention
defined-by-perl, since it is certainly the semantics defined-by-perl!)

### Larry Wall

Nov 13, 1991, 2:44:20 PM11/13/91
to
In article <8i1H!0+...@cs.psu.edu> fl...@cs.psu.edu (Felix Lee) writes:
: Re: local() statement inside a block,
:
: Okay. Here's a different explanation. A 'block' is a dynamic object

: that takes considerable effort to create. If a loop were to create
: and destroy a block on every iteration, it would be much, much slower,
: so instead, loops create their block just once and repeatedly execute
: it. (This is not really any different from how other languages treat
: blocks, except many languages have static block objects instead.)

This is essentially correct. I guess that's part of why you don't see
more interpreters for declarative-rich languages...

But my usage of terms will certainly seem perverse to those steeped in
compiler technology. And I do mis-speak myself. The book shouldn't
have said that local() "declares" anything.

As for the term "block", I only called it that because of it's
resemblance to the blocks of a compiled language. However, in Perl,
the only bit of static scoping tied to blocks is the package
declaration. (There is no scoping for subroutines or formats.) All
other scoping happens at runtime, in whichever way I deem to be most
useful and efficient. There's a heap of stuff internal to the
interpreter that gets localized just like your local variables, and
it's all dynamic. It almost has to be, if you want the compiler to run
in under 53 megabytes.

: Now this loop optimization interacts poorly with the fact that 'local'

: is an executable statement, not a declaration. In nearly every other
: language, the equivalent to local is a declaration, so it happens just
: once, when the block is created. But in perl, the local happens every
: time the block is executed.
:
: This is merely a flaw in implementation. It makes no difference to
: the meaning of the program whether a local() in a loop consumes
: megabytes of memory or not.

True, but there is a semantic distinction here too, quite aside from
the quibble about vocabulary. In the current setup

for (1..10) {
$huh =$foo;
local($foo); ... } sets$huh to the current local value of $foo. It only gets the global value of$foo on the first time through the loop. If we made $foo revert to the global value at the end of each iteration,$huh gets the global
value each time.

: One way to fix this is to turn local into a declaration, but this may

: break code like this:
: $x = 'local($y)';
: { eval $x;$y = 3; }
: Does this matter? Perhaps not.

Actually, the example above is ok, since local() is local to an eval too.
But it would definitely break things like

local($_) =$_[0] if @_;

: Another way to fix this is to note which block local variables belong

: to. If you execute another local for the same variable in the same
: block, you can reclaim the old version since it will never be
: accessible again. Or, equivalently, you could make the local a noop
: and just reuse the same slot.

But you have to be very careful about recursion. You don't want

sub FOO {
local($BAR); ... &FOO; } to break. Nevertheless, it would be a good way to fix it if we want to keep the current semantics. If we want local() to revert at the end of the block, however, it won't fly. Making it revert at the end of the block would certainly impose more time overhead (and less space overhead, which is what started this all off in the first place) on those scripts that do local() within a loop block. It would probably also impose more overhead on scripts that don't do local(). One thing that complicates matters is that the interpreter DOESN'T actually exit the block on each loop iteration. A loop like while ($cond) {
&stuff;
}

is optimized to something resembling

if ($cond) { TOP: &stuff; last unless$cond;
goto TOP;
}

except that the if isn't really an if and the goto isn't really a goto.
The inner block is really a circular linked list of statements, so it
looks to the interpreter like

&stuff;
last unless $cond; &stuff; last unless$cond;
&stuff;
last unless $cond; &stuff; last unless$cond;
&stuff;
last unless $cond; ... In order to make local() revert, I'd have to stick something in to check if anything needs to revert, and that might slow things down even when nothing needs to revert. Yes, compile-time analysis might determine whether such a check was even necessary, but Perl already does about 2 1/2 passes, and I already get enough complaints about startup time. We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise. Larry ### Bernie Cosell unread, Nov 14, 1991, 7:43:20 AM11/14/91 to lw...@netlabs.com (Larry Wall) writes: }But my usage of terms will certainly seem perverse to those steeped in }compiler technology. And I do mis-speak myself. The book shouldn't }have said that local() "declares" anything. }As for the term "block", I only called it that because of it's }resemblance to the blocks of a compiled language. However, in Perl, }the only bit of static scoping tied to blocks is the package }declaration.... One problem is that the "camel book" isn't nearly as circumspect about the terminology as it needs to be if this is to be clear [or even *inferrable*!]. The section on compound statements clearly says: A sequence of statements may be grouped together by enclosing them in curly brackets ({}). We will call this a BLOCK. [...] LABEL: while (EXPR) BLOCK [...] The {\bf while} statement repeatedly executes the block as long as the expression is true. This makes it quite unequivocal to my reading what a block is, and that a block is one component of a while statement. Ignoring the problem with saying 'declares', still the section for local() says: The function declares the listed variables to be local to the enclosing block ... the operator works by ... restoring them upon exiting the block... Now, I admit: 'exiting the block' is *nowhere* defined. But still one has to somehow figure out what that means. For example, consider the loop:$t = 0;
while ($t == 0) { local($t);
$t = 1; } Should we expect that loop to run forever? What if it were written$t = 0;
while ($t == 0) { { local($t); $t = 1; } } is the local() undone *now*? Why is the interior block 'exited' each time around, but the outer block is not? }Making it revert at the end of the block would certainly impose more }time overhead (and less space overhead, which is what started this all }off in the first place) on those scripts that do local() within a loop }block. It would probably also impose more overhead on scripts that don't }do local(). I understand the implementation and efficiency considerations underlying all this. And although it doesn't appear to be so, I'm really *NOT* arguing that perl's block semantics be changed. What I'm trying to do is highlight both that there *IS* an inconsistency [or oversight, if you will] in the camel book, and also how subtle the new description for 'block' is going to have to be to make the actual situation clear. For example: does anyone think they could suggest replacement text for the compound statement and local() sections of the camel book that would actually make clear what is going on? The culprit, I think, is local() --- it claims to have some relationship to the "enclosing block", when it does not, really. What if the description of local() were reworded so that it no longer speaks of *blocks* at all, but just says something like ...restores the value when you exit the nearest enclosing dynamic context. note that you need not call out the special cases of 'dynamic context', and clutter up the defintion unnecessarily, with the litany "block, subr or eval" every time... just calling it 'dynamic context' seems adequate, clear enough, and closer to the truth. But then, of course, you have to define "dynamic context"... But with things set up like this, that now becomes an easy matter. Beyond some explanation about what a dynamic context means, the actual definition is just: A dynamic context is: a compound statement, including all of its interior BLOCKS and EXPRs subroutine body eval a BLOCK if not a component of a compound statement [note: with this defintion, you don't even need to call out 'subroutine' any more: it is just a special case of a "block not a component of a compound statement"] /Bernie\ ### Larry Wall unread, Nov 15, 1991, 1:37:40 PM11/15/91 to Yeah, I've always preferred vigor over rigor. So I need other people to keep me honest. Thanks. On the other hand, every time you throw in a term like "dynamic context", about half the readers say "Huh?" and put off reading the rest of the book till their summer vacation. If I were writing a C++ book, I wouldn't care, because that kind of people shouldn't be programming in C++ anyway. But Perl as Pretentions to Populism, so I try to make things intuitive for the masses. Unfortunately, this makes it a little less definitive to the Semantically Sensible. Ah well. Sometimes the happy medium isn't so happy. Larry ### Felix Lee unread, Nov 18, 1991, 5:27:10 PM11/18/91 to Warned off by the man page, noone should be using local() within a loop anyway, so we're probably free to invent whatever behavior seems appropriate for it. I think this$t = 1;
while ($t) { local($t) = 0; }
should never terminate, and this
$t = 5; while ($t) { --$t; local($t) = 0; }
should terminate after 5 iterations.

Hmm. Lexical scoping would be easier and more efficient. Would
anyone mind much if local() were suddenly lexically scoped?

Okay, how about an implementation that translates
while ($t) { --$t; local($t) = 0; } into something like if$t {
push $t loop: --previous($t)
$t = 0 goto loop if previous($t)
endloop:
pop $t } This works because you can statically determine all references to previous($t) at compile-time. Well, this doesn't work if you say
local($x) =$y if $z; but I think this can be unfolded into local($x) = $x;$x = $y if$z;
--
Felix Lee fl...@cs.psu.edu

### Tom Christiansen

Nov 18, 1991, 5:45:37 PM11/18/91
to
From the keyboard of fl...@cs.psu.edu (Felix Lee):
:Hmm. Lexical scoping would be easier and more efficient. Would

:anyone mind much if local() were suddenly lexically scoped?

Yup, me. I have programs that do

{
local($some_global) =$new_value;
&function();
}

where &function expects to be looking at \$some_global. It's may be a
disgusting hack, but it's been officially sactioned since local()'s
inception. I'd have to change my code. I'll bet others would, too.
Making users change their code is always a bad thing, even if sometimes
you have to do it.

--tom

### Felix Lee

Nov 18, 1991, 7:31:55 PM11/18/91
to
I wrote:
>Hmm. Lexical scoping would be easier and more efficient. Would
>anyone mind much if local() were suddenly lexically scoped?

Tom Christiansen wrote:
>Yup, me. I have programs that do

Okay. How about introducing a different keyword for lexical scoping,
that people will prefer to use. Rather like let and fluid-let in some
dialects of Scheme and Lisp.

The thing that stops me is I can't think of a word better than local.
"declare" or "dcl" perhaps.
--
Felix Lee fl...@cs.psu.edu

### John Macdonald

Nov 19, 1991, 2:02:10 PM11/19/91
to
In article <266H&&e...@cs.psu.edu> fl...@cs.psu.edu (Felix Lee) writes:
|I wrote:
|>Hmm. Lexical scoping would be easier and more efficient. Would
|>anyone mind much if local() were suddenly lexically scoped?
|
|Tom Christiansen wrote:
|>Yup, me. I have programs that do

Me too, I'm sure...

|Okay. How about introducing a different keyword for lexical scoping,
|that people will prefer to use. Rather like let and fluid-let in some
|dialects of Scheme and Lisp.
|
|The thing that stops me is I can't think of a word better than local.
|"declare" or "dcl" perhaps.

How about "private"? Same syntax as "local", but it only hides the
original name for the code within the current block and nested blocks,
but not nested "do" blocks, or function calls. It would *not* do a
re-allocate every time around a loop (but it would do a reallocate
if a more deeply nested block redeclared the same name as private).
--
Usenet is [like] the group of people who visit the | John Macdonald
park on a Sunday afternoon. [...] luckily, most of | jmm@eci386
the people are staying on the paths and not pissing |
on the flowers - Gene Spafford