fix

Alex M

nepřečteno,

29. 11. 2011 0:44:0329.11.11

komu: stanford-1...@googlegroups.com

The definition of fix presented in class today was:

fix :: (a -> a) -> a
fix f = let x = f x in x

However, an alternative definition of fix (from
http://en.wikipedia.org/wiki/Fixed-point_combinator#Implementing_fixed-point_combinators)
appears to be:

fix' :: (a -> a) -> a
fix' f = f (fix' f)

The two exhibit the same behavior on, say, (const 9):

Prelude> fix (const 9)
9
Prelude> fix' (const 9)
9
Prelude>

However, if we have some f like this:

Prelude> let f x = 5 + x

then evaluating the below expression just loops forever without
running out of stack:

Prelude> fix f

whereas evaluating this expression quickly results in stack overflow:

Prelude> fix' f

Also, if we instead do:

Prelude> fix' (\x -> 5 + x)

then we don't run out of stack and just loop forever.

So I'm quite confused by all this. Also, it's not clear to me how
recursion is "implemented using the fixed-point combinator" in the
"Background: recursive bindings" slide.

Any clarification on the above would be appreciated.

Thanks,
Alex

David Terei

nepřečteno,

29. 11. 2011 5:21:5429.11.11

komu: stanford-1...@googlegroups.com

Good question! The answer pulls in some of the trickiness of the RTS
so its going to be a long but interesting answer. High level is that
fix doesn't actually loop forever, it blocks and fix' does loop
forever with stack allocation so eventually overflows.

To start though, I don't get any difference in behaviour between:

fix' f

fix' (\x -> 5 + x)

Both report stack overflow for me (using 7.2.2), in fact if I look at
the code that GHC generates it's exactly the same for both cases.

As for fix vs fix' the STG code for them is as follows (top line
Haskell, bottom line STG):

fix f = let x = f x in x

=>

fix f = let x = f x in x

fix' f = f (fix' f)
=>
fix' f = let x = fix' f in f x

So fix is translated to STG directly as it uses only Haskell
constructs that are also in STG, fix' gets translated a little to
handle laziness (remember from my lecture that 'let' = thunk
creation). Notice fix is a self recursive thunk while fix' is a
recursive function and not tail recursive (so stack allocation).

One thing to be careful of here is that the behaviour in GHCi is a
little different from compiled code.

If you compile the fix' function, it still overflows so same behaviour
in both ghc and ghci.

The cool thing though, is if you compile fix then ghc instead of ghci
when it executes it throws a NonTermination exception!

http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-OldException.html#v%3ANonTermination

It seems pretty magical but ghc can actually detect a very small
number of cases of infinite loops and throw an exception instead. As I
covered a little in the lecture, a thunk is not just a suspended
computation but can also be updated. So when a thunk is unevaluated,
pointers to it point to an actual thunk, something that is a piece of
code with an environment that needs to be executed (forced in
functional terminology) to get the actual result. Once the result is
computed, rather than just return it to the original caller that
forced the thunk, the thunk itself is replaced with a pointer to the
computed value. So if any other pointers to the thunk exist (i.e a
shared variable) the work is shared since they now get the value and
not a thunk when they try to force it.

There is an extra detail here. Something called 'blackholes'.
Basically when a thunk is first forced, the thunk is initially
replaced with a kind of sentinel value called a blackhole which simply
indicates that the thunk is currently being forced. If any other
threads come along and try to force the thunk while its in a blackhole
state, they block on the thunk instead of starting a duplicate
computation. (blackholes are also needed for a complication with
garbage collection and thunks).

Now what happens with the fix function when you call it with a non
terminating function like you did is that the thunk that is doing the
know tying here ends up being forced twice. First when you enter fix
and again after the first iteration. At the second iteration it
doesn't re-enter the thunk (which if it did wouldn't cause a stack
overflow but consume the heap and so eventually all memory) but blocks
on the thunk since its in blackhole status.

This paper has a good description of what I just explained:

http://research.microsoft.com/pubs/68449/new-rts.ps.gz

in section '5.1 Space leaks and black holes'.

So in the compiled version of fix this second attempt to force a
blackholed function with the same thread is detected and a
NonTermination exception thrown.

In GHCi though, blackhole detection doesn't work. So the thread blocks
forever. (if you look at your cpu and memory you'll notice nothing is
being consumed since its not looping but blocked). The reason it
doesn't work in GHCi is more a bug than some fundamental limitation.
See bug report:

http://hackage.haskell.org/trac/ghc/ticket/2786

Cheers,
David

dm-list-clas...@scs.stanford.edu

nepřečteno,

29. 11. 2011 13:01:3029.11.11

komu: stanford-1...@googlegroups.com

At Mon, 28 Nov 2011 21:44:03 -0800,

Alex M wrote:
>
> So I'm quite confused by all this. Also, it's not clear to me how
> recursion is "implemented using the fixed-point combinator" in the
> "Background: recursive bindings" slide.
>
> Any clarification on the above would be appreciated.

I have nothing to add to David T's great explanation of why the two
definitions of fix fail differently. I will say that I find your fix'
function easier to understand, and would have shown that in lecture if
not for the memory issues. However, if you find:

fix' f = f (fix' f)

easier to understand, then I think it's fair to imagine that this is
how fix is defined, except that fix has some optimization you don't
care about for understanding how code works.

Now as to how fixed point combinators allow you to implement
recursion, this is also a good question. First off, what do I even
mean by implementing recursion? Well, recursion is when you have a
binding or set of bindings, and the set of symbols on the left hand
side can also be used on the right-hand side. The two examples I gave
are:

x = (1, snd y)
y = (fst x, 2)
and
fibList = 1 : 1 : zipWith (+) fibList (tail fibList)

Now by implement recursion I meant implement something equivalent to
the above two code fragments (i.e., that the above two fragments can
be mechanically translated to), but in which the symbols to the left
of the equals sign do not appear to the right of the equals sign.

How does the fixed-point combinator help? Well, it lets you have two
names for the same value, so you can use one name to the right of the
equals sign, and another to the left. In particular, consider this
translation of the definition of x and y:

(x, y) = fix $ \ ~(x0, y0) -> let x1 = (1, snd y0)
y1 = (fst x0, 2)
in (x1, y1)

Consider just the let statement: If you replace both x0 and x1 with
x, and replace both y0 and y1 with y, you end up with something
identical to the original recursive definition above. But now think
about what a fixed-point combinator does--it ensures the input value
of the function is the same as the output value. So that means that
indeed x0 = x1 and y0 = y1, and thus the above fix call is equivalent
to the recursive code.

Now why does this work? Well, if you think about the way fix is
implemented, it passes in a thunk that causes an extra copy of the
function to be evaluated. So in the above code, when the system goes
to evaluate "snd y0", it causes another copy of the function to be
executed. This second evaluation returns:

((1, another-recursive-call), (another-recursive-call, 2))

Forcing either of the thunks marked "another-recursive-call" would
cause yet another copy of the function to be evaluated. Fortunately,
this isn't necessary, since the code is only asking about the slots
containing 1 and 2, so the recursion terminates.

So far we've considered cases in which the recursion happens exactly
once, or in which it happens infinitely and the program diverges.
However, in the case of functions, recursion often happens some finite
number of times. Consider the following code:

fact :: Integer -> Integer
fact = \n -> if n == 0 then 1 else n * fact (n - 1)

This is a straight-ahead recursive definition of factorial. (I've
just put the \n to the right of the = for clarity with what follows.)
Now consider this code:

ffact :: (Integer -> Integer) -> Integer -> Integer
ffact f = \n -> if n == 0 then 1 else n * f (n - 1)

What is "fix ffact"? Well, fix ffact should return a particular value
of x such that x = ffact x, right (since fix f = let x = f x in x)?
Well what happens if we expand the definition of ffact x on the right
hand side of that equation?

x = ffact x
==> x = \n -> if n == 0 then 1 else n * x (n - 1)

But wait a second, x now has the exact same (recursive) definition as
fact above, so x is the factorial function. Another way to look at
this, which I think makes it seem a lot less magical and easier to
understand, is if you imagine using your definition of fix'. Let's
expand fix' ffact:

fix' ffact
==> \n -> if n == 0 then 1 else n * (fix' ffact) (n - 1)

The key behavior of fix' is that it is self-replicating. Every time
you force the evaluation of fix' ffact, it spits out another copy the
function ffact, where ffact's argument f is another, identical copy of
"fix' ffact". So you see that expanding, say, "fix' f 5" will cause
"fix' f" to be expanded 6 times (for n = 5, 4, 3, 2, 1, 0), and when
finally n = 0, there will be no need for further expansion and the
computation can terminate.

Alex M

nepřečteno,

30. 11. 2011 0:09:0030.11.11

komu: stanford-1...@googlegroups.com

Thanks a lot David! Some comments/questions inline:

On Tue, Nov 29, 2011 at 2:21 AM, David Terei <david...@gmail.com> wrote:
> Good question! The answer pulls in some of the trickiness of the RTS
> so its going to be a long but interesting answer. High level is that
> fix doesn't actually loop forever, it blocks and fix' does loop
> forever with stack allocation so eventually overflows.
>
> To start though, I don't get any difference in behaviour between:
>
> fix' f
> fix' (\x -> 5 + x)
>
> Both report stack overflow for me (using 7.2.2),

Ah, I'm using 6.12.1. I've downloaded 7.2.2 and now I'm seeing the
same behavior with the named function as with the anonymous one.

> in fact if I look at
> the code that GHC generates it's exactly the same for both cases.

I tried doing "ghc -ddump-stg fix.hs" but all I get is:

==================== STG syntax: ====================
sat_swE =
\u srt:(0,*bitmap*) []
let {
x_swC =
\u srt:(0,*bitmap*) []
let {
sat_swF =
\u srt:(0,*bitmap*) []
let { sat_swG = \u srt:(0,*bitmap*) []
GHC.Integer.smallInteger 5;
} in GHC.Num.fromInteger
GHC.Num.$fNumInteger sat_swG;
} in GHC.Num.+ GHC.Num.$fNumInteger sat_swF x_swC;
} in x_swC;
SRT(sat_swE): [GHC.Integer.smallInteger, GHC.Num.$fNumInteger]
Main.main =
\u srt:(0,*bitmap*) [] GHC.Base.return GHC.Base.$fMonadIO sat_swE;
SRT(Main.main): [GHC.Base.$fMonadIO, sat_swE]
:Main.main =
\u srt:(0,*bitmap*) [] GHC.TopHandler.runMainIO Main.main;
SRT(:Main.main): [GHC.TopHandler.runMainIO, Main.main]

I get the same uninformative stg dump whether I use GHC 6.12.1 or 7.2.2.

The function I call it with it does terminate, no? I thought the issue
is that the function I pass to fix doesn't actually have a fixed
point.

> like you did is that the thunk that is doing the
> know tying

Sorry, the what?

> here ends up being forced twice. First when you enter fix
> and again after the first iteration. At the second iteration it
> doesn't re-enter the thunk (which if it did wouldn't cause a stack
> overflow but consume the heap and so eventually all memory) but blocks
> on the thunk since its in blackhole status.
>
> This paper has a good description of what I just explained:
>
> http://research.microsoft.com/pubs/68449/new-rts.ps.gz
>
> in section '5.1 Space leaks and black holes'.
>
> So in the compiled version of fix this second attempt to force a
> blackholed function with the same thread is detected and a
> NonTermination exception thrown.

So then what if I have a function like this:

factabs fact 0 = 1
factabs fact x = x * fact (x-1)

In this case, both fix' factabs and fix factabs terminate. Why doesn't
fix factabs run into this issue where the same thread forces a
blackhole'd thunk? And why doesn't fix' recurse forever and overflow
the stack?

Alex M

nepřečteno,

30. 11. 2011 1:59:1630.11.11

komu: stanford-1...@googlegroups.com

Thanks David M! A couple of comments/questions inline:

Ah that does make more sense now.

> So far we've considered cases in which the recursion happens exactly
> once, or in which it happens infinitely and the program diverges.
> However, in the case of functions, recursion often happens some finite
> number of times. Consider the following code:
>
> fact :: Integer -> Integer
> fact = \n -> if n == 0 then 1 else n * fact (n - 1)
>
> This is a straight-ahead recursive definition of factorial. (I've
> just put the \n to the right of the = for clarity with what follows.)
> Now consider this code:
>
> ffact :: (Integer -> Integer) -> Integer -> Integer
> ffact f = \n -> if n == 0 then 1 else n * f (n - 1)
>
> What is "fix ffact"? Well, fix ffact should return a particular value
> of x such that x = ffact x, right (since fix f = let x = f x in x)?
> Well what happens if we expand the definition of ffact x on the right
> hand side of that equation?
>
> x = ffact x
> ==> x = \n -> if n == 0 then 1 else n * x (n - 1)
>
> But wait a second, x now has the exact same (recursive) definition as
> fact above, so x is the factorial function.

So does the compiler/runtime "figure this out" after one step, so that
evaluating fix ffact is O(1)?

> Another way to look at
> this, which I think makes it seem a lot less magical and easier to
> understand, is if you imagine using your definition of fix'. Let's
> expand fix' ffact:
>
> fix' ffact
> ==> \n -> if n == 0 then 1 else n * (fix' ffact) (n - 1)
>
> The key behavior of fix' is that it is self-replicating. Every time
> you force the evaluation of fix' ffact, it spits out another copy the
> function ffact, where ffact's argument f is another, identical copy of
> "fix' ffact". So you see that expanding, say, "fix' f 5" will cause
> "fix' f" to be expanded 6 times (for n = 5, 4, 3, 2, 1, 0), and when
> finally n = 0, there will be no need for further expansion and the
> computation can terminate.

So if I understand correctly, evaluating fix' ffact n is O(n). But
what about just fix' ffact? Per my (coincidentally very similar)
question on the other sub-thread, evaluation of fix' ffact terminates.
Is this evaluation O(1) similar to the way that evaluating fix ffact
is O(1) above?

dm-list-clas...@scs.stanford.edu

nepřečteno,

30. 11. 2011 5:53:3030.11.11

komu: stanford-1...@googlegroups.com

At Tue, 29 Nov 2011 22:59:16 -0800,

Alex M wrote:
>
> > ffact :: (Integer -> Integer) -> Integer -> Integer
> > ffact f = \n -> if n == 0 then 1 else n * f (n - 1)
> >
> > What is "fix ffact"? Well, fix ffact should return a particular value
> > of x such that x = ffact x, right (since fix f = let x = f x in x)?
> > Well what happens if we expand the definition of ffact x on the right
> > hand side of that equation?
> >
> > x = ffact x
> > ==> x = \n -> if n == 0 then 1 else n * x (n - 1)
> >
> > But wait a second, x now has the exact same (recursive) definition as
> > fact above, so x is the factorial function.
>
> So does the compiler/runtime "figure this out" after one step, so that
> evaluating fix ffact is O(1)?

Well, fix ffact is a lambda expression. So not much gets evaluated
until you apply another argument. So I mean code like:

let x = fix ffact in seq x x

would terminate in some fixed number of steps just like "seq fact
fact". I guess you could say it is O(1), but usually you would use
big O notation with respect to some parameter, and there's no numeric
parameter in the expression "fix ffact". "fix ffact n" would still
run in O(n) time.

> So if I understand correctly, evaluating fix' ffact n is O(n). But
> what about just fix' ffact? Per my (coincidentally very similar)
> question on the other sub-thread, evaluation of fix' ffact terminates.
> Is this evaluation O(1) similar to the way that evaluating fix ffact
> is O(1) above?

It's basically the same thing as above. Evaluating a lambda
expression doesn't do anything if you haven't supplied the arguments.

Incidentally, this is the difference between Head Normal Form and Weak
Head Normal Form. Haskell's seq operator only evaluates to WHNF,
meaning that if the thing being forced is a data type, then, like HNF,
both the constructor and any strict parameters are forced. However,
unlike HNF, with WHNF, if the thing being forced is a lambda
abstraction, then the lambda abstraction can internally contain
redexes.

In other words, expression "\x -> (\n -> n) 0" is not in HNF. To put
it in HNF, you'd need to beta-reduce "(\n -> n) 0" and turn it to 0.
But the expression is already in WHNF. So similarly:

\n -> if n == 0 then 1 else n * (fix' ffact) (n - 1)

is in WHNF, even though it's not in HNF because "fix' ffact" is a
redex that can be beta-reduced.

David

David Terei

nepřečteno,

1. 12. 2011 16:59:1601.12.11

komu: stanford-1...@googlegroups.com

Yes that's correct. The code I showed you was cleaned up by hand. GHC
will output a lot more information that you generally don't care about
so its best to load up the output in an editor and go through it by
hand. There are some extra flags you can play around with that stop
GHC dumping out as much detail which is often what you want.

http://haskell.org/ghc/docs/7.2.1/html/users_guide/options-debugging.html

Look at section 4.18.3 for the flags.

So one thing here is

factabs :: (Int -> Int) -> Int -> Int

g :: Int -> Int
g = \x -> x + 5

fix :: (a -> a) -> a

if we apply fix to g then we have:

(fix g) :: Int

So you can see its going to try to evaluate to an Int, but what's the
actual expression we've given, its (\x -> x + 5). How the hell does
that evaluate to an Int? It doesn't, hence we get a loop / deadlock.

If we apply fix to factabs we have:

(fix factabs) :: Int -> Int

So right now this will terminate as its a partial application, GHC
won't be trying to force the (fix factabs) thunk yet.

The way to think about fix is that its a function that gives you a
variable to call as the recursive function.

e.g we can write factabs as just:

factabs :: Int -> Int
factabs 0 = 1
factabs n = n * factabs (n - 1)

but here we are doing explicit recursion which we want to avoid. So we
use fix and change factabs to take an extra argument that is the
recursive function to call:

factabs :: (Int -> Int) -> Int -> Int

factabs fact 0 = 1

factabs fact n = n * fact (n - 1)

basically the job of fix is to setup the first argument of factabs to
correctly point back to factabs.

fix f = let x = f x in x

fix factabs = let x = factabs x in x
fix factabs = let x = factabs (factabs x) in x

so notice that the fact argument gets set to (factabs x) where x =
(factabs x). Here is your recursion. This differs from the previous
fix (\x -> x + 5) case also in that in this case we are trying to
evaluate 'x' to a value, while in the factabs case we are using 'x' as
a function and applying it which is the proper way to use it.
Application vs evaluation.

Cheers,
David

Alex M

nepřečteno,

2. 12. 2011 4:48:2802.12.11

komu: stanford-1...@googlegroups.com

Ah I think this is precisely the distinction between HNF and WHNF of
lambda expressions that DM was making in the other email. Since (I'm
assuming) we're trying to reduce to WHNF, we leave the Int->Int lambda
as-is in the fix factabs case, whereas in the fix g case we have all
the arguments so we need to evaluate a data type (Int), so we end up
blocking. So I guess fix is not a true fixed-point combinator, where
-- as claimed in the Hoogle documentation -- fix f somehow magically
gives you "the least fixed point of the function f, i.e. the least
defined x such that f x = x" for any f. Rather, it looks like it only
works on higher-order f's, which lets take advantage of the fact that
lambdas in WHNF can contain redexes. In the above example, \x -> x + 5
didn't have a fixed point. However, \x -> x * x does have a fixed
point (two of them, in fact), but fix (\x -> x * x) still blocks for
all the same reasons as discussed above. No magic there.

David Terei

nepřečteno,

2. 12. 2011 13:43:0002.12.11

komu: stanford-1...@googlegroups.com

Yes sounds right. The fix point stuff is nice in theory but in
practice its always just some form of recursion under the hood.

Odpovědět všem

Odpověď autorovi

Přeposlat