Re: new layout proposal

1 view
Skip to first unread message

Geoffrey Irving

unread,
Sep 6, 2009, 8:34:40 PM9/6/09
to Dylan Alex Simon, duck...@googlegroups.com
The new layout proposal is good. Link for posterity:
http://github.com/girving/duck/blob/b0a59b577603de8489a026f3db3e000a50131d56/doc/layout.textile.

The main weirdness is the potential ambiguity of "p = e1; e2", though
I may be imagining it. I guess the optionalness of '=' means that the
';' always applies to an outer context? If so, I don't understand
this comment:

"If expression-leaders became mandatory, then single-definition
lets could be replaced with p = e1 ; e2 as well."

Comments about the list of complications:

3. I believe automatically closing contexts when we hit 'in' or 'else'
or the like is easy. However, if we did that and got rid of optional
leaders, wouldn't the meaning of "p = e1; e2" change to "p = { e1; e2
}"?
2. I think rule 3 is fairly clear without the parenthetical remark. I
agree that the +1 thing feels weird, so is that what motivated you to
suggest more thought? Ignoring the +1 thing it seems sufficient to me
as is.
1. After your recent commit, any expression can be "{ stmts }". Will
that change? Is it enough to cover the "optional { ; } groups" you
mention?

Geoffrey

Dylan Alex Simon

unread,
Sep 6, 2009, 8:56:33 PM9/6/09
to Geoffrey Irving, duck...@googlegroups.com
From Geoffrey Irving <irv...@naml.us>, Sun, Sep 06, 2009 at 08:34:40PM -0400:

Good. I like it too.

(Also here, since I was still playing with gh-pages:
http://dylex.github.com/duck/layout.html)

> The main weirdness is the potential ambiguity of "p = e1; e2", though
> I may be imagining it. I guess the optionalness of '=' means that the
> ';' always applies to an outer context? If so, I don't understand
> this comment:
>
> "If expression-leaders became mandatory, then single-definition
> lets could be replaced with p = e1 ; e2 as well."
>
> Comments about the list of complications:
>
> 3. I believe automatically closing contexts when we hit 'in' or 'else'
> or the like is easy. However, if we did that and got rid of optional
> leaders, wouldn't the meaning of "p = e1; e2" change to "p = { e1; e2
> }"?

Yeah you're right. I knew there was a reason we needed to keep optional
leaders. I was thinking about the outer context. With optional expression
leaders, this:

f x = p = e1 ; e2

means:

f x = p = e1
e2

Which makes sense given the theoretical relative precedence of "=" and ";".
It is also therefore invalid. Probably that's perfectly reasonable.

> 2. I think rule 3 is fairly clear without the parenthetical remark. I
> agree that the +1 thing feels weird, so is that what motivated you to
> suggest more thought? Ignoring the +1 thing it seems sufficient to me
> as is.

Originally I had "closed under normal rules", but then I realized that
wouldn't allow split reasonably-indented expressions after them like:

case x of y -> f
y

The +1 comment was more because I couldn't think off-hand of an easy way to
implement what I wanted, but I'm sure there is one that won't mess up nested
single-line things.

> 1. After your recent commit, any expression can be "{ stmts }". Will
> that change?

No, I think we should keep that regardless. I was going to have "do" turn
into "{" in layout, but then realized this other way was better.

> Is it enough to cover the "optional { ; } groups" you
> mention?

For expression leaders, yes. If we want to allow let blocks:

let
x = y
z = x

Then we need to change the parser for them too. That is, btw, the only
instance I could find where we use the single-line haskell rules:

do
let x = 1
y = 2
stuff

But of course that's no longer necessary at all with statements.

Also, I need to sort out what I want to do about guards, which may require a
similar optional {;} thing, or may just be sufficient to make "case" a
mandatory leader.

Regardless, we need to sort out semantics for (mutual) recursion of local
functions.

:-Dylan

Geoffrey Irving

unread,
Sep 6, 2009, 10:06:37 PM9/6/09
to Geoffrey Irving, duck...@googlegroups.com

Disallowing that seems fine.

>> 1. After your recent commit, any expression can be "{ stmts }".  Will
>> that change?
>
> No, I think we should keep that regardless.  I was going to have "do" turn
> into "{" in layout, but then realized this other way was better.
>
>> Is it enough to cover the "optional { ; } groups" you
>> mention?
>
> For expression leaders, yes.  If we want to allow let blocks:
>
>  let
>    x = y
>    z = x

This looks ugly to me, so let's drop let blocks in favor of statements
and leaders.

Should we keep single line lets for "f x = let y = x in x"?

> Then we need to change the parser for them too.  That is, btw, the only
> instance I could find where we use the single-line haskell rules:
>
>  do
>    let x = 1
>        y = 2
>    stuff
>
> But of course that's no longer necessary at all with statements.
>
> Also, I need to sort out what I want to do about guards, which may require a
> similar optional {;} thing, or may just be sufficient to make "case" a
> mandatory leader.
>
> Regardless, we need to sort out semantics for (mutual) recursion of local
> functions.

That is certainly tricky. It seems like some of the desirable
properties are conflicting, so try to enumerate all the potential
properties one could conceivably want and then try to figure out all
the conflicts. Properties (beware, this is a messy and ad-hoc
listing):

A. Function definitions are arbitrarily mutually recursive within a
statement block.
B. Functions can be mutually recursive if they are adjacent (not
separated by a(n expression) definition).
C. The scope of variable definitions extends downwards only.
D. The scope of variable definitions is arbitrary subject to causal
ordering. I.e., a function can use a variable declared later as long
as it is also called later.
E. Variable definitions can shadow within a statement block.
F. No variable shadowing is allowed.
G. Function definitions can shadow within a statement block.
H. No function shadowing is allowed.
G. The meaning of a function definition does not depend on where it
occurs within a statement block.
J. If a variable definition shadows an external definition,
expressions before the definition use the external version.
K. The scope of function definitions extends downwards only.
L. The scope of function definitions is arbitrary subject to causal ordering.
M. The scope of function definitions extends downwards for expressions
and other functions plus upwards through any adjacent function
definitions.
N. Variable defining expressions which occur later on are always
evaluated later on.

Clearly we want to allow some kind of mutual recursion, so we need
either A or B. If we go with A, arbitrary mutual recursion, then we
clearly can't have any function shadowing (H). Having shadowing for
variables but not for functions sounds insane, so that means no
variable shadowing either (F). Consider:

f x = g x
z = f 4
y = 3
g x = y + x

If N, this has to be disallowed since 'z' implicitly depends on 'y'.
Regardless of N or -N, if we flip the order of 'y' and 'z', the result
seems like it should be allowed but then 'f' transitively depends on
'y' which is declared later, which isn't a lot different from having
the scope of 'y' be arbitrary. It seems like D is the only natural
rule that describes this without being surprising. I think we do want
N in order to make evaluation order deterministic in the presence of
invisible exceptions. To summarize, the A version looks like

1. Any variable declared in a block (either as an expression or as a
function), has scope throughout the block. No shadowing is allowed.
2. Execution proceeds in order through the variable definitions.
2. It is an error (caught at type inference time) if a variable is
used before execution has reached the point where it is set.

This is quite flexible and fairly intuitive (at least to me).

The alternative is to allow mutual recursion only within consecutive
function definitions. I don't like that very much, so I'll stop here.

That exercise was more than a little bit silly, but maybe it'll give
you some ideas.

Geoffrey

Dylan Alex Simon

unread,
Sep 7, 2009, 8:55:26 PM9/7/09
to Geoffrey Irving, duck...@googlegroups.com

Surely there is a more compact way to represent this list. A table of some
sort perhaps. Anyway.

> Clearly we want to allow some kind of mutual recursion, so we need
> either A or B. If we go with A, arbitrary mutual recursion, then we
> clearly can't have any function shadowing (H). Having shadowing for
> variables but not for functions sounds insane, so that means no
> variable shadowing either (F). Consider:
>
> f x = g x
> z = f 4
> y = 3
> g x = y + x
>
> If N, this has to be disallowed since 'z' implicitly depends on 'y'.
> Regardless of N or -N, if we flip the order of 'y' and 'z', the result
> seems like it should be allowed but then 'f' transitively depends on
> 'y' which is declared later, which isn't a lot different from having
> the scope of 'y' be arbitrary. It seems like D is the only natural
> rule that describes this without being surprising. I think we do want
> N in order to make evaluation order deterministic in the presence of
> invisible exceptions. To summarize, the A version looks like
>
> 1. Any variable declared in a block (either as an expression or as a
> function), has scope throughout the block. No shadowing is allowed.
> 2. Execution proceeds in order through the variable definitions.
> 2. It is an error (caught at type inference time) if a variable is
> used before execution has reached the point where it is set.
>
> This is quite flexible and fairly intuitive (at least to me).

These seem to be the exact same rules as on the top-level. Are they? I
suppose regardless it would be good to be consistent between them, modulo
allowing overloads.

But I do like variable shadowing. Is it really that insane? There's only a
small number of programs that change meaning, and those programs were
confusing anyway. The issues are just J-type shadowing things, and as long as
we're evaluating top-down might as well say variable scope is downward as
well.

From the point of view of top-level evaluation this may make more sense too.

> The alternative is to allow mutual recursion only within consecutive
> function definitions. I don't like that very much, so I'll stop here.

:-Dylan

Geoffrey Irving

unread,
Sep 7, 2009, 10:03:09 PM9/7/09
to duck...@googlegroups.com, Dylan Alex Simon

Oh, absolutely there's a better way. That's why I added the comment
about it being messy and ad-hoc.

Yes, they are.

> But I do like variable shadowing.  Is it really that insane?  There's only a
> small number of programs that change meaning, and those programs were
> confusing anyway.  The issues are just J-type shadowing things, and as long as
> we're evaluating top-down might as well say variable scope is downward as
> well.

Yeah, I like variable shadowing too. When used appropriately it makes
programs easier to analyze, since definitions die faster. If we can
find rules that make it work, I'm definitely for it. As you can see
by the above horrible list, my thoughts about the various options
aren't all that clean.

Would you want function shadowing too? If so, which function would be
in scope between two definitions of the same function?

Incidentally, dynamic languages like python get mutual recursion by
having all variable definitions look upwards into scopes that grow in
the normal way. E.g., in

def f(x): return g(x)
def g(y): return y+1
f(3)

f gets hold of the reference to g because g is in scope by the time
it's called. Unfortunately, this doesn't work for us due to the
program

def g(y): return y+2
def f(x): return g(x)
f(2)
def g(y): return y+1
f(3)

In python, the different invocations of f would use different
definitions of g, which isn't an option in a statically scoped
language.

In fact, considering all subsets of lines of the above sequence might
be a useful way to think about different options. On the other hand,
maybe those complexities go away if we just allow shadowing for
variables but disallow it for functions.

Geoffrey

Dylan Alex Simon

unread,
Sep 9, 2009, 10:02:45 PM9/9/09
to Geoffrey Irving, duck...@googlegroups.com
> >> > Regardless, we need to sort out semantics for (mutual) recursion of local
> >> > functions.
> >>
> >> That is certainly tricky. ?It seems like some of the desirable

> >> properties are conflicting, so try to enumerate all the potential
> >> properties one could conceivably want and then try to figure out all
> >> the conflicts. ?Properties (beware, this is a messy and ad-hoc

> >> listing):
> >>
> >> A. Function definitions are arbitrarily mutually recursive within a
> >> statement block.
> >> B. Functions can be mutually recursive if they are adjacent (not
> >> separated by a(n expression) definition).
> >> C. The scope of variable definitions extends downwards only.
> >> D. The scope of variable definitions is arbitrary subject to causal
> >> ordering. ?I.e., a function can use a variable declared later as long

> >> as it is also called later.
> >> E. Variable definitions can shadow within a statement block.
> >> F. No variable shadowing is allowed.
> >> G. Function definitions can shadow within a statement block.
> >> H. No function shadowing is allowed.
> >> G. The meaning of a function definition does not depend on where it
> >> occurs within a statement block.
> >> J. If a variable definition shadows an external definition,
> >> expressions before the definition use the external version.
> >> K. The scope of function definitions extends downwards only.
> >> L. The scope of function definitions is arbitrary subject to causal ordering.
> >> M. The scope of function definitions extends downwards for expressions
> >> and other functions plus upwards through any adjacent function
> >> definitions.
> >> N. Variable defining expressions which occur later on are always
> >> evaluated later on.
> >>
> >> 1. Any variable declared in a block (either as an expression or as a
> >> function), has scope throughout the block. ?No shadowing is allowed.

> >> 2. Execution proceeds in order through the variable definitions.
> >> 2. It is an error (caught at type inference time) if a variable is
> >> used before execution has reached the point where it is set.
> >>
> >> This is quite flexible and fairly intuitive (at least to me).
> >
> > These seem to be the exact same rules as on the top-level. ?Are they? ?I

> > suppose regardless it would be good to be consistent between them, modulo
> > allowing overloads.
>
> Yes, they are.
>
> > But I do like variable shadowing. ?Is it really that insane? ?There's only a

> > small number of programs that change meaning, and those programs were
> > confusing anyway. ?The issues are just J-type shadowing things, and as long as

Yes, I think this is the way to go: +ACEHJ -DFGGKLM

This changes the current top-level in that functions may not reference
variables declared later. Multiple function definitions are always considered
overloads which are never allowed in local definitions.

The problem with this approach is that we have to take variable scope into
account when considering function meaning: each function has an explicit
closure scope associated with it of all the variables above it (so each
top-level function has its own Globals).

This almost seems like too much complexity, and I'd actually be willing to
entertain B instead, with shadowing but not at the top level.


Back to the layout which started all this: we have some current uses that
are broken by the new rules. Taking basic as an example:

f x y =
let g y = y * y in
x + g y

This could instead by written as:

f x y = let g y = y * y in
x + g y

f x y =
let g y = y * y in
x + g y

Or, more importantly:

f x y =
g y = y * y
x + g y

Which makes this a silly example.

A more important one, with arrows:

test_monad = assert \ [11,101,22,202] == \
[1,2] >>= \a ->
[10 * a, 100 * a] >>= \b ->
b + a

Instead we need:

test_monad = assert \ [11,101,22,202] == \
[1,2] >>= \a ->
[10 * a, 100 * a] >>= \b ->
b + a

Or just some other breaking that doesn't end lines with "->":

test_monad = assert \ [11,101,22,202] == \
[1,2] >>= \
a -> [10 * a, 100 * a] >>= \
b -> b + a

Of course, this would be better written using a monadic do. (Is there any
point of monads with effects?) There another such program in io.duck, but
otherwise everything looks fine. If you're okay with breaking these uses,
I'll push it in.

:-Dylan

Geoffrey Irving

unread,
Sep 9, 2009, 10:19:12 PM9/9/09
to Geoffrey Irving, duck...@googlegroups.com
On Wed, Sep 9, 2009 at 10:02 PM, Dylan Alex Simon<dylan-...@dylex.net> wrote:
>> In fact, considering all subsets of lines of the above sequence might
>> be a useful way to think about different options.  On the other hand,
>> maybe those complexities go away if we just allow shadowing for
>> variables but disallow it for functions.
>
> Yes, I think this is the way to go: +ACEHJ -DFGGKLM
>
> This changes the current top-level in that functions may not reference
> variables declared later.  Multiple function definitions are always considered
> overloads which are never allowed in local definitions.
>
> The problem with this approach is that we have to take variable scope into
> account when considering function meaning: each function has an explicit
> closure scope associated with it of all the variables above it (so each
> top-level function has its own Globals).

That doesn't seem too bad. If we get it right, it might even teach us
how to introduce local overloads.

> This almost seems like too much complexity, and I'd actually be willing to
> entertain B instead, with shadowing but not at the top level.
>
>
> Back to the layout which started all this: we have some current uses that
> are broken by the new rules.  Taking basic as an example:
>
>        f x y =
>          let g y = y * y in
>          x + g y

To make sure I understand, this is broken because there's essentially
an implicit "do" after the first "=", and therefore the body of the
let has to be a bit intended. All good.

> This could instead by written as:
>
>        f x y = let g y = y * y in
>          x + g y
>
>        f x y =
>          let g y = y * y in
>            x + g y
>
> Or, more importantly:
>
>        f x y =
>          g y = y * y
>          x + g y

That's perfect.

> Which makes this a silly example.

I'd say it makes it an awesome example.

> A more important one, with arrows:
>
>        test_monad = assert \ [11,101,22,202] == \
>          [1,2] >>= \a ->
>          [10 * a, 100 * a] >>= \b ->
>          b + a
>
> Instead we need:
>
>        test_monad = assert \ [11,101,22,202] == \
>          [1,2] >>= \a ->
>            [10 * a, 100 * a] >>= \b ->
>              b + a
>
> Or just some other breaking that doesn't end lines with "->":
>
>        test_monad = assert \ [11,101,22,202] == \
>          [1,2] >>= \
>          a -> [10 * a, 100 * a] >>= \
>          b -> b + a
>
> Of course, this would be better written using a monadic do.  (Is there any
> point of monads with effects?)  There another such program in io.duck, but
> otherwise everything looks fine.  If you're okay with breaking these uses,
> I'll push it in.

Beautiful. Push away.

As to the parenthesized question, I'm not sure what happens to monads
which carry around extra state or involve strange control flow like
the list monad. There may be no completely natural way to map these
into an effect system and syntax, or there might be a trick that
manages it.

Geoffrey

Reply all
Reply to author
Forward
0 new messages