Does this mean, if you *read the source code* for the interpreter, then you
can know the program semantics? So if I wanted to know the semantics for
Scheme, I could read the R5RS, but even better, I could read the source code
for MzScheme?
Q2.
"Most of these essentials relate to the semantics, or meaning, of program
elements. Such meanings reflect how program elements are interpreted as the
program executes." EoPL Preface xi.
The second sentence seems backwards to me. Does semantics reflect how
elements are interpreted, or does how the elements are interpreted reflect
the semantics?
Yes, _if_ you already know the semantics of the language that the
interpreter has been written in. If it is a metacircular interpreter
(i.e. one written in the language it is interpreting), then you have
to resort to other means for learning the language's semantics, e.g.
reading an informal English description, or some formal operational or
denotational rules, or maybe just playing around with an
implementation and learning by experimenting, how the language works.
This latter kind of inductive learning, of course, is always prone to
mistakes.
> Q2.
> "Most of these essentials relate to the semantics, or meaning, of program
> elements. Such meanings reflect how program elements are interpreted as the
> program executes." EoPL Preface xi.
>
> The second sentence seems backwards to me. Does semantics reflect how
> elements are interpreted, or does how the elements are interpreted reflect
> the semantics?
That depends. If the language was properly designed, then it probably
has an a priori semantics, and an implementation's job is simply to
execute programs according to the semantics. On the other hand,
sometimes a language has been implemented without exact
specifications, and then a semantics can be designed afterwards to
formally express what the interpreter does. This often a pretty
thankless task, though, since undesigned languages tend to be full of
horribly complicated kludges.
Lauri Alanko
l...@iki.fi
> Q2.
> "Most of these essentials relate to the semantics, or meaning, of program
> elements. Such meanings reflect how program elements are interpreted as the
> program executes." EoPL Preface xi.
>
> The second sentence seems backwards to me. Does semantics reflect how
> elements are interpreted, or does how the elements are interpreted reflect
> the semantics?
One can, in principle, define a mathematical function that assigns a
mathematical object to each phrase of a language. That's the meaning of the phrase.
Say your language looks like this:
exp = 0 | 1 | -1 | ... | (+ exp exp)
Then you could decide to say that the meaning of an exp is an Integer (Z), that
the phrase "0" denotes the number "0", etc, and that the phrase "(+ exp1 exp2)"
denotes the addition of the meaning of exp1 and the meaning of exp2.
Now write this as an interpreter and use the semantics to guide you.
-- Matthias
> Q1.
> "Programs called interpreters provide the most direct, executable expression
> of program semantics." EoPL Preface xi.
>
> Does this mean, if you *read the source code* for the interpreter, then you
> can know the program semantics?
It depends on what you mean by "know".
If you read a piece of sheet music, do you know what the music sounds
like? The situation is exactly analogous. If you have the right
training and the music is simple enough then you can "hear" the music by
reading the notes. If you know the semantics of the language the
interpreter is written in and the interpreter is simple enough then you
can "know" the semantics of the language by reading the interpreter.
But if you don't or it isn't then you can't.
E.
> > "Most of these essentials relate to the semantics, or meaning, of
> > program elements. Such meanings reflect how program elements are
> > interpreted as the program executes." EoPL Preface xi.
> One can, in principle, define a mathematical function that assigns a
> mathematical object to each phrase of a language. That's the meaning
> of the phrase.
Mathias, that's Schmidt's definition of Denotation Semantics (DS):
David Schmidt's book "DS: a methodology for language development"
states on p 3:
The DS method maps a program directly to its meaning, called its
denotation. The denotation is usually a mathematical value, such
as a number or a function. No interpreters are used, a valuation
function maps programs directly to its meaning.
I think that's a great definition of DS, and it includes your LC_v
Standard Reduction function eval_s, defined on p. 51 in
<http://www.ccs.neu.edu/course/com3357/mono.ps>
your paper with Matthew Flatt
Programming Languages and Lambda Calculi.
Unfortunately it seems to me that DS has been redefined to mean
specific ways of constructing the valuation function via domains
(i.e. Scott models of LC) and structural induction, the techniques
Schmidt describes in his book.
If that's what the *preface* is about, clearly the rest of EOPL is way out
of scope for me.
"Bill Richter" <ric...@math.northwestern.edu> wrote in message
news:57189ce0.04060...@posting.google.com...
> Thank you Bill for clarifying this issue.
>
> If that's what the *preface* is about, clearly the rest of EOPL is
> way out of scope for me.
Marlene, I didn't give you good advice. I took something Matthias
Felleisen (MFe) wrote & ran off in a different direction. I apologize.
Really, I'm just excited that MFe is now posting regularly to c.l.s.
I think he has a lot of good leadership to offer. We had a disastrous
thread last year about the R5RS DS, and there were 2 problems:
1) I made a bunch of dumb errors initially, and by the time I'd been
straightened out (mostly by MB and WC), folks were fed up, and
2) there was nobody on c.l.s. with MFe's expertise.
And now I'll try to answer your original question:
> Does semantics reflect how elements are interpreted, or does how
> the elements are interpreted reflect the semantics?
I say the first, for EoPL. The interpreter defines the semantics.
The meaning of a program (or an phrase), is what the interpreter does
to it. Sometimes this is expressed mathematically. Let me explain.
Here's the text from the EoPL introduction past your quote:
"Programs called interpreters provide the most direct executable
expression of the program semantics. They process a program by
directly analyzing an abstract representation of the program text.
We therefore choose interpreters as our primary vehicle for
expressing the semantics of programming language elements."
So it looks to me like EoPL is a book about interpreters, such as
DrScheme or Gambit. So much of the book should be accessible.
I think this is what's called Operational Semantics (OpS), and it's
got a different flavor than the DS I scared you off with. In fact,
the very previous paragraph in Schmidt's DS book is:
The OpS method uses an interpreter to define a language. The
meaning of a program in the language is the evaluation history that
the interpreter produces when it interprets the program. The
evaluation history is a sequence of internal interpreter
configurations.
So semantics = meaning which is expressed mathematically, but in OpS,
it's expressed by the interpreter. The "meaning" of your program is
what the interpreter is gonna do to it. That's OK, right?
Or as MFe said, the interpreter "assigns a mathematical object to each
phrase of a language". In the OpS/interpreter world, that means take
a phrase in your language, and evaluate in a given environment, with
various things stored in memory (and a continuation if you like), and
ask what the interpreter is going to print, or how the memory
locations will change, etc. That's not too mathematical, right?
BTW I enjoyed MFe's ambiguous, "That's the meaning of the phrase."
However, on the next page, EoPL says:
"Frequently our interpreters a very high level view that expresses
language semantics in a very concise way, not far from that of formal
mathematical semantics."
Maybe you'd have trouble there, and that might even involve some DS.
(I haven't read EoPL myself, but the Preface is on their web page.)
But EoPL is hard, and old, and maybe there are better books for you.
What do you want to learn? (I won't be able to help, but others can.)
(I don't mind math. That was my subject in graduate school.)
I am beginning to suspect Essentials of Programming Languages is Essential
for people who research and design languages, not for people who use
languages to build things.
> (I don't mind math. That was my subject in graduate school.)
Then let me be more specific, Marlene, and correct an error of my last
post. Schmidt defines DS to be the study of semantic functions
curly-E : Expression-of-our-Language ----> Some set
where the 2 sets and the function are mathematically defined. You
know what this means! And you know that curly-E can't be a computable
function by the Halting problem (i.e. Goedel Incompleteness).
So any kind of semantics becomes DS once you fully mathematize it. In
that sense, the OpS (as studied I think in EoPL) is also DS.
So OpS must mean the subset of DS where you concentrate on
interpreters, and not their full mathematization. A good example of
this is the R5RS DS, which as MB pointed out is usually understood by
schemers as functional programming. It's a good exercise (worked out
on Anton vS's web page) to code up the R5RS DS as a functional Scheme
program. That would be called OpS. Now to mathematize even
functional programming requires hard Lambda calculus. But it's still
the "shallow end" of the DS pool, as the real mathematization of the
R5RS DS uses domains: Scott models of the Lambda calculus, which
involves non-Hausdorff Cantor sets. That's much much harder Math!
And it's this "deep end" of the DS pool which is normally called DS.
> I am beginning to suspect Essentials of Programming Languages is
> Essential for people who research and design languages, not for
> people who use languages to build things.
Yeah, maybe! What do you want to build? I suspect EoPL is a book
about how to build interpreters. From the end of Abelson's preface:
"You'll come to see yourself as a designer of languages, rather than
only a user of languages, as a person who chooses the rules by which
languages are put together, rather than only a follower of rules that
other people have chosen."
I hope you get an answer soon to your original question.
> I am beginning to suspect Essentials of Programming Languages is Essential
> for people who research and design languages, not for people who use
> languages to build things.
In my experience, people who research and design languages are *much*
better at using them to build things. A huge part of solving a
complex problem is coming up with the language necessary to describe
the problem. If you know a bit about designing languages, you'll be
much further ahead than someone that just knows `how to program'.
As a math major you surely are aware of the power of a good formal
notation. (And as a computer scientist I am *very* frustrated by the
amazingly poor formal notations invented by mathematicians who don't
program!) You don't need to steep yourself ``non-Hausdorff Cantor
sets'' (what?) and Scott's domains to understand programming
semantics. Programming usually involves creating a `mini-language' or
extending an existing language with problem-specific features.
Knowing a bit about semantics will keep you from doing amazingly dumb
things like separating program clauses into `statements' and
`expressions' that cannot be interchanged, making the syntax depend on
runtime values, forgetting about tail recursion, etc.
~jrm
p.s. Scott was trying to adapt set theory to programming, but since
programs can operate on programs, you need the set P : P->P which is,
unfortunately, empty. However, if you add some conditions to P (like
restricting it to continuous partial functions) you can make it
satisfy the conditions necessary to apply Tarski's fixed-point theorem
and construct a non-empty set suitable for modeling programming.
Some people are very uncomfortable with mathematics that has not been
proven correct. Some people are uncomfortable unless they work
through the proof themselves. I'm pretty sure that Scott and Tarski
got it right, so I haven't bothered to memorize the details.
> Knowing a bit about semantics will keep you from doing amazingly dumb
> things like separating program clauses into `statements' and
> `expressions' ...
Gold. Can I use it in a signature?
And Marlene: Just by reading your (very good) questions in
this group makes me absolutely sure, that you will like
"Essentials of Programming Languages". Get it at the library
first, if you want to skim it first.
(the reason you get so good answers, is because you ask
the right ones)
--
Jens Axel Søgaard
> I say the first, for EoPL. The interpreter defines the semantics.
> The meaning of a program (or an phrase), is what the interpreter does
> to it. Sometimes this is expressed mathematically. Let me explain.
>
> [...]
>
> So it looks to me like EoPL is a book about interpreters [...]
Bill, have you considered actually reading the books you're talking
about before you hold forth about them? Or do you only read the
prefaces and forewords of books?
(I understand that reading books has the deleterious effect of
actually slowing down your prodigious newsgroup and mail output.)
Shriram
> Programming usually involves creating a `mini-language' or extending
> an existing language with problem-specific features. Knowing a bit
> about semantics will keep you from doing amazingly dumb things like
> separating program clauses into `statements' and `expressions' that
> cannot be interchanged, making the syntax depend on runtime values,
> forgetting about tail recursion, etc.
Cool, Joe. Sounds heavy on interpreters. Where's a place to read?
Perhaps a better book for Marlene would be Shriram Krishnamurthi's
Programming Languages: Application and Interpretation
listed right after EoPL:
<http://www.schemers.org/Documents/#all-texts>
> p.s. Scott was trying to adapt set theory to programming, but since
> programs can operate on programs, you need the set P : P->P which
> is, unfortunately, empty. However, if you add some conditions to P
> (like restricting it to continuous partial functions)
You mean P = (P->P), right? The function space of continuous maps
from P to itself is bijective with P. And so you need a topology on P
to talk about continuous maps from P to P.
> You don't need to steep yourself ``non-Hausdorff Cantor sets''
> (what?) and Scott's domains to understand programming semantics.
I think so. As you say, you just need to know that P = (P->P), you
don't have to know why. But (responding to your (what?)) here's why
Scott's domain P = P(infty) is a ``non-Hausdorff Cantor set'':
P(omega) is the power set of the natural numbers N, i.e. the set
{ S subset N } = (N -> boolean).
See, a function f : N ---> boolean defines a subset
S = f^{-1}(true) subset N
The usual topology on (N -> boolean) has "basic" open sets the
collection of subsets of (N -> boolean) of the form
O(A,B) = { S subset N : A subset S, B disjoint from S }
where A and B are finite subsets of N.
With this topology, (N -> boolean) is the well known Cantor set, which
you should know something about from fractals: the "dust"Julia sets.
Scott's P(omega) is (N -> boolean) with a different and non-Hausdorff
topology. The basic open sets of P(omega) are
O(A) = { S subset N : A subset S }
for finite subsets A subset N. We did this a year ago on c.l.s. If
you don't know what Hausdorff means, let's just say this: the real
line R is Hausdorff, and non-Hausdorff is really strange. I think
Scott was a genius to come up with his P(omega).
> A huge part of solving a
> complex problem is coming up with the language necessary to describe
> the problem.
> Programming usually involves creating a `mini-language' or
> extending an existing language with problem-specific features.
I've never thought about programming in this way. Abelson talks about
metalinguistic abstraction, which puzzled me. Is this the same as or related
to what you are saying?:
----
"We control complexity by establishing new languages for describing design,
each of which emphasizes particular ascpects of the design and deemphasizes
others." SICP
"a cluster of languages, where the pieces could be flexibly combined"
preface to EOPL
"To appreciate this point [the evaluator is ust another program] is to
change our images of ourselves as programmers. We come to see ourselves as
designers of languages, rather than only users of languages designed by
others." SICP
"Perhaps the whole distinction between programming and programming language
is a misleading idea, and future programmers will see themselves not as
writing programs in particular, but as creating new languages for each new
application." preface to EOPL
----
I thought of a way to explain my concern. A plumbers doesn't need to read
Plato to be good at plumbing. He might enjoy reading Plato. So he reads
Plato in his spare time on Sundays. If Plato is good for the plumber, why
don't we see all plumbers reading Plato?
Thank you very much, Jens Axel, for your encouragement and advice.
I own the book. It looks fun to read. I like the idea of learning by
implementing ideas in code. It's so tedious having to read English prose and
map ambiguous words and metaphors to technical ideas. (I like the R5RS.) I
am trying to decide whether I am "allowed" to move this book from the Fun
queue to the queue with Tanenbaum's Computer Networks, Lea's Concurrent
Programming in Java, etc.
> Joe Marshall <j...@ccs.neu.edu> responded to Marlene Miller:
>
>> Programming usually involves creating a `mini-language' or extending
>> an existing language with problem-specific features. Knowing a bit
>> about semantics will keep you from doing amazingly dumb things like
>> separating program clauses into `statements' and `expressions' that
>> cannot be interchanged, making the syntax depend on runtime values,
>> forgetting about tail recursion, etc.
>
> Cool, Joe. Sounds heavy on interpreters. Where's a place to read?
I haven't seen a `how to design a language' book. The Art of the
Interpreter <http://c2.com/cgi/wiki?TheArtOfTheInterpreter> is a good
paper to look at. There is a lot of good stuff at
<http://library.readscheme.org/> As for my `dumb things' list, those
are things I've encountered in various languages.
There are many languages that draw a distinction between `statements'
and `expressions'. C is one. There are statements like while, do,
if, return, for, break, etc. and there are expressions like
(a + b)*c. Semantically, there are two kinds of continuation, one for
statements and one for expressions. A statement continuation discards
its argument, but an expression continuation does not. So you cannot
use a statement where you have an expression because a statement will
not supply a return value to the continuation.
Unfortunately, most of the control-flow constructs in C are
statements. There are constructs in C for expression sequences and
conditional expressions, but they have a completely different syntax
from statement sequences and conditional sequences. The C statement
if (x > 3) {
x += 2;
y -= 3;
return TRUE;
}
else {
x -= 2;
y += 7;
return FALSE;
}
is essentially equivalent to the C expression
(x > 3)
? (x += 2,
y -= 3,
TRUE)
: (x -= 2,
y += 7,
FALSE)
but you clearly cannot just substitute one for the other! Yet you may
wish to do exactly that sort of thing if you are refactoring code.
If your syntax depends on runtime values, then you cannot effectively
compile your program. In REBOL, for example, the expression
[foo bar baz] could mean (begin (foo) (bar) (baz)) or it could mean
(foo (bar (baz))) or it could mean (begin (foo bar) (baz)) or any of
about 20 other parses. What it means depends on the current values of
foo, bar, and baz, and that can change over time.
Without tail recursion, you must pepper your language with looping
constructs. Users cannot create their own, so you must supply a wide
variety. But loops can only express primitive recursion, so there
will be some things that are extraordinarily painful to compute this
way. Users will also not be able to resort to
continuation-passing-style if they need complex control flow.
> Perhaps a better book for Marlene would be Shriram Krishnamurthi's
> Programming Languages: Application and Interpretation
> listed right after EoPL:
> <http://www.schemers.org/Documents/#all-texts>
>
>> p.s. Scott was trying to adapt set theory to programming, but since
>> programs can operate on programs, you need the set P : P->P which
>> is, unfortunately, empty. However, if you add some conditions to P
>> (like restricting it to continuous partial functions)
>
> You mean P = (P->P), right? The function space of continuous maps
> from P to itself is bijective with P. And so you need a topology on P
> to talk about continuous maps from P to P.
Yes.
>> You don't need to steep yourself ``non-Hausdorff Cantor sets''
>> (what?) and Scott's domains to understand programming semantics.
>
> I think so. As you say, you just need to know that P = (P->P), you
> don't have to know why.
Right. Scott proved that the domain is well-founded and his word is
good enough for me.
> Thank you Joe for your (always) helpful insights and explanations.
>
>> A huge part of solving a
>> complex problem is coming up with the language necessary to describe
>> the problem.
>
>> Programming usually involves creating a `mini-language' or
>> extending an existing language with problem-specific features.
>
> I've never thought about programming in this way. Abelson talks about
> metalinguistic abstraction, which puzzled me. Is this the same as or related
> to what you are saying?:
It's related. You can modify an existing language to accept a few new
constructs or you can go whole hog and write a brand new language
tailor-made for your problem. One advantage to Lisp and Scheme is
that you can do something inbetween these two extremes. Any new
language is going to need variables, definitions, primitive data,
etc. You'll probably want strings and numbers for interacting with
the rest of the world. You'll need to manage memory. Start with
Scheme or Lisp and you get all that for free.
> I thought of a way to explain my concern. A plumbers doesn't need to read
> Plato to be good at plumbing. He might enjoy reading Plato. So he reads
> Plato in his spare time on Sundays. If Plato is good for the plumber, why
> don't we see all plumbers reading Plato?
If your only aspiration were to be a plumber, then Plato may not have
a direct impact on your life. But I'm not sure this analogy is quite
the right one.
You don't *have* to understand programming semantics to be a good
programmer, but people that do understand semantics tend to be far
better programmers than people who do not. Furthermore, given the
absolutely horrible state of software in the world, it seems that the
bulk of people writing software are not good programmers.
So instead of Plato, what about fluid dynamics? A plumber doesn't
need to understand fluid dynamics to solder pipes together, but
if plumbing were like software, we'd be ankle deep in water. A
plumber with an understanding of fluid dynamics would get more work
and be able to relax on Sundays in a dry house.
> > So it looks to me like EoPL is a book about interpreters [...]
>
> Bill, have you considered actually reading the books you're talking
> about before you hold forth about them? Or do you only read the
> prefaces and forewords of books?
>
> (I understand that reading books has the deleterious effect of
> actually slowing down your prodigious newsgroup and mail output.)
:D Shriram, I don't think EoPL is on-line, unlike your book, and your
expertise here greatly exceeds mine. So please bail me out:
Would it seem that Marlene ought to read your book on schemers.org
instead of EoPL? Could you compare the goals of the 2 books?
My expertise here just has to do with MFe's response on 2004-05-27:
> Does semantics reflect how elements are interpreted, or does how
> the elements are interpreted reflect the semantics?
One can, in principle, define a mathematical function that assigns
a mathematical object to each phrase of a language. That's the
meaning of the phrase.
That's DS. The point is that any semantics becomes DS if you fully
mathematize it. I've had a lot of fun thinking about LC and R5RS DS.
But as to how to write interpreters, why learning about interpreters
or semantics makes one a better programmer: I'm a rank beginner.
> Would it seem that Marlene ought to read your book on schemers.org
> instead of EoPL? Could you compare the goals of the 2 books?
I like EoPL; I loved the first edition (which I bet you could get real
cheap used) even more. I cut my teeth on it, and it's what convinced
me to take up programming languages. So I couldn't possibly tell
someone to read my book instead of EoPL. Nor do I want to flog my
book here.
Given what Marlene has told us of her background, EoPL may indeed be a
better book than mine for her.
My book grew out of a great frustration from teaching from EoPL.
There are several things I don't like about it as a professor. I was
once asked by a correspondent (Ehud Lamm, I think) about the pedagogic
philosophy behind the text and for whom it was written. I wrote:
My long-term vision for teaching programming languages is to integrate
the "two cultures" that have evolved in its pedagogy. I was raised in
the interpreter (EoPL) culture, which meant I looked with some disdain
at the "survey of languages" courses. After a while I realized that
otherwise intelligent people used the survey approach, so I spent some
time trying to understand what they got out of it.
I still think that not doing interpreters (broadly construed) is a
mistake, and students who go through the experience of doing it come
out with a much richer perspective. But what I have realized is that
students who don't do the survey also lose something valuable.
Since a course needs one dominant philosophy, I decided to make the
interpreters dominant but use the survey to inform the interpreters.
So students program with a new set of features first (survey), then
try to distill those principles into an actual interpreter. This has
the following benefits:
- by seeing the feature in the context of a real language, they can
build something interesting with it first, so they understand that
it isn't an entirely theoretical construct, and will actually *care*
to build an interpreter for it (in my experience, a few students who
are interested in knowledge for its own sake will get excited about
the interpreter in either case, but I want to also capture the
attention of the other 90%)
- they get at least fleeting exposure to multiple languages, which is
an important educational attribute that is fast being crushed in
this era of Java's dominance (and in the process, they come to
understand why Java will not be the last word in languages)
- because they have already coded with the feature, the explanations
and discussions are much more interesting than when all they have
seen is an abstract model
- by first building a mental model for the feature through experience,
they have a much better chance of actually figuring out how the
interpreter is supposed to work
In short, many more humans work by induction than by deduction, so a
pedagogy that supports it is much more likely to succeed than one that
suppresses it. The book currently reflects this design, though the
survey parts are done better (!) in lecture than in the book; that
will change in future versions.
Separate from this vision is a goal. My goal is to not only teach
students new material, but to also change the way they solve problems;
as Marx wrote, "The philosophers have only interpreted the world in
various ways; the point, however, is to change it." I want to show
students where languages come from (the language "nebula"), why we
should regard languages as the ultimate form of abstraction, how to
recognize such an evolving abstraction, and how to turn what they
recognize into a language. The last section of the book, on
domain-specific languages, is a very, very weak step in this
direction. The homeworks I've done in the class have conveyed this
point much better. Over time, I will update the text to reflect what
the homeworks have taught.
The book is currently the sole textbook for the programming languages
course at Brown, where it is taken primarily by juniors (3rd year),
seniors (4th year) and beginning graduate (both MS and PhD) students.
It seems very accessible to smart sophomores (2nd year) too, and
indeed those are some of my most successful students. The book has
been used at some other universities as a primary or secondary text.
The book's material is worth one undergraduate course worth of credit;
for students who want graduate credit, I supplement the material in
the book with some research paper readings.
The book is still very much under development.
One common criticism I have heard of the text is that the writing is
too colloquial; it sounds too much like me standing in front of a room
and lecturing. However, I don't intend to change the voice of the
book (tighten it, of course; change it, no). I've been told this may
make it much harder to publish the book formally. Either way, I
intend to continue offering a full, free copy on the Web.
Comments welcome.
Shriram
[...]
> Separate from this vision is a goal. My goal is to not only teach
> students new material, but to also change the way they solve problems;
> as Marx wrote, "The philosophers have only interpreted the world in
> various ways; the point, however, is to change it." [...]
The change Marx had in mind was metaphysical (i.e. delusional). See
"Science, Politics, and Gnosticism," by Eric Voegelin, and "Karl Marx:
Communist as Religious Eschatologist," in the second volume of "Logic of
Action" by Murray Rothbard.
Sorry for the off-topic distraction, but I'm not one to pass up a chance
to dis Marx.
-thant
I think EoPL does a poor job on some crucial topics:
- type systems
- garbage collection
- domain-specific languages
The material on types is so caught up in mechanics (especially of type
inference) that I think it fails to provide very much insight into
types. There is little or discussion of soundness, safety, etc. This
is a pity coming from authors who are masters of the topic, but it is
in general consistent with EoPL's "look ma"-ness.
Garbage collection isn't discussed at all in any meaningful way. I
find that students are generally woefully uninformed about garbage
collection, having lots of wrong ideas in their heads. The
programming languages course at most universities is the only chance
to rectify some of these misconceptions (especially since many
colleagues on facutly actively cause the misconceptions). This is one
place where I have found having them implement collectors, very much
in the spirit of EoPL, is actually really helpful; it takes a lot of
the mystery out of GC. I think we also have a responsibility to
discuss some of the systems aspects of GC, especially provide a
meaningful comparison to manual memory management.
While the entire EoPL philosophy is built around "build your own
language" (as Abelson's preface also points out), EoPL doesn't reflect
on this practice. As such, many students can leave an EoPL course
unsure of what to do with the interpreters, eg, not knowing that
Scheme macros offer a great way to transplant what they've learned to
building their own languages. [In prehistoric times, EoPL was
actually written entirely through macros. I'm glad this practice did
not survive, but it's a pity that the book has swung entirely in the
opposite direction.]
All these points can also be read as positive statements about EoPL,
especially if you're a purist. So these should help you determine
which book is more appropriate for your studies, EoPL or PLAI.
Hopefully you will read both. But if you're going to insist on
reading only one, follow the advice in my Acknowledgments section --
read EoPL, a true classic.
Shriram
we should regard languages as the ultimate form of abstraction
That's exciting, Shiram! You must mean some higher-level version of
what you write on p 235 of your book: "Scheme's map and filter are
also abstractions". So I read HtDP to try to learn about abstraction.
And I like HtDP's sec 21.5 "Designing Abstractions from Templates":
abstracting your design template for lists leads us to the abstract
function `fold' (actually in sec 22.2). I really see how HtDP
abstraction makes for better programmers (Plato for plumbers), and I
can maybe dimly glimpse how your "ultimate abstraction" helps.
I very much liked your (short) Ch 19 "Semantics". You write:
It would be convenient to have some common language for explaining
interpreters. We already have one: math!
We call this *big step operational semantics*. It's semantics
because it ascribes meaning to programs. [...] It's operational
because we aren't compiling the entire program into a mathematical
object and using fancy math to reduce it to an answer.
Good: semantics is Math. By Schmidt's definition, you're doing DS as
well. You're defining a semantic (or valuation) function
V: Expressions -> (Environments -> Values)
It's something I tried unsuccessfully to post 2 years ago. The fact
that you're not using fancy Math (CPOs & Scott models of LC) doesn't
mean it's not DS, by Schmidt's definition. I have some comments:
1) There's some unstated mathematical induction in your definition of
V. E.g. your rule on p 173 includes
b, E'[i <- a_v] ==> b_v
but of course you'll have to apply the rule recursively to do so, and
maybe it will not halt, and in that case (V exp E) = bottom, and that
shows V is not a computable function... This kind of induction isn't
hard, but it's all you need to define V. You don't need CPOs...
2) I'm not quite sure you define your mathematical set Values. But it
sure looks to me like you don't need Scott models of LC. The subset
Procedure-Values of Values just consists (p 172) of triples
<function-name, function-body text, evaluating environment>
So you don't run into the problem MB explained to me 2 years ago:
> E [the R5RS domain of Values] is "defined" as
>
> E = .... some expressions ultimately involving E ...
you only need Scott models (non-Hausdorff Cantor sets) if you define
Procedure-Values to be a set of functions on Values, while Values
contains Procedure-Values as a subset. You avoid this problem by
giving your (I say SICP-ish) triples definition of Procedure-Values.
3) You only fail to define an R5RS-like semantic function
Expressions -> (Env Store Cont -> Answers)
by not doing continuations (which you discussed at length 2 sections
earlier, so probably you'll probably do Cont when you expand Ch 19),
and by "conflating Store and Env", to use MB's great phrase. I
haven't read enough of your book to comment on your decision.
4) So I claim that your "big step operational semantics" is also DS,
by Schmidt's definition. It seems that folks only say DS if you're
doing CPOs & Scott models, but that's a cultural distinction.
Shriram> I think EoPL does a poor job on some crucial topics:
Shriram> - type systems
Shriram> - garbage collection
Shriram> - domain-specific languages
Concurrency?
--
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla
> Shriram> I think EoPL does a poor job on some crucial topics:
>
> [...]
>
> Concurrency?
Well, if we start going down this path, there's a lot more that one
could add. I was trying to limit myself to things that I do discuss
in some detail in my course/book.
I don't cover concurrency due to a peculiarity of Brown's curriculum,
which covers it very well in several other courses (or at least well
enough that I don't feel like expending precious time on the subject).
As an example of something that I *do* cover, I think it's important
to show that types are only the tip of the proof iceberg, and there
are interesting techniques for proving properties of programs; eg, I
discuss model checking.
Shriram
> 2) I'm not quite sure you define your mathematical set Values. But it
> sure looks to me like you don't need Scott models of LC. The subset
> Procedure-Values of Values just consists (p 172) of triples
> <function-name, function-body text, evaluating environment>
>
> So you don't run into the problem MB explained to me 2 years ago:
>
>> E [the R5RS domain of Values] is "defined" as
>>
>> E = .... some expressions ultimately involving E ...
>
> you only need Scott models (non-Hausdorff Cantor sets) if you define
> Procedure-Values to be a set of functions on Values, while Values
> contains Procedure-Values as a subset. You avoid this problem by
> giving your (I say SICP-ish) triples definition of Procedure-Values.
Right, but this removes first-class functions from your language.
Rather boring.
> 3) You only fail to define an R5RS-like semantic function
>
> Expressions -> (Env Store Cont -> Answers)
>
> by not doing continuations (which you discussed at length 2 sections
> earlier, so probably you'll probably do Cont when you expand Ch 19),
> and by "conflating Store and Env", to use MB's great phrase. I
> haven't read enough of your book to comment on your decision.
Remove continuations and you remove control flow. That's even less
interesting than a language with no functions.
Joe, that's identifier, not function-name, so (lambda (x) body)
evaluates in environment E to the Procedure-Value triple <x, body, E>.
That's basically SICP, which you agree has first-class functions.
> > So you don't run into the problem MB explained to me 2 years ago:
> >
> >> E [the R5RS domain of Values] is "defined" as
> >>
> >> E = .... some expressions ultimately involving E ...
> >
> > So you only need Scott models if you define Procedure-Values (a
> > subset of Values) to be a set of functions on Values.
> Right, but this removes first-class functions from your language.
> Rather boring.
What? Shriram didn't change his language (approx. Scheme) at all, but
only the definition of the set Values in his semantic function
Expressions -> (Env -> Values)
These aren't my ideas. It's Shriram's book PLAI
<http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/PDF/all.pdf>
listed on schemers.org right after EoPL.
> > 3) You only fail to define an R5RS-like semantic function
> >
> > Expressions -> (Env Store Cont -> Answers)
> >
> > by not doing continuations, and by "conflating Store and Env", to
> > use MB's great phrase.
> Remove continuations and you remove control flow. That's even less
> interesting than a language with no functions.
Sure, but Part IX Semantics is only 3 pages long, so maybe it's under
construction. Part VI Continuations is 50+ pages long. So I think we
can conclude that Shriram knows how to add continuations to his
semantics without changing everything (adding in Scott models e.g.).
I think I would learn much from reading both books.
Marlene, the plumber
> Joe Marshall <j...@ccs.neu.edu> responded to me:
>>
>> > it sure looks to me like you don't need Scott models of LC. The
>> > subset Procedure-Values of Values just consists (p 172) of triples
>> > <function-name, function-body text, evaluating environment>
>
> Joe, that's identifier, not function-name, so (lambda (x) body)
> evaluates in environment E to the Procedure-Value triple <x, body, E>.
> That's basically SICP, which you agree has first-class functions.
Die, horse, die!
Shriram is using *operational* semantics rather than *denotational*
semantics. The key difference is this: Denotational semantics
defines a function that maps programs to what they mean; operational
semantics defines a set of rules that *maintain* the meaning.
As an example, look at function application:
f, E => <i, b, E0> a, E => av b, E0 [i <-av] => bv
-----------------------------------------------------------
{f a}, E => bv
This says that *provided that* f, E is the triple <i, b, E0> *and*
a, E reduces to av, *and* b, E0[i <- av] reduces to bv, *then* (f a),
E reduces to bv.
There are a couple of unanswered questions:
1. Does there exist an f, E, i, b, E0, a, av, and bv that can
satisfy the antecedents? (In particular, it would be nice to
know if there exists non-empty sets of values for av and bv.)
2. Do the sets of values av and bv correspond to the kinds of
values we want to manipulate?
3. Does this reduction rule correspond to our intuitive notion of
function application? In particular, if I want to model the
`add three' function, does there exist an f (a program) that
will do that?
There are three approaches to answering these questions:
1. Assume a-priori that Shrirams semantics for reducing an
application correspond to the common notion of function
application.
2. *Prove* (or disprove) that Shrirams semantics do indeed mean
function application.
3. Treat the operational semantics as the rules for a meaningless
game.
If you don't wish to simply assume that the semantics work, then you
either need to prove they do or take the position that `function
application' as defined in the language is simply a complex syntactic
operation that may or may not have anything to do with mathematical
functions.
>> > So you only need Scott models if you define Procedure-Values (a
>> > subset of Values) to be a set of functions on Values.
>
>> Right, but this removes first-class functions from your language.
>> Rather boring.
Let me amend this: Without Scott domains I can only consider
Procedure-Values to be curiously formed tuples that have a complex
reduction rule. This isn't interesting.
> What? Shriram didn't change his language (approx. Scheme) at all, but
> only the definition of the set Values in his semantic function
>
> Expressions -> (Env -> Values)
>
> These aren't my ideas. It's Shriram's book PLAI
> <http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/PDF/all.pdf>
> listed on schemers.org right after EoPL.
Yes, but Shriram is discussing ``big-step operational semantics'' not
denotational semantics. Shriram's semantics will tell me that
((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
relationship between the Scheme expression `(lambda (x) (+ x 3))' and
the mathematical function `add three'.
> Yes, but Shriram is discussing ``big-step operational semantics'' not
> denotational semantics. Shriram's semantics will tell me that
> ((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
> relationship between the Scheme expression `(lambda (x) (+ x 3))' and
> the mathematical function `add three'.
Quite right, though note that it will also tell you
((lambda (x) (+ x 3)) 7) => ^10^ [circumflex-10]
You could even possibly prove that, for all numbers ^N^,
((lambda (x) (+ x 3)) N) => ^N+10^
where N => ^N^.
Joe enumerates three approaches for Making Sense of my semantics:
> 1. Assume a-priori that Shrirams semantics for reducing an
> application correspond to the common notion of function
> application.
>
> 2. *Prove* (or disprove) that Shrirams semantics do indeed mean
> function application.
>
> 3. Treat the operational semantics as the rules for a meaningless
> game.
Since I do periodically allude in that section to the relationship
between the semantics and an interpreter, we can rule out both #3 (the
semantics doesn't live cut off from the universe) and #1 (though I'm
taking great liberties, I'm at least trying to draw offer a
justification). We are therefore left with two refinements of #2:
1. Prove that the semantics captures function application.
2. Prove that the semantics faithfully reflects the interpreter, and
prove that the interpreter captures function application.
Shriram
> Shriram is using *operational* semantics rather than *denotational*
> semantics.
Let's be precise, Joe. Shriram calls it big-step OpS (p 173, PLAI).
But it's also DS by Schmidt's definition (p 3 of his DS book):
The DS method maps a program directly to its meaning, called its
denotation. The denotation is usually a mathematical value, such
as a number or a function. No interpreters are used, a valuation
function maps programs directly to its meaning.
So any mathematically defined function
V: Programs [or Expressions] ---> Some-Set
is a DS valuation function, by Schmidt's definition. So practically
any semantics is DS, once you mathematize it. (More DS talk below.)
> The key difference is this: Denotational semantics defines a
> function that maps programs to what they mean;
But Shriram did just that. He mapped programs (expressions even)
mathematically to a set, by a function (I'll give names here)
V: Expressions -> (Env -> Values)
In Schmidt's definition of DS, "meaning" just means a mathematical
value. If you don't think his subset Procedure-Values is interesting,
that's fine, but there's no mathematical point to argue about.
Let's keep going. You seemed to agree that Shriram is a good enough
semanticist that he could've expanded his semantics to a math function
W: Expressions -> (Env Store Cont -> Answers)
Now Shriram's W won't be the same as the R5RS DS semantic function
curly-E, because the target sets are different. Since Shriram has a
different definition of Values (using Procedure-Value triples to
bypass Scott models), his Store & Cont will differ from R5RS DS.
But Shriram's set Answers will be the same as R5RS DS, so we can ask a
more meaningful question. R5RS DS uses an initial
<rho_0, sigma_0, kappa_0> in Env x Store x Cont
Shriram's semantic function W will also uses such initial values, and
let's call them by the same names, even though they live in different
sets. Then for any program P, I claim that
W[[ P ]] <rho_0, sigma_0, kappa_0>
=
curly-E[[ P ]] <rho_0, sigma_0, kappa_0> in Answers
By P here I mean the expression you get by wrapping P in a let form
with "undefined"s as R5RS DS does in sec 7.2. I think this would be
easy to prove. And I'd say that was a satisfactory "proof" of your
> 2. *Prove* (or disprove) that Shriram's semantics do indeed mean
> function application.
which is to say, that's what I think your "mean" should mean.
> [...] Shriram's semantics will tell me that
> ((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
> relationship between the Scheme expression `(lambda (x) (+ x 3))' and
> the mathematical function `add three'.
Sure, but that's a matter for proving theorems about observational
equivalence (which MFe has posted about). It's not a deficiency of
Shriram's semantic functions V & W. I don't see that Shriram's
semantics is at any disadvantage here with R5RS DS.
I think folks have posted that R5RS DS isn't particularly good for
observational equivalence, and I remember WC posting that R5RS DS is
actually too strict: there are expressions that we want to say are the
same, but curly-E distinguishes them because of some inconsequential
differences in what gets stored in what locations.
Maybe we should say that DS is a subject that humans work in, so DS
means whatever the DS humans are doing. I work in the subject of
"Homotopy Theory", but the meaning of "Homotopy Theory" has changed
quite a bit since I started 25 years ago. However, let's note that
Barendregt is much in agreement with Schmidt's definition above.
Barendregt's book is called
The Lambda Calculus: its Syntax and Semantics,
and he says LC Semantics include the term model (more or less standard
reduction) as well as the much harder Scott models.
Shriram's book comes close to a precise definition. In his short
Semantics section, he writes
It would be convenient to have some common language for explaining
interpreters. We already have one: math!
[...] It's semantics because it ascribes meanings to programs.
To me, that sounds like what Schmidt calls DS, Shriram calls S! How
about this: any mathematically defined function
V: Programs [or Expressions] ---> Some-Set
is an S valuation function. [Now don't tell me the previous sentence
here was "We call this a big-step OpS. It makes no difference.]
I'll abide by Shriram's definition of S, if others will do accept it
and also reject Schmidt's definition. What about a definition of DS?
In an interesting private discussion, I think Shriram said he didn't
want to call something DS unless it made really serious and integrated
use of Scott models. Maybe that means if you only use Scott models to
solve the P = (P -> P) problem, it's not really DS. That's fine, as
long as we admit this is culture, and not Math. We can argue about
whether somebody's "really" using Scott models, just like we can argue
about whether some proof is "deep", or "trivial". The truth of our
theorems must be above such political discussions, or it's not Math.
Math isn't a `common language' if we won't use it precisely.
> Joe Marshall <j...@ccs.neu.edu> responded to me:
>
>> Shriram is using *operational* semantics rather than *denotational*
>> semantics.
>
> Let's be precise, Joe. Shriram calls it big-step OpS (p 173, PLAI).
> But it's also DS by Schmidt's definition (p 3 of his DS book):
>
> The DS method maps a program directly to its meaning, called its
> denotation. The denotation is usually a mathematical value, such
> as a number or a function. No interpreters are used, a valuation
> function maps programs directly to its meaning.
I don't see how this applies. Shriram's equations do *not* map
programs directly to their meaning. Shriram is assuming that a
mapping exists and shows how to reduce Scheme expressions while
preserving the mapping. In fact, if we restrict our Scheme to
lambda-expressions only, we can dispense with the right hand side.
We'd still have an operational semantics that reduced expressions, but
no denotation attached to it.
> So any mathematically defined function
>
> V: Programs [or Expressions] ---> Some-Set
>
> is a DS valuation function, by Schmidt's definition.
The set is supposed to be the meaning of the program. The unix `wc'
command (word count) maps program text to non-negative integers, but
it is not generally considered a DS valuation function.
> So practically any semantics is DS, once you mathematize it. (More
> DS talk below.)
Nonsense. Denotational semantics involves finding a function that
maps programs directly to their meaning. There can be other
relationships between programs and meanings. Consider a `boolean
semantics' which maps (program * set) -> boolean (true if the
program's meaning is contained within the set, false otherwise). It's
a weird sort of semantics (and not entirely useless), but not
denotational.
>> The key difference is this: Denotational semantics defines a
>> function that maps programs to what they mean;
>
> But Shriram did just that.
Take a look at pg. 171 There are constructs like this:
l => ^lv^ r => ^rv^
-----------------------------
{+ l r} => ^lv + rv^
These are called `judgements'. They are not terms in curly-E and
Shriram makes no claim that they are.
> He mapped programs (expressions even) mathematically to a set, by a
> function (I'll give names here)
>
> V: Expressions -> (Env -> Values)
Some of the rules in the operational semantics have no antecedents.
For example,
n,E => ^n^
These are taken as execution axioms.
> In Schmidt's definition of DS, "meaning" just means a mathematical
> value. If you don't think his subset Procedure-Values is interesting,
> that's fine, but there's no mathematical point to argue about.
>
> Let's keep going. You seemed to agree that Shriram is a good enough
> semanticist that he could've expanded his semantics to a math function
>
> W: Expressions -> (Env Store Cont -> Answers)
The reason Shriram didn't expand that is because he isn't trying to do
denotational semantics. On page 173 Shriram says explicitly:
``It's *operational* because ... we aren't compiling the entire
program into a mathematical object and using fancy math to reduce it
to an answer.''
> Now Shriram's W won't be the same as the R5RS DS semantic function
> curly-E, because the target sets are different.
Yes. This is why you don't try to expand the axioms.
> Since Shriram has a different definition of Values (using
> Procedure-Value triples to bypass Scott models), his Store & Cont
> will differ from R5RS DS.
>
> But Shriram's set Answers will be the same as R5RS DS,
That's a bold assertion.
> so we can ask a more meaningful question.
>
> R5RS DS uses an initial
>
> <rho_0, sigma_0, kappa_0> in Env x Store x Cont
>
> Shriram's semantic function W will also uses such initial values, and
> let's call them by the same names, even though they live in different
> sets. Then for any program P, I claim that
>
> W[[ P ]] <rho_0, sigma_0, kappa_0>
> =
> curly-E[[ P ]] <rho_0, sigma_0, kappa_0> in Answers
>
> By P here I mean the expression you get by wrapping P in a let form
> with "undefined"s as R5RS DS does in sec 7.2. I think this would be
> easy to prove.
Feel free to provide the proof. Consider in particular the issue of
self-application. Hint:
http://www1.elsevier.com/homepage/sac/opit/24/article.pdf
Meyer and de Vink make heavy use of domains, though, so you'll have to
remove them from the proof.
> And I'd say that was a satisfactory "proof" of your
>
>> 2. *Prove* (or disprove) that Shriram's semantics do indeed mean
>> function application.
>
> which is to say, that's what I think your "mean" should mean.
I see no proof.
>> [...] Shriram's semantics will tell me that
>> ((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
>> relationship between the Scheme expression `(lambda (x) (+ x 3))' and
>> the mathematical function `add three'.
>
> Sure, but that's a matter for proving theorems about observational
> equivalence (which MFe has posted about).
Given that the denotational semantics *do* say that
`(lambda (x) (+ x 3))' means the mathematical function `add three',
this is a central point.
> It's not a deficiency of Shriram's semantic functions V & W. I
> don't see that Shriram's semantics is at any disadvantage here with
> R5RS DS.
I never said that Shriram's semantics were deficient, I said they were
operational rather than denotational. They approach the problem of
semantics in a different manner.
> Shriram's book comes close to a precise definition. In his short
> Semantics section, he writes
>
> It would be convenient to have some common language for explaining
> interpreters. We already have one: math!
> [...] It's semantics because it ascribes meanings to programs.
>
> To me, that sounds like what Schmidt calls DS, Shriram calls S!
Not to me. Semantics ascribes meanings to programs, but there are
many techniques for this. Denotational semantics attempts to define a
function over programs that maps them to meanings. Operational
semantics identifies the meaning of a program with the steps taken to
evaluate it. Both are mathematical.
--
~jrm
The small steps operational semantics do not define any meaning for programs
but allows you to relate a program with valid reductions of it.
Big-step and DS are different end of story.
Since you feel so keen to interpret Schmidt, you may want to have a
look at <http://citeseer.ist.psu.edu/schmidt95programming.html>.
There Schmidt says, among other things:
Unlike denotational semantics, natural semantics does not
claim that the meaning of a program is necessarily
"mathematical."
Here "natural semantics" means big-step operational semantics, as you
can readily verify by reading the paper.
Lauri Alanko
l...@iki.fi
> Big-step Ops only define meanings for terminating programs. i.e. the
> evaluation function is a partial function. DS requires the meaning
> function be total for all programs.
>
> The small steps operational semantics do not define any meaning for
> programs but allows you to relate a program with valid reductions of it.
>
> Big-step and DS are different end of story.
Let me clarify one subtle point. For programs or terms for which the
big-step operational semantics is total, one may in a rather perverse way
consider it a form of DS.
i.e. if we consider an OP-sem for simple arithmetic expressions without
non-terminating expressions. The DS for such a language would basically be
the same thing.
However, for Scheme and any non-trivial programming language they are not
the same.
Now I wrote that Shriram had mathematically defined a function
V: Expressions -> (Env -> Values)
Now maybe he didn't actually do so, and maybe that's the problem.
Shriram's short Semantics section in PLAI certainly doesn't use the
names V, Env, Values, or Procedure-Values. Is that an issue?
But I claim that Shriram could easily have done just that. That is,
whatever he wrote about judgments or antecedents, that Shriram could
easily have defined such a mathematical function V.
And then I claim that such a function V is what Schmidt calls a DS
valuation function, if we wish to say that V captures the semantics.
Now while I'm waiting for your response (or Laurie's), let me make
some quick comments (uh, 150 lines I mean :D) on your post:
Joe Marshall (prunes...@comcast.net) responded to me:
> > So any mathematically defined function
> >
> > V: Programs [or Expressions] ---> Some-Set
> >
> > is a DS valuation function, by Schmidt's definition.
>
> The set is supposed to be the meaning of the program. The unix `wc'
> command (word count) maps program text to non-negative integers, but
> it is not generally considered a DS valuation function.
Right, good point, & I tried to correct this above. We have to assert
that V captures the semantics of our language. `wc' certainly
doesn't! So how do we decide? We're trying to keep it to Math and
away from culture. I know what my criterion is: stick to programs,
which is all that Schmidt's quote refers to. For any program P,
V[[ P ]] in Some-Set
must be a mathematization of the actual interpreter output. Maybe
that even works for expressions. Shriram's (or my) value
V[[ (lambda (x) body) ]](E) in Values
strikes you as "meaningless", but Scheme prints something meaningless,
as you know. I just pasted a lambda expr (sol to Ex 22.2.3 of HtDP,
and I was quite proud of my solution) into the Interactions window of
DrScheme, hit RET, and here's my output:
> ;; fold : Y (X Y -> Y) -> ((listof X) -> Y)
(define (fold base combine)
(local ((define (abs-fun aloX)
(if (empty? aloX)
base
(combine (first aloX) (abs-fun (rest aloX))))))
abs-fun))
>
Absolutely nothing!
> > R5RS DS uses an initial
> >
> > <rho_0, sigma_0, kappa_0> in Env x Store x Cont
> >
> > Shriram's semantic function W will also uses such initial values, and
> > let's call them by the same names, even though they live in different
> > sets. Then for any program P, I claim that
> >
> > W[[ P ]] <rho_0, sigma_0, kappa_0>
> > =
> > curly-E[[ P ]] <rho_0, sigma_0, kappa_0> in Answers
> >
> > By P here I mean the expression you get by wrapping P in a let form
> > with "undefined"s as R5RS DS does in sec 7.2. I think this would be
> > easy to prove.
>
> Feel free to provide the proof.
I'd like to, Joe, but there's a serious communication problem. I
don't know even what part of this you think is hard. I'd say this is
obvious because R5RS DS is basically playing the SICP game, just with
domains. That is, curly-E[[ (lambda (x) body) ]] is the function that
get by using the SICP rules for evaluation. I thought everyone agreed
that it made pretty good sense to think of R5RS DS as FP. Anton vS
even worked out WC's exercise, writing a DS->Scheme meta-interpreter.
> > Sure, but that's a matter for proving theorems about observational
> > equivalence (which MFe has posted about).
>
> Given that the denotational semantics *do* say that
> `(lambda (x) (+ x 3))' means the mathematical function `add three',
> this is a central point.
Maybe you're right, Joe, it sounds reasonable. My R5RS DS is pretty
rusty. But I want to stick to programs anyway if we're gonna decide
if some function is a DS valuation. And you didn't seem to
understand my point, so let me try again, with your example in mind:
We can say that Shriram's W is too strict on lambda expressions. Any
two lambda's will be distinguished by W unless they're identical.
We don't want that. We'd like our DS valuation function to identify
lambda's that are observationally equivalent.
But R5RS DS is too strict as well. I think WC posted lambda's that
are observationally equivalent, but separated by curly-E.
To me, that just shows that "the proof of the DS valuation pudding" is
in the programs, not the expressions.
> > Shriram's book comes close to a precise definition. In his short
> > Semantics section, he writes
> >
> > It would be convenient to have some common language for explaining
> > interpreters. We already have one: math!
> > [...] It's semantics because it ascribes meanings to programs.
> >
> > To me, that sounds like what Schmidt calls DS, Shriram calls S!
>
> Not to me. Semantics ascribes meanings to programs, but there are
> many techniques for this. Denotational semantics attempts to define a
> function over programs that maps them to meanings. Operational
> semantics identifies the meaning of a program with the steps taken to
> evaluate it. Both are mathematical.
Then I want an example of Ops that can't easily be turned into DS.
Let's suppose we have an mathematically defined function
omega: Programs ---> Machine-History
which takes a program to its entire evaluation history, a sequence
[state_1, state_2,... ]. Is that OpS?
Now I'll define a DS valuation function
V: Programs ---> Answers
which sends the program P to either
state_n, if omega[[ P ]] = [state_1, state_2,..., state_n ]
bottom, if omega[[ P ]] is an infinite sequence.
Now the only thing I read from your paper is that they defined all DS
valuation function to be compositional, and this function V is not
compositional. But that's a matter of definition. I claim V is a
non-compositional DS valuation function, and I think I handled
Daniel's objection. Now I'm assuming above that state_n will be the
actual program output (if it exists).
There's certainly a gain in mathematical complexity. I'd bet that
there's a computable function Next-State that computes state_{i+1}
from state_i. But V is definitely not a computable function, by the
Halting problem. We need induction to define V from Next-State.
Shriram's W will be compositional, because that's the SICP way.
I think that was just pretty illustrative. You are unable to take at
face value what people actually write, and make unwarranted
extrapolations, even though the things you assume have been explicitly
denied, as you might find out if you bothered to actually follow this
newsgroup and read what people say.
Lauri
Right. So the meaning of "(lambda (x) (+ x 3))" is
"(lambda (x) (+ x 3))". Mighty useful piece of information, that one.
(It _is_ useful for most of the practical purposes that CS folks use
calculi for. That's why op sem is so prevalent nowadays. But it does
not give any enlightenment about whether the term represents the
mathematical function that we intuitively associate it with.)
Lauri
> Now I wrote that Shriram had mathematically defined a function
> V: Expressions -> (Env -> Values)
> Now maybe he didn't actually do so, and maybe that's the problem.
> Shriram's short Semantics section in PLAI certainly doesn't use the
> names V, Env, Values, or Procedure-Values. Is that an issue?
Yes.
> But I claim that Shriram could easily have done just that. That is,
> whatever he wrote about judgments or antecedents, that Shriram could
> easily have defined such a mathematical function V.
He *could* have, but he didn't.
> And then I claim that such a function V is what Schmidt calls a DS
> valuation function, if we wish to say that V captures the semantics.
Yes.
It is non-trivial to go from a set of judgements to a valuation
function. In order to determine the meaning of a program from the
judgements, you must construct a chain of judgements that
incrementally reduce the program to axioms. Yes, you do this with
induction, but in order for the induction to work you must have a base
case. If you apply a function to itself, however, you construct an
infinite chain of judgements. You can still attribute meaning to the
infinite chain if it approaches a limit, but it isn't easy.
But Shriram isn't doing this.
> Right, good point, & I tried to correct this above. We have to assert
> that V captures the semantics of our language. `wc' certainly
> doesn't! So how do we decide? We're trying to keep it to Math and
> away from culture. I know what my criterion is: stick to programs,
> which is all that Schmidt's quote refers to. For any program P,
>
> V[[ P ]] in Some-Set
>
> must be a mathematization of the actual interpreter output. Maybe
> that even works for expressions. Shriram's (or my) value
>
> V[[ (lambda (x) body) ]](E) in Values
>
> strikes you as "meaningless", but Scheme prints something meaningless,
> as you know.
Lambda expressions are meaningful, but they mean different things in
denotational semantics and operational semantics. In denotational
semantics, lambda expressions mean mathematical functions. In
operational semantics, lambda expressions mean to construct a
closure.
>> > R5RS DS uses an initial
>> >
>> > <rho_0, sigma_0, kappa_0> in Env x Store x Cont
>> >
>> > Shriram's semantic function W will also uses such initial values, and
>> > let's call them by the same names, even though they live in different
>> > sets. Then for any program P, I claim that
>> >
>> > W[[ P ]] <rho_0, sigma_0, kappa_0>
>> > =
>> > curly-E[[ P ]] <rho_0, sigma_0, kappa_0> in Answers
>> >
>> > By P here I mean the expression you get by wrapping P in a let form
>> > with "undefined"s as R5RS DS does in sec 7.2. I think this would be
>> > easy to prove.
>>
>> Feel free to provide the proof.
>
> I'd like to, Joe, but there's a serious communication problem. I
> don't know even what part of this you think is hard.
The hard part is establishing a one-to-one mapping between the tuples
of Operational Semantics and the domain of function values in
denotational semantics.
> Then I want an example of Ops that can't easily be turned into DS.
On the one hand, since it can be proven that operational semantics is
equivalent to denotational semantics, there can be no example. On the
other hand, you need to invoke domain theory to do it, so it's never
easy.
> Let's suppose we have an mathematically defined function
>
> omega: Programs ---> Machine-History
>
> which takes a program to its entire evaluation history, a sequence
> [state_1, state_2,... ]. Is that OpS?
Not quite. omega is defined by induction over the operational
semantics. OpS takes you from state_n to state_n+1, but no further.
> There's certainly a gain in mathematical complexity. I'd bet that
> there's a computable function Next-State that computes state_{i+1}
> from state_i. But V is definitely not a computable function, by the
> Halting problem. We need induction to define V from Next-State.
Exactly. By restricting ourselves to the state-transition function,
we avoid the computability problems associated with the denotational
approach. But this comes at a cost: we can no longer say that
the state transitions involved with function application and lambda
expressions `sum up' to `real' functions.
Good! Joe, I think we're making progress.
> > But I claim that Shriram could easily have done just that. That
> > is, whatever he wrote about judgments or antecedents, that Shriram
> > could easily have defined such a mathematical function V.
>
> He *could* have, but he didn't.
Great. Now the question we're divided on is how hard it would.
That's not gonna be real easy to settle, but let's keep working:
> > And then I claim that such a function V is what Schmidt calls a DS
> > valuation function, if we wish to say that V captures the semantics.
>
> Yes.
Great.
> It is non-trivial to go from a set of judgments to a valuation
> function.
That's what I dispute, and I think we're heading toward a resolution.
> [...] Yes, you do this with induction, [...]
Great!
> But Shriram isn't doing this.
Yeah, maybe. But if it's easy enough to pass to V, then I'm not way
off base to have misinterpreted Shriram this way. Now if it's hard,
then I wildly misinterpreted Shriram, and I owe him an apology.
> > Right, good point, & I tried to correct this above. We have to
> > assert that V captures the semantics of our language. `wc'
> > certainly doesn't! So how do we decide? We're trying to keep it
> > to Math and away from culture. I know what my criterion is: stick
> > to programs, which is all that Schmidt's quote refers to. For any
> > program P,
> >
> > V[[ P ]] in Some-Set
> >
> > must be a mathematization of the actual interpreter output.
Joe, can I get you to vote on this? You went on to my next point.
> > Maybe that even works for expressions. Shriram's (or my) value
> >
> > V[[ (lambda (x) body) ]](E) in Values
> >
> > strikes you as "meaningless", but Scheme prints something
> > meaningless, as you know.
>
> Lambda expressions are meaningful, but they mean different things in
> denotational semantics and operational semantics. In denotational
> semantics, lambda expressions mean mathematical functions.
Now here I claim you're bringing culture into Math. In the DS we've
seen, yes, you're right. There's no requirement though, in Schmidt's
definition, and you seemed to agree with me on this above.
> >> > R5RS DS uses an initial
> >> >
> >> > <rho_0, sigma_0, kappa_0> in Env x Store x Cont
> >> >
> >> > Shriram's semantic function W will also uses such initial
> >> > values, and let's call them by the same names, even though they
> >> > live in different sets. Then for any program P, I claim that
> >> >
> >> > W[[ P ]] <rho_0, sigma_0, kappa_0>
> >> > =
> >> > curly-E[[ P ]] <rho_0, sigma_0, kappa_0> in Answers
> >> >
> >> > By P here I mean the expression you get by wrapping P in a let form
> >> > with "undefined"s as R5RS DS does in sec 7.2. I think this would be
> >> > easy to prove.
> >>
> >> Feel free to provide the proof.
> >
> > I'd like to, Joe, but there's a serious communication problem. I
> > don't know even what part of this you think is hard.
>
> The hard part is establishing a one-to-one mapping between the tuples
> of Operational Semantics and the domain of function values in
> denotational semantics.
Yeah, great. I think I can do that. I read R5RS DS quite carefully
and it really looked to me like curly-E[[ lambda expressions ]] was
just the obvious function you'd want to define. It took me a while to
decode R5RS DS, and now it's looks impenetrable again.
Our eval order plt-scheme thread will help. From now on, I'm junking
the permute/unpermute part of R5RS DS, and going with left->right eval
order. I mean, I was doing that before anyway, but I felt guilty :D
> > Then I want an example of Ops that can't easily be turned into DS.
>
> On the one hand, since it can be proven that operational semantics
> is equivalent to denotational semantics, there can be no example.
> On the other hand, you need to invoke domain theory to do it, so
> it's never easy.
I think you contradicted yourself below, Joe. That is, if domain
theory means Scott models of LC. Let's go read it:
> > Let's suppose we have an mathematically defined function
> >
> > omega: Programs ---> Machine-History
> >
> > which takes a program to its entire evaluation history, a sequence
> > [state_1, state_2,... ]. Is that OpS?
>
> Not quite. omega is defined by induction over the operational
> semantics. OpS takes you from state_n to state_n+1, but no further.
Ah, thanks. So OpS is just my Next-State function below. And sure,
you'd need induction to even define omega. Great.
> > There's certainly a gain in mathematical complexity. I'd bet that
> > there's a computable function Next-State that computes state_{i+1}
> > from state_i. But V is definitely not a computable function, by
> > the Halting problem. We need induction to define V from
> > Next-State.
>
> Exactly. By restricting ourselves to the state-transition function,
> we avoid the computability problems associated with the denotational
> approach.
But this doesn't bother me a bit! Pure mathematicians rarely have
computable functions. Any time you bring in the real line you're not
computable, because (as Tom Bushnell posted), the real line isn't a
computable set!
This is where I thought you were contradicting yourself. Because this
V doesn't seem to use Scott models, and I said it was a
non-compositional DS valuation function. Hmm, you snipped that part :)
> But this comes at a cost: we can no longer say that the state
> transitions involved with function application and lambda
> expressions `sum up' to `real' functions.
Don't quite grok. Is this what you were going after Shriram for, that
his big-step OpS might be meaningless reduction rules? If so, that's
a real good point, and the reason I always want to produce the V.
> > Let's suppose we have an mathematically defined function
> >
> > omega: Programs ---> Machine-History
> >
> > which takes a program to its entire evaluation history, a sequence
> > [state_1, state_2,... ]. Is that OpS?
> >
> > Now I'll define a DS valuation function
> >
> > V: Programs ---> Answers
> >
> > which sends the program P to either
> >
> > state_n, if omega[[ P ]] = [state_1, state_2,..., state_n ]
> >
> > bottom, if omega[[ P ]] is an infinite sequence.
>
> Right. So the meaning of "(lambda (x) (+ x 3))" is
> "(lambda (x) (+ x 3))". Mighty useful piece of information, that
> one.
But that's the answer you get, or even less, Lauri! V maps programs
to answers, mostly meaning the printed output of the interpreter.
That's how R5RS DS uses the name Answers. If your program was a
lambda expression, DrScheme gives no output at all.
> (It _is_ useful for most of the practical purposes that CS folks use
> calculi for. That's why op sem is so prevalent nowadays. But it does
> not give any enlightenment about whether the term represents the
> mathematical function that we intuitively associate it with.)
Yeah, and that must be why folks use Scott models in DS. But there's
no requirement (in Schmidt's definition at least) that the valuation
of lambda-expr is your mathematical function. It could be your
"mighty useful" (lambda (x) (+ x 3)). Don't agret that the proof
of the DS pudding is for programs? If I hand you a math function
V: Expressions -> (Env Store Cont -> Answers)
and if you're deciding if my V is really a DS valuation function,
then you're going to make your decision based on the restriction
V^p : Programs -> Answers
V^p( P ) = V(P)(rho_0, sigma_0, kappa_0)
All bets are off on the original V, unless we demand compositionality.
Lauri, you seem antagonistic, and maybe you remember how this went up
in smoke 2 years ago. But MFe is here now, and Joe & I have built up
some goodwill on plt-scheme... I'm willing to give it another try.
>> > Right, good point, & I tried to correct this above. We have to
>> > assert that V captures the semantics of our language. `wc'
>> > certainly doesn't! So how do we decide? We're trying to keep it
>> > to Math and away from culture. I know what my criterion is: stick
>> > to programs, which is all that Schmidt's quote refers to. For any
>> > program P,
>> >
>> > V[[ P ]] in Some-Set
>> >
>> > must be a mathematization of the actual interpreter output.
>
> Joe, can I get you to vote on this? You went on to my next point.
I'm not quite sure what you are asserting here, but I'd say this:
If your interpreter is faithful to your denotational semantics, then
V [[ P ]] = V [[ interpreter output ]]
Note that this is different from operational semantics:
Op (P) => interpreter output
i.e., operational semantics applied to a program reduces to the
interpreter output.
>> > Maybe that even works for expressions. Shriram's (or my) value
>> >
>> > V[[ (lambda (x) body) ]](E) in Values
>> >
>> > strikes you as "meaningless", but Scheme prints something
>> > meaningless, as you know.
>>
>> Lambda expressions are meaningful, but they mean different things in
>> denotational semantics and operational semantics. In denotational
>> semantics, lambda expressions mean mathematical functions.
>
> Now here I claim you're bringing culture into Math. In the DS we've
> seen, yes, you're right. There's no requirement though, in Schmidt's
> definition, and you seemed to agree with me on this above.
Yes, we could be using `wc'.
>> But this comes at a cost: we can no longer say that the state
>> transitions involved with function application and lambda
>> expressions `sum up' to `real' functions.
>
> Don't quite grok. Is this what you were going after Shriram for, that
> his big-step OpS might be meaningless reduction rules? If so, that's
> a real good point, and the reason I always want to produce the V.
I wasn't really `going after' Shriram; I'm sure his operational
semantics are well-founded, and I believe that he chose operational
semantics over denotational semantics in order to avoid the problems
associated with the latter and to illustrate more closely the actions
of the interpreter.
But while Shriram's semantics may faithfully model what the
interpreter does, they do not provide (nor are they intended to
provide) a reason to believe that the program *as a whole* means what
we intend (i.e., the valuation function is implied through induction,
but there is no proof that the induction is valid).
For simple expressions that makes no difference, but suppose we
consider this one:
(((lambda (f)
((lambda (D) (D D))
(lambda (x) (f (lambda () (x x))))))
(lambda (f)
(lambda (n)
(if (zero? n)
1
(* n ((f) (- n 1))))))) 10)
Operational semantics can easily show that this reduces to 3628800,
but it could not show that this fragment:
(lambda (f)
((lambda (D) (D D))
(lambda (x) (f (lambda () (x x))))))
when applied to this fragment:
(lambda (f)
(lambda (n)
(if (zero? n)
1
(* n ((f) (- n 1))))))
yields a (partial) function.
--
~jrm
Joe, I didn't really follow your post, except for your (almost!) Y_v
combinator (see below), but I think I maybe see our problem:
I think there's 2 separate issues here, and I'd like your vote on both
the mathematical issue and the "real-world modeling" issue:
****** Math ******
I claimed I could easily turn Shriram's big-step OpS into a
mathematical function
V : Expressions -> (Env -> Values)
where V[[ (lambda (x) body) ]](E) = <x, body, E>
I said I didn't need Scott models or CPO's to define V, just some
induction. Now relating V to the curly-E of R5RS DS would be real
work, because curly-E is real work, as it involves Scott models.
I think maybe you agreeing with me on this point, but you say I can't
call it a DS valuation function, on account of:
****** Real-World Modeling ******
When we make a mathematical model a real-world phenomenon, we have to
ask if it's a "good" model. And that's partly a math question, but
it's partly a real-world question. What do we mean by good?
So a DS valuation/semantic function for our language
V : Expressions -> Meaning-Set
must satisfy us mathematically, but it must also satisfy us
intuitively, in a way that I think can't itself be mathematized.
So V[[ (lambda (x) body) ]] is supposed to mathematically codify the
"meaning" of (lambda (x) body). But what is the "meaning"?
I think we can legitimately give different answers to this question,
and we must define different semantic functions accordingly.
I say there's nothing wrong with the SICP solution, as Sussman &
Steele invented Scheme. SICP "conflates" the Store with the
Environment, as Shriram does, and SICP also declares the value of
(lambda (x) body) in an environment E to be just the lambda tagged
with E, i.e. Shriram's triple <x, body, E>.
You can say SICP is old-hat, but I say we can, and ought to, define
a SICP DS semantic function, and then we'll have to say that the
"meaning" of (lambda (x) body), i.e.
V[[ (lambda (x) body) ]]
reflects the SICP biz,
and then we'll have to say something like Shriram's
V[[ (lambda (x) body) ]](E) = <x, body, E>
Now you can say, "No!" The "real" meaning of a lambda is some actual
function on Values, and that forces Scott models etc.
And I'd say, that's fine, but that's a different "semantic"
understanding of Scheme, so you need R5RS's curly-E, a different DS
semantic function. There's no right answer here. Heck, there are
folks who think that everything is a pointer in Scheme, and they also
need a DS semantic function that's different from curly-E.
****** other issues: R5RS DS has real-world-fit problems ******
I think I remember what WC posted was wrong with the R5RS DS def of
procedure values. Let's forget Cont for now, and dumb down the R5RS
semantic function to the sort of call/cc-free DS semantic function
curly-E: Expressions -> (Store Env -> Store Value)
which Cartwright & Felleisen talk about on the top of page 2 of their
"Extensible Denotational" paper on PLT.
Then curly-E[[ (lambda (x) body) ]](sigma E) = (sigma f)
where f is an honest function. But I think what WC pointed out is
that meaningless changes to (lambda (x) body) will result in different
functions f. f is a function that changes the contents of locations,
and we can modify this location-behavior in meaningless ways by making
meaningless changes to (lambda (x) body). By meaningless, I'm making
a "real-world" distinction: they don't mean anything to us Scheme
programmers. But in the DS meaning of meaning, it's not meaningless!
What that means is that the (very nice IMO) curly-E doesn't really
reflect our real-world understanding of Scheme semantics. But it's
pretty good, and it's perfect on whole programs.
Why don't I also point out that my Shriram V semantic function is
compositional by the definition in "Extensible Denotional", p 6:
[The map from syntactic domains to semantic domains] satisfies the law
of compositionality: the interpretation of a phrase is a function of
the interpretation of the sub-phrases.
Since that's SICP says, it's obviously true for the Shriram
V : Expressions -> (Env -> Values)
****** your interesting Y_v combinator ******
Your code looks pretty much like the Y_v combinator version of
(fact 10) => 3628800
but it looked odd enough I checked. Here's yours
(((lambda (f)
((lambda (D) (D D))
(lambda (x)
(f (lambda () (x x))))))
(lambda (f)
(lambda (n)
(if (zero? n)
1
(* n ((f) (- n 1))))))) 10)
and here's the usual Y_v fact biz (Y_v courtesy of TLS):
(((lambda (f)
((lambda (D) (D D))
(lambda (x)
(f (lambda (p) ((x x) p))))))
(lambda (f)
(lambda (n)
(if (zero? n)
1
(* n (f (- n 1))))))) 10)
I never saw your version. Finally I realized that your f is really
(f), so you don't quite have a version of Y_v. Pretty clever anyway!
> You can say SICP is old-hat, but I say we can, and ought to, define
> a SICP DS semantic function, and then we'll have to say that the
> "meaning" of (lambda (x) body), i.e.
> V[[ (lambda (x) body) ]]
> reflects the SICP biz,
> and then we'll have to say something like Shriram's
>
> V[[ (lambda (x) body) ]](E) = <x, body, E>
>
> Now you can say, "No!" The "real" meaning of a lambda is some actual
> function on Values, and that forces Scott models etc.
More or less. I'm not insisting that you must model a function, but
if you choose not to model a function, then you cannot make sense of this:
((lambda (x) x) 42)
Because the lambda expression means a 3-tuple and you have not
provided a semantics for applying 3-tuples to arguments.
You have 3 options at this point:
1. Haul out the Scott domains
2. Outlaw applying closures to arguments
3. Punt and use Operational semantics to rewrite
((lambda (x) x) 42) => 42 and declare victory.
> Why don't I also point out that my Shriram V semantic function is
> compositional by the definition in "Extensible Denotional", p 6:
>
> [The map from syntactic domains to semantic domains] satisfies the law
> of compositionality: the interpretation of a phrase is a function of
> the interpretation of the sub-phrases.
But it isn't compositional (yet) because you haven't defined what the
application of 3-tuples mean.
> Your code looks pretty much like the Y_v combinator version of
>
> (fact 10) => 3628800
>
> but it looked odd enough I checked. Here's yours
>
> (((lambda (f)
> ((lambda (D) (D D))
> (lambda (x)
> (f (lambda () (x x))))))
> (lambda (f)
> (lambda (n)
> (if (zero? n)
> 1
> (* n ((f) (- n 1))))))) 10)
>
> and here's the usual Y_v fact biz (Y_v courtesy of TLS):
>
> (((lambda (f)
> ((lambda (D) (D D))
> (lambda (x)
> (f (lambda (p) ((x x) p))))))
> (lambda (f)
> (lambda (n)
> (if (zero? n)
> 1
> (* n (f (- n 1))))))) 10)
>
> I never saw your version. Finally I realized that your f is really
> (f), so you don't quite have a version of Y_v. Pretty clever anyway!
My Y is curried.
--
~jrm
Its not just a "different semantic understanding". Its an
understanding that has no more layers of evaluation on top of it.
Your V yields a <x, body, E> tuple. What *is* this, I ask you? You
might say, "hmmm... well, to understand what this tuple means, I need
to show you it running on some data in my interpreter." (And in fact,
different interpreters may yield different answers to this question)
But in Denotational Semantics, once I get the element of the
"Meaning-Set" as you call it, I'm done. That's it. There's no more
running it in an interpreter. The thing I get back will be a
function; a real mathmatical function. I can't change what the
factorial or fibonacci functions *are*.
(I can't change your <x, body, E> tuple either, but I can change what
I think the 'body' inside of it means.)
Also, I suggest you go back and look at Joe's example at the end of
his last post. It sounds like you were trying to look at it and twist
it into something like the Y-combinator, and in the process you missed
the whole point that he was making: that Denotational Semantics gave
us a *meaning* for the sub-expressions that weren't just abstract
tuples with some code in it.
Instead, the meanings were "just" abstract functions that are tricky
for us humans to make sense out of. Are these functions "better" than
your tuples for the purposes of reasoning about programs? I'm a
biased (and ignorant!) party, so I'm going to be political here and
just say: It depends on the structure and goals of your analysis.
Bill, if you want to try to understand why one approach might be
better or worse than another, you should stop posting on the
newsgroup, and go write some static analysis code. REAL CODE.
Something like a CFA would be a good exercise for you; go google for
"control-flow analysis scheme", do some reading, and spend three weeks
hacking something up.
This is my problem with DS: who decides the semantics of the "Meaning-Set"?
Just math, we say. But that is just the original problem all over again.
Math is a system of symbols and axioms and rules for evaluating them, quite
like programs, really. It is just that people have internalized math quite
well over the years, and so there is immediate recognition and agreement as to
the meaning of things.
But ultimately the problem is the same: semantics comes about from how things
are used, are they interact; in programming terms: how things are interpreted.
In my view DS is unnecessarily complicated.
--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
rAYb...@STRIPCAPStelus.net The Rhythm has my soul.
> Also, I suggest you go back and look at Joe's example at the end of
> his last post. It sounds like you were trying to look at it and twist
> it into something like the Y-combinator, and in the process you missed
> the whole point that he was making: that Denotational Semantics gave
> us a *meaning* for the sub-expressions that weren't just abstract
> tuples with some code in it.
The other point is that if you attempt to perform induction over the
operational semantics with this subform you will find yourself with no
base case. It isn't the Y combinator that's the problem, it's the
self-application of F that's going to do you in.
Thanks, Joe, that clarifies your previous Y biz. I disagree:
> Because the lambda expression means a 3-tuple and you have not
> provided a semantics for applying 3-tuples to arguments.
Shriram wrote down the `semantics for applying 3-tuples to arguments'
in PLAI. It's the SICP rule, essentially. That's the reduction rule
you posted about. But yeah, we have to provide such semantics!
> > Why don't I also point out that my Shriram V semantic function is
> > compositional by the definition in "Extensible Denotational", p 6:
> >
> > [The map from syntactic domains to semantic domains] satisfies the law
> > of compositionality: the interpretation of a phrase is a function of
> > the interpretation of the sub-phrases.
>
> But it isn't compositional (yet) because you haven't defined what
> the application of 3-tuples mean.
OK, sorry for not clarifying. Would you agree it's compositional now?
Maybe I should clarify about SICP. They stress: to evaluate a
combination, first evaluate the arguments, and the 1st must return a
procedure value (one of Shriram's triple), and then make a new
environment... Sounds like compositionality to me.
> My Y is curried.
Thanks! I'll hafta think about that.
Now Felix: I'm not claiming that my (or Shriram's) semantic function
is better in some calculational way. I'm in fact clueless about the
value of DS, except for what Shriram wrote in PLAI:
It would be convenient to have some common language for explaining
interpreters. We already have one: math!
I'm just thinking of DS/math as a way to clearly talk about
interpreters. I'm not a great programmer. I got into this because I
couldn't understand the text of R5RS, which they describe as the
"informal semantics", so I read R5RS DS to clarify, and it worked, but
I had huge misunderstandings about the Math...
> Joe Marshall <prunes...@comcast.net> responds to me:
>
>> Because the lambda expression means a 3-tuple and you have not
>> provided a semantics for applying 3-tuples to arguments.
>
> Shriram wrote down the `semantics for applying 3-tuples to arguments'
> in PLAI. It's the SICP rule, essentially. That's the reduction rule
> you posted about. But yeah, we have to provide such semantics!
Unfortunately, Shriram's rule for reducing lambda expressions does not
give the semantics for applying 3-tuples. Instead, it gives us a
rewrite rule that allows us to change the application of a 3-tuple
into something else. But I can't use a rewrite rule as a definition
of a valuation function without identifying the rewrite operation as
being semantically meaningful in the valuation space.
>> > Why don't I also point out that my Shriram V semantic function is
>> > compositional by the definition in "Extensible Denotational", p 6:
>> >
>> > [The map from syntactic domains to semantic domains] satisfies the law
>> > of compositionality: the interpretation of a phrase is a function of
>> > the interpretation of the sub-phrases.
>>
>> But it isn't compositional (yet) because you haven't defined what
>> the application of 3-tuples mean.
>
> OK, sorry for not clarifying. Would you agree it's compositional now?
No.
This would be compositional:
V [[ ((lambda (x) x) 3) ]] = V[[ (lambda (x) x) ]] (V[[3]])
But you aren't defining V[[ (lambda (x) x) ]], you are applying the
operational steps on the lambda expression:
OpS (OpS (lambda (x) x), OpS(3))
with the hope that
V[[ OpS (OpS (lambda (x) x), OpS(3)) ]] = V[[ (lambda (x) x) ]] (V[[3]])
But you have given no justification for asserting this.
> Maybe I should clarify about SICP. They stress: to evaluate a
> combination, first evaluate the arguments, and the 1st must return a
> procedure value (one of Shriram's triple), and then make a new
> environment... Sounds like compositionality to me.
Yes, a finite string of Operational semantics steps compose. That
isn't a valuation function.
This is a formalist view, and probably doesn't reflect the attitude of
the majority of mathematicians, although it may come naturally to a
computer scientist (I know it comes naturally to me).
I think the prevailing view is that mathematical objects exist in some
platonic realm quite independently of any formal systems, and
mathematical truths are quite independent of the derivabality of
anything (excepting, of course, those mathematical propositions that
explicitly concern derivability). We are supposed to have an intuitive
understanding of these mathematical objects and a formal system can
then be judged by how well it captures this intuition of ours.
In this context, the "meaning" that DS assigns to the Scheme program
(+ 2 2) is not simply a numeral "4", but actually "four", or just
_four_, fourness itself. What this _really_ means is then a question
for philosophers. Certainly, as Benacerraf has argued, it cannot mean
simply {{},{{}},{{},{{}}},{{},{{}},{{},{{}}}}} (here meaning the set
which that expression denotes, not the expression itself as a
syntactic object).
In any case, even though there are philosophical problems about the
meaning of math, at least DS pushes the problem of interpretation to
the philosophers, and out from the computer scientists' shoulders. :)
Lauri Alanko
l...@iki.fi
> >> > Why don't I also point out that my Shriram V semantic function is
> >> > compositional by the definition in "Extensible Denotational", p 6:
> >>
> >> But it isn't compositional (yet) because you haven't defined what
> >> the application of 3-tuples mean.
> >
> > OK, sorry for not clarifying. Would you agree it's compositional now?
>
> No.
>
> This would be compositional:
> V [[ ((lambda (x) x) 3) ]] = V[[ (lambda (x) x) ]] (V[[3]])
Ah, that looks like exactly our problem, Joe! I think I owe you an
apology for suggesting you were dragging culture into this. I see now
why you insisted that the meaning of a lambda-exp is a function.
You don't quite have the the definition of compositionality. What you
wrote is an example of compositionality, but more generally, it's that
there's a function Phi s.t.
V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])
Your Phi is Phi(f, x) = f(x), which is fine, but that's not the only
one. As I posted here on 26 Jun 2002 (some editing):
As Cartwright and Felleisen's "Extensible Denotational" paper says:
[The map from syntactic domains to semantic domains] satisfies the
law of compositionality: the interpretation of a phrase is a
function of the interpretation of the sub-phrases.
This means that for a DS valuation function
curly-E: Expressions ---> M
compositionality means e.g. there must be a function
Phi: M x M ---> M
such that for a pair of expressions (X, Y),
curly-E[[ (X Y) ]] = Phi( curly-E[[ X ]], curly-E[[ Y ]] )
I don't see any trouble defining Phi, and I think I did so 2 years
ago. You just read it off SICP. And if we unconflate Store & Env,
it's not SICP, it's R5RS, 4.1.4 Procedures:
Semantics: A lambda expression evaluates to a procedure. The
environment in effect when the lambda expression was evaluated is
remembered as part of the procedure. When the procedure is later
called with some actual arguments, the environment in which the
lambda expression was evaluated will be extended by binding the
variables in the formal argument list to fresh locations, the
corresponding actual argument values will be stored in those
locations, and the expressions in the body of the lambda expression
will be evaluated sequentially in the extended environment.
It's easy to translate that to a Phi function for a DS function
curly-E: Expressions ---> (Env x Store -> Values x Store)
just like the (very similar) SICP prose yields a Phi for Shriram's
V: Expressions -> (Env -> Values)
Thus Shriram's V (obtained from his actual big-step OpS with a little
induction) is a compositional DS semantic function.
It's exciting to think we might settle this old argument! And getting
back to Felix, my interest here is just that overly arcane math makes
for a bad `common language', per Shriram great PLAI slogan:
> > But in Denotational Semantics, once I get the element of the
> > "Meaning-Set" as you call it, I'm done. That's it. There's no
> > more running it in an interpreter. The thing I get back will be a
> > function; a real mathmatical function. I can't change what the
> > factorial or fibonacci functions *are*.
Felix is accurately expressing the usual practice in DS, I think, but
there's no actual requirement to get `a real mathmatical function'.
> Math is a system of symbols and axioms and rules for evaluating
> them, quite like programs, really. It is just that people have
> internalized math quite well over the years, and so there is
> immediate recognition and agreement as to the meaning of things.
There's an important difference, Ray! Math is much more powerful!
That is `math' that `people have internalized' includes powerful
axioms that allow you say construct functions that you can't write
programs for. I like your `immediate recognition and agreement' :D
> It's exciting to think we might settle this old argument! And getting
> back to Felix, my interest here is just that overly arcane math makes
> for a bad `common language', per Shriram great PLAI slogan:
>
> It would be convenient to have some common language for explaining
> interpreters. We already have one: math!
If only I'd known my throw-away comment would be quoted this many
times and in this context, I'd never have written it. (As I'm sure
Dave Schmidt might be regretting not paying more attention to his
introduction, either.)
Bill, I'm going to rewrite my text to say that a semantics is not
denotational unless it maps lambda to an actual mathematical function.
A mapping of lambda to tuples, or to any other kind of structure, is
not a denotational semantics. Then I'm going to quote the book ad
infinitum in response to your posts.
Shriram
It is easy to say "it is easy to see that..." Since we're such
skeptics here, it would be easier for you to simply provide the
desired function immediately, since someone is going ask for it anyway.
Now, as to why I for one am so skeptical, lessee... In Shriram's
manuscript on p. 173 we see a rule:
f,e => <i,b,e'> a,e => a_v b,e'[i<-a_v] => b_v
---------------------------------------------------
{f a},e => b_v
Now, if I try to turn this straightforwardly into a meaning function
like you suggest, I end up with something like this:
E[{f a}] e = let <i,b,e'> = E[f] e
in E[b] e'[i<-(E[a] e)]
^^^^
See the problem?
If you have a _compositional_ meaning definition derived from the
above big-step rule, I'm sure everyone would be interested in seeing
it. Especially if you won't use any recursive domains since those
would bring along the host of problems that requires all those hairy
CPOs to solve, and you certainly don't need _those_, as you have
reiterated...
Lauri Alanko
l...@iki.fi
> ric...@math.northwestern.edu (Bill Richter) writes:
>
> > It's exciting to think we might settle this old argument! And getting
> > back to Felix, my interest here is just that overly arcane math makes
> > for a bad `common language', per Shriram great PLAI slogan:
> >
> > It would be convenient to have some common language for explaining
> > interpreters. We already have one: math!
>
> If only I'd known my throw-away comment would be quoted this many
> times and in this context, I'd never have written it. (As I'm sure
> Dave Schmidt might be regretting not paying more attention to his
> introduction, either.)
>
> Bill, I'm going to rewrite my text to say that a semantics is not
> denotational unless it maps lambda to an actual mathematical function.
Uh, oh, Shriram. Be veeeeery careful!
The first problem with your throw-away sentence (no, not the first
one, this one!) is that almost anything can formally be turned into a
function. What you probably mean is that the denotation of a lambda
should be a function that, when applied to the denotation of an
argument, returns the denotation of the result.
But this stronger requirement would actually rule out some semantics
which are commonly considered denotational.
Matthias
I agree with this. I said I was biased. I didn't say in which
direction (though I suspect my choice of challenge to Bill Richter
(developing a CFA) revealed which way I'm tilted towards).
But then again, I also said I was ignorant. I've taken only one class
that tried to cover Den.Sem., so I don't consider myself an expert in
its utility. Hopefully by next summer I'll have a greater
appreciation for the utility of Scott domains. Or even Category
Theory!
IMNHO the killer app for denotational techniques is when you want to prove
rich semantic equivalences between terms in your language. Especially when
the operational techniques are just too clunky. Unfortunately, for general
programming languages their are few if any general semantic equivalences
that your user may want to prove.
But imagine giving a denotational semantics to some interesting thing like a
declarative language for representing resolution independent pictures...
(i.e. SVG, Postscript, FLASH...) The Haskell School of Expression by Hudak
has lots of good examples of good uses of denotational reasoning of this flavor.
You would rather describe a circle as the set of points equidistant from a
point rather than with a particular algorithm used to rasterize it.
Especially if you want to prove you can collapse several coordinate
transformations into one transformation matrix. In this area the
mathematical denotation of the object is the most natural way of thinking
about it.
For programs most of the time, I feel reasoning about the operational
behavior of it is unfortunately more useful/natural/intuitive. So I think
the denotational approach and Scott domains are a bit of an overkill for
most things.
For example it is nice to know that the iterative version of fib in some way
are semantically the same function as the naive exponential one. However,
most people also care that they are different in the sense that the
iterative version is more efficient. Programming requires reasoning about
correctness and operational behavior. I do not think you can realistically
ignore one or the other!
To be fair I think, the operational techniques have become more common
because people have given up on reasoning about correctness, so the
operational techniques are just useful enough to prove the weaker theorems
people are interested in these days. One day I hope, maybe people will start
worrying about corecntess and Scott domains will be in vogue again.
Okay, i think I understand now.
Bill Richter wants to give us a different kind of semantics... lets
not call it "Denotational Semantics", since that is easily confused
with the common (but clearly pointless) practice of making the range
of the valuation function be "real math stuff"...
Lets call it "A New Kind of Semantics", or NKS for short.
And this revolutionary NKS will have the ease of use of Denotational
Semantics, while providing the awesome power of reasoning in an
Operational Semantics.
After all, the V for an ideal NKS would clearly have properties like:
V[[ (lambda (x) ((lambda (y) (- x y)) 3)) ]] != V[[ (lambda (x)
(- x 3)) ]]
and in NKS, we get exactly that, since these two applications of V
yield totally different tuples! Wonderful!
</sarcasm>
Bill, please please please go try to develop a static analysis of some
sort. Yes, I read that your background is in math and not in computer
science, but that's simply an unacceptable excuse. We need people
with a math background in this area; there's too many compiler hackers
out there WITHOUT a reasonable background in math. Just write a
CFA-0, it shouldn't take more than a month. And I bet you'll find
many knowledgable folk here to help you (that is, I bet there's lots
more CFA experts reading comp.lang.scheme than NKS experts).
Until you try to apply some of the things you've learned to a concrete
problem (some problem where you have the potential for automated
testing; and posting formulae to comp.lang.scheme does not count as
"automated testing"), I fear you're never going to get an insight for
why Computer Scientists have chosen particular frameworks for
particular tasks, and why theorems aren't freely interchangable
between frameworks.
-Felix
p.s. if I misinterpreted what V[[ - ]] would do on the expressions
above, I apologize. However, please do not interpret this apology as
a request for an explanation for what your V[[ - ]] would yield nor
how it would do so.
> Joe Marshall <j...@ccs.neu.edu> responded to me:
>>
>> This would be compositional:
>> V [[ ((lambda (x) x) 3) ]] = V[[ (lambda (x) x) ]] (V[[3]])
>
> Ah, that looks like exactly our problem, Joe! I think I owe you an
> apology for suggesting you were dragging culture into this. I see now
> why you insisted that the meaning of a lambda-exp is a function.
>
> You don't quite have the the definition of compositionality. What you
> wrote is an example of compositionality, but more generally, it's that
> there's a function Phi s.t.
>
> V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])
>
> Your Phi is Phi(f, x) = f(x), which is fine, but that's not the only
> one. As I posted here on 26 Jun 2002 (some editing):
>
> As Cartwright and Felleisen's "Extensible Denotational" paper says:
>
> [The map from syntactic domains to semantic domains] satisfies the
> law of compositionality: the interpretation of a phrase is a
> function of the interpretation of the sub-phrases.
>
> This means that for a DS valuation function
>
> curly-E: Expressions ---> M
>
> compositionality means e.g. there must be a function
>
> Phi: M x M ---> M
>
> such that for a pair of expressions (X, Y),
>
> curly-E[[ (X Y) ]] = Phi( curly-E[[ X ]], curly-E[[ Y ]] )
>
> I don't see any trouble defining Phi, and I think I did so 2 years
> ago.
The appropriate Phi for curly-E is not going to work for Ops, so I'll
call your putative Phi `Phi1':
So you are at this stage:
V[[ ((lambda (x) x) 3) ]] = Phi1 (V[[ (lambda (x) x) ]], V[[3]])
Now I assume you wish your semantics to have the same semantics as
R5RS, so you expect this to hold:
Phi1 (<a 3-tuple>, <the number 3>) = Phi (<a function>, <the number 3>)
Or, since Phi (x, y) is defined as x(y),
Phi1 (<a 3-tuple>, <the number 3>) = <a function> (<the number 3>)
So Phi1 must map 3-tuples to functions. That is,
Phi1 (x, y) = Phi (3-tuple->function (x), y)
But you have not demonstrated that
3-tuple->function is definable over all (or almost all) 3-tuples,
or that 3-tuple->function exists for all (or almost all) appropriate 3-tuples,
or that 3-tuple->function is unique for those.
> It's easy to translate that to a Phi function for a DS function
>
> curly-E: Expressions ---> (Env x Store -> Values x Store)
>
> just like the (very similar) SICP prose yields a Phi for Shriram's
>
> V: Expressions -> (Env -> Values)
>
> Thus Shriram's V (obtained from his actual big-step OpS with a little
> induction) is a compositional DS semantic function.
Woah, there's a lot of induction. If you expand out the Y operator
example you'll find more induction than you can shake a stick at! You
haven't show that this induction will work.
--
~jrm
> But this stronger requirement would actually rule out some semantics
> which are commonly considered denotational.
It would rule out decision-tree representations, some game-theoretic
models, and so on. But then Richter would be writing the authors of
those papers demanding to know why their purported "denotational"
semantics don't match the throw-away wording in my book, no?
Shriram
> To be fair I think, the operational techniques have become more common
> because people have given up on reasoning about correctness, so the
> operational techniques are just useful enough to prove the weaker
> theorems people are interested in these days. One day I hope, maybe
> people will start worrying about corecntess and Scott domains will be
> in vogue again.
You can reason about correctness using other models, eg, Kripke
structures and temporal logic. Some of us do this all the time.
Correctness is not out of vogue, only this technique is.
Yes, you're missing a step of proof that the Kripke structure you
extract from a program corresponds to the program's "real" (ie,
denotational) meaning. But the extraction process is usually quite
straight-forward (not much different from, and often reusing, what a
compiler does), and lots of weaker theorems have been proven about it
to give the user additional confidence.
Shriram
[a nice response]
> I think the prevailing view is that mathematical objects exist in some
> platonic realm quite independently of any formal systems, and
> mathematical truths are quite independent of the derivabality of
> anything (excepting, of course, those mathematical propositions that
> explicitly concern derivability). We are supposed to have an intuitive
> understanding of these mathematical objects and a formal system can
> then be judged by how well it captures this intuition of ours.
I don't strictly believe this myself, but have no real problem with this
view. Certainly I use my intuitive understanding of things to guide me.
My take on it is that ultimately it doesn't matter. Our attempts to refer to
such platonic objects run into the limitations of our notations and reasoning
abilities, meaning that even if there is a "true" semantics to be found, we
are not sure if we are achieving it.
Our practical semantics, then, is the result of how things interact, which
hopefully for our intuition would tend to correspond to what we believe the
"true" semantics are like.
> In this context, the "meaning" that DS assigns to the Scheme program
> (+ 2 2) is not simply a numeral "4", but actually "four", or just
> _four_, fourness itself. What this _really_ means is then a question
> for philosophers. Certainly, as Benacerraf has argued, it cannot mean
> simply {{},{{}},{{},{{}}},{{},{{}},{{},{{}}}}} (here meaning the set
> which that expression denotes, not the expression itself as a
> syntactic object).
I see at least 6 ways to refer to the notion of "four" here. Which is right?
Are they all the different, equivalent, or just sometimes?
Is the platonic set you are attempting to refer to the same as the platonic 4?
I think it all depends on context and how things are used. E.g. how things are
interpreted.
> In any case, even though there are philosophical problems about the
> meaning of math, at least DS pushes the problem of interpretation to
> the philosophers, and out from the computer scientists' shoulders. :)
I don't know if pushing the problem to the philosophers really solves
anything; they tend to argue alot :-).
I do agree that it is useful to be able to push a problem to another standard
one, since that allows us to understand how problems can relate to each other,
and to determine if problems are equivalent. So, yes, I can see the usefulness
of DS in that respect.
> Matthias Blume <fi...@my.address.elsewhere> writes:
>
> > But this stronger requirement would actually rule out some semantics
> > which are commonly considered denotational.
>
> It would rule out decision-tree representations, some game-theoretic
> models, and so on.
Exactly.
> But then Richter would be writing the authors of
> those papers demanding to know why their purported "denotational"
> semantics don't match the throw-away wording in my book, no?
I must have missed a smiley (actually present or implied), because I
don't know what you are trying to say. (Your sentence supports the
idea of requiring denotations of lambdas to be functions HOW?)
Matthias
Having spent the last several months worrying about proofs at an
unbelievable level of pedantry and annoying detail, I find it absolutely
disturbing to "leave a step out" and claim to have a proof. Perhaps, you can
build Kripke structures directly from an operational semantics of a language
and show they are some how sound, but I suspect a DS version would be a bit
more pleasant.
I would also note that if you wanted to actually mathematically prove that
what goes on in a compiler is sound a DS with Scott models might be the most
obvious way to go about things. Of course not many people actually have the
time to mathematically prove that what their compiler does is sound.
On that related note, does anyone happen to have a pointer to a proof that
shows the CPS transformation is semantics preserving? Was such a proof done
via a DS or operational approach. I imagine such a proof could be carried
out using either technqiue, I'm just curious how it actually was carried out.
> Having spent the last several months worrying about proofs at an
> unbelievable level of pedantry and annoying detail, I find it
> absolutely disturbing to "leave a step out" and claim to have a
> proof.
I'm amused that you would be "absolutely disturbed" about something
that I didn't say. I said that the proofs are about the Kripke
structures, and that we can have "confidence" in the relationship
between the structure and the program. I chose my words carefully.
Given that "verification" is mostly about debugging, not about proving
correctness, this is not only useful, but often about as much as you
can achieve. (Did the pedantry and annoying detail you dealt with
take into account quantum effects, hardware errors, etc? If not,
should we be absolutely disturbed that you assumed these away?)
Shriram
I don't want to nitpick, but reading your original text carefully still
makes it seems ambiguous about as to what you actually claimed.
> Given that "verification" is mostly about debugging, not about proving
> correctness, this is not only useful, but often about as much as you
> can achieve. (Did the pedantry and annoying detail you dealt with
> take into account quantum effects, hardware errors, etc? If not,
> should we be absolutely disturbed that you assumed these away?)
The issue is making a precise statement about what assumptions are made and
understanding how interesting the conclusion is with respect to the
assumptions. The right assumptions will allow you to prove anything.
If you are abstracting the program semantics to prove a property about the
program, one definitely should have a proof that the abstraction is sound
with respect to the program semantics. If however, you are merely saying
that you have a proof about your putative abstraction and make no formal
claims about program correctness than fine. But in that case I do not know
what you mean by "you can reason about correctness ...." if you don't
establish the link between your abstraction and the real semantics.
Perhaps, I'm misreading "reason" as "prove" rather than "have confidence
about". In any case, I have confidence about many things that are totally
wrong.
> E[{f a}] e = let <i,b,e'> = E[f] e
> in E[b] e'[i<-(E[a] e)]
> ^^^^
> See the problem?
Lauri, I don't quite follow you, but I do see a problem! Thanks!
It's maybe impossible to show compositionality for my SICP-like
E: Expressions -> (Env -> Values)
because evaluating the 1st argument in {f a} changes the env/store.
So let's unconflate Env & Store, to get a DS semantic function
E: Expressions ---> (Env x Store -> Value x Store)
of the sort discussed on page 2 of Cartwright & Felleisen "Extensible
Denotational" paper. My E uses left->right eval order, like mzscheme.
Let's translate Shriram's OpS example as this. We'll define
E[{f a}](e, s) = (b_v, s3), if
E[f](e, s) = (<i,b,e'>, s1)
E[a](e, s1) = (a_v, s2)
l = (new s2)
E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
E[{f a}](e, s) = bottom, otherwise.
My R5RS-like `new' produces a fresh location, i.e. sent to bottom by
s2 in Store = (Location -> Value).
Was that your point, that I needed to unconflate to define E? Can you
check my stores? I stared at & fiddled with them for a long time, as
my State semantics is rusty. I've been coding just Lambda alg FP.
Now I'll answer Joe's question, and define (this different!) Phi,
Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
-> (Env x Store -> Value x Store)
I claim that
E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store)
That's kinda complicated, but its easy for schemers, who are excellent
at functions that take functions as arguments etc.
Given alpha, beta in (Env x Store -> Value x Store), we define
Phi(alpha, beta)(e, s) = (b_v, s3), if
alpha(e, s) = (<i,b,e'>, s1)
beta(e, s1) = (a_v, s2)
l = (new s2)
E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
Phi(alpha, beta)(e, s) = bottom, otherwise.
As you see, it's a straightforward translation of the eval rule.
> If you have a _compositional_ meaning definition derived from the
> above big-step rule, I'm sure everyone would be interested in seeing
> it.
I think I posted my Phi 2+ years ago, and nobody said it was false, or
interesting, but only that I had violated a rule, as my Phi used E.
I said no DS author insists that Phi be defined independently of E,
and we can't mathematize such a notion, and given such a Phi depending
on E, and we then re-define E by structural induction, so this rule is
even satisfied! I didn't got a response, but folks were worn out.
As to the rest of the traffic: I think of Shriram as sortuva friend,
he tried to find me a job once, & I don't get riled at his (quite
funny!) sarcasm. PLAI (like HtDP) looks like a great book. Nice to
see some mild support from MB, who I learned quite a lot from 2+ yrs
ago. Felix: I'd be happy to know something useful about DS, perhaps
you could post something about CFA, or give a link, but what I'm
saying is really simple & really worth understanding, I think.
> Given alpha, beta in (Env x Store -> Value x Store), we define
>
> Phi(alpha, beta)(e, s) = (b_v, s3), if
> alpha(e, s) = (<i,b,e'>, s1)
> beta(e, s1) = (a_v, s2)
> l = (new s2)
> E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
>
> Phi(alpha, beta)(e, s) = bottom, otherwise.
>
> As you see, it's a straightforward translation of the eval rule.
This is not a definition. This is a constraint, an equation that may
or may not hold for some functions E. You yet need to construct some
function that satisfies this equation. Well, all right, _that_
is easy: just set Phi(alpha,beta)(e, s) = bottom. Or you can choose
any of an infinite number of other functions. But only a couple of
them are ones that we are _interested_ in (in the sense that bottom
accurately corresponds with nontermination). And you haven't shown
which ones are those. Scott has.
This is a recursive equation, and the usual way of dealing with those
is to get the least fixed point, but for that you need to have an
_ordering_ and that gets you again into the world of CPOs.
Incidentally, I don't think it's necessary to add stores (or
continuations) to this discussion. They just add complexity without
bringing any new insight.
> I think I posted my Phi 2+ years ago, and nobody said it was false, or
> interesting, but only that I had violated a rule, as my Phi used E.
Indeed. That's what makes it recursive, and that's why it's not a
definition.
> I said no DS author insists that Phi be defined independently of E,
> and we can't mathematize such a notion,
We don't _need_ to mathematize such a notion, because in math things
are by default defined independently of themselves. It's _recursive_
definitions that need to be mathematized.
I think usually the authors take for granted that the reader knows
that when making a definition you can't refer, even indirectly, to the
thing that you are defining.
In Schmidt, p. 51, the existence of the meaning functions is proven by
structural induction. That's what "compositional" means: defined with
structural induction. Your "definition" of E is recursive, so you have
to find out some other way of specifying exactly which function you
are talking about. You haven't done this.
> and given such a Phi depending on E, and we then re-define E by
> structural induction, so this rule is even satisfied!
By structural induction on what? As you have been told, you no longer
have a base case.
You have been given the omega example gazillion times. Consider. Let's
find, using your technique, the value of:
E[{{fun x => {x x}} {fun x => {x x}}}](e0)
Now, the start is easy. I'll leave out the store. e0 stands for the
empty environment.
E[{fun x => {x x}}](e0) = <x,{x x},e0>
So, by your definition (simplified without the stores), the result is now
E[{x x}](e1) (where e1 = e0[x-><x,{x x},e0>])
So, we get
E[{x}](e1) = <x,{x,x},e0>
and therefore
E[{x x}](e1) = E{{x x}](e1)
You call that a definition?
Now, as far as constraints go, that one is trivial. What all this
means is that your "definition" is satisfied by e.g. the meaning
function that assigns the value 42 to this example. But we don't want
that. We want _bottom_. And not just because we are being difficult,
but because when I say ((lambda (x) (x x)) (lambda (x) (x x))) to an
interpreter, it _doesn't terminate_!
Lauri Alanko
l...@iki.fi
> Now I'll answer Joe's question, and define (this different!) Phi,
>
> Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
>
> -> (Env x Store -> Value x Store)
Modeling the store is unnecessary to my argument (as you will run into
trouble long before you want to introduce side effects). Let's keep
it simple:
Phi: (Env -> Value) x (Env -> Value) -> (Env -> Value)
> I claim that
>
> E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store)
>
> That's kinda complicated, but its easy for schemers, who are excellent
> at functions that take functions as arguments etc.
So if you have a Scheme form `(f a)', you claim that the denotation
may be derived as follows:
V [[ f ]] = <a 3-tuple of args, body, and environment>
V[[ (f a) ]] = Phi( V[[ f ]], V [[ a ]])
> Given alpha, beta in (Env x Store -> Value x Store), we define
>
> Phi(alpha, beta)(e, s) = (b_v, s3), if
>
> alpha(e, s) = (<i,b,e'>, s1)
>
> beta(e, s1) = (a_v, s2)
>
> l = (new s2)
>
> E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
>
> Phi(alpha, beta)(e, s) = bottom, otherwise.
Leaving out the store,
Phi (alpha, beta) (e) = b_v if
alpha (e) = <i, b, e'>
beta (e) = a_v
E[b](e'[i->a_v]) = b_v
otherwise bottom.
Let's plug in our original equation:
V [[ (f a) ]] = V[[ b ]](e'[i -> V[[ a ]])) where
<i, b, e'> = V [[ f ]]
> As you see, it's a straightforward translation of the eval rule.
Right, but for one thing.
If you look at the semantic functions in section 7.2.3 in R5RS, you
will see that curly-E is defined over the recursive decomposition of
an expression. Because expressions are finite, the denotation of an
expression can be given non-recursively.
Now look at your valuation function. It too is defined recursively,
but the recursion is over the *body* of the function, not simply the
subform `f'. Now suppose our function f is recursive factorial.
V [[ (f a) ]] =
V[[ (if (zero? x) x (* x (fact (- x 1)))) ]](e'[i -> V[[ a ]])) where
<i, (if (zero? x) x (* x (fact (- x 1)))), e'> = V [[ f ]]
Since V is compositional, this term:
V[[ (if (zero? x) x (* x (fact (- x 1)))) ]]
must be equal to some function (I'll call it phi2) of the valuation of
subterms:
= phi2 (V [[ (zero? x) ]], V [[ x ]], V [[ (* (fact (- x 1))) ]])
Since V is compositional, the third subterm must be equal to some
function (I'll call it phi3) of the valuation of *its* subterms:
V [[ (* (fact (- x 1))) ]]
= phi3 (V [[ * ]], V [[ (fact (- x 1)) ]])
Since V is compositional, the second subterm must be equal to some
function (I'll call it phi4) of the valuation of *its* subterms:
V [[ (fact (- x 1)) ]]
= phi4 (V [[ fact ]], V [[ (- x 1) ]])
But wait a second. Phi4 defines the composition necessary for
function application, so phi4 must be the same as the phi way up near
the top. In fact, phi3 is this same function. So by back
substitution, we find that V [[ (fact a) ]] is some composition (I'll
call this one phi5) of these terms:
V [[ (fact a) ]] =
phi6 (V [[ zero? ]],
V [[ a ]],
V [[ 1 ]],
V [[ * ]],
V [[ (fact a') ]] where a' = V [[ (- a 1) ]])
That last term, V [[ (fact a') ]], can be expanded with the above
equation:
V [[ (fact a) ]] =
phi6 (V [[ zero? ]],
V [[ a ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ a' ]],
V [[ 1 ]],
V [[ * ]],
V [[ (fact a'') ]]
where a'' = V [[ (- a' 1) ]])
where a' = V [[ (- a 1) ]])
That last term V [[ (fact a'') ]], can be expanded:
V [[ (fact a) ]] =
phi6 (V [[ zero? ]],
V [[ a ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ a' ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ a''' ]],
V [[ 1 ]],
V [[ * ]],
V [[ (fact a''') ]]
where a''' = V [[ (- a'' 1) ]])
where a'' = V [[ (- a' 1) ]])
where a' = V [[ (- a 1) ]])
As you can see, this diverges. Your valuation function is not well
defined.
> I think I posted my Phi 2+ years ago, and nobody said it was false, or
> interesting, but only that I had violated a rule, as my Phi used E.
>
> I said no DS author insists that Phi be defined independently of E,
> and we can't mathematize such a notion, and given such a Phi depending
> on E, and we then re-define E by structural induction, so this rule is
> even satisfied! I didn't got a response, but folks were worn out.
It probably should have been explained more explicitly, but the
problem isn't that Phi can't use E, but that Phi's use of E must
converge. This is easily satisfiable by compositional induction over
a finite expression, but not by compositional induction over the
*value* of the expression.
> E[{x x}](e1) = E{{x x}](e1)
>
> You call that a definition?
You caught me again! Thanks. And I'm very pleased to see that the
traffic today was just Math, from you & Joe.
I should've said I have some reduction rule, and if it doesn't return
a value in finite time, then we say
E[expr](e, s) = bottom
I'm drawing a blank on my reduction rule, so I'll think about it and
get back to you, after I unstick myself. My State Semantics is rusty.
But Lauri, I take strong exception to other things you said, and I can
deal with these now. The model for my (poorly defined) function
E: Expressions ---> (Env x Store -> Value x Store)
is Felleisen & Flatt's Standard Reduction function eval_s, defined on
p. 51 in <http://www.ccs.neu.edu/course/com3357/mono.ps>.
It seems to me that all the complaints you made about my E (other than
my goof!) apply to F&F's eval_s. I'd say that their function
eval_s : LC_v Expressions ---> LC_v Values
is well-defined and satisfying compositionality, even though they
don't do the various things you mention below. Or Barendregt's
Standard Reduction function in LC, which is not I think
compositional. Can you think about that, while I fix my E?
> This is a recursive equation, and the usual way of dealing with those
> is to get the least fixed point, but for that you need to have an
> _ordering_ and that gets you again into the world of CPOs.
I say no. Are you saying F-F & Barendregt needed CPOs? It's just
induction. Keep flailing away the reduction rule until you terminate,
and if you don't, send it to bottom. Just because this function is
(as you say) the solution of a fixed point equation doesn't mean you
have to consider all possible solutions of the fixed point equation.
You just inductively define the one solution you want, and you don't
need the CPOs, which tell you the desired solution is the minimal one!
> Incidentally, I don't think it's necessary to add stores (or
> continuations) to this discussion. They just add complexity without
> bringing any new insight.
You don't need stores to understand my goof! Plus, I goofed in
another way. The SICP conflation works fine if we tack on Env:
E: Expressions ---> (Env -> Value x Env)
> > I think I posted my Phi 2+ years ago, and nobody said it was
> > false, or interesting, but only that I had violated a rule, as my
> > Phi used E.
>
> Indeed. That's what makes it recursive, and that's why it's not a
> definition.
I say no, but that's a good question. I define E first by induction,
and then afterward, I defined Phi in terms of E. I don't define E &
Phi by simultaneous recursion.
> In Schmidt, p. 51, the existence of the meaning functions is proven
> by structural induction. That's what "compositional" means: defined
> with structural induction.
I say that's false, Lauri. Can you give me an exact quote? Here's my
quote again from Cartwright and Felleisen's "Extensible Denotational":
[The map from syntactic domains to semantic domains] satisfies the
law of compositionality: the interpretation of a phrase is a
function of the interpretation of the sub-phrases.
It's just what I said: there must exist such functions like my (poorly
defined) Phi. It doesn't matter how you construct the Phi functions.
So I assert that C-F & Schmidt do not make your definition, and
furthermore, it's impossible to make a mathematical formulation of
your definition of compositional. Can you check?
> Your "definition" of E is recursive, so you have to find out some
> other way of specifying exactly which function you are talking
> about. You haven't done this.
Yeah, I sure haven't! Many apologies.
> You have been given the omega example gazillion times.
Thanks for giving it to me again.
> > I said no DS author insists that Phi be defined independently of
> > E, and we can't mathematize such a notion [...]
>
> It probably should have been explained more explicitly, but the
> problem isn't that Phi can't use E, but that Phi's use of E must
> converge. This is easily satisfiable by compositional induction
> over a finite expression, but not by compositional induction over
> the *value* of the expression.
Let's go with that, Joe. Thanks. Thats' a nice "mathematical" thing
to say. Now the burden is on me to define a converging E & Phi.
> > Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
> >
> > -> (Env x Store -> Value x Store)
>
> Modeling the store is unnecessary to my argument (as you will run
> into trouble long before you want to introduce side effects). Let's
> keep it simple:
>
> Phi: (Env -> Value) x (Env -> Value) -> (Env -> Value)
I failed to define such a Phi last night, and therefore switched back
to (Env x Store -> Value x Store). The obvious problem is that
evaluating expressions can change the store/env with side effects.
But even with functional programs, you need side effects for the
recursion to make sense, at least in the R5RS way that I'm thinking.
We need define (which in R5RS DS is really set!), which I didn't
mention yet, but it's going to set up a binding
Ident -e'--> Locations -s--> Values
fact -> (e' fact) -> <i, (if (zero? x) x (* x (fact (- x 1)))), e'>
and later we'll add the bindings (let's say a_v = 6)
i -> l -> 6
i -> l' -> 5
i -> l'' -> 4
Your diverging factorial computation didn't seem to use that binding
for fact.
> So if you have a Scheme form `(f a)', you claim that the denotation
> may be derived as follows:
>
> V [[ f ]] = <a 3-tuple of args, body, and environment>
That's not quite right, and your post yesterday had this trouble too.
You're reverting my E to
V: Expressions -> (Env -> Values)
so you needed to say
V [[ f ]](this env) = <a 3-tuple of args, body, and environment>
Maybe that's what you mean, and that's probably fine for this
factorial computation. But in general, I can't define Phi without V
being a function on all env's. Let's do it my way:
E: Expresions -> (Env x Store -> Value x Store)
I can't define my Phi(alpha, beta)(e, s) with alpha & beta only
calling the initial argument (e, s).
> Joe Marshall <j...@ccs.neu.edu> responded to me:
>
>> > I said no DS author insists that Phi be defined independently of
>> > E, and we can't mathematize such a notion [...]
>>
>> It probably should have been explained more explicitly, but the
>> problem isn't that Phi can't use E, but that Phi's use of E must
>> converge. This is easily satisfiable by compositional induction
>> over a finite expression, but not by compositional induction over
>> the *value* of the expression.
>
> Let's go with that, Joe. Thanks. Thats' a nice "mathematical" thing
> to say. Now the burden is on me to define a converging E & Phi.
>
>> > Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
>> >
>> > -> (Env x Store -> Value x Store)
>>
>> Modeling the store is unnecessary to my argument (as you will run
>> into trouble long before you want to introduce side effects). Let's
>> keep it simple:
>>
>> Phi: (Env -> Value) x (Env -> Value) -> (Env -> Value)
>
> I failed to define such a Phi last night, and therefore switched back
> to (Env x Store -> Value x Store). The obvious problem is that
> evaluating expressions can change the store/env with side effects.
>
> But even with functional programs, you need side effects for the
> recursion to make sense, at least in the R5RS way that I'm thinking.
No, you don't. That's why I used the Y operator. You get recursion
out of self-application.
> We need define (which in R5RS DS is really set!), which I didn't
> mention yet, but it's going to set up a binding
>
> Ident -e'--> Locations -s--> Values
>
> fact -> (e' fact) -> <i, (if (zero? x) x (* x (fact (- x 1)))), e'>
>
> and later we'll add the bindings (let's say a_v = 6)
>
> i -> l -> 6
> i -> l' -> 5
> i -> l'' -> 4
>
> Your diverging factorial computation didn't seem to use that binding
> for fact.
I hope you want your factorial to work for other values than 6. But
let me point out that I'm *not* performing a factorial computation.
I'm performing an analysis of the factorial program and it is the
analysis that is diverging, not the program. The fault lies in the
analysis, which is not sufficiently powerful to model recursive
programs.
Look what happens even if you do substitute in a value:
V [[ (fact 2) ]] =
phi6 (V [[ zero? ]],
V [[ 2 ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ 1 ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ 0 ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ -1 ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ -2 ]],
V [[ 1 ]],
V [[ * ]],
phi6 (V [[ zero? ]],
V [[ -3 ]],
V [[ 1 ]],
V [[ * ]],
....
It still diverges.
Remember that phi6 is our compositional valuation function. It does
not `know' that the meaning of the IF expression in factorial is
conditional on the value of the predicate.
>> So if you have a Scheme form `(f a)', you claim that the denotation
>> may be derived as follows:
>>
>> V [[ f ]] = <a 3-tuple of args, body, and environment>
>
> That's not quite right, and your post yesterday had this trouble too.
> You're reverting my E to
>
> V: Expressions -> (Env -> Values)
>
> so you needed to say
>
> V [[ f ]](this env) = <a 3-tuple of args, body, and environment>
>
> Maybe that's what you mean, and that's probably fine for this
> factorial computation. But in general, I can't define Phi without V
> being a function on all env's. Let's do it my way:
>
> E: Expresions -> (Env x Store -> Value x Store)
>
> I can't define my Phi(alpha, beta)(e, s) with alpha & beta only
> calling the initial argument (e, s).
That's the problem.
--
~jrm
>> This is a recursive equation, and the usual way of dealing with those
>> is to get the least fixed point, but for that you need to have an
>> _ordering_ and that gets you again into the world of CPOs.
>
> I say no. Are you saying F-F & Barendregt needed CPOs? It's just
> induction.
`Just' induction?
> Keep flailing away the reduction rule until you terminate,
> and if you don't, send it to bottom. Just because this function is
> (as you say) the solution of a fixed point equation doesn't mean you
> have to consider all possible solutions of the fixed point equation.
Actually, you do. First of all, if the fixed point does not exist,
then the expression has no meaning. Not `bottom', but no meaning
whatsoever. Every time you flail away on the induction rule, you get
a different answer and it does not approach anything.
So you need to first determine that a fixed point exists.
But you may have more than one fixed point, too. The identity
function has an infinite number of fixed points, and a lot of them are
wildly different. You wouldn't want a semantics that said that
(factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
If you have more than one fixed point, you need a mechanism by which
you can discriminate between them.
> You just inductively define the one solution you want, and you don't
> need the CPOs, which tell you the desired solution is the minimal one!
You cannot just say `I want the fixed point that gives the right
answer'! Each fixed point gives a potentially *different* answer, so
you need a mechanism to choose the one you want. And that mechanism
had better be realizable on a physical computer.
> I define E first by induction, and then afterward, I defined Phi in
> terms of E. I don't define E & Phi by simultaneous recursion.
Your definition of E is incomplete without Phi, so there is
simultaneous recursion. (Besides, you have defined E by induction,
but you haven't shown that the induction terminates).
>> In Schmidt, p. 51, the existence of the meaning functions is proven
>> by structural induction. That's what "compositional" means: defined
>> with structural induction.
>
> I say that's false, Lauri. Can you give me an exact quote? Here's my
> quote again from Cartwright and Felleisen's "Extensible Denotational":
>
> [The map from syntactic domains to semantic domains] satisfies the
> law of compositionality: the interpretation of a phrase is a
> function of the interpretation of the sub-phrases.
>
> It's just what I said: there must exist such functions like my (poorly
> defined) Phi. It doesn't matter how you construct the Phi functions.
Your phi is not a function of the interpretation of the sub-phrases
but a function of the interpretation of the *values* of the
sub-phrases. That's the problem.
--
~jrm
Joe Marshall <prunes...@comcast.net> wrote me 2 responses:
Joe, I'm glad to see you were using Y to avoid define. Let's put off
your diverging calculation until we settle some theoretical points.
> >> This is a recursive equation, and the usual way of dealing with
> >> those is to get the least fixed point, but for that you need to
> >> have an _ordering_ and that gets you again into the world of
> >> CPOs.
> >
> > I say no. Are you saying F-F & Barendregt needed CPOs? It's just
> > induction.
>
> `Just' induction?
Yes, and see below for why. But this is something I want you & Lauri
to think about. Are you saying F-F & B needed CPOs? How come F-F & B
don't run into the problems you raise, such as:
> But you may have more than one fixed point, too. The identity
> function has an infinite number of fixed points, and a lot of them
> are wildly different. You wouldn't want a semantics that said that
> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
:D Why doesn't this problem snag F-F's eval_s? Look on p. 51 in
<http://www.ccs.neu.edu/course/com3357/mono.ps>.
> So you need to first determine that a fixed point exists.
Yes, and I'm in arrears here.
> You cannot just say `I want the fixed point that gives the right
> answer'!
:D
> Each fixed point gives a potentially *different* answer, so you need
> a mechanism to choose the one you want. And that mechanism had
> better be realizable on a physical computer.
I think I disagree. We're going to define these fixed points by math
axioms, like induction, that's not realizable on a physical computer.
That's the value of using math as a `common language' of semantics: we
can define functions that aren't realizable on a physical computer.
> > I define E first by induction, and then afterward, I defined Phi in
> > terms of E. I don't define E & Phi by simultaneous recursion.
>
> Your definition of E is incomplete without Phi, so there is
> simultaneous recursion.
No, that's going to be false, after I successfully define E.
> (Besides, you have defined E by induction, but you haven't shown
> that the induction terminates).
Yes!
> > I can't define my Phi(alpha, beta)(e, s) with alpha & beta only
> > calling the initial argument (e, s).
>
> That's the problem.
No, that's fine. My Phi is supposed to be a math function
Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
-> (Env x Store -> Value x Store)
So Phi(alpha, beta)(e, s) is supposed to be in Value x Store. Now one
way to define Phi would be to have another math function
Phi1: (Value x Store) x (Value x Store) ---> (Value x Store)
such that
Phi(alpha, beta)(e, s) = Phi1(alpha(e, s), beta(e, s))
But I don't have to define Phi this way, and what I did was fine: I
used alpha(e, s) to produce (e6, s6) and plugged that into beta.
If this was as a Scheme program, you wouldn't have any trouble: you
deal with functions that take functions as args all the time.
> > there must exist such functions like my (poorly defined) Phi. It
> > doesn't matter how you construct the Phi functions.
>
> Your Phi is not a function of the interpretation of the sub-phrases
> but a function of the interpretation of the *values* of the
> sub-phrases. That's the problem.
I think you're claiming my Phi is defined in terms of a Phi1. Why?
Here's how induction constructs functions like the F-F's eval_s.
Just like the math factorial function
f: N --> N = {0, 1, 2, ... }
Suppose we've constructed a function
f_n: {0,...,n} ---> N
that satisfies the factorial definition this far. Then we define
f_{n+1}: {0,...,n+1} ---> N
by restricting to f_n for the first bunch, and defining
f_{n+1}(n+1) = (* n+1 f_n(n)).
Plus, we can construct the first one f_0 by f_0(0) = 1.
By mathematical induction, there exists a function f_n for all n in N.
Then we define f(n) = f_n(n). That's the way the math folks do it.
The CPO fixed point way is fine, but you don't need it.
Scott domains exist only because of diverging calculations! If we only
worried about non-diverging calculations there would be no need for
Scott domains. You cannot understand the need for CPOs is you ignore
diverging calculations.
> This is not a definition. This is a constraint, an equation that may
> or may not hold for some functions E.
Right, Lauri. Now I'll say something about how to actually define
E: Expressions -> (Env x Store -> Value x Store)
and not just give equations for E to satisfy. I want to follow the
Flatt & Felleisen's standard reduction function eval_s: see p. 51 in
<http://www.ccs.neu.edu/course/com3357/mono.ps>.
I posted a solution of this sort here on 26 Jun 2002. It looks pretty
clumsy to me now, but I think it's OK. But instead of discussing it,
I'd prefer to talk about a version for HtDP, which I think should be
both simpler and better designed. I wasn't aware of HtDP back then.
HtDP's Semantics for Advanced Scheme is mostly explained in sec 38,
but sec 40.3 explains mutable structures, and sec 24 says to rewrite
lambda expressions as local exps. I think it's very elegant.
HtDP use as global environment (+ store), with automatic re-naming of
variables to avoid name collisions. I think that's a good idea, which
cuts down on the wild proliferation of environments. The global env
is seen in sec 38 as a list of define's above the computation.
So HtDP's semantics can I think be written as DS semantic function
E: Expressions -> (Env -> Value Env)
Define E standard-reduction fashion as follows. First define the set
Enhanced of "enhanced" expressions, where some of the subexpressions
are values. Values is a subset of Enhanced, and there's an inclusion
Expressions -> Enhanced
Now we'll define from sec 38 a 1-step reduction function
R: Enhanced x Env -> Enhanced x Env
R looks for the evaluation context and perform some action. If
there's no eval context, then we should be done, we should have a
value. If e.g. the eval context is a variable, then the value of the
variable is found in the global env, and substituted in. If the eval
context is (set! x V), the subexp is replaced by void, and a new env
is return, with x now bound to V. If the eval context is a local
expression, then some modification of is needed, as HtDP renames the
local variables as to not collide with variables already in the env,
substitutes the new names are substituted into the local exp, and
brings the local defines out to top-level. Now instead of one
expression we have define's followed by an expressions. So either
change this technique to handle a list of expressions, or else keep
the local-defines inside the local expression until they are reduced
to (define x# V), and then bring the define out to the global env.
I guess I don't have it all figured out.
But we should be able to define a 1-step reduction function
R: Enhanced x Env -> Enhanced x Env
and now we define
F: Enhanced x Env -> Value x Env
by applying R until we either hit bottom or the 1st coordinate is a
value. If no iterate of R ever gets a value, then we also send it to
bottom. That is, letting R^n mean the n-th iterate of R,
F(exp, e) = R^n(exp, e), if n in {0, 1, 2,...} is smallest s.t.
R^n(exp, e) in Value x Env and
R^i(exp, e) != bottom for i <= n
F(exp, e) = bottom, otherwise.
The omega problem yields F(omega, e) = bottom, because there is no n:
For all n in {0, 1, 2,...}, R^n(omega, e) is not in Value x Env.
Now we define the math function
E: Expressions -> (Env -> Value Env)
by including Expressions -> Enhanced and "un-currying".
Discussion: Perhaps my R is a simple version of the small-step OpS
that Robby Findler and Jacob ? are writing for Scheme. So one might
say, "This isn't DS, it's OpS", but I say it's both, as we can easily
pass from R to E in the way that F-F define eval_s. I say it's a
little induction, but F-F doesn't say anything at all. I contend you
don't need CPOs.
One might say, "The R OpS is much more useful than the E DS", and I'd
say, "You're the experts!" But mathematicians want functions like E.
The standard miscommunication between mathematicians and physicists
goes like this: The physicists writes down a complicated Lagrangian to
integrate, and the mathematician says, "What's the range and domain of
your Lagrangian as a function?" And that ends the discussion, but
this problem is apparently being solved in String Theory these days.
I'd prefer we didn't worry about the exact definition of R or Value
for Advanced Scheme until we settled whether it yields a DS semantic
function E. Clearly I have things to learn about R and Value. But I
think this clearly shows I'm not just writing down "a constraint, an
equation that may or may not hold for some functions E."
I'm not sure if E is compositional, because of this automatic
renaming. Even if E is compositional, it might be awkward to discuss.
So I recommend that we don't insist that compositionality be part of
the definition of a DS semantic function, as some authors do.
> That's the value of using math as a `common language' of semantics: we
> can define functions that aren't realizable on a physical computer.
Then you have not given a semantics of a *programming* language.
Shriram
> Yes, and see below for why. But this is something I want you & Lauri
> to think about. Are you saying F-F & B needed CPOs? How come F-F & B
> don't run into the problems you raise, such as:
>
>> But you may have more than one fixed point, too. The identity
>> function has an infinite number of fixed points, and a lot of them
>> are wildly different. You wouldn't want a semantics that said that
>> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
>
> :D Why doesn't this problem snag F-F's eval_s? Look on p. 51 in
> <http://www.ccs.neu.edu/course/com3357/mono.ps>.
It probably should. On page 33 (section 4.6.3) they assert that the Y
operator `can find the fixed point' of any function. But a function
can have many fixed points (like the identity function) or no fixed
points (like (lambda (x) (+ x 1)) ).
But since Felleisen and Flatt are developing an operational semantics,
perhaps they didn't feel the need to specify which (if any) fixed
point is found.
>> Each fixed point gives a potentially *different* answer, so you need
>> a mechanism to choose the one you want. And that mechanism had
>> better be realizable on a physical computer.
>
> I think I disagree. We're going to define these fixed points by math
> axioms, like induction, that's not realizable on a physical computer.
>
> That's the value of using math as a `common language' of semantics: we
> can define functions that aren't realizable on a physical computer.
Mathematically, any value whatsoever is a fixed point of the identity
function. Yet when you use a computer to invoke the Y operator on the
identity function, you get an infinite loop, i.e. `bottom'.
If you don't care if your semantics apply to realizable machines,
that's fine, but it isn't of much interest to computer scientists.
>> > I define E first by induction, and then afterward, I defined Phi in
>> > terms of E. I don't define E & Phi by simultaneous recursion.
>>
>> Your definition of E is incomplete without Phi, so there is
>> simultaneous recursion.
>
> No, that's going to be false, after I successfully define E.
I don't see how. Phi is just a name that we gave to the composition
step in E. There's always going to be a composition step, and we can
always call it Phi if we choose to do so.
The only way you can avoid mutual recursion between phi and E is to
define E non-recursively. Then Phi will exist, but it will be
independent of E and there will be no problem.
--
~jrm
Let see... do you consider a Turing machine physically realizable? I sure
have not seen a physical Turing machine with one of those fancy infinite tapes.
Many programming language semantics strictly speaking are not physically
realizable, because they assume infinite storage. Programming languages are
nice abstractions of the physical systems we acutally build.
That is, one can't go from Shriram's big-step OpS to a function
E: Expressions -> (Env -> Value Env)
without some real work, as my last post showed clearly. So it was not
reasonable for me to think that PLAI's semantics was DS.
So there's sense to saying I've got to use CPOs or Scott models. It's,
"You've gotta do something! You can't just say you're done!"
We've gotten into other topics, which I'd be happy to discuss them
further, but the original impetus is now dead: I was wrong about PLAI.
For instance, I'm now maybe starting to see the advantage of CPOs: to
make sense of the big-step rule, do structural induction over the
phrases of the language. Start small with numbers & identifiers &
induct up. I'll have to think about it. But my solution was fine
too: refine the big-step rule (which is really just an equation for E
to satisfy, as Lauri & Joe said) into a 1-step function.
2 other clarifications:
I wrote last post
> sec 24 says to rewrite lambda expressions as local exps."
That was just brainlock on my part. I forgot the rule (which MFl had
to remind me of once) of Advanced Scheme
3. A set!-expression must occur in the lexical scope of a define
that introduces the set!-expression's left-hand side.
That allows the simplified semantics of Advanced Scheme
((lambda (x-1 ... x-n) exp)
v-1 ... v-n)
= exp with all x-1 ... x-n replaced by v-1 ... v-n
The law serves as a replacement of the law of application from algebra
in the study of the foundations of computing. By convention, this law
is called the ßv (pronounced ``beta value'') axiom.
Joe wrote:
> > I say no. Are you saying F-F & Barendregt needed CPOs? It's just
> > induction.
>
> `Just' induction?
Maybe we should say more. I think the math folks are in the habit of
saying "just use induction." They don't say, "use CPOs or Scott
models", because they don't know what they are! Look at it yourself.
I really disagree, Joe, and I'm sure the authors would also.
> On page 33 (section 4.6.3) they assert that the Y operator `can find
> the fixed point' of any function. But a function can have many
> fixed points (like the identity function) or no fixed points (like
> (lambda (x) (+ x 1)) ).
You're right, but that's different, that's not in the Standard
Reduction section. You're definitely making a good point about Y.
> But since Felleisen and Flatt are developing an operational semantics,
> perhaps they didn't feel the need to specify which (if any) fixed
> point is found.
Please take another look at F-F's eval_s, on p. 51.
> If you don't care if your semantics apply to realizable machines,
> that's fine, but it isn't of much interest to computer scientists.
I think we're miscommunicating. Simplest example of what I'm saying
is that the DS function E sends all infinite loops to bottom. We
can't realize that on a machine, by the Halting Problem, as you know.
> >> > I define E first by induction, and then afterward, I defined Phi in
> >> > terms of E. I don't define E & Phi by simultaneous recursion.
> >>
> >> Your definition of E is incomplete without Phi, so there is
> >> simultaneous recursion.
> >
> > No, that's going to be false, after I successfully define E.
>
> I don't see how. Phi is just a name that we gave to the composition
> step in E. There's always going to be a composition step, and we can
> always call it Phi if we choose to do so.
>
> The only way you can avoid mutual recursion between phi and E is to
> define E non-recursively. Then Phi will exist, but it will be
> independent of E and there will be no problem.
OK, Joe, but now we have a test case. I sketched the construction of
an E for Advanced Scheme, but didn't give a Phi at all. Do you find
an error there? My point is that my E for Advanced Scheme, using a
1-step R, is vastly smarter than the E we've been arguing about :D
> > That's the value of using math as a `common language' of
> > semantics: we can define functions that aren't realizable on a
> > physical computer.
>
> Then you have not given a semantics of a *programming* language.
Shriram, I think I see your point, & I partly agree with you. First
let apologize "in person" for my wild-goose chase.
I think your PLAI big-step OpS is realizable on a physical computer,
because it amounts to things like this. Your OpS says that if 3
reductions exist, then a 4th reduction for {f a} exist. Your function
4th-reduction(1st red, 2nd red, 3rd red)
looks like a computable math function. And furthermore it's good
semantics: it expresses mathematically what we know & believe about
our PL. To rephrase this, you don't define the noncomputable function
V: Expressions -> (Env -> Value)
but your OpS translates to interesting equations for V to solve.
You'll say that
V[f](e) = <x, body, e'>, V[a](e) = a_v, & V[body](e[x<-a_v]) = b_v
implies that
V[{f a}](e) = b_v
Your "failure" to define V doesn't hurt your semantics at all: Your
(computable) implications fully expresses the semantics! You've
mathematized what SICP & the text of R5RS say, and furthermore you've
mathematized what the interpreter will tell us. Your math/semantics
fails to tell us what the interpreter won't tell us either: when a
program is in an infinite loop. But we can only "answer" this
question by invoking strong math axioms anyway.
And so you've refuted a claim I've made in this thread a few times:
"Any semantics, when fully mathematized, becomes DS."
But we like to write down functions like V mathematically, and it may
even be useful (I like standard reduction), so let me weaken my claim:
Math, the `common language' of semantics, lets us define noncomputable
functions like V, i.e. V is not realizable on a physical computer.
In response to this weaker claim, I think this claim is too strong:
> Then you have not given a semantics of a *programming* language.
You can define the semantics of a PL with a noncomputable function
like V: that's done in DS. It's just that you don't have to go the
extra noncomputable mile in order to specify the PL's semantics.
That's what I (finally!!!) learned from your PLAI semantics.
> Joe Marshall <prunes...@comcast.net> responded to me:
>>
>> > Are you saying F-F & B needed CPOs? How come F-F & B don't run
>> > into the problems you raise, such as:
>> >
>> >> But you may have more than one fixed point, too. The identity
>> >> function has an infinite number of fixed points, and a lot of them
>> >> are wildly different. You wouldn't want a semantics that said that
>> >> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
>>
>> > :D Why doesn't this problem snag F-F's eval_s? Look on p. 51 in
>> > <http://www.ccs.neu.edu/course/com3357/mono.ps>.
>>
>> It probably should.
>
> I really disagree, Joe, and I'm sure the authors would also.
Matthias is a few doors down, I'll ask.
>> On page 33 (section 4.6.3) they assert that the Y operator `can find
>> the fixed point' of any function. But a function can have many
>> fixed points (like the identity function) or no fixed points (like
>> (lambda (x) (+ x 1)) ).
>
> You're right, but that's different, that's not in the Standard
> Reduction section. You're definitely making a good point about Y.
>
>> But since Felleisen and Flatt are developing an operational semantics,
>> perhaps they didn't feel the need to specify which (if any) fixed
>> point is found.
>
> Please take another look at F-F's eval_s, on p. 51.
The standard reduction section is showing that an algorithm exists by
which you can reduce an expression, and that this algorithm preserves
the algabraic properties of the reduction equations.
In other words, it proves that `standard reduction is as good as
algabraic reduction' with regard to getting the same answers.
> OK, Joe, but now we have a test case. I sketched the construction of
> an E for Advanced Scheme, but didn't give a Phi at all. Do you find
> an error there?
I haven't even seen a well-formed `E', let alone one I can examine for
an error. Every E you've presented has E on both sides of the
equation.
--
~jrm
> Maybe we should say more. I think the math folks are in the habit of
> saying "just use induction." They don't say, "use CPOs or Scott
> models", because they don't know what they are! Look at it yourself.
Dare I throw kerosene on the fire?
It is possible to show that:
The function GOODSTEIN terminates for all positive integer values
for seed.
That the proof for the above statement is independent of the Peano
axioms.
;;; Note, this depends on a version of FLOOR that returns both a
;;; quotient and remainder. You may have to roll your own.
(define (base-bump base n)
(if (zero? n)
0
(do ((exponent 0 (+ exponent 1))
(divisor 1 next)
(next base (* next base)))
((> next n)
(call-with-values (lambda () (floor n divisor))
(lambda (quotient remainder)
(+ (* quotient (expt (+ base 1) (base-bump base exponent)))
(base-bump base remainder))))))))
(define (goodstein seed)
(do ((base 2 (+ base 1))
(n seed (- (base-bump base n) 1)))
((zero? n))))
--
~jrm
> >> > :D Why doesn't this problem snag F-F's eval_s? Look on p. 51 in
> >> > <http://www.ccs.neu.edu/course/com3357/mono.ps>.
> >>
> >> It probably should.
> >
> > I really disagree, Joe, and I'm sure the authors would also.
>
> Matthias is a few doors down, I'll ask.
Excellent, Joe! Do you teach there too?
> > Please take another look at F-F's eval_s, on p. 51.
>
> The standard reduction section is showing that an algorithm exists
> by which you can reduce an expression, and that this algorithm
> preserves the algebraic properties of the reduction equations.
>
> In other words, it proves that `standard reduction is as good as
> algabraic reduction' with regard to getting the same answers.
Yeah, but I'm just concentrating on the fact that `this algorithm',
called eval_s, is well-defined, and F-F never mentioned CPOs, Scott
models, structural induction. I think that's perfectly fine, and I
think any undergraduate Math major who'd taken a course in Point Set
Topology or Real Analysis would see why that's fine.
> > OK, Joe, but now we have a test case. I sketched the construction of
> > an E for Advanced Scheme, but didn't give a Phi at all. Do you find
> > an error there?
>
> I haven't even seen a well-formed `E', let alone one I can examine for
> an error. Every E you've presented has E on both sides of the
> equation.
Perhaps you didn't see my E for Advanced Scheme, in a response to
Lauri. E occurs for the first time here:
F(exp, e) = R^n(exp, e), if n in {0, 1, 2,...} is smallest s.t.
R^n(exp, e) in Value x Env and
R^i(exp, e) != bottom for i <= n
F(exp, e) = bottom, otherwise.
The omega problem yields F(omega, e) = bottom, because there is no
n: For all n in {0, 1, 2,...}, R^n(omega, e) is not in Value x Env.
Now we define the math function
E: Expressions -> (Env -> Value Env)
by including Expressions -> Enhanced and "un-currying" F.
R is not in fact an inductively defined function (i.e. no R on both
sides of the equation). As I said in my post, there are holes in my
treatment of R, but you should be able to see that
1) E is defined in much the same way that F-F's eval_s is, so it
doesn't need CPOs
2) Some Advanced Scheme hotshots could easily fill in the holes.
>>I think that's perfectly fine, and I think any undergraduate Math major
who'd taken a course in Point Set Topology or Real Analysis would see why
that's fine.
Uh well... yes I have. But I've been lost for quite some time now. :-)
Bill Richter:
> I think we're miscommunicating. Simplest example of what I'm saying
> is that the DS function E sends all infinite loops to bottom. We
> can't realize that on a machine, by the Halting Problem, as you know.
That's like saying we can't compute pi using a computer, because of
the Halting Problem.
What we *can* compute is an infinite sequence of approximations that
converge to pi. Similarly, we can compute an infinite sequence of
approximations that converge to bottom.
This assumes infinite computing resources, of course. If we didn't
have infinite computing resources, then the Halting Problem would be
decidable.
Will
That's hot, Joe! My favorite set theory book, by Just & Weese, says
that prior to Goedel incompleteness, no one had ever proved a theorem
in Number Theory that independent of the Peano axioms. And of course,
the Goedel sentence is pretty weird, not clearly related to any Number
Theory we know or care about. This sounds a lot more interesting.
Can you say more? I couldn't grok your code.
Where's the kerosene anyway? My point is that you can do great things
with Math, like construct DS function with or without Scott models,
but only because Math has such strong (set theory) axioms. But we
know from Goedel incompleteness that are axioms aren't complete.
Your turn!
> I haven't been following this, because it looks like a repetition of
> an apparently useless discussion from a couple of years ago,
I don't think it's useless, Will. I learned from Shriram that not all
semantics is DS. Lauri's off reading Schmidt to see if
compositionality really requires structured induction with CPOs.
Joe's gonna ask MFe if the standard reduction function requires CPOs.
My best post from a couple of years ago was something nobody responded
to, and I'll attribute that to exhaustion, as you guys had been
correcting my mistakes for months by then :D I posted 26 Jun 2002 how
to actually define a compositional semantic function for call/cc-free
Scheme without CPOs or Scott models.
I just did it again for Advanced Scheme, with some holes, but it's an
advance, as Advanced Scheme cleans up my clumsy approach, and now (I
think) I understand from Shriram that the key is using some small-step
OpS, as Robby Findler does. That is, in order to not just write down
a buncha equations for the semantic function, I needed a non-recursive
1-step evaluation function. And that strikes me as similar to the
standard Scott/CPO/structured approach: we can induct over the
complexity of expressions, or we can take teeny tiny evaluation steps.
> so I will take things completely out of context and comment upon
> them whenever it tickles my funny bone.
That's the spirit here! We're playing the telephone game: Joe said
something I didn't understand, so I semi-responded...
> Bill Richter:
> > I think we're miscommunicating. Simplest example of what I'm
> > saying is that the DS function E sends all infinite loops to
> > bottom. We can't realize that on a machine, by the Halting
> > Problem, as you know.
>
> That's like saying we can't compute pi using a computer, because of
> the Halting Problem.
Well, they're both uncomputable, and that's all I meant, in my
response to Joe:
Each fixed point gives a potentially *different* answer, so you
need a mechanism to choose the one you want. And that mechanism
had better be realizable on a physical computer.
I probably had no idea what Joe meant. I don't know what Shriram
thought I meant when he responded to me:
> That's the value of using math as a `common language' of
> semantics: we can define functions that aren't realizable on a
> physical computer.
Then you have not given a semantics of a *programming* language.
Back to you, Will:
> What we *can* compute is an infinite sequence of approximations that
> converge to pi. Similarly, we can compute an infinite sequence of
> approximations that converge to bottom.
Aha, interesting point. Does that answer Joe's original point:
And that (fixed-point-choosing) mechanism had better be
realizable on a physical computer.
That the semantic function must be a limit? But what does that rule
out? Are there any semi-pseudo-semantic function that aren't limits?
> This assumes infinite computing resources, of course. If we didn't
> have infinite computing resources, then the Halting Problem would be
> decidable.
Cool! Yeah, all functions are computable on a finite set!
But we must assume infinite computing resources. It doesn't matter if
Daniel's fancy infinite Turing tape physically exists. We're talking
about computable functions on the integers and other infinite sets.
> Joe Marshall <prunes...@comcast.net> responds to me:
>>
>> > I think the math folks are in the habit of saying "just use
>> > induction." They don't say, "use CPOs or Scott models", because
>> > they don't know what they are!
>>
>> Dare I throw kerosene on the fire?
>>
>> It is possible to show that:
>>
>> The function GOODSTEIN terminates for all positive integer values
>> for seed.
>>
>> That the proof for the above statement is independent of the Peano
>> axioms.
>
> That's hot, Joe! My favorite set theory book, by Just & Weese, says
> that prior to Goedel incompleteness, no one had ever proved a theorem
> in Number Theory that independent of the Peano axioms. And of course,
> the Goedel sentence is pretty weird, not clearly related to any Number
> Theory we know or care about. This sounds a lot more interesting.
> Can you say more? I couldn't grok your code.
Google for Goodstein sequences.
> Where's the kerosene anyway?
This was in reference to `just' induction. One of the Peano axioms is
(informally)
``If a property is possessed by 0 and also by the successor of every
natural number which possesses it, then it is possessed by all
natural numbers.''
http://www.brainyencyclopedia.com/encyclopedia/p/pe/peano_axioms.html
and this, of course, is the foundation for mathematical induction.
But mathematical induction isn't strong enough to show that Goodstein
sequences converge to zero. The program I wrote halts if Goodstein
sequences converge to zero, therefore you need something stronger than
mathematical induction if you want a comprehensive semantics.
--
~jrm
I should just like to note that the discussion has not been useless. I
for one learned a lot from the previous round, since rarely if ever
have the rationale and technical details of such an esoteric subject
as DS been spelt out in such excruciating detail. Even if this is not
useful to the participants, it may well be to the lurkers.
Lauri
> I should just like to note that the discussion has not been useless. I
> for one learned a lot from the previous round, since rarely if ever
> have the rationale and technical details of such an esoteric subject
> as DS been spelt out in such excruciating detail. Even if this is not
> useful to the participants, it may well be to the lurkers.
This is really a weakness of textbooks. I'm yet to find one that
provides even a rudimentary (technical) introduction to Why before it
dives into What -- a book that says what has been said on this thread,
for instance.
If anyone knows of a counter-example, please point me to it, because I
sure as heck would love to not have to go over this in person with
each of my students.
Shriram
> in my response to Joe:
>
> Each fixed point gives a potentially *different* answer, so you
> need a mechanism to choose the one you want. And that mechanism
> had better be realizable on a physical computer.
>
> I probably had no idea what Joe meant.
Consider this function (call it FOO)
(lambda (f)
(lambda (n)
(if (< n 1)
1
(* n (f (- n 1))))))
FOO takes a function as an argument and returns a function as a
result. The argument function must be of type (number -> number) and
the result type will be of type (number -> number). We could, if we
wanted, call FOO on the cosine function:
> ((foo cos) 3)
-1.2484405096414273
> ((foo cos) .4)
1
> ((foo cos) 1.7)
1.3002317183836305
or perhaps the square-root function:
> ((foo sqrt) 7)
17.146428199482244
Since the result of FOO is the same type as the argument to FOO, we
can call FOO on it, too.
> ((foo (foo cos)) 3)
3.2418138352088386
As many times as you want:
> ((foo (foo (foo (foo (foo (foo sqrt)))))) 6.01)
73.78031367520926
At this point, it would be useful to write a helper:
(define (nest-wrapper wrapper depth)
(lambda (k)
(if (zero? depth)
k
((nest-wrapper wrapper (- depth 1)) (wrapper k)))))
So now we can write
> (((nest-wrapper foo 6) sqrt) 6.01)
73.78031367520926
The interesting thing is that the deeper we nest the calls, the
less important the initial function is:
> (((nest-wrapper foo 20) sqrt) 2)
2
> (((nest-wrapper foo 20) cos) 2)
2
and a whole new behavior emerges:
> (((nest-wrapper foo 20) cos) 3)
6
> (((nest-wrapper foo 20) cos) 4)
24
> (((nest-wrapper foo 20) cos) 5)
120
> (((nest-wrapper foo 20) cos) 6)
720
But we only get this behavior if we nest deeply enough:
> (((nest-wrapper foo 3) cos) 6)
-118.79909959205344
Providing that the nesting is deep enough, the initial function
doesn't matter. The resulting function no longer depends on the
initial function, but derives from FOO itself. So the question
arises: what function is generated by FOO in the limit as the nesting
depth goes to infinity? As you must have noticed by now,
for any function G,
lim ((nest-wrapper foo n) g) = (lambda (x) (factorial x))
n->inf
Now if N is really, really large, then
n times
(foo (foo (foo (foo ... ))))
is almost the same function as
n+1 times
(foo (foo (foo (foo (foo ... )))))
For a function H and a value x, if H(x) = x, then x is a fixed-point
of H. So if there is an x so that (foo x) = x, then x is a
fixed-point of foo.
(foo (lambda (x) (factorial x))) = (lambda (x) (factorial x))
So the factorial function (a function from int->int) is a fixed point
of the FOO function (a function from (int->int)->(int->int) ). We
note that:
lim ((nest-wrapper foo n) g) = a fixed point of foo
n->inf
So what about functions other than FOO? This function:
(define bar (lambda (f)
(lambda (x)
(f x))))
has an infinite number of fixed-points.
lim ((nest-wrapper bar n) g) = g
n->inf
Any function whatsoever (one-arg, one-value function that is) can be
wrapped by BAR and that doesn't change it:
(bar x) = x
for all x.
If we wish to use fixed-points to model recursion, we need to specify
which fixed-point to use if there is more than one. Although the
factorial function is a fixed-point of BAR (as well as FOO), we really
can't expect the computer to compute that particular fixed-point from
BAR.
> And that (fixed-point-choosing) mechanism had better be
> realizable on a physical computer.
>
> That the semantic function must be a limit? But what does that rule
> out? Are there any semi-pseudo-semantic function that aren't limits?
Sure. Make one up. You could say that if there are multiple fixed
points that the one chosen is the lexicographically least when
expressed in English. So the fixed point of BAR would be `abs'. But
you can't build a computer that behaves this way.
Lauri Alanko:
> I should just like to note that the discussion has not been useless.
I stand corrected. I'm glad to hear the discussion has been useful.
Bill Richter:
> > That's like saying we can't compute pi using a computer, because of
> > the Halting Problem.
>
> Well, they're both uncomputable, and that's all I meant, in my
> response to Joe:
Pi is one of the computable real numbers.
> > What we *can* compute is an infinite sequence of approximations that
> > converge to pi. Similarly, we can compute an infinite sequence of
> > approximations that converge to bottom.
>
> Aha, interesting point. Does that answer Joe's original point:
>
> And that (fixed-point-choosing) mechanism had better be
> realizable on a physical computer.
Yes.
> That the semantic function must be a limit? But what does that rule
> out? Are there any semi-pseudo-semantic function that aren't limits?
That aren't limits of computable approximations? Yes. Here's one you
might recognize.
Suppose E [[ * ]] gives the denotational semantics of expressions in
a Scheme-like language, as a continuous partial function from environments
to continuous partial function from continuations to continuous partial
functions from stores to integer answers. Suppose further that r0, k0,
and s0 are some particular environment, continuation, and store. Then
the partial function from expressions to integers that is defined by
GOOD [[ e ]] = E [[ e ]] r0 k0 s0
is computable, and is the limit of a sequence of finitely computable
approximations, but
BAD [[ e ]] = 632499472 if E [[ e ]] r0 k0 s0 = bottom
= 532028816 otherwise
is not computable, and is not the limit of any sequence of computable
approximations.
Will
Joe, that's an important mistake. The Peano induction axiom is only
for statements expressed in the language of Peano arithmetic. And the
Peano language is so limited that you can't get a lot of induction
that way. So the Peano induction axiom is NOT the foundation for
mathematical induction. The foundation is the much more powerful ZFC
axioms, and it yields the statement I think was in my Algebra II text
in High School:
Given mathematical statements S(n), n = 0, 1, 2,... if S(0) is true,
and S(n) => S(n+1), then S(n) is true for all n.
You probably know all this, because CS folks generally know about
nonstandard models of (Peano) arithmetic. That is, not every element
of the model is a successor of 0. But why not? We can write an easy
inductive proof of this:
S(n): for all n (n is a successor of 0)
S(0) is true!
S(n) => S(n+1) is true, because by definition, n+1 is the
successor of n.
But that isn't a proof because we can't state "n is a successor of 0"
in Peano arithmetic. The problem is that we can't form the iterated
composites
Successor^n: N ---> N
And that's exactly the sort of induction that I want to use to show
the standard reduction function is well-defined. That is, given the
1-step reduction function R, we want to form the iterated composites
R^n: LC_v Expressions ---> LC_v Expressions
It's obvious how to form these (partial) function R^n with induction,
because by definition,
R^{n+1} = R . R^n
But you can't say that in Peano arithmetic.
> But mathematical induction isn't strong enough to show that
> Goodstein sequences converge to zero. The program I wrote halts if
> Goodstein sequences converge to zero, therefore you need something
> stronger than mathematical induction if you want a comprehensive
> semantics.
OK, you haven't established the last claim, as I showed above. The
rest is extremely interesting. Are you sure it's not just Peano
induction that's not strong enough? Why don't you start up a separate
thread about Goodstein sequences?
Anyway, I want to congratulate you on coming up with a meaningful
mathematical objection to my claim (standard reduction follows
from "just induction"). I want to buy back something I wrote earlier:
> >> > I think the math folks are in the habit of saying "just use
> >> > induction." They don't say, "use CPOs or Scott models",
> >> > because they don't know what they are!
> >>
> >> Dare I throw kerosene on the fire?
> >>
> >> It is possible to show that:
> >>
> >> The function GOODSTEIN terminates for all positive integer values
> >> for seed.
> >>
> >> That the proof for the above statement is independent of the Peano
> >> axioms.
> >
> > That's hot, Joe! My favorite set theory book, by Just & Weese,
> > says that prior to Goedel incompleteness, no one had ever proved a
> > theorem in Number Theory that independent of the Peano axioms.
I think that's a goof on my part. What Just & Weese must have meant,
and maybe they wrote it, is that Goedel was the first person to prove
that he had something independent of the Peano axioms. But number
theorists had been using complex analysis for a hundred years! It's
pretty easy to believe that they proved with complex analysis some
theorem in number theory that was independent of the Peano axioms.
They just hadn't proved independence.
> Pi is one of the computable real numbers.
Oops, I think I see my goof now, Will. Doe computable means we can
write an integer lambda expression f s.t. (f n) is the n-th digit of
the decimal expansion of pi? And isn't there a continued fraction
expansion of pi? The point I would know is that since there's a
countable number of integer lambda expressions, almost all real
numbers must not be computable real numbers.
> > > What we *can* compute is an infinite sequence of approximations
> > > that converge to pi. Similarly, we can compute an infinite
> > > sequence of approximations that converge to bottom.
I think I misunderstood you the 1st time around. Obviously any real
number is an infinite sequence of computable numbers (the finite
decimal approximations). But I think now you mean that the entire
sequence is computable, so you have a computable function g s.t.
g(n) converges to pi.
And I guess you similarly mean you have a computable function F
s.t. F(n) converges to a good DS semantic function? I didn't know
that either. That's sounds like `realizable on a physical computer.'
> > Aha, interesting point. Does that answer Joe's original point:
> >
> > And that (fixed-point-choosing) mechanism had better be
> > realizable on a physical computer.
>
> Yes.
Cool! I don't know why it's true, but I see it's meaningful.
> > That the semantic function must be a limit? But what does that rule
> > out? Are there any semi-pseudo-semantic function that aren't limits?
>
> That aren't limits of computable approximations? Yes. Here's one you
> might recognize.
>
> Suppose E [[ * ]] gives the denotational semantics of expressions in
> a Scheme-like language, as a continuous partial function from environments
> to continuous partial function from continuations to continuous partial
> functions from stores to integer answers. Suppose further that r0, k0,
> and s0 are some particular environment, continuation, and store. Then
> the partial function from expressions to integers that is defined by
>
> GOOD [[ e ]] = E [[ e ]] r0 k0 s0
>
> is computable,
Will, either I disagree strongly, or I don't understand you. If you
mean GOOD is computable in the above sense (there's a computable
function G s.t. G(n) converges to GOOD) then I didn't know that, but
I'll take your word for it. Sounds interesting. But I say the
Halting problem says GOOD is not a computable function.
Let's just check. E is a function
E: Expressions ---> (U x K x S -> Integers)
and Good is the "program" version of this function
Good: Expressions ---> Integers_bottom
obtained in R5RS DS fashion as you say. So send the Integers to 1 and
bottom to 0, and we have a (total) function
BooleanGood: Expressions ---> {0, 1}
which sends a program to 1 if and only if the program does not halt.
The Halting problem says there is no such such computable function
BooleanGood.
> and is the limit of a sequence of finitely computable
> approximations, but
>
> BAD [[ e ]] = 632499472 if E [[ e ]] r0 k0 s0 = bottom
> = 532028816 otherwise
>
> is not computable, and is not the limit of any sequence of computable
> approximations.
That's pretty interesting! I don't know what that's true.
Will, there's no question that you a ton of stuff that I don't know.
Really useful CS stuff. And the stuff I know is of no apparent use.
I'd be grateful if you'd run circles around me. You can certainly
teach me a lot. But why don't you vote on what Lauri wrote:
> In Schmidt, p. 51, the existence of the meaning functions is proven
> by structural induction. That's what "compositional" means: defined
> with structural induction.
I didn't give Lauri enough credit in my last post for exposing my
error: I just wrote down some equations for E to satisfy, without
showing giving an inductive argument relating everything back to the
base case. In fact, I could not have, because of infinite loops! I
think that's a long standing error of mine, which Lauri corrected.
But there's NO way I'll believe that Schmidt defines compositional to
mean proven by structural induction. That's not the way mathematical
definitions are phrased. And it's not what Cartwright and Felleisen
say in their "Extensible Denotational" paper. They say:
[The map from syntactic domains to semantic domains] satisfies the
law of compositionality: the interpretation of a phrase is a
function of the interpretation of the sub-phrases.
So for a DS valuation function
E: Expressions ---> M
compositionality means e.g. there must be a function
Phi: M x M ---> M
such that for a pair of expressions (X, Y),
E[[ (X Y) ]] = Phi( E[[ X ]], E[[ Y ]] )
Now if you want to say that you don't believe that one could define
such a compositional E without Scott models, CPOs & structural
induction, fine, that's a reasonable mathematical opinion. I think
the standard reduction function gives a counterexample, and hope we
get to the "bottom" of that. But you can't just insert the phrase
"you have to define E by structural induction" into the definition.
> cesur...@verizon.net (William D Clinger) responds to me:
>>Pi is one of the computable real numbers.
> Oops, I think I see my goof now, Will. Doe computable means we can
> write an integer lambda expression f s.t. (f n) is the n-th digit of
> the decimal expansion of pi?
Almost. It means that you can make a Turing machine that writes
the digits of pi on a tape. The machine will never terminate,
but for any N you can be sure that it will at some point have written
more than N digits on the tape, if you wait long enough.
As it happens, it is possible to compute the n'th (for a fixed n)
decimal digit of pi (without calculating the other digits). This
surprising result is due to developments by the Borwein brothers
and Fabrice Bellard (author of tcc and more). Even for large n
it is possible to do the calculation with floating point.
> The point I would know is that since there's a
> countable number of integer lambda expressions, almost all real
> numbers must not be computable real numbers.
Exactly.
;;;
;;; Computation of the n'th digit of pi in any base 10
;;;
;;; Jens Axel Søgaard, 27. maj 2001, a dull sunday
;;;
;;; Reference: "Computation of the n'th digit of pi in any base in O(n^2)"
;;; by Fabrice Bellard, 1997
;;; <http://www.stud.enst.fr/~bellard/pi/pi_n2/pi_n2.html>
;;
;; MAIN
;;
; returns the n'th digit of pi in base 10
(define (pi-digit n)
(let* ((N (inexact->exact (truncate (* (+ n 20) (/ (log 10) (log 2)))))))
(letrec ((for-p
(lambda (p sum)
;; termination condition
(if (> p (* 2 N))
sum
(let* ((vmax (inexact->exact (truncate (/ (log (* 2 N)) (log p)))))
(modulus (expt p vmax)))
(letrec ((for-k
(lambda (k b A v alpha)
(if (> k N)
alpha
(let* ((v (+ (- v (multiplicity p k))
(multiplicity p (- (* 2 k) 1))))
(A (modulo (/ (* (- (* 2 k) 1)
A)
(expt p (multiplicity p (- (* 2 k) 1))))
modulus))
(b (modulo (* b (/ k (expt p (multiplicity p k))))
modulus)))
(for-k (+ k 1) b A v
(if (> v 0)
(+ alpha (* k b (inv-mod A modulus) (expt p (-
vmax v))))
alpha)))))))
(for-p (next-prime p)
(mod1 (+ sum
(/ (modulo (* (expt-mod 10 (- n 1) modulus)
(for-k 1 1 1 0 0))
modulus)
modulus))))))))))
(first-digit (for-p 3 0)))))
;;
;; Arithmetical utilities
;;
; Extended version of Euclids algoritm for finding the gcd
; returns
; (g,a,b) s.t. g=gcd(m,n) and am+bm=g
(define (gcd-ext m n)
(letrec ((helper (lambda (m n am bm an bn)
(if (> m n)
(helper (- m n) n (- am an) (- bm bn) an bn)
(if (> n m)
(helper m (- n m) am bm (- an am) (- bn bm))
(cons m (cons am (cons bm '()))))))))
(helper m n 1 0 0 1)))
;; b s.t. ab = 1 mod n ( note: requires gcd(a,n)=1 )
(define inv-mod
(lambda (a n)
(cadr (gcd-ext a n))))
;; base ^ exp mod (Abelson et al. p.51)
(define (expt-mod base exp m)
(cond ((= exp 0) 1)
((even? exp)
(remainder (square (expt-mod base (/ exp 2) m))
m))
(else
(remainder (* base (expt-mod base (- exp 1) m))
m))))
(define (square x) (* x x))
;;;
;;; PRIME RELATED (Abelson et al. p.50)
;;;
;;; TODO: Replace prime? from O(sqrt(n)) to O(log n) algorithm
(define (smallest-divisor n)
(find-divisor n 2))
(define (find-divisor n test-divisor)
(cond ((> (square test-divisor) n) n)
((divides? test-divisor n) test-divisor)
(else (find-divisor n (+ test-divisor 1)))))
(define (divides? a b)
(= (remainder b a) 0))
(define (prime? n)
(= n (smallest-divisor n)))
(define (next-prime n)
(if (= n 2)
3
(if (prime? (+ n 2))
(+ n 2)
(next-prime (+ n 2)))))
;;
;; FACTORISE
;;
; Example: (factor 45) -> ((3 2) (5 1))
(define (factor n)
(factor-from n 2))
(define (factor-from n f)
(if (= n 1)
'()
(if (divides? f n)
(cons (list f (multiplicity f n))
(factor-from (/ n (expt f (multiplicity f n))) (next-prime f)))
(factor-from n (next-prime f)))))
(define (multiplicity p n)
(if (divides? p n)
(+ 1 (multiplicity p (/ n p)))
0))
; Utilies
(define (mod1 x)
(- x (floor x)))
(define (first-digit x)
(inexact->exact (floor (* 10 (mod1 x)))))
;;
;; TEST
;;
(define (interval from to)
(if (> from to)
'()
(cons from (interval (+ 1 from) to))))
; guile> (map pi-digit (interval 1 100))
; (1 4 1 5 9 2 6 5 3 5
; 8 9 7 9 3 2 3 8 4 6
; 2 6 4 3 3 8 3 2 7 9
; 5 0 2 8 8 4 1 9 7 1
; 6 9 3 9 9 3 7 5 1 0
; 5 8 2 0 9 7 4 9 4 4
; 5 9 2 3 0 7 8 1 6 4
; 0 6 2 8 6 2 0 8 9 9
; 8 6 2 8 0 3 4 8 2 5
; 3 4 2 1 1 7 0 6 7 9)
--
Jens Axel Søgaard
Note the "partial". Here the function, instead of returning bottom,
simply is not defined on nonterminating expressions.
> > In Schmidt, p. 51, the existence of the meaning functions is proven
> > by structural induction. That's what "compositional" means: defined
> > with structural induction.
> But there's NO way I'll believe that Schmidt defines compositional to
> mean proven by structural induction.
True, Schmidt didn't say that. That was an interpretation of mine. And
when I said "means", I didn't mean "is defined to be", but rather
"amounts to in practice". Also, I should perhaps have said "recursion"
instead of "induction". Not that there is very much difference,
though.
> [The map from syntactic domains to semantic domains] satisfies the
> law of compositionality: the interpretation of a phrase is a
> function of the interpretation of the sub-phrases.
To my mind, this is exactly what structural recursion is about. I
don't quite see what the big difference is supposed to be.
The point of my comment was, in any case, simply that compositionality
is not an arbitrary restriction. In fact, it is a _license_: you are
_allowed_ to use the meanings of subphrases when defining the meaning
of a phrase, because syntax trees are finite and induction on the
height of a tree is well-founded. If it weren't for this fact, you
couldn't use E _at all_ in the definition of E.
Lauri Alanko
l...@iki.fi