Proposed Language Changes Norman Ramsey
PC Ports / Back Ends Michael Elliott
Extensions to Trestle Steve Glassman
RPC Systems Andy Hisgen
--------------------
Proposed Language Changes
Norman Ramsey
About 15 or 20 people attended the session. We broke the session into
two parts: brainstorming to make a list of proposed changes, and
discussion of the proposals. We reserved some time at the end to
address the higher-level questions of how much change we want and how
changes should be made.
Because there wasn't time to discuss all the proposals, we voted on the
list and discussed only the proposals that received the most votes.
Please remember that votes measure interest in discussing the proposal,
not support for the proposal itself. The proposals were:
Generics should be changed. (10 votes)
The language should require that checked runtime errors raise
exceptions, especially the ``out of memory'' error. (8 votes)
ABSTRACT OBJECT should indicate a type that cannot be instantiated;
only proper subtypes can be instantiated. (4 votes)
It should be possible to override the meanings of the infix
operators, e.g. by PROCEDURE "="( , ) = ... (4 votes)
Modula-3 should (or should not) have multiple inheritance. (3 votes)
Modula-3 should have an unsigned integer type WORD distinct from
INTEGER. (3 votes)
Permit formal parameters of type ANY and the type REF ANY. (3 votes)
Support FORK P(args) as syntactic sugar for the declaration of the
appropriate subtype of Thread.Closure, the allocation of the instance,
and the call to Thread.Fork. (3 votes)
ASSERT should be a statement, not a pragma, and not ignorable. (2 votes)
Define INTEGER to be 32 bits, and add LONGINTEGER. (2 votes)
Restrict the text between pragma brackets <* ... *> to be a sequence
of valid Modula-3 tokens. (2 votes)
Add a parameter mode PROCEDURE P(READONLYREF p: REF T) that prevents
P from modifying p^. (1 vote)
Make it possible to declare a VAR READONLY in an interface.
Change the names of the parameter modes, or the semantics, or both,
to support the Ada-style IN, OUT, and INOUT modes. (1 vote)
Provide a syntax so that VAR and VALUE actuals can be distinguished
at call sites. (1 vote)
I apologize if I have misstated anyone's proposal; please send
corrections to n...@princeton.edu.
Because time was short, discussion was limited to ten minutes per
proposal. Discussion began with generics.
Few of those present were happy with generics as now defined, but
nobody wanted to add all the nifty new generic features that have been
proposed. Discussion centered around the proposal made by Jorge
Stolfi to support ``inline'' instantiation of generics, hiding the
existence of internal generics, e.g.
GENERIC MODULE Set(Elem);
IMPORT List(Elem);
Jorge mentioned this idea in his presentation at the meeting, along
with other, more speculative ideas. Greg Nelson's concern was that
there was no way to specify what module was to be used as the
implementation of List, whereas with the current system the programmer
is forced to instantiate a separate interface, e.g.
INTERFACE IntList = List(Integer); END IntList.
MODULE IntList = List(Integer); END IntList.
MODULE IntSet = Set(Integer, Intlist); END IntSet.
Most of our time was spent discussing this example, and pointing out
that it was unclear how to have two different implementations of lists
under the proposed system. Norman Ramsey suggested that the problem of
who implements List(Integer) isn't any different from the problem of
who implements Integer---it is also impossible to have two different
implementations of Integer. Jorge pointed out that the his proposal
subsumes the current generics, so that it would still be possible to do
things the old way.
No consensus was reached about generics, except possibly that anything
other than what we have now would probably be hard to understand.
Discussion moved on to the question of whether the language definition
should require that checked runtime errors raise exceptions.
Discussion was spirited; I have tried to summarize the positions of
those pro and con, not to describe the back-and-forth of the session.
Sam Harbison said his experience with Ada convinced him that it was
good to have checked run-time errors be exceptions. The error
``exception not in RAISES clause'' has to be treated specially.
Eliot Moss pointed out that CLU handles every error as an exception,
that overflow was explicit, and that it was important to have different
exceptions for different conditions. Exceptions are divided into two
classes, and the implementation does not guarantee correctness after an
exception of the second class (``Failure'') has been raised. All
procedures can raise Failure, and an exception is automatically
converted to Failure if not in the RAISES clause. Examples include
parity error and insufficient memory.
Eliot also said that Modula-3 should require that checked run-time
errors raise exceptions, because otherwise programs wouldn't be
portable between implementations that chose to raise exceptions and
those that chose not to. Raising exceptions make it easier to write
long-running servers that can log what happened and try to recover.
Someone pointed out that in the PC world it is important to be able to
write shrink-wrap applications (e.g. a word-processor) that never just
crash, but print some error message that makes sense to the user. Jim
Saxe observed that the Larch Prover, when it encounters a checked
run-time error, prints a message that includes Steve Garland's email
address. (The Larch Prover is written in CLU.) He agreed that it's
important for applications to be able to make an attempt to save the
user's work when a fatal error occurs. He also suggested that raising
an exception when array bounds are violated could help avoid the
duplicate work of checking array bounds in extension languages.
Doug Moen has developed editors for the PC and Mac, and these editors
have had a ``low on memory'' pop-up window that gives users a chance to
save their work and avoid bombs when memory is exhausted. Customers
demand this feature. The implementation pre-allocates resources for
that window at startup time; this technique could be applied generally.
The faction against requiring checked run-time errors to be exceptions
was less numerous. Norman Ramsey pointed out that other mechanisms
can be used to handle some checked run-time errors; for example, users
might register a procedure to be called when memory is exhausted.
Greg Nelson pointed out the complications induced by concurrency; when
a checked run-time error occurs, there is no guarantee that the
currently executing thread has a handler for the appropriate
exception. This situation can easily occur in practice because some
of the modules in the library fork threads. The language committee
was reluctant to require that ALL checked run-time errors raise
exceptions, because that might place an undue burden on implementors.
Greg said that ``serious fail-stop programmers'' who actually
worry about long-running servers don't use exceptions to recover from
failures. Instead of watching individual threads with exception
handlers, they use operating-system mechanisms to watch the entire
address space, starting a new address space on failure.
Discussion moved from specific topics to the general questions of how
much change we want and what mechanisms there are for making changes.
There was broad consensus in favor of keeping the language stable,
although we didn't explore in detail what ``stability'' meant.
Greg Nelson's view was that another ``12 changes'' in the next 18
months would be too much change, but that there would be changes
eventually---otherwise the language would be dead. Sam Harbison
suggested that we think about making changes approximately every two
years.
Sam Harbison suggested that no changes at all are better than a few
good changes, and that it is important to be able to point new users
to a single document that describes the language, not make them apply
deltas to the existing documents. Christian Jacobi said that small
improvements are not worth the disruption. He also said that there is
a difficult tightrope walk between promising enough stability and
getting the necessary changes done. He thought that the language
commitee handled the 12 changes quite elegantly in this respect.
Dave Detlefs argued the opposite position, that Modula-3 has a small
user community, so now is a good time to make changes. Modula-3 might
need to change to compete with C++ templates. Eliot Moss pointed out
that the designers might have an inaccurate picture of the potential
impact changes, since they and most users are at SRC where the impact
is high. Eliot also suggested that compilers with experimental
features should offer a conforming mode.
It was suggested that comp.lang.modula3 could be used as a forum for
proposing changes, but there seemed to be agreement posting is too
easy and that the process for proposing changes should be more
arduous. Someone asked how well comp.lang.modula3 is working, and
those present were not too dissatisfied, although there were broad
complaints that there is too much quantity and the signal-to-noise
ratio is low. No-one present volunteered to moderate a Modula-3
newsgroup.
--------------------
PC Ports and Back Ends
Michael Elliott
I. Concerning the port of Modula-3 to the PC
A. Modula-3 must be percieved as a viable language
1. C++ is percieved to be viable
2. Speed vs C++ is probably not an issue
a) garbage collection may enable M3 to run a little faster
b) lots of virtual functions will slow it down, though
optimization techniques hold promise.
c) How about Byte Magazine ad benchmarks?
d) Sales techniques
Has to appeal to the one or two daring guys in an
organization of 100 or so, in order to get them to try it.
B. What's the target user?
To what sort of user should a PC port be addressed
1. The professional software developer
Probably not any time soon
2. Education -- used for teaching programming
Probably the best hope of getting a large user base --
followed by popularity in industry.
C. The PC World
Expects a lot in terms of an integrated environment --
editor, compiler, debugger, profiler, etc. Current PC
debuggers are quite good and quite friendly.
D. What OS?
1. MS-DOS
Non re-entrant non-tasking operating system will be tough.
It's definitely doable, but within what range of effort?
2. OS-2
Much easier port, but that doesn't provide the wide
availability which is the overriding concern.
E. Nothing less than a 386
The general consensus seemed to be that it was folly to
consider architectures in the 8088 -- 80286 range, as the
lack of protected mode programming and a flat 32-bit address
space would not be worth addressing. A minimum of 2Mb of
memory was deemed necessary.
F. 386 is probably not a problem
Although a lot of older machines are out there, few people
today buy less than the architecture considered workable --
a 386 with at least 2Mb of memory.
G. Bad implementation which at least works?
Some interest was shown in having some implementation, even
if quite slow, available soon -- just to establish "market
presence" in the academic world.
H. What's the soonest route to delivery of something?
1. Maybe Microsoft Windows
I. Microsoft Windows as an operating system
1. Has a memory manager allowing a large 32-bit address space
2. Is thoroughly GUI -- possible to port Trestle to Windows
3. Has crude multi-tasking, although not pre-emptive.
4. Can threads actually exist with Windows?
J. Buying much of a port
Probably not cost-effective. Borland will sell something
for around $500,000.00, but it wouldn't be nearly a turnkey
product.
K. How about having a grad student do the port?
Consensus of opinion was against this as being too difficult
a task for a grad student to be doing -- apparently grad
students aren't what they were ten years ago.
L. Toughest single problem: threads
II. Concerning non PC backend issues
Little time is currently spent on backend issues.
A. Mips microcode
B. GNU RTL
1. A proven design for multiple architectures
2. Allows re-use of backends already produced (or to be
produced in the future) for the GNU project.
C. bytecode
1. Pros: ease of implementation
2. Cons: difficulty of integrated environment and debugging.
D. ANDF
E. "Oberon"
Really a placeholder for any quick and dirty code generator.
F. Apple Macintosh
If something exists for a long time on Macintoshes, there is
a tendency for PC guys to ask for a similar product. It was
noted that Apple had recently made a change internally from
Object Pascal to C++.
G. Pattern matcher
Seen as a potential back end, to allow for ease in attaching
an automatically generated code generator.
H. Global Optimizer
Seen as a potential back end, since the code generator can
be automated to a greater extent.
--------------------
Extensions to Trestle
Steve Glassman
Attendance 8
Mark Manasse, Steve Glassman, Marc H. Brown, Jim Meehan (DEC SRC)
Frode Odegard (Odegard Labs)
Dave Nichols, David Goldberg (Xerox PARC)
Michel Dagenais - (Ecole Polytechnic Montreal)
Executive summary:
Non-SRC'ers expressed wishes for several high-level applications
and/or VBT widgets.
SRC'ers indicated wishes to do some low-level improvements to
Trestle
Neither group promised to do anything
Minor announcement: The Trestle tutorial along with vbtkit and FormsVBT
packages have been released to gatekeeper for anonymous FTP. Documentation
for the Trestle tutorial is SRC report 69. Initial documentation
for vbtkit is on gatekeeper.
Details:
Dave Nichols expressed wishes for Rich Text editor/viewer and an
embedded editor. David Goldberg requested TCL-like funtionality
which was permuted into the possibility for an RPC interface to FormsVBT.
We got off on a discussion of Graphical FormsVBT editing.
Mark Manassed presented a list of things he would like a chance to
do to Trestle:
Exposing X-graphics operations (and extensions like PEX and Display
Postscript) to Trestle clients (would/could/should be non-portable)
Shared memory putImage operations for pixmaps and video
Double buffering and capture improvements (less copying of bytes)
Well structured JoinVBTs (VBT's with multiple parents for multiple
views of a single common VBT) extending over multiple screens/screen
types
Teleporting trestle windows to other screens or servers.
Dealing with X ICCCM changes
Fill out unimplemented Trestle functionality (out of domain tracking,
etc.)
Observing WindowsNT as a potential porting target
Getting at window system state (Xdefaults) cleanly
We got off into a discussion of application configuration/Look and
Feel (TM) - whether Trestle applications should or should not look
like <your favorite window LAF(TM)> windows. Even if the client
might want them to (even if some people didn't think the client
should want them to)
We got off on a discussion of editable graphics, structured graphics
and ZSplits (until, mercifully, time ran out...)
--------------------
RPC Systems
Andy Hisgen
At the RPC session, we had discussions of the approach of using SUN
RPC (as presented earlier in the day by Dave Nichols of Xerox PARC)
and of SRC's new Network Objects project. Among the issues that came
up were tuning time-outs for SUN RPC, callback RPC's (that is, RPC's
which wish to pass a procedure and a RPC binding that the callee can
callback), opaque remote references, and distributed garbage
collection. We also discussed resource limits in the implementation
of RPC (fixed-size buffer pools and the like) and how these can impact
applications which stress the resources to their limits.
--------------------
Cheers,
John
Some good concepts are an important basis for a new language. But if the
language designers can't react (i.e. at least talk about) promptly to new
developments, the language *will* be dead before you know it. This is no
vote for being opportunistic, just for being open to new suggestions.
>Sam Harbison suggested that no changes at all are better than a few
>good changes, and that it is important to be able to point new users
>to a single document that describes the language, not make them apply
>deltas to the existing documents. Christian Jacobi said that small
>improvements are not worth the disruption. He also said that there is
>a difficult tightrope walk between promising enough stability and
>getting the necessary changes done. He thought that the language
>commitee handled the 12 changes quite elegantly in this respect.
>
Hmm, does this mean that language changes should be made in large chunks?!
I don't see any good in this. In fact, I would want changes to come one at
a time, especially if I have to modify my sources.
Of course, there should be an up-to-date language description at any time.
But in what way does this oppose new developments? Just change the language
report accordingly and there we are.
>Dave Detlefs argued the opposite position, that Modula-3 has a small
>user community, so now is a good time to make changes. Modula-3 might
>need to change to compete with C++ templates. Eliot Moss pointed out
>that the designers might have an inaccurate picture of the potential
>impact changes, since they and most users are at SRC where the impact
>is high. Eliot also suggested that compilers with experimental
>features should offer a conforming mode.
>
Very right. Even more important than the small user community seems to be
the fact that there is only one M3 compiler by now. This takes away all the
standarization overhead.
>It was suggested that comp.lang.modula3 could be used as a forum for
>proposing changes, but there seemed to be agreement posting is too
>easy and that the process for proposing changes should be more
>arduous. Someone asked how well comp.lang.modula3 is working, and
>those present were not too dissatisfied, although there were broad
>complaints that there is too much quantity and the signal-to-noise
>ratio is low. No-one present volunteered to moderate a Modula-3
>newsgroup.
>
WHAT?! Ever took a look at comp.lang.Ada, ..., not to talk about comp.lang.C++?
The C++ group comes up with 30-60 (!) articles *every day*! If you think
signal-to-noise ratio is low in comp.lang.modula3, just look into a few other
groups. Now there you'll find really idiotic postings. The modula3 group is one
of the most constructive and serious I know of. We surely don't need moderation.
And talking about quantity: IMHO, there should be much *more* discussion about
conceptual aspects of the language. Most of the time, I have the impression that
not many people are overly interested in improving M3 - either they like it and
use it or they don't (flame on - sometimes this impression extends to the
language designers themselves - flame off).
Anyway: what alternatives have you considered? Discussion about M3 in the
newsgroup is natural - that's what it's good for. So what's the problem? To
take a frequent look into the postings if there might be something valuable
in there? That shouldn't be too hard. Or do you guys think you will have all
good ideas by yourselves (little post-flame here).
---
Peter Klein E-Mail: p...@rwthi3.informatik.rwth-aachen.de
Lehrstuhl fuer Informatik III Tel.: +49/241/80-21320
Ahornstrasse 55 Fax.: +49/241/80-21329
RWTH Aachen
D-5100 Aachen
Germany
The older I get, the more conservative toward language changes I
get. Maybe that's just a consequence of seeing how C++ is evolving.
To many of the proposals are of the form, "Language X has this
nifty feature, can we force it into C++...?"
The principal reasons for suggesting a language change (IMHO) should
be:
1) To clarify an ambiguous area of the language definition.
2) To better support an area where the language
a) Discovered to be deficient AND
b) Discovered to be a widely used area of the language
(or was intended to be a widely used area)
I think M3 generics fall under (2) above. I would like to see some
hard thinking done in the area of a cleaner, simpler definition of
generics that preserves abstraction.
-- geoff
Geoff Wyant
wy...@centerline.com
Centerline Software, Inc.
(Formerly Saber Software, Inc)
10 Fawcett Street
Cambridge, Ma.
01238
I think this is correct -- that is, language change is not necessarily
a bad thing, but it has been a bad thing for C++. For instance, if
someone could figure out what multiple inheritance was supposed to
mean, and what the useful idioms are, and how to make it interact
cleanly with the rest of the language, then it might be good to add it
to Modula-3. I hear good things about mix-ins from NeWS and CLOS
programmers. However, if we don't understand it, or cannot caputer
the useful idioms, or if it as some vile interaction with any existing
good thing, then perhaps we can live without it.
David Chase
Sun
>IMHO, the crucial aspect of stability is that a valid M3 program today is
>still a valid program tomorrow. Adding new features which simply extend the
>capabilities of the language (like VIRTUAL OBJECT) don't affect stability
>seriously. Other changes, which might cause that existing programs have to be
>adapted, should be included carefully. But why discuss this in terms of
>every n months? If there is a consensus that a certain feature is wanted and
>needed, why not include it immediately? If someone personally dislikes this,
>well - nobody is forced to work with an up-to-date version of the language.
>Just stick with your old compiler until you think it's time to install a new
>one.
One point you seem to have missed is that language users form a community.
If I choose to stick to version X of a language definition and everybody
else moves to X+1 then I can't use or understand some of the code which
is generously donated every now and again. Changing a language definition
too often => suicide for the language. As an example, has anybody tried using
Icon? Versions of Icon are all upwardly compatible and add new features which
only Icon fanatics would bother to learn to use. The basic core of Icon is
hard enough to learn as it is.
In addition, for users who do not maintain their own computing facilities,
it can be hard to convince an overworked systems manager to reinstall a
little used (in terms of number of users) language compiler every year.
I don't know if you've read my earlier posting on the language changes
discussion, but somehow you are going about exactly the direction at
which I flamed a bit. I agree with both of you up to the point that the
C++ way of handling changes is just as bad as the rest of the language.
Of course, every development step has to be considered carefully. Especially
in a language which claims to be ``compact'', interactions between new ideas
and existing features should be investigated thoroughly.
Now, learning from the C++ mess should not result in taking the opposite
direction. My impression of the M3 developers (this includes everyone who
feels pointed at, not just the guys at SRC) is, that this thorough
investigation just isn't happening. Everyone is just saying something like
"Hey, what's the problem? We've got a nice little language here, which
could be better, but we don't know in what way, so ..."
Since you mentioned the MI thing again: I still don't see any problem with
it. If you remember my first article, I didn't say: "Gee, MI is such a
fantastic thing, we simply *must* have it", I just wanted to encourage
a discussion about the pros and cons of it. But, if anything, the
discussion almost convinced me that it actually *is* useful. The pro side
provided some IMHO good examples, where MI is a natural way of solving
a problem. The con side came up only with "MI is hard to understand" (I
still don't believe that, examples?) and "We have MI in C++, but we don't
use it" (this also prooves nothing). Even more, if I haven't missed
anything in John DeTreville's Users Group Meeting Report, some Users
found MI worth discussing, but nothing came from it. Now this isn't
exactly what I would call ``thorough investigation''.
My personal conclusion from this is that there should be *more* discussion
about proposed new language features than in C++. At the moment, it is
far less. Don't forget that some people consider some things essential
which you might think of as unnecessary or even bad. If you don't give
these people at least some reasons *why* it's not in M3, they'll never
use it.
Chris
--
Christopher Robin Thewalt (the...@ce.berkeley.edu)
Dept. of Civil Engineering These opinions are not necessarily
University of California, Berkeley shared by my employer...
! a problem. The con side came up only with "MI is hard to understand" (I
! still don't believe that, examples?)
It doesn't have to be. I find the Sather way of doing MI very easy to
understand. Quoting the Sather report:
"A class 'A' inherits the features of another class 'B' when the
name 'B' appears in A's feature list. We say that B is a parent class
and that A is a child. The semantics is exactly as if the features
of B were textually copied into A at the point the name appears.
Any references that appear in the inherited feature definitions will
be resolved in the new namespace. Later features in a class with
the same name as earlier ones override the earlier definitions. This
means that the order in which classes are inherited can have an
effect on the meaning. A class may inherit from many other classes.
This is called MULTIPLE INHERITANCE."
Depending on the language, it can be made to look more difficult. But
the concept is on that paragraph.
--
...............................................................................
Ari Huttunen They say the sky high above
90-7285944 is Caribbean blue... (Enya)
> It [multiple inheritance] was added to C++ about
> 5 years after the C++ type system was initially
> designed; again I believe this proved to warp the
> language considerably.
Oh? Can you give us some concrete examples of that warpage?
For instance, can you show a case where the existence of
multiple inheritance in C++ inconveniences programmers that
don't use it?
--
--Andrew Koenig
a...@europa.att.com
I don't think that's exactly the problem. It's that those people who
*want* to use multiple inheritance have to deal with its problems and
evolutionary vestiges.
A difficult-to-use-MI also affects the way that libraries are designed
and implemented, and a programmer not directly using MI often uses
libraries. Unfortunately despite OOP becoming so popular, there's still
an emphasis on the language rather than really good libraries.
--------------------------------------
Personally, I don't like C++ at all--I think it's much too complicated
and difficult for what it does. The syntax is too hard for humans to
parse.
One of the languages I like best is Sather. It was amazing
to nearly completely understand it in 45 minutes. It's very simple.
Perhaps a bit too spare, but otherwise I like it alot.
It has multiple inheritance and parameterized types and for the first
time I really understood what these did. It seems thought that classes
can only take types as arguments, and not values. I.e. with a generic
list class you can make a list of foobars class, but not a list of
foobars of size 11.
One nice thing is that it generates code made *specific* for the
paramterized class, and not only the general case with tons of indirect
function calls.
I might actually start using it for my work were it not for the
fact that there isn't even a "do loop" construct, i.e. only a general
loop and you have to 'manually' increment a counter. It's simply a
minor syntactical convenience, but such a common case (especially
in numerical programming). (Of course, I'd really like index notation
for arrays, but that's another long story...)
: --
: --Andrew Koenig
: a...@europa.att.com
--
-Matt Kennel m...@inls1.ucsd.edu
-Institute for Nonlinear Science, University of California, San Diego
But you will have difficulties to find language
users or library developers for a brand-new
language which doesn't contain at least all
really important concepts of its direct competitors
Eiffel, C++, etc.
For instance, our group is busy with the development
of integrated software development environments,
and we have implemented many (generic) tools using
the language Modula-2 (producing some hundred
thousands lines of code). There is e.g. a graph-oriented
nonstandard DBMS which is currently extended by a
front-end for a data modelling and database programming
language.
This DBPL offers the concept of MI and we are
continously using MI on the conceptual level as
well as on the implementation level. Up to now
we have to implement inheritance and dynamic
binding in Modula-2 using many tricks (type-casts,
procedure arrays, ...).
You may guess from the above written lines that we
are looking for another implementation language
* which has a Modula-2-like "look&feel"
* which doesn't permit a C-like style of programming
* and which supports MI.
Therefore, we would like to use Modula-3, but we will
never use a language forcing us to simulate MI further-
on.
To summarize: its true that nobody will try to use a
new language without usefull standard libraries, but
it's also true that nobody will spend many efforts to
supply libraries for a new language which has up-to-now
neither a large user community nor contains new important
concepts like MI.
Dr. Andy Schuerr,
Lehrstuhl fuer Informatik III
RWTH Aachen
D-5100 Aachen (Germany)
Big, bloated languages are made up of lots of little things...
--
James Hague
exu...@exu.ericsson.se
In article <1992Jul13....@exu.ericsson.se> exu...@exu.ericsson.se (James Hague) writes:
Big, bloated languages are made up of lots of little things...
What bloat does complex number support cost when you already have all
the operators you need for integer and floating point support? While
I might back off the exponentiation operator request, since you can
get by without it with minor pain (although I think it is silly not to
include it), the cost of supporting complex numbers on the user code
side is way too great considering the minor impact it would have on
langauge syntax.
There's a couple of problems -- sorry if I didn't go into details.
(1) Define the semantics of TYPECASE (which case is chosen?)
(2) Can we add MI without slowing down SI? (This is a trick
question, because you need to worry about both current
(sleazy compilers) and future (good compilers) performance.)
(3) MI has some weird interactions with opaque supertypes.
Suppose you get something like (pardon my mangling of the
syntax):
TYPE X <: OBJECT;
TYPE Y <: OBJECT;
A = X OBJECT METHODS ... END;
B = Y OBJECT METHODS ... END;
C = A,B OBJECT METHODS ... END;
Now, what if it turns out that X and Y actually share
a common ancestor? Are the data fields and methods shared, or no?
(sharing is not consistent with encapsulation -- my conclusion is
that sharing should not occur in this case. However, for mix-ins
it appears that sharing is what you want. One resolution is to
only allow sharing to occur when it is not hidden. However, I
don't know if this can be done in a consistent and reliable
fashion; what happens if there are multiple views of the objects
involved?)
(4) How does narrow work in the presence of repeated inheritance?
(a) what is the right answer?
(b) can we implement this efficiently?
That is, suppose you have:
TYPE X <: OBJECT;
A = X OBJECT METHODS ... END;
B = A,X OBJECT METHODS ... END;
C = A,B OBJECT METHODS ... END;
Suppose I narrow an "X" to an "A", given that what I've really got
is a C. Which one? How efficient is "NARROW" now?
(5) Discover and document any remaining semantic/implementation
gotchas.
My concern about efficiency may seem misplaced, but if what you're
after is flexibility with no major concern for performance, you can
get that already from CLOS. (And yes, I am *very* interested in
Dylan, which I am told is something like Scheme+CLOS.)
Comparisons with bullet-items on the C++ Real-Soon-Now list are also
unfair. It could be that they are unimplementable, or come with
horrible interactions, or maybe there is more sizzle than steak (e.g.,
maybe the feature is "provided" but in a useless fashion).
David Chase
Sun
! One of the languages I like best is Sather. It was amazing
! to nearly completely understand it in 45 minutes. It's very simple.
! Perhaps a bit too spare, but otherwise I like it alot.
! I might actually start using it for my work were it not for the
! fact that there isn't even a "do loop" construct, i.e. only a general
! loop and you have to 'manually' increment a counter. It's simply a
! minor syntactical convenience, but such a common case (especially
! in numerical programming). (Of course, I'd really like index notation
! for arrays, but that's another long story...)
Take a look at lib/base/do.sa. It implements some loop constructs. You
are able to say things like:
c: DO := DO::times(100); until c.is_done loop ..foobar.. end;
No, I haven't tried to use it.
(What do you mean by index notation?)
I agree. We should get rid of a few more inessential little things.
Let's see... If we eliminated the division and subtraction operators,
ah, what about multiplication?
I'm getting quite enthusiastic about this, perhaps we only need the Boolean
data type and a couple of logical operations. We can implement everything
else in terms of these. It may be a bit of trouble, but, after all, as it's
in the interest of a more parsimonious language, I'm sure to get a lot of
support.
I should remind people that I'm not a computer scientist, but a
(probably future unemployed) physicist.
: ! I might actually start using it for my work were it not for the
: ! fact that there isn't even a "do loop" construct, i.e. only a general
: ! loop and you have to 'manually' increment a counter. It's simply a
: ! minor syntactical convenience, but such a common case (especially
: ! in numerical programming). (Of course, I'd really like index notation
: ! for arrays, but that's another long story...)
:
: Take a look at lib/base/do.sa. It implements some loop constructs. You
: are able to say things like:
: c: DO := DO::times(100); until c.is_done loop ..foobar.. end;
I did. It still seems unnatural. Seems, hell. It is. Take a look at
it, and be honest. I think "do loops" were deleted from Modula-2 in
going to Oberon (an intentionally small language, like Sather), and
reinstated after practical experience in Oberon-2. A simple, very common
concept deserves a simple, idiomatic statement.
: No, I haven't tried to use it.
:
: (What do you mean by index notation?)
Implied iteration and summation over repeated indices. (a.k.a. Einstein
summation convention). It's a common notation in math and science--computer
or no.
E.g. Vector A = Matrix B times vector C:
A(i) = B(i,j)*C(j)
or more explicitly
"for all i" A(i) = "sum over j" B(i,j)*C(j).
This concept subsumes nearly all of Fortran 90's matrix & vector operations.
Explicitly parallel too. In comp.lang.misc there was a long debate
on this a couple of months ago. Jim Giles also had the same position, and
in fact I think his operation even designed specific extensions to Fortran
to express this concept. (In fact if I were to design an optimizer for
an F90 compiler, I might want to internally represent array
operations in this generalized form so I could write only 1 code generator).
Despite what all the proponents of OOP say, either explicitly or by
their textbook examples, I do *not* think that a class structure is the
best solution for this problem (i.e. matrix and vector and tensor classes
and overload the + and the - and the blah blah blah). There's another solution
that's cleaner and more general, at least in the (important!) domain of
array operations.
In case you're wondering: Yes, I do think that iteration and arithmetic
on arrays deserves special case attention in the language. I think possible
solutions can be elegant, clean, with a strong non-computer precedents, and
not at all interfere with other aspects of the language. Nobody's done it
really 'right' yet. (Sather with index operations might be very attractive
though. It would have that 'killer' feature that nobody else has, and do
all the normal stuff easier. With a good enough library and optimization
it would be at least technically, a formidable competitor to Fortran.).
: --
: ...............................................................................
: Ari Huttunen They say the sky high above
: 90-7285944 is Caribbean blue... (Enya)
--
-------------------------------------------------------------------------------
Marc Wachowitz, m...@gandalf.ki.fht-mannheim.de, 75...@novell1.rz.fht-mannheim.de
* wonder everyday * nothing in particular * all is special *
Theoretically, you need only two counters, increment, decrement, and
compare with zero. Writing efficient programs is a bit harder though. :)
Seriously, to quote from Systems_Programming_with_Modula-3, "C.A.R. Hoare
has suggested that as a rule of thumb a language is too complicated if it
can't be described precisely and readably in fifty pages. The Modula-3
committee elevated this to a design principle.... There are so many good
ideas in programming language design that some kind of arbitrary budget
seems necessary to keep a language from getting too complicated."
Tony Lai
--
Usenet. The triumph of noise over signal.
> (1) Define the semantics of TYPECASE (which case is chosen?)
This seems pretty easy; the current rule says that the first case arm
whose type is a supertype of the case expression is chosen, which
seems to work just as well in the MI case.
> (2) Can we add MI without slowing down SI? (This is a trick
> question, because you need to worry about both current
> (sleazy compilers) and future (good compilers) performance.)
The C++ experience seems to answer yes to this question, though, as
you implicitly point out below, this experience may not carry over to
Modula-3.
> (3) MI has some weird interactions with opaque supertypes.
> Suppose you get something like (pardon my mangling of the
> syntax):
>
> TYPE X <: OBJECT;
> TYPE Y <: OBJECT;
>
> A = X OBJECT METHODS ... END;
> B = Y OBJECT METHODS ... END;
>
> C = A,B OBJECT METHODS ... END;
>
> Now, what if it turns out that X and Y actually share
> a common ancestor? Are the data fields and methods shared, or no?
> (sharing is not consistent with encapsulation -- my conclusion is
> that sharing should not occur in this case. However, for mix-ins
> it appears that sharing is what you want. One resolution is to
> only allow sharing to occur when it is not hidden. However, I
> don't know if this can be done in a consistent and reliable
> fashion; what happens if there are multiple views of the objects
> involved?)
Again, I'll argue from the viewpoint that C++ faces many of the same
problems and gives a solution that at least addresses all the
problems. As you point out, there are times when sharing is
appropriate, and times when it is not. (I tend to think that sharing
is almost always the right choice, but let's push that discussion.)
C++ lets you specify both: a supertype may be virtual or not; all
virtual instances of a type in a MI subtype lattice are collapsed into
a single node. Another principle guiding the design of MI in C++ is
that any ambiguity must be explicitly resolved by the programmer; in
the example above, if A and B had defined fields with the same name,
then a C method would have to narrow 'self' to an A or a B before
selecting that field.
So, one answer to the question is that we add some syntax allowing us
to specify whether a supertype is the-M3-equivalent-of-virtual. Then,
in the example you've given, in any scope in which it is revealed that
X and Y are both subtypes of Z, those revelations will say whether Z
is a virtual supertype or not. Let's say that Z has a field a. If
Z is a virtual supertype in both cases, then there is only one copy of
a Z in a C, and it is clear what c.a means if c is a C. If Z is not
virtual in both cases, then c.a is ambiguous; to eliminate the
ambiguity, the programmer would have to write NARROW(c, A).a or
NARROW(c, Y).a. If there is a scope in which it is revealed that
X is a subtype of Z, then there is still no problem, since c.a is
unambiguous in all the cases: if Z is not a supertype of Y, then there
is no conflict, if Z a virtual supertype of both X and Y, then there
is no ambiguity, since c.a refers to the same field; if Z is a
non-virtual supertype of both X and Y, in this scope c.a must refer
NARROW(c, X).a, since we don't know of anything about Y and Z in this
scope.
> (4) How does narrow work in the presence of repeated inheritance?
> (a) what is the right answer?
> (b) can we implement this efficiently?
>
> That is, suppose you have:
>
> TYPE X <: OBJECT;
>
> A = X OBJECT METHODS ... END;
> B = A,X OBJECT METHODS ... END;
>
> C = A,B OBJECT METHODS ... END;
>
> Suppose I narrow an "X" to an "A", given that what I've really got
> is a C. Which one? How efficient is "NARROW" now?
Again, the C++ answer is: if A and X are virtual supertypes in all cases,
the right answer is clear, since there is just one subobject of each
type in a C. If A and X are not virtual supertypes, and x is a
variable whose static type is X containing a value whose allocation
type is C, then we have the following type graph:
X (1)
|
|
(2) X A
| |
| |
A B
\ /
\ /
C
How did we get a variable with static type X and allocation type C?
We originally allocated a C and assigned to a variable of some
supertype. If we assigned it to a B variable there is no ambiguity;
we could then assign that result to an X variable and end up at (1).
If we assigned the C value directly to an A variable or an X variable,
however, there would be ambiguity, so the compiler would have to
complain. So, the C++ answer says that there is no way to get from a
C value to the X subobject labelled (2). (This ambiguity is one of
the reasons I recommend the ubiquitous use of virtual base classes
whenever I'm asked.)
Given that we're at the X labelled (1), then NARROW's of the value to
any of an A, B, or C is well-defined.
Perhaps a the confusing example you want to bring out is if X is a
virtual supertype, but A is not. This this case you get
X
/ \
/ \
+ A
| |
| |
A B
\ /
\ /
C
Now we can assign a C value directly to an X variable, but what if we
NARROW it back to an A? How about: that's ambiguous, and therefore
illegal.
As an overall comment, note that there is a certain consistency in the
C++ approach to MI: if we it isn't completely obvious what the
compiler should do, the compiler rejects the program, and requires you
to be a little more explicit. This tends to prevent at least some
potential misuses of MI.
So, this little missive makes a small start at addressing what MI
would mean if we added it; it does nothing to argue that it would be a
good thing if we did. I happen to believe that it would; but I'll
leave that argument for another time. I look forward to further
debate.
Dave
Independently of the language, it _is_ more difficult if it is to remain
sensible even in complicated situations. According to the quote,
it would seem that parallel superclasses can interfere surprisingly
with each other in Sather.
However, I am more optimistic than e.g. Alan Snyder (if the quote,
"multiple inheritance is a good thing, but nobody knows how to do it right",
is correctly accredited to him). I think that MI can even be done right,
but have not studied whether and how it could be retrofitted to Modula-3.
----------------------------------------------------------------------
Markku Sakkinen (sakk...@jytko.jyu.fi)
SAKK...@FINJYU.bitnet (alternative network address)
Department of Computer Science and Information Systems
University of Jyvaskyla (a's with umlauts)
PL 35
SF-40351 Jyvaskyla (umlauts again)
Finland
----------------------------------------------------------------------
>My other sense is that the innovation should not come in the
>language, but in the accompanying libraries and tools. I would
>like to see people expending their energies on well-designed
>and innovative libraries; such things as an interesting data
>model and database integrated into the M3 environment, libraries
>for constraint-based programming, rule based systems, computational
>geometry, algorithm visualization, transaction based systems, ...
> ...
However, the existence of good (class) libraries for specialised purposes
tends to increase the demand for MI, since in many cases one has to inherit
some class from each library in order to exploit its facilities.
(This has at least been the experience of many people who have used
some generally available C++ class libraries.)
Well, if any class B inherits some other class A as _public_ (or protected),
and the designer of B has even the least doubt that somebody might later
use B in a "fork-join" (*) situation, it would be very sensible to declare
the inheritance as "virtual" (meaning a sharable superclass part).
However, besides the performance penalty, this causes two drawbacks:
1. If non-default initialisation for A is needed, every constructor
of every subclass (direct or indirect) of B must explicitly restate
that initialisation.
2. "Downcasting" a pointer of type A* to a pointer of type B* becomes
absolutely impossible (of course this is a very dangerous operation
even in the allowed situations).
(*) This means that there is some other direct subclass of A, say C,
such that neither B nor C is a subclass of the other,
and some further class inherits both B and C (directly or indirectly).
Shameless plug: see my paper, "A critique of the inheritance principles
of C++" (Computing Systems, Winter 1992), for eshasutinxhaustin.
In article <23...@alice.att.com> a...@alice.UUCP () writes:
>Oh? Can you give us some concrete examples of that warpage?
>For instance, can you show a case where the existence of
>multiple inheritance in C++ inconveniences programmers that
>don't use it?
Okey-dokey, I'll bite. (Cut to the end to skip the long-winded stuff).
Right now, C++ (2.1, 3.0) requires that I read about virtual and
non-virtual base classes. Without MI, these wouldn't be there.
Right now, C++ (2.0, 2.1, 3.0) lacks the ability to cast from a base
type to a derived type in certain situations. It is, in fact, an
intellectual burden to discover and remember those situations. I
believe that this is a consequence of multiple inheritance, since
single inheritance does not require any offsets of the "this" pointer.
Right now, C++ (2.0, 2.1, 3.0) pretty much requires (but does not
enforce) virtual destructors in certain single-inheritance
situations. As you can see already, I don't recall quite when that
is, and this is a bug. I believe this weird requirement is a
consequence of multiple inheritance (same as above -- an offset "this"
pointer).
Right now, C++ (same versions) lacks run-time type checking. I don't
know why it doesn't have it, but I do know that it is more difficult
to define, implement, and explain this in the presence of multiple
inheritance. Having implemented this in the past for a language with
single inheritance, I know that it is not terribly hard in that case.
This is an example of where the addition of one feature may have
prevented/delayed the addition of another feature.
In the future, C++ may have some form of run-time type-checking. In a
language with single inheritance (like Modula-3) this can be
implemented in a way that it requires a constant amount of time to
perform any type check, and log(#cases) time to execute a typecase
statement (with arbitrary depth in the cases and the type tree). Will
C++'s versions of these statements have similar efficiency, assuming
that I don't use multiple inheritance? (I've been trying to figure
out how one might search the DAG quickly, but haven't had the time to
really make much progress.) This is an example of where the addition
of one feature may have prevented the best presentation or
implementation of another feature.
In the future, it might be nice to add garbage collection to C++. I
know it is easy to do a good garbage collection in a SI system, but
I'm not sure of what the MI interactions are. It does not simplify
the garbage collector or make it faster (the offset pointers are the
basic pain here. References cause similar problems.)
I'm a little curious where all this goes with (say) persistent data,
or pickled data, or dynamically loaded libraries. I've done the
homework (other people have helped) to convince myself that I
implement a single-inheritance type system that would provide
efficient NARROW and concurrent introduction of new types (no locking,
assumes certain consistency properties of the underlying memory
system). TYPECASE is a little harder, but I think about it from time
to time. I wouldn't be surprised if MI had bad interactions with
these features.
Note that these are features that I have used in the past (in the
Olivetti M-3 system) and wished that I had access to now in C++. I
have not experienced such a burning need for MI. This doesn't mean
that MI is never useful, but if most of the cases where it is useful
sound something like "A friend of a grandmother of an old boss of mine
was working on this window system..." then I remain skeptical. (This
is an exaggeration. Several people have told me good stories about
mix-ins. However, the C++ MI system is considerably larger than
mix-ins, and it might be possible to add mix-ins w/o all the other
restrictions.)
So in summary:
1. MI makes the language bigger, even if I confine myself to a SI
subset. (virtual base classes)
2. MI adds more rules that a SI user must follow.
(restrictions on down-casting; virtual destructors)
3. MI may have delayed other useful features.
(run-time types)
4. MI may prevent other useful features from being implemented in
their most useful fashion. Speaking for myself, I've used and
liked those features (run-time types, garbage collection,
pickled data), but MI has never done much for me.
David Chase
Sun
! I should remind people that I'm not a computer scientist, but a
! (probably future unemployed) physicist.
Warning! This text was written by a computer science student. ;-)
! : Take a look at lib/base/do.sa. It implements some loop constructs. You
! : are able to say things like:
! : c: DO := DO::times(100); until c.is_done loop ..foobar.. end;
! I did. It still seems unnatural. Seems, hell. It is. Take a look at
! it, and be honest. I think "do loops" were deleted from Modula-2 in
! going to Oberon (an intentionally small language, like Sather), and
! reinstated after practical experience in Oberon-2. A simple, very common
! concept deserves a simple, idiomatic statement.
It's very easy for me to be honest: that DO-loop looks awful. But I don't
think adding the DO-loop to Sather would solve more problems than it creates.
Should there also be for-, while-do-, do-while- and do-until-loops? A single
loop construct makes more easily readable code.
Also, in Sather a very common thing is a 'cursor' that goes through the
elements of a container class (like a list) in some order. The until-loop-end
style loop is very good for this. UNTIL we_have_gone_through_all_the_elements_
_in_this_container LOOP use_the_element_in_the_container END;
! Implied iteration and summation over repeated indices. (a.k.a. Einstein
! summation convention). It's a common notation in math and science--computer
! or no.
! E.g. Vector A = Matrix B times vector C:
! A(i) = B(i,j)*C(j)
! or more explicitly
! "for all i" A(i) = "sum over j" B(i,j)*C(j).
! This concept subsumes nearly all of Fortran 90's matrix & vector operations.
! Explicitly parallel too. In comp.lang.misc there was a long debate
! on this a couple of months ago. Jim Giles also had the same position, and
! in fact I think his operation even designed specific extensions to Fortran
! to express this concept. (In fact if I were to design an optimizer for
! an F90 compiler, I might want to internally represent array
! operations in this generalized form so I could write only 1 code generator).
! Despite what all the proponents of OOP say, either explicitly or by
! their textbook examples, I do *not* think that a class structure is the
! best solution for this problem (i.e. matrix and vector and tensor classes
! and overload the + and the - and the blah blah blah). There's another solution
! that's cleaner and more general, at least in the (important!) domain of
! array operations.
The array operations are important to you. To others they might not be
so important. Furthermore, adding such constructs would change the language
radically. That seems a little too high price for not having to use objects
and redifine their methods/member functions/features.
No-one forces you to use an object-oriented language against your will.
> Right now, C++ (2.1, 3.0) requires that I read about virtual and
> non-virtual base classes. Without MI, these wouldn't be there.
Virtual base classes are part of MI. A program that doesn't use MI
won't use virtual base classes either.
> Right now, C++ (2.0, 2.1, 3.0) lacks the ability to cast from a base
> type to a derived type in certain situations.
Those situations all involve virtual base classes.
> Right now, C++ (2.0, 2.1, 3.0) pretty much requires (but does not
> enforce) virtual destructors in certain single-inheritance
> situations.
This has nothing to do with MI. You need a virtual destructor if
you use a base class pointer to delete a derived class object.
In this respect, operator delete is much like any other member function.
> Right now, C++ (same versions) lacks run-time type checking.
No fair complaining about parts of C++ that don't exist! :-)
Seriously, I don't see why MI should make this any harder:
it is already necessary to figure out which virtual functions
to call and any run-time type information can use the same
mechanism as virtual functions.
> In the future, it might be nice to add garbage collection to C++. I
> know it is easy to do a good garbage collection in a SI system, but
> I'm not sure of what the MI interactions are. It does not simplify
> the garbage collector or make it faster (the offset pointers are the
> basic pain here. References cause similar problems.)
I don't think it makes an awful lot of difference. Again, it doesn't
seem to be any more difficult than getting the right information into
virtual function tables.
> I'm a little curious where all this goes with (say) persistent data,
> or pickled data, or dynamically loaded libraries.
This is getting a little off the point. The original claim was that
`MI warps the language' and I asked for examples. All of the above
examples are either integral parts of MI, which therefore cannot be
said to be effects on the rest of the language, or speculations about
things that do not currently exist.
> 1. MI makes the language bigger, even if I confine myself to a SI
> subset. (virtual base classes)
Virtual base classes are part of MI.
> 2. MI adds more rules that a SI user must follow.
> (restrictions on down-casting; virtual destructors)
The first restriction applies only to programs using virtual base classes,
which art part of MI; the second is unrelated to MI.
> 3. MI may have delayed other useful features.
> (run-time types)
Every feature delays other useful features.
> 4. MI may prevent other useful features from being implemented in
> their most useful fashion.
I haven't seen any real evidence for this.
--
--Andrew Koenig
a...@europa.att.com
cha...@rbbb.Eng.Sun.COM (David Chase @ Sun Microsystems, Mt. View, Ca.)
> Right now, C++ (2.0, 2.1, 3.0) pretty much requires (but does not
> enforce) virtual destructors in certain single-inheritance
> situations. As you can see already, I don't recall quite when that
> is, and this is a bug. I believe this weird requirement is a
> consequence of multiple inheritance (same as above -- an offset "this"
> pointer).
Since you don't quite recall what the problem is I can't quite be specific,
but I see no such problem.
> Right now, C++ (same versions) lacks run-time type checking. I don't
> know why it doesn't have it, but I do know that it is more difficult
> to define, implement, and explain this in the presence of multiple
> inheritance. Having implemented this in the past for a language with
> single inheritance, I know that it is not terribly hard in that case.
> This is an example of where the addition of one feature may have
> prevented/delayed the addition of another feature.
That is pure conjecture, and wrong. C++ was designed without run-time
type information mechanisms because I disliked the effect of their use
in Simula programs. When I was convinced that having such mechanisms
was less trouble than not having them I implemented it in two morning's
of real time, about 4 hours of solid work. The amount of code needed
(combined compiler and run-time support code is less than 150 lines
of C++).
> In the future, C++ may have some form of run-time type-checking. In a
> language with single inheritance (like Modula-3) this can be
> implemented in a way that it requires a constant amount of time to
> perform any type check, and log(#cases) time to execute a typecase
> statement (with arbitrary depth in the cases and the type tree). Will
> C++'s versions of these statements have similar efficiency, assuming
> that I don't use multiple inheritance? (I've been trying to figure
> out how one might search the DAG quickly, but haven't had the time to
> really make much progress.) This is an example of where the addition
> of one feature may have prevented the best presentation or
> implementation of another feature.
Maybe, but that again is conjecture. The dumb and easy implementation of
a downcast involves n-1 calls of a trivial recursive function where n is
the distance betwwen the static type of the object and the desired static
type.
So, before I deal with your points, let me point out what kind of MI I would
support.
- A may inherit from B and C only if B and C themselves are subtypes of D. This
does not only solve some of your problems, it also forces the designer to
think about common properties of B and C. This improves reusability, data
abstraction etc. If this restriction forces you to include a class which
seems "artificial", you are probably composing two classes which should not
be composed (just because they have nothing in common).
- A should have only one copy of D's data fields. I can't think of any problem
where you would like to have two copies (one from B, one from C). This is
likely a case where an OBJECT b:B; c:C; END; should be used.
- If there are name clashes between data fields in B and C, these should be
resolved internally. Everything else will interfere with data abstraction.
- Methods from D, which are redefined differently in B and C, have to be redefined
in A. This can be checked statically.
>(1) Define the semantics of TYPECASE (which case is chosen?)
>
I agree, this is a problem. One simple solution comes into my mind: For A, B, and
C, TYPECASE would only select D. This is not very satisfying, although it does
fit into the existing language. TYPECASE in general is a statement only useful in
SI systems. But this is not a fault of MI. If you would include a feature into
some language which would not allow, let's say, pointers, does this imply that
pointers are not useful?
To me, TYPECASE is a construct for easy handling of objects with a "variant RECORD"
semantics (correct me if I'm wrong). Such objects are SI by nature. My proposal for
a modified TYPECASE still allows such usage.
>(2) Can we add MI without slowing down SI? (This is a trick
> question, because you need to worry about both current
> (sleazy compilers) and future (good compilers) performance.)
>
Who knows? I'd like to take a chance. Many features, which are common today, were
regarded as "unimplementable" some years ago (and some still are, look at
the discussion about garbage collection in comp.lang.object). Most other OO
languages have it, so it shouldn't be that bad.
>(3) MI has some weird interactions with opaque supertypes.
> Suppose you get something like (pardon my mangling of the
> syntax):
>
> TYPE X <: OBJECT;
> TYPE Y <: OBJECT;
>
> A = X OBJECT METHODS ... END;
> B = Y OBJECT METHODS ... END;
>
> C = A,B OBJECT METHODS ... END;
>
> Now, what if it turns out that X and Y actually share
> a common ancestor? Are the data fields and methods shared, or no?
> (sharing is not consistent with encapsulation -- my conclusion is
> that sharing should not occur in this case. However, for mix-ins
> it appears that sharing is what you want. One resolution is to
> only allow sharing to occur when it is not hidden. However, I
> don't know if this can be done in a consistent and reliable
> fashion; what happens if there are multiple views of the objects
> involved?)
>
Well, I've answered this already. I think they should be shared. Why does this
interfere with data abstraction? B and C (in my example) may act upon D's data
only through D's methods. This should assert the integrity of D's data. B and C
offer methods to change their own data, as well as modified or complex combinations
of D's methods. So data hiding is supported (am I missing a point?).
The only problem I do see here is that A has to know that B and C both inherit
from D. But then: according to my proposal, if A may inherit from B and C, it
already knows that there *has* to be some common ancestor of B and C. A should be
aware of this, because it makes A itself a subclass of D. But this is not so
very different from SI: If A is a subclass of B and B of C, then A also knows
that it's a subclass of C. Otherwise, it can't use C's methods and the inheritance
is useless.
>(4) How does narrow work in the presence of repeated inheritance?
> (a) what is the right answer?
> (b) can we implement this efficiently?
>
> That is, suppose you have:
>
> TYPE X <: OBJECT;
>
> A = X OBJECT METHODS ... END;
> B = A,X OBJECT METHODS ... END;
>
> C = A,B OBJECT METHODS ... END;
>
> Suppose I narrow an "X" to an "A", given that what I've really got
> is a C. Which one? How efficient is "NARROW" now?
>
This is not allowed in my proposal, because there is no common superclass for
A and X. Besides, I can't see any sense in this class hierarchy.
>(5) Discover and document any remaining semantic/implementation
> gotchas.
>
Yes.
>My concern about efficiency may seem misplaced, but if what you're
>after is flexibility with no major concern for performance, you can
>get that already from CLOS. (And yes, I am *very* interested in
>Dylan, which I am told is something like Scheme+CLOS.)
>
I don't see any efficiency drawbacks from MI. Of course, there is more work
to do for the compiler. But how does it affect execution speed?
>Comparisons with bullet-items on the C++ Real-Soon-Now list are also
>unfair. It could be that they are unimplementable, or come with
>horrible interactions, or maybe there is more sizzle than steak (e.g.,
>maybe the feature is "provided" but in a useless fashion).
>
You're right. But nevertheless, M3 has to compete with C++ and others. And, as I
stated in some earlier posting, if you want people to use M3 instead of C++ (or
Sather, Eiffel, ..., all of which *have* MI), you have to give them some *real*
good reasons why M3 hasn't MI. I'm not convinced yet. But any suggestions are
welcome.
Sorry, this is nonsense. I meant: D should only select A. If both B and C are
given, but not A, the ELSE-case is chosen.
In article <23...@alice.att.com> a...@alice.UUCP () writes:
>
>No fair complaining about parts of C++ that don't exist! :-)
No, exactly fair. Time spent designing and implementing multiple
inheritance is time not spent doing other things. This may not be
what you thought "warping the language" means, but it is good enough
for me.
>> [run-time type-checking]
>Seriously, I don't see why MI should make this any harder:
>it is already necessary to figure out which virtual functions
>to call and any run-time type information can use the same
>mechanism as virtual functions.
I think you are being a bit glib here. For a single-inheritance type
system, the virtual function tables contain a constant amount of
additional information describing the position of a type in a type
tree (the technique described in a letter in TOPLAS recently is
inferior). To determine if type A is an ancestor of type B, it
suffices to know that lr_pre_number(A) <= lr_pre_number(B) and that
lr_post_number(A) >= lr_post_number(B). See Paul Dietz, "Maintaining
Order in a Linked List", STOC 1982. For single inheritance, we're
talking about one word per VFT to a type descriptor, and each type
descriptor can contain as little as two words encoding pre- and
post-order numbers. I'm pretty sure it is not this simple for
multiple inheritance.
On the other hand, the TOPLAS technique seems like it could be hacked
to work for MI. The space cost goes up (O(max inheritance depth) per
VFT, I think), but the time cost stays low (assuming that the hack
works, of course). I'm not sure what happens if a new type is loaded
into a running system, since it could increase the inheritance depth,
and thus require changing the sizes of the VFTs.
The problem gets worse if you consider TYPECASE (if you choose to
implement it). Using techniques described in the Dietz paper, you can
implement typecase for single inheritance to run in O(log # cases)
time, with O(#cases) space cost. For multiple inheritance? (Again,
I'd love to be enlightened.)
>> 4. MI may prevent other useful features from being implemented in
>> their most useful fashion.
>I haven't seen any real evidence for this.
It's difficult to prove this, but I'll try a comparison with Modula-3.
The Olivetti M-3 implementation required < 10 people-years to obtain
roughly cfront functionality, PLUS garbage collection, PLUS automatic
makefile generators and dependency checkers, PLUS (an instance of) an
interpreter, PLUS automatic generation of marshalling code, PLUS a
compiler-server (cached interface files, a big win), PLUS run-time
types, PLUS a skeleton AST system used to build pretty printers and
implementation stub generators, PLUS a threads library (briefly ported
to multi-threaded Mach on a '386 before Olivetti shut the lab down --
presumably, we were insufficiently productive :-). Like C++, we
churned through a couple of versions of the language in the process.
Like C++, we targeted multiple archictectures.
(Note that there are several interesting language tools in the list,
as opposed to "language features".)
Now, compare this with what has appeared for C++, a language of much
greater commercial significance. Either the crew at Olivetti (some
working part-time, some also writing papers, some working on another
continent) was phenomenally good, or else we were working on a much
easier problem. I'd like to think that we were good, but we weren't
that good. It's been two years since I last worked on Modula-3,
and presumably more than 10 people-years have been spent since then
(at various software houses) improving C++ and building tools, and
that's on top of whatever was already available in 1990. Where's the
goodies? *Something* has prevented those other features from
appearing. (To be fair, I'm certain that backward semi-compatibility
with C is the biggest obstacle.)
David Chase
Sun
>Maybe, but that again is conjecture. The dumb and easy implementation of
>a downcast involves n-1 calls of a trivial recursive function where n is
>the distance betwwen the static type of the object and the desired static
>type.
"Trivial" is not necessarily cheap, depending upon the shape of the
type DAG and the dynamic frequency of type checks (*). It is also a
bit worrying (as a potential future user of RTTI in C++) that the best
refutation available to my inefficiency fears is "that again is
conjecture". This implies that a feature is being added to the
language without a clear handle on the costs of its use (because if
you had a cost estimate, if it was any good, you could remove my fears
and make me look like an ignoramus by revealing the efficient
algorithm, instead of by saying "conjecture". Therefore I must
conclude that no good time bound is known).
Is this good language design? (Unfair question -- given that MI has
already been added, it is almost impossible to remove. Since RTTI is
necessary, it must also be added, and let the costs fall where they
will. However, this supports the assertion "MI has warped the
language", if one believes that RTTI is more important than MI, or if
one believes that the wrong kind of MI was chosen w/o taking the needs
of an important feature like RTTI into account.)
(*) I believe Mick Jordan has reported substantial speedups in his AST
system by making all the types visible and "flattening" the TYPECASE
statements into ordinary switch statements. The last time I was near
M-3, TYPECASE and NARROW were implemented using the algorithm
Stroustrup describes, since (as noted in Stroustrup's post) that is
the easy way to do it. This is evidence that trivial is not
necessarily cheap.
David Chase
Sun
J. Eliot B. Moss, Associate Professor
Department of Computer Science
Lederle Graduate Research Center
University of Massachusetts
Amherst, MA 01003
(413) 545-4206, 545-1249 (fax); Mo...@cs.umass.edu
Take the tree of types and walk it in a depth first manner. Using a
global counter, drop numbers off at the nodes, as follows. For an
internal node, drop one number off before visiting the children and
another afterwards; for leaves, simply drop off one number. The type
code of a type is simply the first number dropped at the trees node in
the tree.
TYPECASE consists of taking the typecode of the item being cased and seeing,
for each arm in sequence, if the typecode falls within the range allowed for
the arm. If the arm is a leaf type, then there is only one possible value; if
the arm is an internal node, then there is an upper and lower bound.
A simple implementation would consist of a series of IF tests, each testing
either equality or two bounds.
Almost any implementation must special case NIL since it is generally not
represented as a pointer to a thing that can supply a type code. I will assume
that that is done first, before any IFs or CASEs.
In SI at least, the order of the cases can be removed by getting rid of later
cases that are covered by earlier ones (which I think is an error anyway
(don't have my M3 book here)). So we can break it all down into ranges of
typecodes and case arms that go with them; a given arm may have several
ranges, since earlier arms may handle subtypes. Anyway, given this viewpoint,
it is easy to see that an O(arms) linear search can be turned into an
O(log(arms)) binary search or, by listing each individual type code in the
range of interest, into an O(1) indexed branch.
Now Mick can correct me, but I recall discussing this with him a bit at the
time of the MUG, so even some details are a little off, I think I can clarify
the general point: if you know all the types involved, you can build the type
tree and know the type codes at compile time, and turn the sequential-IF
implementation of TYPECASE into a CASE statement, which could result in a
switch statement in the code output by the SRC compiler (though I am not sure
that it actually *does* result in a switch in C; the SRC compiler may insist
on deciding how to carry out the CASE that Mick builds).
This is an interesting argument in favor of the type numbering, etc., used by
the SRC system, which is different from what we've design for the GNU system.
I believe that we test TYPECASE arms sequentially, using a different narrowing
test. We consider the depth of the two types in the tree. Let's call the
(allocated, not static) type of the object the *source* type, and the type we
are trying to narrow to the *destination* type. The destination type's depth
(distance from the root) in the type tree must be <= the source type's depth,
and the source type's supertype at the depth of the destination type must be
exactly the destination type. So we have a <= test of a variable against a
compile-time known constant, and an equality test of a variable against a
constant (we store the vector of supertypes somewhere easily accessible from
the type). The advantage of our system is that types can be added without
affecting the actual tests in any way. We think this may be important for
persistent systems, where new subtypes may be added (and code for them loaded
dynamically from the store) after a program is compiled. Hence, the type tree
of the SRC approach is not static, and their approach breaks down. We can
probably still come up with an O(log(arms)) implementation of our scheme; it
is just hard to do it O(1) in the face of a type tree growing dynamically.
All of this breaks down in the MI setting. I can't think of an O(1) narrow
test, much less an O(log(n)) TYPECASE test, that is at all space efficient and
that works in the face of dynamically growing type DAGs. Some of the folks
with the logic language project at MCC (I think Hassan Ait-Kaci was one) did
some work on making DAG narrow tests efficient, but I believe they assumed a
static DAG so you could do a bunch of precomputation.
It may turn out that the way to do it is to use a potentially expensive
algorithm to check subtypes, and to do one or more kinds of caching to save
the results. For example, each TYPECASE statement could have a fixed size
cache (organized as a hash table, of course) to say which arm to use for some
recently presented types. This may well turn out to be expected O(1) for most
programs. If the cache size is 1, this is similar to the inline method cache
idea -- if the type is the one we know about, go to the location recorded,
otherwise do the more painful technique and update the cache. (If you go
implement this and got the idea from me, don't forget to give me credit :-).
You're right. But nevertheless, M3 has to compete with C++ and others.
And, as I stated in some earlier posting, if you want people to use M3
instead of C++ (or Sather, Eiffel, ..., all of which *have* MI), you
have to give them some *real* good reasons why M3 hasn't MI. I'm not
convinced yet. But any suggestions are welcome.
Some people have been so burned by MI in C++ that they consider it a
good thing to leave out of a language. So it's not obvious to me that
you have to have MI to "compete" with C++ and company.
--Mark Day
> Some people have been so burned by MI in C++ that they consider it a
> good thing to leave out of a language. So it's not obvious to me that
> you have to have MI to "compete" with C++ and company.
Whom are you thinking of, aside from Tom Cargill?
Note that at various times, Tom has also called for removing
constructors and overloaded operators from C++. Moreover, I don't
think he's actually been burned from it; he just doesn't think
it is useful.
--
--Andrew Koenig
a...@europa.att.com
Excuse me for being unclear. (I obviously haven't had my best day when I wrote
that article anyway :).
Well, although you're questions seem to be a little bit rhetorical: yes, I
actually meant *direct* subclasses. To put it in other words: I would agree
to MI if it is restricted to very simple fork/join situations. Furthermore,
I think that all data from the common superclass should be shared.
If you want to combine orthogonal classes, you actually build the cartesian
product of these classes. If I'm not completely wrong, this is exactly what
RECORDs are good for. So, if B and C are independent, I would combine these
using A = OBJECT b: B; c: C; END.
I'm totally aware of the fact that this is a little bit awkward for the people
used to MI from other languages, because you have to copy and merge B and C's
interfaces to get an interface for A. But it *can* be done. On the other hand,
if you have a (virtual in the C++ sense) superclass D and subclasses B and C
from D, there is no way to simulate A = B, C OBJECT END with SI without loosing
data hiding.
So, my proposal is sort of a compromise. It prevents MI from getting too
complicated - for the user as well as for the compiler/run time system. But I
don't think it prohibits the essential point about MI.