OO object layout - method-table pointer or tag

48 views
Skip to first unread message

James Harris

unread,
Aug 29, 2021, 2:24:52 PMAug 29
to
I am a long way from implementing OO-type objects but I have in mind a
layout for them and something Dmitry mentioned in another thread
suggested that there may be a better way.

My intended model is to have each object's data preceded by a pointer to
a table of its methods. To illustrate, let's say the source code had a
structure 'point' with two fields (x and y) and which referred to two
methods (plot and clear).

Each 'point' would have the coordinates and the two methods mentioned
such that if we had a point P then

P.x would refer to the x coordinate
P.plot() would invoke the plot method

My idea is to implement that as two parts. Each object would have a
read-write part as

pointer to method table
x
y

and the pointer therein would point to a read-only 'method table' which
had the records and some other fields, such as

number of objects pointing at this table
pointer to the type
pointer to procedure 'plot'
pointer to procedure 'clear'


Sorry if that looks a bit complicated. It's about as short as I can
think to make it. Basically, the method table would contain a count of
the number of objects which were pointing at it, a pointer to the type
definition, and a pointer to each method.

I chose it that way to make it fast to dispatch one of the procedures:
follow one pointer to get to the method table then invoke the routine
pointed to by the pointer at the designated offset. (The routine would
be supplied a pointer to the object.)

To get to base info about the type would also be possible by following
the pointer to the type.

Opinions on the scheme?

However Dmitry said: "You do not need length, because it can be deduced
from the tag. Then tag is more universal than a pointer to the table.
Pointer do not work with multiple-dispatch, for example."

Ignoring the length part (as some objects would have their own length
and some would not but that difference is immaterial here) how would a
'tag' system be laid out in memory and how would one use that to locate
a method?


--
James Harris

Dmitry A. Kazakov

unread,
Aug 29, 2021, 3:32:21 PMAug 29
to
On 2021-08-29 20:24, James Harris wrote:

> Ignoring the length part (as some objects would have their own length
> and some would not but that difference is immaterial here) how would a
> 'tag' system be laid out in memory and how would one use that to locate
> a method?

You must consider the following inheritance variants:

1. Multiple inheritance (MI)

A B
\ /
C

1.1. Java model. This is when only one parent may have an implementation
(e.g. members). Thus you have a single path where you could keep vptr
[the limited model you have in mind].

1.2. Full MI, C++ model. Any parent may have an implementation. This in
particular means that when you pass C as A you must have vptr of
A'Class, when you pass C as B, you must have another vptr of B'Class.

1.3. Idempotent inheritance:

A
/ \
B C
\ /
D

D has the single instance of A inherited over B and C.

1.4. Additive inheritance:

A
/ \
B C
\ /
D

D has two instances of A, one inherited over B another over C.

2. Multiple dispatch.

2.1. Multimethods only. A multimethod is when all arguments and results
are from the same type hierarchy. E.g. arithmetic operations are
multimethods. Multimethods can have vptr, but the table becomes
multidimensional.

2.2. Parallel inheritance. Arguments can be from different hierarchies,
but only diagonal combinations are allowed:

A1 B1
| |
A2 B2

Foo (A1, B1) - OK
Foo (A1, B2) - not OK
Foo (A2, B2) - OK

This is very common case, e.g. when you have a hierarchy of array
element types parallel to the corresponding array types.

2.3. Full multiple dispatch. Any combinations allowed.

3. Embedded vs. outside tags. If you embed the tag into the object, you
cannot have classes of scalar types.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

David Brown

unread,
Aug 29, 2021, 5:28:44 PMAug 29
to
So far, that is a straight copy of C++ (without virtual inheritance
support - and it's questionable whether virtual inheritance is useful
enough to justify its complications).

> I chose it that way to make it fast to dispatch one of the procedures:
> follow one pointer to get to the method table then invoke the routine
> pointed to by the pointer at the designated offset. (The routine would
> be supplied a pointer to the object.)
>
> To get to base info about the type would also be possible by following
> the pointer to the type.
>
> Opinions on the scheme?

It works for C++.

Bart

unread,
Aug 29, 2021, 6:04:39 PMAug 29
to
I haven't implemented this stuff in static code: classes with methods
and with inheritance, so can't speak directly from experience.

I have played with it however, and I have implemented it for dynamic code.

The first thing that strikes me, if I understood correctly, is that each
instance of a Point needs to include not only the data x,y, but also a
pointer.

With apparently, exactly the same pointer in each of millions of
instances. Which all means that if you allocate an array of such points,
you can't just initialise them with all-zeros to (0,0), but need to
specifically initialise that pointer field in each one.

So, why is that actually necessary? Is it because of inheritance, where
any instance P could be of the base class, or of a derived class which
could have its own sets of methods, and this information is not known
until runtime?

I can illustrate that with this example in dynamic code:

record point =
var x,y
proc plot(&self) =
println "POINT",self.x,self.y
end
end

record dirvec(point) = # inherit from Point
proc plot(&self) =
println "DIRVEC",self.x,self.y
end
end

proc start=
a:=point(10,20)
b:=dirvec(30,30)

a.plot()
b.plot()
end

The output here is:

POINT 10 20
DIRVEC 30 30

Each instance (A and B), contains values for its X and Y fields. But no
pointers to method tables. But they both have, because it is dynamic, is
a type tag: a index into tables of types.

That is used to search (again because its dynamic), a table of entries
for the 'plot' attribute (which can be used in a different other types
too), which matches the types of A and B.

I would have expect a static language to convert those last the lines to:

point.plot(a)
dirvec.plot(b)

I guess it depends on how A and B would be declared, and whether, if
they have those two different types, either can still be passed to a
function taking a Point type, and possibly one taking a Dirvec type.

(I think I ran into trouble trying this in static code, not just because
of methods, but because the two types instances would have the normal
fields at different offsets, and could be different types; say Point has
int32 X,Y, but Dirvec defines float64 Y,Z and inherits int32 X.

But maybe I'm just not understanding how OOP is supposed to work.)


James Harris

unread,
Aug 30, 2021, 5:47:58 AMAug 30
to
On 29/08/2021 20:32, Dmitry A. Kazakov wrote:
> On 2021-08-29 20:24, James Harris wrote:
>
>> Ignoring the length part (as some objects would have their own length
>> and some would not but that difference is immaterial here) how would a
>> 'tag' system be laid out in memory and how would one use that to
>> locate a method?
>
> You must consider the following inheritance variants:

...

Thanks for the information, Dmitry. I might come back to you about some
of that. But for this thread I was wondering how the memory might be
laid out under the tag model that you mentioned. What form does a tag
take and how does the runtime get from the tag to a method?

If it depends on which model is in use then what about single
inheritance and the Full MI model you mentioned?


--
James Harris

James Harris

unread,
Aug 30, 2021, 6:23:15 AMAug 30
to
On 29/08/2021 23:04, Bart wrote:
> On 29/08/2021 19:24, James Harris wrote:
>> I am a long way from implementing OO-type objects but I have in mind a
>> layout for them and something Dmitry mentioned in another thread
>> suggested that there may be a better way.
>>
>> My intended model is to have each object's data preceded by a pointer
>> to a table of its methods.

...

>>
>> However Dmitry said: "You do not need length, because it can be
>> deduced from the tag. Then tag is more universal than a pointer to the
>> table. Pointer do not work with multiple-dispatch, for example."

...

> I haven't implemented this stuff in static code: classes with methods
> and with inheritance, so can't speak directly from experience.
>
> I have played with it however, and I have implemented it for dynamic code.
>
> The first thing that strikes me, if I understood correctly, is that each
> instance of a Point needs to include not only the data x,y, but also a
> pointer.
>
> With apparently, exactly the same pointer in each of millions of
> instances. Which all means that if you allocate an array of such points,
> you can't just initialise them with all-zeros to (0,0), but need to
> specifically initialise that pointer field in each one.
>
> So, why is that actually necessary? Is it because of inheritance, where
> any instance P could be of the base class, or of a derived class which
> could have its own sets of methods, and this information is not known
> until runtime?

It's not because of inheritance. It's to support polymorphism.

Without OO there would be no 'method table pointer' (MTP). Instead, each
function called to act on an object would know the object's type and,
hence, where its fields are.

With OO the object could be one of a number of different types as long
as they all have the same conceptual set of functions (in the same
internal order), e.g. they both have a plot() function or they both have
an add() function. So you could have

pixel_point P
float_point F

Both types could have a plot() function. As long as it had the same
interface then the function could be invoked by

P.plot()
F.plot()

The key benefit, AISI, is that the routine which calls plot would work
with either type of point. The call wouldn't care what form the point
took; it would be polymorphic.

Of course, whether that's worth the complexity of the implementation is
debatable but it does allow a lot of source code to be simpler.

>
> I can illustrate that with this example in dynamic code:
>
>     record point =
>         var x,y
>         proc plot(&self) =
>             println "POINT",self.x,self.y
>         end
>     end
>
>     record dirvec(point) =        # inherit from Point
>         proc plot(&self) =
>             println "DIRVEC",self.x,self.y
>         end
>     end
>
>     proc start=
>         a:=point(10,20)
>         b:=dirvec(30,30)
>
>         a.plot()
>         b.plot()
>     end
>
> The output here is:
>
>     POINT 10 20
>     DIRVEC 30 30
>
> Each instance (A and B), contains values for its X and Y fields. But no
> pointers to method tables. But they both have, because it is dynamic, is
> a type tag: a index into tables of types.

So your form of a tag is an index into a table? Maybe that's how
Dmitry's 'tag' would be implemented.

I have doubts about indexing into a table rather than pointing to the
method table directly. It would probably require at least one extra
level of dereferencing to find a method.

>
> That is used to search (again because its dynamic), a table of entries
> for the 'plot' attribute (which can be used in a different other types
> too), which matches the types of A and B.
>
> I would have expect a static language to convert those last the lines to:
>
>    point.plot(a)
>    dirvec.plot(b)

Why not

a.plot()
b.plot()

?

>
> I guess it depends on how A and B would be declared, and whether, if
> they have those two different types, either can still be passed to a
> function taking a Point type, and possibly one taking a Dirvec type.
>
> (I think I ran into trouble trying this in static code, not just because
> of methods, but because the two types instances would have the normal
> fields at different offsets, and could be different types; say Point has
> int32 X,Y, but Dirvec defines float64 Y,Z and inherits int32 X.
>
> But maybe I'm just not understanding how OOP is supposed to work.)

Dmitry's reply about various different interpretations of how OOP can
work shows there is no single right answer. Indeed, OOP may not be the
best solution, at all.


--
James Harris

David Brown

unread,
Aug 30, 2021, 7:19:28 AMAug 30
to
In particular, it is for support of run-time polymorphism.

There is also compile-time polymorphism, handled by templates or other
generic coding mechanisms. Since the types are determined entirely at
compile time, there is no need for a pointer to a virtual method table
and like C++ classes without virtual methods, there is no need for the
extra pointer in the class implementation. (Obviously compile-time
polymorphism has limitations of flexibility, but it results in more
efficient code.)

I often use inheritance in C++ without virtual methods, and therefore no
virtual method table overhead in the classes (and no possibility of
run-time polymorphism).

>
> Without OO there would be no 'method table pointer' (MTP). Instead, each
> function called to act on an object would know the object's type and,
> hence, where its fields are.
>
> With OO the object could be one of a number of different types as long
> as they all have the same conceptual set of functions (in the same
> internal order), e.g. they both have a plot() function or they both have
> an add() function. So you could have
>
>   pixel_point P
>   float_point F
>
> Both types could have a plot() function. As long as it had the same
> interface then the function could be invoked by
>
>   P.plot()
>   F.plot()
>
> The key benefit, AISI, is that the routine which calls plot would work
> with either type of point. The call wouldn't care what form the point
> took; it would be polymorphic.
>
> Of course, whether that's worth the complexity of the implementation is
> debatable but it does allow a lot of source code to be simpler.
>

<https://en.wikipedia.org/wiki/Virtual_method_table>

You are describing exactly the system used in many compiled object
oriented languages.

But you should be aware that compile-time polymorphism has become the
current trend, giving more efficient results in many cases.

>>
>> I can illustrate that with this example in dynamic code:
>>
>>      record point =
>>          var x,y
>>          proc plot(&self) =
>>              println "POINT",self.x,self.y
>>          end
>>      end
>>
>>      record dirvec(point) =        # inherit from Point
>>          proc plot(&self) =
>>              println "DIRVEC",self.x,self.y
>>          end
>>      end
>>
>>      proc start=
>>          a:=point(10,20)
>>          b:=dirvec(30,30)
>>
>>          a.plot()
>>          b.plot()
>>      end
>>
>> The output here is:
>>
>>      POINT 10 20
>>      DIRVEC 30 30
>>
>> Each instance (A and B), contains values for its X and Y fields. But
>> no pointers to method tables. But they both have, because it is
>> dynamic, is a type tag: a index into tables of types.
>

That code is not dynamic polymorphism - the types are all statically
known at compile time. Dynamic polymorphism would be if you also had:

proc doubleplot(&x) =
x.plot();
x.plot();
end

proc start =
a : =point(10, 20)
b := dirvec(30, 30)
doubleplot(&a)
doubleplot(&b)
end

i.e., the function "doubleplot" is compiled without knowing whether it
will be calling the "point" version of "plot", or the "dirvec" version.

> So your form of a tag is an index into a table? Maybe that's how
> Dmitry's 'tag' would be implemented.

That's what it sounds like to me. I don't see off-hand any significant
benefits of using an index into a table, rather than a pointer directly
to the type information (primarily the virtual methods list, but perhaps
also containing run-time type information if the language supports taht).

>
> I have doubts about indexing into a table rather than pointing to the
> method table directly. It would probably require at least one extra
> level of dereferencing to find a method.
>

The extra level of indirection could be removed if you are doing
whole-program code generation, but you'd still need to scale the index
and add it to the base of the table instead of using the pointer directly.

>>
>> That is used to search (again because its dynamic), a table of entries
>> for the 'plot' attribute (which can be used in a different other types
>> too), which matches the types of A and B.
>>
>> I would have expect a static language to convert those last the lines to:
>>
>>     point.plot(a)
>>     dirvec.plot(b)
>
> Why not
>
>   a.plot()
>   b.plot()
>
> ?

Perhaps Bart meant the static language would implement "a.plot()" as
though it had been "point.plot(a)", which is exactly how optimising
compilers /do/ implement polymorphism when they know the types at
compile time. (These applies to virtual methods too - it's called
devirtualisation optimisation.)

>
>>
>> I guess it depends on how A and B would be declared, and whether, if
>> they have those two different types, either can still be passed to a
>> function taking a Point type, and possibly one taking a Dirvec type.
>>
>> (I think I ran into trouble trying this in static code, not just
>> because of methods, but because the two types instances would have the
>> normal fields at different offsets, and could be different types; say
>> Point has int32 X,Y, but Dirvec defines float64 Y,Z and inherits int32 X.
>>
>> But maybe I'm just not understanding how OOP is supposed to work.)
>
> Dmitry's reply about various different interpretations of how OOP can
> work shows there is no single right answer. Indeed, OOP may not be the
> best solution, at all.
>

Yes. Your job here is to listen to all the ideas, and figure out what
will work for /you/ and /your/ language.

Dmitry A. Kazakov

unread,
Aug 30, 2021, 7:37:01 AMAug 30
to
On 2021-08-30 11:47, James Harris wrote:
> On 29/08/2021 20:32, Dmitry A. Kazakov wrote:
>> On 2021-08-29 20:24, James Harris wrote:
>>
>>> Ignoring the length part (as some objects would have their own length
>>> and some would not but that difference is immaterial here) how would
>>> a 'tag' system be laid out in memory and how would one use that to
>>> locate a method?
>>
>> You must consider the following inheritance variants:
>
> ...
>
> Thanks for the information, Dmitry. I might come back to you about some
> of that. But for this thread I was wondering how the memory might be
> laid out under the tag model that you mentioned. What form does a tag
> take and how does the runtime get from the tag to a method?

You must also consider type tests. E.g. is A a descendant of B, unrelated?

> If it depends on which model is in use then what about single
> inheritance and the Full MI model you mentioned?

In the most general case a dispatching table belongs to the method, not
to the type. The tag would denote an N-th coordinate (plane) of the
multidimensional index in the table.

E.g. in the case of full dispatch: Print (Device, Shape), Print is a
method of two hierarchies.

Now, of course, because the same tag must be used to index a multitude
of tables, it cannot be a dense index. So, go figure.

Another issue to consider is whether tables are contiguous. Consider the
following scenario:

The type A is declared in the shared library libA. So you can put the
dispatching table there? Not so fast. Let there be the library libB,
that derives some B from A. That must change the dispatching tables for
A upon loading of libB. When unloading libB you must roll the changes back.

Bart

unread,
Aug 30, 2021, 8:08:08 AMAug 30
to
On 30/08/2021 11:23, James Harris wrote:
> On 29/08/2021 23:04, Bart wrote:

>> So, why is that actually necessary? Is it because of inheritance,
>> where any instance P could be of the base class, or of a derived class
>> which could have its own sets of methods, and this information is not
>> known until runtime?
>
> It's not because of inheritance. It's to support polymorphism.
>
> Without OO there would be no 'method table pointer' (MTP). Instead, each
> function called to act on an object would know the object's type and,
> hence, where its fields are.
>
> With OO the object could be one of a number of different types as long
> as they all have the same conceptual set of functions (in the same
> internal order), e.g. they both have a plot() function or they both have
> an add() function. So you could have
>
>   pixel_point P
>   float_point F
>
> Both types could have a plot() function. As long as it had the same
> interface then the function could be invoked by
>
>   P.plot()
>   F.plot()
>
> The key benefit, AISI, is that the routine which calls plot would work
> with either type of point. The call wouldn't care what form the point
> took; it would be polymorphic.

That example looks too simple to me; it doesn't really involve OO in my
opinion as it merely provides a convenient alternative syntax. Example
in static code:

record pixelpoint =
byte x,y
proc plot(pixelpoint &self)=
println "Pixelpoint:",self.x,self.y
end
end

record floatpoint =
real x,y
proc plot(floatpoint &self)=
println "Floatpoint:",self.x,self.y
end
end

proc start=
pixelpoint p:=(10,20)
floatpoint q:=(30.5,40.6)

p.plot()
q.plot()
println p.bytes, q.bytes
end

Output is:

Pixelpoint: 10 20
Floatpoint: 30.500000 40.600000
2 16


That p.plot() line is just an alternative to pixelpoint.plot(p), with
the transformation done at compile-time. Here, this is just
encapsulation (which I actually never use; I'm surprised it works!).

I'm not up on OO terms, but polymorphism to me would mean function and
operator overloading. So if you had two functions both called plot,
taking pixelpoint or floatpoint parameters respectively, you could say:

plot(p)
plot(f)

>> Each instance (A and B), contains values for its X and Y fields. But
>> no pointers to method tables. But they both have, because it is
>> dynamic, is a type tag: a index into tables of types.
>
> So your form of a tag is an index into a table? Maybe that's how
> Dmitry's 'tag' would be implemented.

That's just generally how it works with dynamic typing; the tag is the
key to everything. However such code usually means dynamic type dispatch
which is slow. If you have static typing at all, you expect it to be
much faster.

> I have doubts about indexing into a table rather than pointing to the
> method table directly. It would probably require at least one extra
> level of dereferencing to find a method.

Why, in your example, wouldn't the compiler know the type of P?

Do you have a more elaborate example that shows off the advantages of
OO, or whatever you are trying to do, that would be too clunky in C?

Bart

unread,
Aug 30, 2021, 8:21:27 AMAug 30
to
On 30/08/2021 12:19, David Brown wrote:
> On 30/08/2021 12:23, James Harris wrote:
>> On 29/08/2021 23:04, Bart wrote:

>>> I can illustrate that with this example in dynamic code:

> That code is not dynamic polymorphism - the types are all statically
> known at compile time. Dynamic polymorphism would be if you also had:
>
> proc doubleplot(&x) =
> x.plot();
> x.plot();
> end
>
> proc start =
> a := point(10, 20)
> b := dirvec(30, 30)
> doubleplot(&a)
> doubleplot(&b)
> end
>
> i.e., the function "doubleplot" is compiled without knowing whether it
> will be calling the "point" version of "plot", or the "dirvec" version.

This works fine (but the &s are not needed). But since this is a
dynamically typed language where you get generics and everything else
for free, that is not surprising.

I was making a point about what information needs to be stored with each
instance. Sometimes I don't even need the tag for every point value:

type point = struct
int32 x,y
end

proc start =
x:=new(array,point, 1 million)
println x.bytes
print x[12345]
end

Output is:

8000000
(0,0)

However the 'struct' type doesn't have methods. That is just because the
language decided not to allow that; it could have done, and each array
element would still need only 8 bytes (the array object itself has a tag
and knows the type of the elements).

But it still knows how to print the type! (For a custom print function,
here I'd favour an overload of 'tostring' for this point type, then I
can just do a regular 'print'.)

Dmitry A. Kazakov

unread,
Aug 30, 2021, 8:44:07 AMAug 30
to
On 2021-08-30 13:19, David Brown wrote:

> There is also compile-time polymorphism, handled by templates or other
> generic coding mechanisms.

Yes, but static polymorphism has a very limited use because you cannot
have objects from the closure of the class, only instances of specific
type ones. In practice this precludes most cases of reuse.

James Harris

unread,
Aug 30, 2021, 4:25:55 PMAug 30
to
On 30/08/2021 12:36, Dmitry A. Kazakov wrote:
> On 2021-08-30 11:47, James Harris wrote:
>> On 29/08/2021 20:32, Dmitry A. Kazakov wrote:
>>> On 2021-08-29 20:24, James Harris wrote:
>>>
>>>> Ignoring the length part (as some objects would have their own
>>>> length and some would not but that difference is immaterial here)
>>>> how would a 'tag' system be laid out in memory and how would one use
>>>> that to locate a method?
>>>
>>> You must consider the following inheritance variants:
>>
>> ...
>>
>> Thanks for the information, Dmitry. I might come back to you about
>> some of that. But for this thread I was wondering how the memory might
>> be laid out under the tag model that you mentioned. What form does a
>> tag take and how does the runtime get from the tag to a method?
>
> You must also consider type tests. E.g. is A a descendant of B, unrelated?

Interesting question...!

>
>> If it depends on which model is in use then what about single
>> inheritance and the Full MI model you mentioned?
>
> In the most general case a dispatching table belongs to the method, not
> to the type. The tag would denote an N-th coordinate (plane) of the
> multidimensional index in the table.

That sounds weird. If the dispatch table belongs to the method what
would it be indexed by?

When you talk about a tag do you mean something that might, if it cannot
be optimised any better, result in a series of three or even more
lookups executed at run time before the required method is found?

>
> E.g. in the case of full dispatch: Print (Device, Shape), Print is a
> method of two hierarchies.
>
> Now, of course, because the same tag must be used to index a multitude
> of tables, it cannot be a dense index. So, go figure.
>
> Another issue to consider is whether tables are contiguous. Consider the
> following scenario:
>
> The type A is declared in the shared library libA. So you can put the
> dispatching table there? Not so fast. Let there be the library libB,
> that derives some B from A. That must change the dispatching tables for
> A upon loading of libB. When unloading libB you must roll the changes back.

It's rare that a good idea is so complex or has so many semantically
different implementations. I do wonder whether all this complexity
should be replaced by something simpler.

My current scheme is simpler but it doesn't necessarily have
substitutability which is something I gather you believe to be
important, even from your interesting question, above.


--
James Harris

Dmitry A. Kazakov

unread,
Aug 30, 2021, 5:13:12 PMAug 30
to
By the type, of course. It is pretty straightforward. Consider

Surface Shape
/ \ / \
PDF SVG Ellipse Rectangle

Draw | PDF | SVG |
----------+-----+-----+
Ellipse | ptr | ptr |
----------+-----+-----+
Rectangle | ptr | ptr |
----------+-----+-----+

> When you talk about a tag do you mean something that might, if it cannot
> be optimised any better, result in a series of three or even more
> lookups executed at run time before the required method is found?

Yes. You need some unique type identifier, short enough to fit in a
register.

>> Another issue to consider is whether tables are contiguous. Consider
>> the following scenario:
>>
>> The type A is declared in the shared library libA. So you can put the
>> dispatching table there? Not so fast. Let there be the library libB,
>> that derives some B from A. That must change the dispatching tables
>> for A upon loading of libB. When unloading libB you must roll the
>> changes back.
>
> It's rare that a good idea is so complex or has so many semantically
> different implementations. I do wonder whether all this complexity
> should be replaced by something simpler.

It is a very complex issue. E.g. consider requirements:

1. All slots in a dispatching table are always defined, i.e. dispatch
never fail at run-time.

1.1. That would require the user to override slots the language cannot
safely inherit from the parent type, e.g. mixed type operations.

1.2. The problem with modules. Let the module X derive HTML from Surface
and the module Y derive Sector from Shape. If neither of the modules
uses/knows another, where to define Draw (HTML, Sector)?

There must be some solution for where to place methods. BTW, C++ model
when methods are put into classes is clearly inconsistent with multiple
dispatch. Where to place Draw?

2. No object of a type may exist after the type ceases to exist.

3. Dispatch never happens on partially constructed/destructed objects.
Consider:

A
|
B

In the constructor of B is called after A is constructed. Now what
happens if you dispatch to Foo from the constructor of A and Foo is
overridden by B?

C++ "resolves" that by refusing to dispatch. It will call A::Foo instead
of B::Foo.

That is no solution because programmers do need dispatching
constructors/destructors. A possible solution would be to have
constructors/destructors of class-wide objects where you could safely
dispatch from.

4. Allowing or not to re-dispatch. I.e. if you dispatch on Foo, can you
dispatch again on the same argument inside Foo? This is how C++ works
and this is greatly inefficient and also inconsistent with the types.

> My current scheme is simpler but it doesn't necessarily have
> substitutability which is something I gather you believe to be
> important, even from your interesting question, above.

Inheritance is an open issue. I do not know any language that handles it
consistently and complete. So you have a chance to invent something useful.

James Harris

unread,
Sep 3, 2021, 8:27:24 AMSep 3
to
On 30/08/2021 22:13, Dmitry A. Kazakov wrote:
> On 2021-08-30 22:25, James Harris wrote:
>> On 30/08/2021 12:36, Dmitry A. Kazakov wrote:

...

>>> In the most general case a dispatching table belongs to the method,
>>> not to the type. The tag would denote an N-th coordinate (plane) of
>>> the multidimensional index in the table.
>>
>> That sounds weird. If the dispatch table belongs to the method what
>> would it be indexed by?
>
> By the type, of course. It is pretty straightforward. Consider
>
>    Surface       Shape
>      /  \         /  \
>    PDF  SVG  Ellipse  Rectangle
>
>    Draw      | PDF | SVG |
>    ----------+-----+-----+
>    Ellipse   | ptr | ptr |
>    ----------+-----+-----+
>    Rectangle | ptr | ptr |
>    ----------+-----+-----+

That solution looks as though the shapes would have to know how to draw
themselves on different surfaces. As such I doubt it would scale well as
new shapes and new types of surface were added. But maybe it's purely
illustrative of multiple dispatch. What I take from it is that, however
it's done, the address of the drawing function (the ptr in the above
table) would need to be found from a key of three parameters

(shape type, surface type, function name)

whereas single dispatch would find a function based on two parameters
such as

(shape type, function name)

Is that correct?


>
>> When you talk about a tag do you mean something that might, if it
>> cannot be optimised any better, result in a series of three or even
>> more lookups executed at run time before the required method is found?
>
> Yes. You need some unique type identifier, short enough to fit in a
> register.

If the tag is the unique identifier you mention how would the compiled
code get from the tag to the correct routine?

My intention was for the tag to be the address of the dispatch table so
that functions (as long as they were at known offsets) could be accessed
quickly but I don't think you liked that.

I will have to come back to you on the other points of your post.


--
James Harris

Dmitry A. Kazakov

unread,
Sep 3, 2021, 9:39:26 AMSep 3
to
Yes, and the surface parameter is then class-wide, i.e. each function
must have to deal with any shape (any instance from the class = class-wide).

>>> When you talk about a tag do you mean something that might, if it
>>> cannot be optimised any better, result in a series of three or even
>>> more lookups executed at run time before the required method is found?
>>
>> Yes. You need some unique type identifier, short enough to fit in a
>> register.
>
> If the tag is the unique identifier you mention how would the compiled
> code get from the tag to the correct routine?

Via the dispatching table indexed by the tag. Tag could not be a dense
index, true, so you would use a hash or binary search instead.

> My intention was for the tag to be the address of the dispatch table so
> that functions (as long as they were at known offsets) could be accessed
> quickly but I don't think you liked that.

Hash functions are pretty quick.

Then again, if you don't repeat C++ and Java error confusing class-wide
and specific types, you would drastically reduce cases when dispatch
ever happens. Most of the code deals with statically known specific
types, ergo, no dispatch (or as some call it, "static dispatch").

Segmented dispatching tables is IMO a bigger problem unless you rebuild
affected dispatching each time you load a library or dynamically create
a new type etc.

Bart

unread,
Sep 3, 2021, 10:00:10 AMSep 3
to
On 03/09/2021 13:27, James Harris wrote:
> On 30/08/2021 22:13, Dmitry A. Kazakov wrote:
>> On 2021-08-30 22:25, James Harris wrote:
>>> On 30/08/2021 12:36, Dmitry A. Kazakov wrote:
>
> ...
>
>>>> In the most general case a dispatching table belongs to the method,
>>>> not to the type. The tag would denote an N-th coordinate (plane) of
>>>> the multidimensional index in the table.
>>>
>>> That sounds weird. If the dispatch table belongs to the method what
>>> would it be indexed by?
>>
>> By the type, of course. It is pretty straightforward. Consider
>>
>>     Surface       Shape
>>       /  \         /  \
>>     PDF  SVG  Ellipse  Rectangle
>>
>>     Draw      | PDF | SVG |
>>     ----------+-----+-----+
>>     Ellipse   | ptr | ptr |
>>     ----------+-----+-----+
>>     Rectangle | ptr | ptr |
>>     ----------+-----+-----+
>
> That solution looks as though the shapes would have to know how to draw
> themselves on different surfaces. As such I doubt it would scale well as
> new shapes and new types of surface were added. But maybe it's purely
> illustrative of multiple dispatch.

(The example reminds me of actual drawing code I needed to write in my
applications in the 1990s. The script language included functions such
as a general-purpose one called Plot(), with 8 parameters (many optional
with defaults). The remaining docs are poor but I think they were:

Dest Destination code
D Info specific to a destination
Scan Scan type
View View point
P Drawing element
Clear Whether to clear destination first
Nsegments 1, or build results as multiple hoz bands
Flags Misc

This is for displaying a 3D model, which is represented in P as a
hierarchical tree of drawing elements. There were nearly 50 leaf objects
(including Rectangle and Ellipse!) plus named collections which in a HLL
would correspond to a named function call.

Dest is a code broadly indicating one of Window, Image, Printer/Plotter,
or collections of 3D Face or Lines. Here D might refer to the actual
window, a filename to capture output (eg. EPS print format) etc.

Scan indicates how the data should be processed, with 16 variations,
including Wireframe, Hidden, Shaded, Erase, Highlight, DXF dest, Expand
to lines/faces, ...

View specifies the viewpoint from which the result will be rendered, and
whether it uses perspective projection.

Flags controls many other factors: whether to draw grids or axes, page
outlines ...

OK, the example was clearly a simple one to illustrate some aspects of
how OOP works. But I see such examples using shapes all the time and
find them somewhat annoying.

It gives the impression that OOP can magically solve all the problems
that come up in actual applications, but the reality is always a lot
more complicated.

I'm rather sceptical about what OOP can do, and I have firm views on how
far a language should get involved in matters that should be the concern
of an application.

I also think that, if you really want to use OOP seriously, then you're
quite liable to just end up reinventing C++, as you meet the same
obstacles.)


What I take from it is that, however
> it's done, the address of the drawing function (the ptr in the above
> table) would need to be found from a key of three parameters
>
>   (shape type, surface type, function name)
>
> whereas single dispatch would find a function based on two parameters
> such as
>
>   (shape type, function name)
>
> Is that correct?

The shape should describe the shape itself and nothing else. If you want
to plot it, then extra information is needed:

p.plot(surface_type)

With perhaps that parameter being optional when there is a sensible
default. So here there is single dispatch based on shape type.

Anything else would be difficult to get your head around. If there are M
shape types and N surface types, then you've got MxN methods to write,
but where are you going to write them? Functions are written one after
the other, not in a table!

You might consider introducing overloaded functions, so the plot()
method for Rectangle is split into one each for PDF and SVG. Feature creep.

(My approach as I used was single dispatch - a simple switch - using
common code to reduce complex objects to basic edges, and then using
single dispatch on the 'surface' when drawing those edges.)

James Harris

unread,
Sep 3, 2021, 10:40:46 AMSep 3
to
OK. I am now trying to understand what you mean by 'class wide'. Are you
saying that each drawing function would have to be passed the target
surface as a parameter - and that makes surface 'class wide'?

In practice, I'd imagine it's best for 'surfaces' to know how to draw on
themselves and they would export drawing primitives (such as point,
line, curve, rectangle, ellipse etc) which routines can call.

>
>>>> When you talk about a tag do you mean something that might, if it
>>>> cannot be optimised any better, result in a series of three or even
>>>> more lookups executed at run time before the required method is found?
>>>
>>> Yes. You need some unique type identifier, short enough to fit in a
>>> register.
>>
>> If the tag is the unique identifier you mention how would the compiled
>> code get from the tag to the correct routine?
>
> Via the dispatching table indexed by the tag. Tag could not be a dense
> index, true, so you would use a hash or binary search instead.

Are you thinking that to find each function the executable code would
carry out the hash or binary search at run time (if optimisation didn't
remove the need for it)?

My guess is that Python works that way: each name referred to in a
method is subjected to lookup in a series of tables (in something called
Method Resolution Order, MRO) starting with the object and ascending to
each class in the MRO in turn until a matching name is found. But that's
slow.

In compiled code, carrying out a hash lookup or a binary search just to
find which function to call would be faster than Python but still too slow.

>
>> My intention was for the tag to be the address of the dispatch table
>> so that functions (as long as they were at known offsets) could be
>> accessed quickly but I don't think you liked that.
>
> Hash functions are pretty quick.

Hash functions are not fast enough to find a call target! I was/am even
concerned about the extra cost of following a double pointer. Using a
hash table would be slower still.

>
> Then again, if you don't repeat C++ and Java error confusing class-wide
> and specific types, you would drastically reduce cases when dispatch
> ever happens. Most of the code deals with statically known specific
> types, ergo, no dispatch (or as some call it, "static dispatch").

I'm not sure I understand that but if you think C++ and Java got it
wrong do you think Ada got it right and what did it do differently?

>
> Segmented dispatching tables is IMO a bigger problem unless you rebuild
> affected dispatching each time you load a library or dynamically create
> a new type etc.
>

I presume that means when a large dispatch table reaches the limits of
its allocated memory and has to be split.


--
James Harris

James Harris

unread,
Sep 3, 2021, 10:59:19 AMSep 3
to
On 03/09/2021 14:59, Bart wrote:
> On 03/09/2021 13:27, James Harris wrote:

...

> OK, the example was clearly a simple one to illustrate some aspects of
> how OOP works. But I see such examples using shapes all the time and
> find them somewhat annoying.

I recently had to take a first look at how I'd want a GUI to work. My
answer was basically sentient widgets which knew how to operate on data
and how to draw themselves. But the drawing would be by graphics
primitives which were provided by the surfaces on which the widgets
wanted to appear. Without having implemented any of this, yet, that
seems to me to be the best approach.

>
> It gives the impression that OOP can magically solve all the problems
> that come up in actual applications, but the reality is always a lot
> more complicated.

OOP has lots wrong with it - including multiple incompatible ways of
realising it.

ADTs, by contrast, are great. Each type comes with its own functions.

Don't throw the ADTs out with the OOP!

>
> I'm rather sceptical about what OOP can do, and I have firm views on how
> far a language should get involved in matters that should be the concern
> of an application.
>
> I also think that, if you really want to use OOP seriously, then you're
> quite liable to just end up reinventing C++, as you meet the same
> obstacles.)

I don't know much about C++ but ending up with similar solutions is
unlikely as its design goals are different.
>
>
>  What I take from it is that, however
>> it's done, the address of the drawing function (the ptr in the above
>> table) would need to be found from a key of three parameters
>>
>>    (shape type, surface type, function name)
>>
>> whereas single dispatch would find a function based on two parameters
>> such as
>>
>>    (shape type, function name)
>>
>> Is that correct?
>
> The shape should describe the shape itself and nothing else. If you want
> to plot it, then extra information is needed:
>
>    p.plot(surface_type)
>
> With perhaps that parameter being optional when there is a sensible
> default. So here there is single dispatch based on shape type.
>
> Anything else would be difficult to get your head around. If there are M
> shape types and N surface types, then you've got MxN methods to write,

Indeed, that's exactly the issue. Better to have the M shapes (or
widgets) use a set of drawing primitives, and have those provided by the
N surface types, IMO.


> but where are you going to write them? Functions are written one after
> the other, not in a table!
>
> You might consider introducing overloaded functions, so the plot()
> method for Rectangle is split into one each for PDF and SVG. Feature creep.
>
> (My approach as I used was single dispatch - a simple switch - using
> common code to reduce complex objects to basic edges, and then using
> single dispatch on the 'surface' when drawing those edges.)
>

Why not say that each surface has to provide a set of primitives that
other programs can use to draw on an instance of that type of surface?


--
James Harris

Dmitry A. Kazakov

unread,
Sep 3, 2021, 11:28:08 AMSep 3
to
On 2021-09-03 16:59, James Harris wrote:

> Why not say that each surface has to provide a set of primitives that
> other programs can use to draw on an instance of that type of surface?

Because that set is open ended. OOP allows you

1. adding new primitives (new shapes)

2. using primitives with new surfaces (inheritance)

All counter proposals are static and do not scale in practice, which is
why OOP is used so widely in large projects regardless any difficulties.
There is simple no other way to do it, economically, there is no other way.

Dmitry A. Kazakov

unread,
Sep 3, 2021, 11:46:13 AMSep 3
to
Class-wide = anything from the members of the class.

> Are you
> saying that each drawing function would have to be passed the target
> surface as a parameter - and that makes surface 'class wide'?

Not only passed, it must accept any surface of any surface type. In
particular it means that it must know the type tag.

> In practice, I'd imagine it's best for 'surfaces' to know how to draw on
> themselves and they would export drawing primitives (such as point,
> line, curve, rectangle, ellipse etc) which routines can call.

I commented on that in another post.

>>>>> When you talk about a tag do you mean something that might, if it
>>>>> cannot be optimised any better, result in a series of three or even
>>>>> more lookups executed at run time before the required method is found?
>>>>
>>>> Yes. You need some unique type identifier, short enough to fit in a
>>>> register.
>>>
>>> If the tag is the unique identifier you mention how would the
>>> compiled code get from the tag to the correct routine?
>>
>> Via the dispatching table indexed by the tag. Tag could not be a dense
>> index, true, so you would use a hash or binary search instead.
>
> Are you thinking that to find each function the executable code would
> carry out the hash or binary search at run time (if optimisation didn't
> remove the need for it)?

https://en.wikipedia.org/wiki/Hash_function

Tag -> 1st Foo's parameter Hash = I1
Tag -> 2nd Foo's parameter Hash = I2
..
Tag -> Nth Foo's parameter Hash = IN

Foo_Table (I1, I2, ... IN) --> Target pointer

There exist multi-dimensional hash techniques which would allow you to
do it in less steps. I do not know how effective they are.

>> Hash functions are pretty quick.
>
> Hash functions are not fast enough to find a call target! I was/am even
> concerned about the extra cost of following a double pointer. Using a
> hash table would be slower still.

Faster than anything the user would have to write otherwise, in order to
emulate dispatch.

Then performance is never a big concern. Safety is. User emulations of
dispatch would be nested if-statements, which are not only significantly
slower O(n), but inherently error-prone and unmaintainable.

>> Then again, if you don't repeat C++ and Java error confusing
>> class-wide and specific types, you would drastically reduce cases when
>> dispatch ever happens. Most of the code deals with statically known
>> specific types, ergo, no dispatch (or as some call it, "static
>> dispatch").
>
> I'm not sure I understand that but if you think C++ and Java got it
> wrong do you think Ada got it right and what did it do differently?

If foo and baz are virtual functions of T then

void foo (T& x)
{
x.baz (); -- Dispatches
}

In Ada

procedure Foo (X : T) is
begin
X.Baz; -- Does not dispatch, the type is established to be T
end Foo;

>> Segmented dispatching tables is IMO a bigger problem unless you
>> rebuild affected dispatching each time you load a library or
>> dynamically create a new type etc.
>
> I presume that means when a large dispatch table reaches the limits of
> its allocated memory and has to be split.

Not large, but consisting of several parts, so that you must linearly
search through all of them.

James Harris

unread,
Sep 3, 2021, 12:43:16 PMSep 3
to
On 03/09/2021 16:28, Dmitry A. Kazakov wrote:
> On 2021-09-03 16:59, James Harris wrote:
>
>> Why not say that each surface has to provide a set of primitives that
>> other programs can use to draw on an instance of that type of surface?
>
> Because that set is open ended. OOP allows you
>
> 1. adding new primitives (new shapes)
>
> 2. using primitives with new surfaces (inheritance)
>
> All counter proposals are static and do not scale in practice, which is
> why OOP is used so widely in large projects regardless any difficulties.
> There is simple no other way to do it, economically, there is no other way.
>

Your example was good in showing multiple dispatch but I'm not so sure
it's a good solution in this case, let alone the only way!

The rights and wrongs of the approach are OT so all I'll say is that M*N
does not scale and that new shapes can be constructed from primitives.

Unless I am missing something the cartesian M*N would be costly and so
even economically, as you mention, I cannot agree that there is no other
way.


--
James Harris

Dmitry A. Kazakov

unread,
Sep 3, 2021, 1:27:59 PMSep 3
to
On 2021-09-03 18:43, James Harris wrote:
> On 03/09/2021 16:28, Dmitry A. Kazakov wrote:
>> On 2021-09-03 16:59, James Harris wrote:
>>
>>> Why not say that each surface has to provide a set of primitives that
>>> other programs can use to draw on an instance of that type of surface?
>>
>> Because that set is open ended. OOP allows you
>>
>> 1. adding new primitives (new shapes)
>>
>> 2. using primitives with new surfaces (inheritance)
>>
>> All counter proposals are static and do not scale in practice, which
>> is why OOP is used so widely in large projects regardless any
>> difficulties. There is simple no other way to do it, economically,
>> there is no other way.
>>
>
> Your example was good in showing multiple dispatch but I'm not so sure
> it's a good solution in this case, let alone the only way!

Of course it is. Write a program that renders a list of shapes in a
widget in a way that you would not need to modify each time another
developers team creates a new shape or supports another surface.

> The rights and wrongs of the approach are OT so all I'll say is that M*N
> does not scale and that new shapes can be constructed from primitives.

You did not thought it through. A set of primitives supported by all
shapes is already a class. This is exactly what a class of shapes is.
Any client of the class expects to find an implementation of any
primitive in any surface.

Remove the class, you will need to write a new client for each surface.

> Unless I am missing something the cartesian M*N would be costly and so
> even economically, as you mention,

And its existence is a fact of life. Shapes exist, surfaces exist,
drawing exist.

> I cannot agree that there is no other way.

Show the other way to implement and use M*N operations safely.

Bart

unread,
Sep 3, 2021, 1:45:10 PMSep 3
to
Say you have an enum type listing M different shapes, and another
listing N different surfaces:

* Create a 2D MxN array H of function pointers

* Populate the table (see below)

* To plot any shape S, call the function H[shape_tag, surface_type]


To populate the table (this is what I can do in my language; probably
it's not possible in Ada or others, it might need external
code-generation, or auxiliary, manually-created tables of available
combinations):

* Write the functions for the combinations of Shape and Surface that are
supported; name the functions according to some agreed format. (These
might be methods; that is just a detail)

* Have an initialisation routine that scans all functions in the program
looking for those that implement the plotting (it will look at the name)

* For each such function, extract the shape X and surface Y it
implements, from the name. Fill in H[X,Y] with a reference to the function

* For all table entries not yet populated, fill in a reference to an
error-handler or default-handling function. 'nil' is not recommended!


Dmitry A. Kazakov

unread,
Sep 3, 2021, 2:01:40 PMSep 3
to
On 2021-09-03 19:44, Bart wrote:
> On 03/09/2021 18:27, Dmitry A. Kazakov wrote:
>> On 2021-09-03 18:43, James Harris wrote:
>
>>
>>> Unless I am missing something the cartesian M*N would be costly and
>>> so even economically, as you mention,
>>
>> And its existence is a fact of life. Shapes exist, surfaces exist,
>> drawing exist.
>>
>>> I cannot agree that there is no other way.
>>
>> Show the other way to implement and use M*N operations safely.
>
> Say you have an enum type listing M different shapes, and another
> listing N different surfaces:
>
> * Create a 2D MxN array H of function pointers

Failed. The requirement included different teams of developers and
different times rows and columns (types) added.

Bart

unread,
Sep 3, 2021, 4:07:03 PMSep 3
to
How is that relevant? Presumably all the necessary information (method
definitions or their interfaces) will be visible to the (presumably AOT)
compiler when the application is built?

Then who cares who wrote it or when.

Anyway who are you to say how my application will be organised.

Dmitry A. Kazakov

unread,
Sep 3, 2021, 4:25:21 PMSep 3
to
Nope. An unknown number of dynamically loaded shared libraries may
contain types derived from Shape and/or Surface. The libraries are
loaded during run-time per scanning a list of directories containing DLL
plug-ins.

> Then who cares who wrote it or when.

You must care. You must scan all code to determine N and M and find all
functions for all slots. Then you must recompile everything.

> Anyway who are you to say how my application will be organised.

A provider of operations.

The point is that in real projects nobody has control or knowledge about
the types comprising dispatching tables. It is called late bindings for
a reason.

You may have your wet dreams about how you organize *your* application,
but the fact is that nobody develops software this way for longer than
three decades.

Bart

unread,
Sep 3, 2021, 4:48:42 PMSep 3
to
On 03/09/2021 21:25, Dmitry A. Kazakov wrote:
> On 2021-09-03 22:06, Bart wrote:
>> On 03/09/2021 19:01, Dmitry A. Kazakov wrote:
>>> On 2021-09-03 19:44, Bart wrote:
>>>> On 03/09/2021 18:27, Dmitry A. Kazakov wrote:
>>>>> On 2021-09-03 18:43, James Harris wrote:
>>>>
>>>>>
>>>>>> Unless I am missing something the cartesian M*N would be costly
>>>>>> and so even economically, as you mention,
>>>>>
>>>>> And its existence is a fact of life. Shapes exist, surfaces exist,
>>>>> drawing exist.
>>>>>
>>>>>> I cannot agree that there is no other way.
>>>>>
>>>>> Show the other way to implement and use M*N operations safely.
>>>>
>>>> Say you have an enum type listing M different shapes, and another
>>>> listing N different surfaces:
>>>>
>>>> * Create a 2D MxN array H of function pointers
>>>
>>> Failed. The requirement included different teams of developers and
>>> different times rows and columns (types) added.
>>
>> How is that relevant? Presumably all the necessary information (method
>> definitions or their interfaces) will be visible to the (presumably
>> AOT) compiler when the application is built?

> Nope. An unknown number of dynamically loaded shared libraries may
> contain types derived from Shape and/or Surface. The libraries are
> loaded during run-time per scanning a list of directories containing DLL
> plug-ins.

Then any approach using tables attached to class instances is going to
fail too, if every time someone makes a suggestion, you are going to
think up more unlikely scenarios to sabotage the idea.



>
>> Then who cares who wrote it or when.
>
> You must care. You must scan all code to determine N and M and find all
> functions for all slots. Then you must recompile everything.

That's what I said you have to do. a For static application which knows
all the shapes and all the surfaces.

Now you are saying you can NEVER do that because SOME applications have
shapes and surfaces defined at runtime.

Does that make sense? So I shouldn't buy a car to do shopping etc,
because I won't be able to drive from London to New York?

Anyway, when stuff like that happens at runtime, then it sounds more a
job for a dynamic language than a static one.

(In my 1990s application, modules of my script language could be
hot-loaded; commands and functionality could be added /while my
application was running/.)

>> Anyway who are you to say how my application will be organised.
>
> A provider of operations.
>
> The point is that in real projects nobody has control or knowledge about
> the types comprising dispatching tables. It is called late bindings for
> a reason.

This is not about types. It's about enumerations. What you're saying is
that I can't use a fixed enumeration like:

(Mon, Tue, Wed, Thu, Fri, Sat, Sun)

as the basis for a dispatch table, because new days of the week could be
created while the program runs.

> You may have your wet dreams about how you organize *your* application,
> but the fact is that nobody develops software this way for longer than
> three decades.

So, go back to that inheritance diagram of shapes and surfaces; you're
also saying that it's a waste of time for a compiler to even look at
that, because the real inheritance graph is not known until runtime?

In that case we might as well all pack up and go home.

Dmitry A. Kazakov

unread,
Sep 3, 2021, 5:42:34 PMSep 3
to
On 2021-09-03 22:48, Bart wrote:
> On 03/09/2021 21:25, Dmitry A. Kazakov wrote:

>> Nope. An unknown number of dynamically loaded shared libraries may
>> contain types derived from Shape and/or Surface. The libraries are
>> loaded during run-time per scanning a list of directories containing
>> DLL plug-ins.
>
> Then any approach using tables attached to class instances is going to
> fail too,

It works perfectly well with C++, Ada and dozens other OOPL. You can try
it yourself. Get a grip, you can have a type T in one library. Derive S
from T in another. Load that library at run-time. Create an object of
the type S. Pass it to a function in the first library and dispatch from
that function to a method overridden in the dynamically loaded library!

> if every time someone makes a suggestion, you are going to
> think up more unlikely scenarios to sabotage the idea.

What? Plug-ins are unlikely scenario? Do you live under a rock?

>>> Then who cares who wrote it or when.
>>
>> You must care. You must scan all code to determine N and M and find
>> all functions for all slots. Then you must recompile everything.
>
> That's what I said you have to do. a For static application which knows
> all the shapes and all the surfaces.
>
> Now you are saying you can NEVER do that because SOME applications have
> shapes and surfaces defined at runtime.

No they are defined at compile time in the components which know nothing
about each other. This is called loose coupling design principle:

https://en.wikipedia.org/wiki/Loose_coupling

>> The point is that in real projects nobody has control or knowledge
>> about the types comprising dispatching tables. It is called late
>> bindings for a reason.
>
> This is not about types. It's about enumerations.

You can treat tags of all types from some class as an enumeration.
Dispatching is equivalent to a switch statement over the tags.

One of the motivations behind OOPL was to allow alternatives of such a
monstrous switch to be located in different modules developed and
maintained independently on each other.

Operating systems used dispatch since very beginning. Even the term
dispatch is from there, I believe. Implementations of kernels written in
assembler dispatched from interrupt handlers to the driver code. And,
yes, surprise, drivers were dynamically loadable half a century ago.
OOPLs are there merely to support such software architectures.

Bart

unread,
Sep 3, 2021, 8:49:28 PMSep 3
to
On 03/09/2021 22:42, Dmitry A. Kazakov wrote:
> On 2021-09-03 22:48, Bart wrote:
>> On 03/09/2021 21:25, Dmitry A. Kazakov wrote:
>
>>> Nope. An unknown number of dynamically loaded shared libraries may
>>> contain types derived from Shape and/or Surface. The libraries are
>>> loaded during run-time per scanning a list of directories containing
>>> DLL plug-ins.
>>
>> Then any approach using tables attached to class instances is going to
>> fail too,
>
> It works perfectly well with C++, Ada and dozens other OOPL. You can try
> it yourself. Get a grip, you can have a type T in one library. Derive S
> from T in another. Load that library at run-time. Create an object of
> the type S. Pass it to a function in the first library and dispatch from
> that function to a method overridden in the dynamically loaded library!

So my H[shape,surface] table /can/ work?

You want to plot S, get X/Y as its shape tag, plus the desired surface
type, and call the handler. The difference is that building the handler
table is more dynamic than it was (if you remember, it was populated at
runtime anyway).

X and Y are normally linear enumarations; if not then H[,] becomes a
hashtable instead (with a key of 2 values).


>> if every time someone makes a suggestion, you are going to think up
>> more unlikely scenarios to sabotage the idea.
>
> What? Plug-ins are unlikely scenario? Do you live under a rock?

What do you think my hot-loaded modules were? Just crude versions of
plug-ins from 25 years ago.

You do like to bamboozle people don't you with yours seems to be to
bamboozle people with scenarios from giant, enterprise languages, as
though there's no place for anything else.

I want to create my own recipes in my own kitchen using my own methods.
You look down your nose at that because those methods aren't suitable
for industrial scale food production.

OK. But my own meals will still taste better for me!

There's a place for your languages, and there's a place for ones like mine.

>> Now you are saying you can NEVER do that because SOME applications
>> have shapes and surfaces defined at runtime.
>
> No they are defined at compile time in the components which know nothing
> about each other.

Well, I create my 1990s CAD application without any of these methods. I
seem to recall that it worked pretty well.

One problem that I recall was that it fitted easily onto a single floppy
disk. Since it was rather expensive product, it was shipped on CD to
make out was it was much bigger than it was! Even then people had gotten
used to bloated, cumbersome applications so that's what they expected.

Dmitry A. Kazakov

unread,
Sep 4, 2021, 4:27:21 AMSep 4
to
On 2021-09-04 02:49, Bart wrote:
> On 03/09/2021 22:42, Dmitry A. Kazakov wrote:
>> On 2021-09-03 22:48, Bart wrote:
>>> On 03/09/2021 21:25, Dmitry A. Kazakov wrote:
>>
>>>> Nope. An unknown number of dynamically loaded shared libraries may
>>>> contain types derived from Shape and/or Surface. The libraries are
>>>> loaded during run-time per scanning a list of directories containing
>>>> DLL plug-ins.
>>>
>>> Then any approach using tables attached to class instances is going
>>> to fail too,
>>
>> It works perfectly well with C++, Ada and dozens other OOPL. You can
>> try it yourself. Get a grip, you can have a type T in one library.
>> Derive S from T in another. Load that library at run-time. Create an
>> object of the type S. Pass it to a function in the first library and
>> dispatch from that function to a method overridden in the dynamically
>> loaded library!
>
> So my H[shape,surface] table /can/ work?

Dispatching tables work. It is the methods of creating and maintaining
them safely that is the issue.

> There's a place for your languages, and there's a place for ones like mine.

Nobody forces you to implement any feature. The grapes are sour.

James Harris

unread,
Sep 4, 2021, 1:43:36 PMSep 4
to
On 03/09/2021 16:46, Dmitry A. Kazakov wrote:
> On 2021-09-03 16:40, James Harris wrote:
>> On 03/09/2021 14:39, Dmitry A. Kazakov wrote:

...

>>> Hash functions are pretty quick.
>>
>> Hash functions are not fast enough to find a call target! I was/am
>> even concerned about the extra cost of following a double pointer.
>> Using a hash table would be slower still.
>
> Faster than anything the user would have to write otherwise, in order to
> emulate dispatch.

On the contrary, where the scheme I mentioned would have just two
dereferences to get the address of the required function ... your scheme
would have multiple dereferences, a function call, a hash calculation,
and possibly collision resolution.

>
> Then performance is never a big concern.

I'm really surprised that such a lookup would even be considered. It
would be much slower than a direct function call.


> Safety is. User emulations of
> dispatch would be nested if-statements, which are not only significantly
> slower O(n), but inherently error-prone and unmaintainable.

That's an interesting point from the point of view of performance. A
case structure would nominally be faster than sequential search but with
a potentially complex key such as your tag that case statement may well
have to do hash lookup or binary search. A simple index or a pointer
would be faster.

What is your objection to my proposed solution? To be clear, the idea is
to have the object's pointer to link directly to the method table as
follows.

object: |___ptr___|___field0___|___field1___| etc
|
V
|__type___|__reserved__|__ptr__|__ptr__|__ptr__|__ptr__|

I've shown one dimension but there could be as many as necessary for
multiple dispatch.

One area in which your scheme would do well is if there were a large,
sparse, multidimensional table of method pointers. But in that case the
routine I would dispatch to could still carry out its own lookup to find
the target function.

Having said all that, IMO the static dispatch that David mentioned would
be significantly faster than either of the above approaches, though I
gather from your point about closures that it can be impossible at
compile time to predict what the type will be.


>
>>> Then again, if you don't repeat C++ and Java error confusing
>>> class-wide and specific types, you would drastically reduce cases
>>> when dispatch ever happens. Most of the code deals with statically
>>> known specific types, ergo, no dispatch (or as some call it, "static
>>> dispatch").
>>
>> I'm not sure I understand that but if you think C++ and Java got it
>> wrong do you think Ada got it right and what did it do differently?
>
> If foo and baz are virtual functions of T then
>
>    void foo (T& x)
>    {
>       x.baz (); -- Dispatches
>    }

I didn't know what a virtual function was but AIUI it's a function which
has to be overridden in a descendant class. (?)

If that's right is the issue that Ada prohibits virtual functions?

>
> In Ada
>
>    procedure Foo (X : T) is
>    begin
>       X.Baz; -- Does not dispatch, the type is established to be T
>    end Foo;

So T.Baz has to be in T and cannot be overridden in a subclass of T?


--
James Harris

Dmitry A. Kazakov

unread,
Sep 4, 2021, 3:56:26 PMSep 4
to
On 2021-09-04 19:43, James Harris wrote:
> On 03/09/2021 16:46, Dmitry A. Kazakov wrote:
>> On 2021-09-03 16:40, James Harris wrote:
>>> On 03/09/2021 14:39, Dmitry A. Kazakov wrote:
>
> ...
>
>>>> Hash functions are pretty quick.
>>>
>>> Hash functions are not fast enough to find a call target! I was/am
>>> even concerned about the extra cost of following a double pointer.
>>> Using a hash table would be slower still.
>>
>> Faster than anything the user would have to write otherwise, in order
>> to emulate dispatch.
>
> On the contrary, where the scheme I mentioned would have just two
> dereferences to get the address of the required function ... your scheme
> would have multiple dereferences, a function call, a hash calculation,
> and possibly collision resolution.

You did not propose anything for the case of full dispatch. Certainly
specialized cases like single dispatch or multi-methods could be handled
differently.

The point was, if you do not support this in the language, the
programmer will have to implement it, which is always slower and always
error prone.

>> Then performance is never a big concern.
>
> I'm really surprised that such a lookup would even be considered. It
> would be much slower than a direct function call.

Again, what is the alternative?

GTK is written in C. C has no classes. So GTK emulates them using the
GObject library. Guess, what is faster, C++ implementation of dispatch
or the GObject kludge?

> What is your objection to my proposed solution? To be clear, the idea is
> to have the object's pointer to link directly to the method table as
> follows.
>
>   object:   |___ptr___|___field0___|___field1___| etc
>                  |
>                  V
>             |__type___|__reserved__|__ptr__|__ptr__|__ptr__|__ptr__|

Show how this is supposed to work with full multiple dispatch, with
multi-methods, with multiple inheritance, with constructors and
destructors, with scalar types.

BTW, there is a book by Bjarne Stroustrup with a detailed discussion
regarding implementation of objects and dispatching tables (virtual
tables) in C++.

https://www.stroustrup.com/arm.html

I suggest you read it first in order to get an elementary understanding
of the problematic. And keep in mind that C++ has single dispatch.

> I've shown one dimension but there could be as many as necessary for
> multiple dispatch.

No, because there will be many objects in the call.

> Having said all that, IMO the static dispatch that David mentioned would
> be significantly faster than either of the above approaches, though I
> gather from your point about closures that it can be impossible at
> compile time to predict what the type will be.

Static dispatch is just a direct call, provided you designed the
language properly.

>>>> Then again, if you don't repeat C++ and Java error confusing
>>>> class-wide and specific types, you would drastically reduce cases
>>>> when dispatch ever happens. Most of the code deals with statically
>>>> known specific types, ergo, no dispatch (or as some call it, "static
>>>> dispatch").
>>>
>>> I'm not sure I understand that but if you think C++ and Java got it
>>> wrong do you think Ada got it right and what did it do differently?
>>
>> If foo and baz are virtual functions of T then
>>
>>     void foo (T& x)
>>     {
>>        x.baz (); -- Dispatches
>>     }
>
> I didn't know what a virtual function was but AIUI it's a function which
> has to be overridden in a descendant class. (?)

"Virtual function" is C++ term for method.

> If that's right is the issue that Ada prohibits virtual functions?

No, Ada's name for virtual function is "primitive operation."

>> In Ada
>>
>>     procedure Foo (X : T) is
>>     begin
>>        X.Baz; -- Does not dispatch, the type is established to be T
>>     end Foo;
>
> So T.Baz has to be in T and cannot be overridden in a subclass of T?

X.Baz does not dispatch in Ada. In C++ X.baz () it does dispatch.

Bart

unread,
Sep 4, 2021, 4:19:05 PMSep 4
to
[Original DAK post lost]

DAK:

On 2021-09-04 02:49, Bart wrote:


>> There's a place for your languages, and there's a place for ones
like mine.

> Nobody forces you to implement any feature. The grapes are sour.


There is a certain amount of satisfaction in having, over exactly 40
years now, used almost exclusively my own languages, implementations and
tools, and all self-hosted.

One key to doing that is to be selective about what features I've added.
I've kept things simple, managable and understandable. Which is not
always easy.

Your (DAK's) role seems to be the opposite: turn everything into
something harder, less manageable, and incomprehensible!

James Harris

unread,
Sep 4, 2021, 4:44:57 PMSep 4
to
On 30/08/2021 13:08, Bart wrote:
> On 30/08/2021 11:23, James Harris wrote:

...

>>    P.plot()

...

> Why, in your example, wouldn't the compiler know the type of P?

The compiler would know if P were created and manipulated wholly by the
current function but P could be a formal parameter which could be one of
a number of types. I can think of two cases. One is inheritance

type A
|
type B

then a parameter declared to be of type A could include a function from
either type A or type B. (Or is it the other way round?)

function f(A P)

A given activation of function f could be required to use either A's or
B's version of a function.

The other is a variant

typedef C = int or float

function f(C P)

A given activation of function f could have been passed an int or a float.


>
> Do you have a more elaborate example that shows off the advantages of
> OO, or whatever you are trying to do, that would be too clunky in C?
>

I'm certainly not trying to promote OOP. I have serious doubts about the
inheritance part of it.


--
James Harris

Dmitry A. Kazakov

unread,
Sep 4, 2021, 5:01:01 PMSep 4
to
Right, different priorities. Simplicity for the compiler designer
translates into complexity for the programmer.

Bart

unread,
Sep 4, 2021, 5:35:47 PMSep 4
to
On 04/09/2021 21:44, James Harris wrote:
> On 30/08/2021 13:08, Bart wrote:
>> On 30/08/2021 11:23, James Harris wrote:
>
> ...
>
>>>    P.plot()
>
> ...
>
>> Why, in your example, wouldn't the compiler know the type of P?
>
> The compiler would know if P were created and manipulated wholly by the
> current function but P could be a formal parameter which could be one of
> a number of types. I can think of two cases. One is inheritance
>
>   type A
>     |
>   type B
>
> then a parameter declared to be of type A could include a function from
> either type A or type B. (Or is it the other way round?)
>
>   function f(A P)
>
> A given activation of function f could be required to use either A's or
> B's version of a function.

I didn't understand that at all. I guess inheritance is not for me.

I have played with chains of derived types like this:

A -> B -> C -> D
X -> Y -> Z

Then had to decide what to do about C+Y or even A+B*D, since some ops
may be inherited, some overloaded, and what do about auto coercions in
D:=A or Y:=A.

So, I don't have distinct derived types; only aliases for the same type,
to make like simpler.


> The other is a variant
>
>   typedef C = int or float
>
>   function f(C P)
>
> A given activation of function f could have been passed an int or a float.

Here it's easy to understand your intention. But possibly more
challenging to implement.

That looks a bit like a Haskell type, but if you're not implementing
that, would f be a generic function? Or would C be a variant type along
the lines of C++'s? Will there be overloads of f()?

Just getting the basics working (eg. A+B if both are of type C) would be
challenging here.

So perhaps pin down more precisely what features you're thinking of having.

Bart

unread,
Sep 4, 2021, 5:37:12 PMSep 4
to
You might want to look back at the fragments of my language that I've
posted. Such as 'print x'.

Dmitry A. Kazakov

unread,
Sep 5, 2021, 3:32:51 AMSep 5
to
I do not need it. But if you ask, print x with the sign as a ternary
number aligned at the right margin of the field of 10 character width,
and do not forget it must be superscript.

Dmitry A. Kazakov

unread,
Sep 5, 2021, 3:58:07 AMSep 5
to
On 2021-09-04 23:35, Bart wrote:

>> The other is a variant
>>
>>    typedef C = int or float
>>
>>    function f(C P)
>>
>> A given activation of function f could have been passed an int or a
>> float.
>
> Here it's easy to understand your intention. But possibly more
> challenging to implement.

This is quite easy, since no inheritance or [multi-]methods involved.
[f() is not a method]

> That looks a bit like a Haskell type,

C is a class-wide type of the class {int, float}.

> but if you're not implementing
> that, would f be a generic function?

f() is a class-wide function.

> Or would C be a variant type along
> the lines of C++'s?

One can imagine it as a variant type with the tag selecting the value.
The values of C would be like these:

(int'tag, 123)
(float'tag, 3.1415)

The representation of tags is irrelevant in this case.

> Will there be overloads of f()?

Why not? If you rather meant overriding, then no, only methods can be
overridden.

> Just getting the basics working (eg. A+B if both are of type C) would be
> challenging here.

function "+" (Left, Right : C) return C is
begin
if Left in int then -- Left.tag = int'tag
if Right in int then
return (int'tag, int (Left) + int (Right);
else
return (float'tag, float (int (Left)) + float (Right);
end if;
else
if Right in int then
return (float'tag, float (Left) + float (int (Right));
else
return (float'tag, float (Left) + float (Right);
end if;
end if;
end "+";

int (Left) is what C++ calls dynamic type cast. It checks the type tag
to be int'tag and then returns the integer value.

Here we see the an obvious problem, the above is nothing but a manual
implementation of a dispatch on the multi-method. "+" should have been a
multi-method, not a class-wide operation. Then each choice in the
cascaded ifs would be a separate overriding/inheriting body.

Bart

unread,
Sep 5, 2021, 6:24:38 AMSep 5
to
I can manage this:

a:=730

println a
fprintln "|#|", a:"x3 10 jr"

This shows:

730
| 1000001|

I can't do much about superscript; that depends on the capabilities of
the console window I'm using.

In any case I consider aspects of typeface, style, size, colour etc
outside the remit of 'print' which is mainly concerned with turning
expressions into plain text, eg. serialising.

You might as well ask whether the output of a JSON or XML converter will
render strings as superscript.

But, show us all how it's done in Ada, and in a way that will work on my PC.

(I know have an Ada compiler installed. I've added it to one of my
benchmarks here:
https://github.com/sal55/langs/blob/master/Compilertest1.md. Ada (Gnat)
is just over halfway down.)

Dmitry A. Kazakov

unread,
Sep 5, 2021, 6:54:35 AMSep 5
to
Looks like broken checksum rather than a program...

> This shows:
>
>     730
>     |   1000001|

Where is the +?

> I can't do much about superscript; that depends on the capabilities of
> the console window I'm using.

Windows console supports UTF-8. See the code page command CHCP.

Bart

unread,
Sep 5, 2021, 7:15:11 AMSep 5
to
On 05/09/2021 11:54, Dmitry A. Kazakov wrote:
> On 2021-09-05 12:24, Bart wrote:

>>
>>      730
>>      |   1000001|
>
> Where is the +?


Oops, I forgot. Try:

println a : "+x3 10 jr"

I've dispensed with the format which added the |...| delimiters.

If you don't like the way those attributes are specified, then please
post how it's done in Ada.

If you like a more long-winded style, then I could use this (dynamic code):

a:=730

println a:fmt(dosign:"+", width:10, base:3, just:"right")

....

function fmt(width=0, dosign="", just="", base=10)=
s:=dosign
if width then s+:=tostr(width) fi
if base<>10 then s+:="x"+tostr(base) fi
if just then s+:="j"+leftstr(just) fi
return s
end



>> I can't do much about superscript; that depends on the capabilities of
>> the console window I'm using.
>
> Windows console supports UTF-8. See the code page command CHCP.

It might do, but it doesn't belong in the serialisation options of a
single print item. If superscript can be turned on and off by string
escape sequences, then my example becomes:

println sson, a:"x3...", ssoff

But this is no longer within the language.


James Harris

unread,
Sep 9, 2021, 1:25:06 PMSep 9
to
On 04/09/2021 22:35, Bart wrote:
> On 04/09/2021 21:44, James Harris wrote:
>> On 30/08/2021 13:08, Bart wrote:
>>> On 30/08/2021 11:23, James Harris wrote:
>>
>> ...
>>
>>>>    P.plot()
>>
>> ...
>>
>>> Why, in your example, wouldn't the compiler know the type of P?
>>
>> The compiler would know if P were created and manipulated wholly by
>> the current function but P could be a formal parameter which could be
>> one of a number of types. I can think of two cases. One is inheritance
>>
>>    type A
>>      |
>>    type B
>>
>> then a parameter declared to be of type A could include a function
>> from either type A or type B. (Or is it the other way round?)
>>
>>    function f(A P)
>>
>> A given activation of function f could be required to use either A's
>> or B's version of a function.
>
> I didn't understand that at all. I guess inheritance is not for me.

Inheritance is maybe not for /anyone/! It encourages composition by tall
hierarchies rather than by flattening and simplifying but that's another
matter. The model I had in mind for that example is a simple one:

An object of type A has fields Aa, Ab and Ac

An object of type B has all of A's fields and some of its own.

As long as both types of object have A's fields at the same offsets then
code which expects an A can still work on A's fields and doesn't care
whether the object is really of type A or of type B.

...

>> The other is a variant
>>
>>    typedef C = int or float
>>
>>    function f(C P)
>>
>> A given activation of function f could have been passed an int or a
>> float.
>
> Here it's easy to understand your intention. But possibly more
> challenging to implement.

Could function f not just use a switch? As in

switch P.tag
case int
/* int processing */
case float
/* float processing */
end switch

(That switch could either be explicit in the source code or implicit and
generated by the compiler.)

>
> That looks a bit like a Haskell type, but if you're not implementing
> that, would f be a generic function? Or would C be a variant type along
> the lines of C++'s? Will there be overloads of f()?

To answer your questions: in the example, I had in mind C being a
variant type; f could be a simple function and would not have to be
overloaded.

I read a nice, simple description the other day:

* Polymorphism is about one function handling multiple types.

* Generics is about producing a custom function for each type.

One could add a variant type (and a switch on the type) to that in order
to achieve the same effect.

>
> Just getting the basics working (eg. A+B if both are of type C) would be
> challenging here.
>
> So perhaps pin down more precisely what features you're thinking of having.
>

Frankly, I'd like the programmer to be able to work purely in terms of
the logic of his algorithm and for the compiler to pick the implementation.

I am looking into various options but there are plenty of 'em!


--
James Harris

Bart

unread,
Sep 9, 2021, 2:10:19 PMSep 9
to
There is also the case where B overrides some fields of A. They might be
of different types and sizes from A's, which makes keeping them at the
same offsets hard, and cause other difficulties in using a value X which
can be of either A or B types.

> As long as both types of object have A's fields at the same offsets then
> code which expects an A can still work on A's fields and doesn't care
> whether the object is really of type A or of type B.

In my dynamic language, none of these matters. The offsets can be
different, as they are picked up at runtime; and the types can be
different anyway.

But I ran into trouble with static code.


> ...
>
>>> The other is a variant
>>>
>>>    typedef C = int or float
>>>
>>>    function f(C P)
>>>
>>> A given activation of function f could have been passed an int or a
>>> float.
>>
>> Here it's easy to understand your intention. But possibly more
>> challenging to implement.
>
> Could function f not just use a switch?

What, for each reference to P inside the body? What about when you have
the expression P+Q+R, each of which has type C; will you need an 8-way
switch, or ... ?


As in
>
>   switch P.tag
>     case int
>       /* int processing */
>     case float
>       /* float processing */
>   end switch

... would the switch be on entry to the function, where it looks at all
parameter combinations and then calls a specific instantiation for that?
That would just be generics, and would preferably be done at a
compile-time by a compiler that is aware of the actual types.

> (That switch could either be explicit in the source code or implicit and
> generated by the compiler.)

>>
>> That looks a bit like a Haskell type, but if you're not implementing
>> that, would f be a generic function? Or would C be a variant type
>> along the lines of C++'s? Will there be overloads of f()?
>
> To answer your questions: in the example, I had in mind C being a
> variant type; f could be a simple function and would not have to be
> overloaded.
>
> I read a nice, simple description the other day:
>
> * Polymorphism is about one function handling multiple types.
>
> * Generics is about producing a custom function for each type.
>
> One could add a variant type (and a switch on the type) to that in order
> to achieve the same effect.

I use type-dispatching in dynamic code; I have difficulty seeing how it
could be applied efficiently to static code.

How about looking at an expression such as A:=B+C*D, where these are all
variants, and mocking up what generate code might look like. Try also
X[i] where X has variant elements, and i might be variant too.

Or maybe write some C++ code and see what the compiler generates.



> Frankly, I'd like the programmer to be able to work purely in terms of
> the logic of his algorithm and for the compiler to pick the implementation.

You might be on the verge of switching to a dynamically typed language!

James Harris

unread,
Sep 9, 2021, 2:32:04 PMSep 9
to
On 09/09/2021 19:10, Bart wrote:
> On 09/09/2021 18:25, James Harris wrote:
>> On 04/09/2021 22:35, Bart wrote:
>>> On 04/09/2021 21:44, James Harris wrote:

...

>>>>    type A
>>>>      |
>>>>    type B
>>>>
>>>> then a parameter declared to be of type A could include a function
>>>> from either type A or type B. (Or is it the other way round?)
>>>>
>>>>    function f(A P)
>>>>
>>>> A given activation of function f could be required to use either A's
>>>> or B's version of a function.
>>>
>>> I didn't understand that at all. I guess inheritance is not for me.
>>
>> Inheritance is maybe not for /anyone/! It encourages composition by
>> tall hierarchies rather than by flattening and simplifying but that's
>> another matter. The model I had in mind for that example is a simple one:
>>
>> An object of type A has fields Aa, Ab and Ac
>>
>> An object of type B has all of A's fields and some of its own.
>
> There is also the case where B overrides some fields of A. They might be
> of different types and sizes from A's, which makes keeping them at the
> same offsets hard, and cause other difficulties in using a value X which
> can be of either A or B types.

AIUI in a statically typed language data types are typically not
changed. (They may be in a language which has dynamic typing.) Only the
functions will be overrdden, not the data type. So the offsets would not
change, and a type B could appear to be a type A.

>
>> As long as both types of object have A's fields at the same offsets
>> then code which expects an A can still work on A's fields and doesn't
>> care whether the object is really of type A or of type B.
>
> In my dynamic language, none of these matters. The offsets can be
> different, as they are picked up at runtime; and the types can be
> different anyway.
>
> But I ran into trouble with static code.

Did you try to change the types?

>
>>>> The other is a variant
>>>>
>>>>    typedef C = int or float
>>>>
>>>>    function f(C P)
>>>>
>>>> A given activation of function f could have been passed an int or a
>>>> float.
>>>
>>> Here it's easy to understand your intention. But possibly more
>>> challenging to implement.
>>
>> Could function f not just use a switch?
>
> What, for each reference to P inside the body? What about when you have
> the expression P+Q+R, each of which has type C; will you need an 8-way
> switch, or ... ?
>

If those are integers I'd have thought /you/ would just expand them all
up ... to 64 bits!

A more typical use than numbers would probably be nodes of different types.

...

> I use type-dispatching in dynamic code; I have difficulty seeing how it
> could be applied efficiently to static code.
>
> How about looking at an expression such as A:=B+C*D, where these are all
> variants, and mocking up what generate code might look like. Try also
> X[i] where X has variant elements, and i might be variant too.

1. Convert i to int.
2. Get the address of the element.

What is needed next will be the same as operating on a scalar.

>
> Or maybe write some C++ code and see what the compiler generates.

Why would I want to do that?!

>
>
>
>> Frankly, I'd like the programmer to be able to work purely in terms of
>> the logic of his algorithm and for the compiler to pick the
>> implementation.
>
> You might be on the verge of switching to a dynamically typed language!
>

Not me!


--
James Harris

James Harris

unread,
Sep 9, 2021, 3:39:08 PMSep 9
to
On 03/09/2021 18:27, Dmitry A. Kazakov wrote:
> On 2021-09-03 18:43, James Harris wrote:
>> On 03/09/2021 16:28, Dmitry A. Kazakov wrote:
>>> On 2021-09-03 16:59, James Harris wrote:
>>>
>>>> Why not say that each surface has to provide a set of primitives
>>>> that other programs can use to draw on an instance of that type of
>>>> surface?
>>>
>>> Because that set is open ended. OOP allows you
>>>
>>> 1. adding new primitives (new shapes)
>>>
>>> 2. using primitives with new surfaces (inheritance)
>>>
>>> All counter proposals are static and do not scale in practice, which
>>> is why OOP is used so widely in large projects regardless any
>>> difficulties. There is simple no other way to do it, economically,
>>> there is no other way.
>>>
>>
>> Your example was good in showing multiple dispatch but I'm not so sure
>> it's a good solution in this case, let alone the only way!
>
> Of course it is. Write a program that renders a list of shapes in a
> widget in a way that you would not need to modify each time another
> developers team creates a new shape or supports another surface.

Sure. I would have surfaces support different interfaces: pixmap, vector
and whatever else was needed, and have a basic set of component shapes
that surfaces could draw on themselves: lines, triangles, rectangles,
ellipses, etc, and bit blit. Then widgets could then use those
primitives to do all required drawing of basic and complex shapes.

There is no need that I can see for the complexity or the lack of
scalability that you seem to be proposing.

>
>> The rights and wrongs of the approach are OT so all I'll say is that
>> M*N does not scale and that new shapes can be constructed from
>> primitives.
>
> You did not thought it through. A set of primitives supported by all
> shapes is already a class. This is exactly what a class of shapes is.
> Any client of the class expects to find an implementation of any
> primitive in any surface.

I don't know what that means but composite shapes can be made up of
primitive ones. I suspect, in fact, that that's how rendering engines
work: a set of primitives that the hardware or firmware knows how to
render and widget which know how to use those primitives to draw
everything they want to.

>
> Remove the class, you will need to write a new client for each surface.

I disagree. As above, I'd use a common interface.

>
>> Unless I am missing something the cartesian M*N would be costly and so
>> even economically, as you mention,
>
> And its existence is a fact of life. Shapes exist, surfaces exist,
> drawing exist.
>
>> I cannot agree that there is no other way.
>
> Show the other way to implement and use M*N operations safely.

See above.


--
James Harris

James Harris

unread,
Sep 9, 2021, 4:30:27 PMSep 9
to
On 04/09/2021 20:56, Dmitry A. Kazakov wrote:
> On 2021-09-04 19:43, James Harris wrote:
>> On 03/09/2021 16:46, Dmitry A. Kazakov wrote:
>>> On 2021-09-03 16:40, James Harris wrote:
>>>> On 03/09/2021 14:39, Dmitry A. Kazakov wrote:
>>
>> ...
>>
>>>>> Hash functions are pretty quick.
>>>>
>>>> Hash functions are not fast enough to find a call target! I was/am
>>>> even concerned about the extra cost of following a double pointer.
>>>> Using a hash table would be slower still.
>>>
>>> Faster than anything the user would have to write otherwise, in order
>>> to emulate dispatch.
>>
>> On the contrary, where the scheme I mentioned would have just two
>> dereferences to get the address of the required function ... your
>> scheme would have multiple dereferences, a function call, a hash
>> calculation, and possibly collision resolution.
>
> You did not propose anything for the case of full dispatch. Certainly
> specialized cases like single dispatch or multi-methods could be handled
> differently.

If by 'full dispatch' you mean selecting a function based on the types
of all of its arguments then I don't remember you proposing a mechanism
for it. Yes, you spoke about something that could be used in toy
situations where you had two or three parameters and a handful of types
but it would not scale. If you had just five parameters which could be
of a dozen types. You would need a dispatching table of 5^12 or 244
million pointers to who knows how many functions. That is fantasyland.

>
> The point was, if you do not support this in the language, the
> programmer will have to implement it, which is always slower and always
> error prone.

I agree that there are things the language should do for the programmer.
Where I am less certain is the multiple-dispatch /concept/ you want to
compile.

>
>>> Then performance is never a big concern.
>>
>> I'm really surprised that such a lookup would even be considered. It
>> would be much slower than a direct function call.
>
> Again, what is the alternative?

I have some ideas but I'd need to see some examples of what the problems
really are before coming up with a concrete proposal.

>
> GTK is written in C. C has no classes. So GTK emulates them using the
> GObject library. Guess, what is faster, C++ implementation of dispatch
> or the GObject kludge?

Straw man!

https://en.wikipedia.org/wiki/Straw_man

Your argument reminds me of how a certain graphics system got its
reputation completely trashed (unfairly, I gather) by a slow
implementation.

>
>> What is your objection to my proposed solution? To be clear, the idea
>> is to have the object's pointer to link directly to the method table
>> as follows.
>>
>>    object:   |___ptr___|___field0___|___field1___| etc
>>                   |
>>                   V
>>              |__type___|__reserved__|__ptr__|__ptr__|__ptr__|__ptr__|
>
> Show how this is supposed to work with full multiple dispatch, with
> multi-methods, with multiple inheritance, with constructors and
> destructors, with scalar types.

Those are /your/ desires. I am not sure I would have all those because
as I've said before the fundamental design which you are espousing does
not scale. It's possible that a rethink of the design is needed.

I /would/ support hooks such as constructors and destructors and I can
see a need to support multiple /interfaces/ but I am not sold on either
inheritance or the full dispatch that you keep talking about.

>
> BTW, there is a book by Bjarne Stroustrup with a detailed discussion
> regarding implementation of objects and dispatching tables (virtual
> tables) in C++.
>
>    https://www.stroustrup.com/arm.html
>
> I suggest you read it first in order to get an elementary understanding
> of the problematic.

If I ever start to think like Stroustrup I will have failed.


> And keep in mind that C++ has single dispatch.

OK.

...

> "Virtual function" is C++ term for method.

OK, though I cannot imagine what's virtual about a normal method.

>
>> If that's right is the issue that Ada prohibits virtual functions?
>
> No, Ada's name for virtual function is "primitive operation."

Whew! ;-)

>
>>> In Ada
>>>
>>>     procedure Foo (X : T) is
>>>     begin
>>>        X.Baz; -- Does not dispatch, the type is established to be T
>>>     end Foo;
>>
>> So T.Baz has to be in T and cannot be overridden in a subclass of T?
>
> X.Baz does not dispatch in Ada. In C++ X.baz () it does dispatch.
>

I am confused by your terms. When you say 'dispatch' do you mean
dispatch at run time aka dynamic dispatch

https://en.wikipedia.org/wiki/Dynamic_dispatch

as opposed to dispatch at compile time or static dispatch

https://en.wikipedia.org/wiki/Static_dispatch

?


--
James Harris

Bart

unread,
Sep 9, 2021, 5:05:02 PMSep 9
to
Where is that written?

I've just tried a bit of C++ online (this example:
https://www.w3schools.com/cpp/cpp_inheritance.asp). I was able to change
the type of 'brand' within derived class 'Vehicle'.

No errors were reported, but it either didn't work properly, or it crashed.

And while the methods could be overridden, I could change the function
signatures (eg. add extra params) and here it worked.

I also created a function taking the base class. I was able to pass both
an instance of the base class, or of the derived class. It worked, but
treated it as a base class (it would call the .honk method of the base
class).

If the function instead took the derived class, I couldn't pass it an
instance of a base class.

So there there seem to certain rules about what can and what can't be done.

>>
>>> As long as both types of object have A's fields at the same offsets
>>> then code which expects an A can still work on A's fields and doesn't
>>> care whether the object is really of type A or of type B.
>>
>> In my dynamic language, none of these matters. The offsets can be
>> different, as they are picked up at runtime; and the types can be
>> different anyway.
>>
>> But I ran into trouble with static code.
>
> Did you try to change the types?

I can't remember my tests. But then I wasn't aware of the rules that OOP
needs to follow to make this stuff work.

> If those are integers I'd have thought /you/ would just expand them all
> up ... to 64 bits!

A variant type can presumably be any mix of types. And if there was one
obvious dominant type, you wouldn't need a variant; you'd use that type.

> A more typical use than numbers would probably be nodes of different types.
>
> ...
>
>> I use type-dispatching in dynamic code; I have difficulty seeing how
>> it could be applied efficiently to static code.
>>
>> How about looking at an expression such as A:=B+C*D, where these are
>> all variants, and mocking up what generate code might look like. Try
>> also X[i] where X has variant elements, and i might be variant too.
>
> 1. Convert i to int.
> 2. Get the address of the element.
>
> What is needed next will be the same as operating on a scalar.

Exactly how different can variant types be? Maybe X is an int, and maybe
X is an array of structs. Or an array of Variants. What are the limitations?

Staying with my X[i] where X's type is array of Variant elements, they
could be ints, floats or strings. And maybe i is a 32-int, or 64-bit
int, or a float.

You need to analyse it each time is is accessed to see if conversion is
needed.

Also, Variants could presumably be assigned something else of a
different type? Then you can't even assume it will stay the same type.


>>
>> Or maybe write some C++ code and see what the compiler generates.
>
> Why would I want to do that?!

To see what limitations it places on the possibilities. Or maybe there
are no limitations and the resulting code is horrendous.

I don't quite understand why you seem to think that dealing with variant
types is straightforward.


>>> Frankly, I'd like the programmer to be able to work purely in terms
>>> of the logic of his algorithm and for the compiler to pick the
>>> implementation.
>>
>> You might be on the verge of switching to a dynamically typed language!
>>
>
> Not me!

If you have a fully flexible Variant type with no limitations, then you
will de facto have a language that supports dynamic typing.

The main difference is needing to specify the set of types it can
assume, which creates an extra difficulty, for example if assigning:

A := B

where A has type V1 and B has type V2, which are known at compile-time
to have some common types, but you still need a runtime check.

Dmitry A. Kazakov

unread,
Sep 10, 2021, 3:08:32 AMSep 10
to
Who calls these? Your rendering program?

You confuse implementation with means of. Rendering primitives are means
to implement the procedure Draw you would call from your program for
each element from the list shapes. The classic OO way:

for Shape in Shapes_List loop
Shape.Draw (Widget.Get_Redenring Context);
-- Dispatches to the Shape's Draw
end loop;

Your way?

>>> The rights and wrongs of the approach are OT so all I'll say is that
>>> M*N does not scale and that new shapes can be constructed from
>>> primitives.
>>
>> You did not thought it through. A set of primitives supported by all
>> shapes is already a class. This is exactly what a class of shapes is.
>> Any client of the class expects to find an implementation of any
>> primitive in any surface.
>
> I don't know what that means but composite shapes can be made up of
> primitive ones. I suspect, in fact, that that's how rendering engines
> work: a set of primitives that the hardware or firmware knows how to
> render and widget which know how to use those primitives to draw
> everything they want to.

Again, rendering primitives is just another higher-level language, like
OpenGL. This changes nothing. You still have to implement Draw in this
language for this type of shape.

>> Remove the class, you will need to write a new client for each surface.
>
> I disagree. As above, I'd use a common interface.

for Shape in Shapes_List loop

!!! Fill the gap !!!

end loop;

>>> Unless I am missing something the cartesian M*N would be costly and
>>> so even economically, as you mention,
>>
>> And its existence is a fact of life. Shapes exist, surfaces exist,
>> drawing exist.
>>
>>> I cannot agree that there is no other way.
>>
>> Show the other way to implement and use M*N operations safely.
>
> See above.

Nothing to see, so far.

Dmitry A. Kazakov

unread,
Sep 10, 2021, 3:42:36 AMSep 10
to
On 2021-09-09 22:30, James Harris wrote:
> On 04/09/2021 20:56, Dmitry A. Kazakov wrote:
>> On 2021-09-04 19:43, James Harris wrote:
>>> On 03/09/2021 16:46, Dmitry A. Kazakov wrote:
>>>> On 2021-09-03 16:40, James Harris wrote:
>>>>> On 03/09/2021 14:39, Dmitry A. Kazakov wrote:
>>>
>>> ...
>>>
>>>>>> Hash functions are pretty quick.
>>>>>
>>>>> Hash functions are not fast enough to find a call target! I was/am
>>>>> even concerned about the extra cost of following a double pointer.
>>>>> Using a hash table would be slower still.
>>>>
>>>> Faster than anything the user would have to write otherwise, in
>>>> order to emulate dispatch.
>>>
>>> On the contrary, where the scheme I mentioned would have just two
>>> dereferences to get the address of the required function ... your
>>> scheme would have multiple dereferences, a function call, a hash
>>> calculation, and possibly collision resolution.
>>
>> You did not propose anything for the case of full dispatch. Certainly
>> specialized cases like single dispatch or multi-methods could be
>> handled differently.
>
> If by 'full dispatch' you mean selecting a function based on the types
> of all of its arguments then I don't remember you proposing a mechanism
> for it.

Single dispatch = single controlling argument
Milti-method = several controlling arguments from the same hierarchy
Full multiple dispatch = several controlling arguments from any hierarchies

> Yes, you spoke about something that could be used in toy
> situations where you had two or three parameters and a handful of types
> but it would not scale. If you had just five parameters which could be
> of a dozen types. You would need a dispatching table of 5^12 or 244
> million pointers to who knows how many functions. That is fantasyland.

This is the requirement. Either way, manually or with the compiler
support, the programmers must do that. You cannot tell the customer, it
is too many shapes and too many surfaces you put in the requirements. It
is exactly as many as the customer requested.

>>>> Then performance is never a big concern.
>>>
>>> I'm really surprised that such a lookup would even be considered. It
>>> would be much slower than a direct function call.
>>
>> Again, what is the alternative?
>
> I have some ideas but I'd need to see some examples of what the problems
> really are before coming up with a concrete proposal.

See above. Single dispatch, multi-method, full multiple dispatch, it is
all you need to know.

>> GTK is written in C. C has no classes. So GTK emulates them using the
>> GObject library. Guess, what is faster, C++ implementation of dispatch
>> or the GObject kludge?
>
> Straw man!
>
>   https://en.wikipedia.org/wiki/Straw_man
>
> Your argument reminds me of how a certain graphics system got its
> reputation completely trashed (unfairly, I gather) by a slow
> implementation.

Irrelevant now as you already agreed with me on the principle, compiler
implementation is always better.

>>> What is your objection to my proposed solution? To be clear, the idea
>>> is to have the object's pointer to link directly to the method table
>>> as follows.
>>>
>>>    object:   |___ptr___|___field0___|___field1___| etc
>>>                   |
>>>                   V
>>>              |__type___|__reserved__|__ptr__|__ptr__|__ptr__|__ptr__|
>>
>> Show how this is supposed to work with full multiple dispatch, with
>> multi-methods, with multiple inheritance, with constructors and
>> destructors, with scalar types.
>
> Those are /your/ desires.

The grapes are sour...

> I /would/ support hooks such as constructors and destructors

Then you must confront the issue of dispatching from there.

[Just try to understand that the world is much more complex than you
imagine. Talking about fantasy lands...]

>> BTW, there is a book by Bjarne Stroustrup with a detailed discussion
>> regarding implementation of objects and dispatching tables (virtual
>> tables) in C++.
>>
>>     https://www.stroustrup.com/arm.html
>>
>> I suggest you read it first in order to get an elementary
>> understanding of the problematic.
>
> If I ever start to think like Stroustrup I will have failed.

You may think anything about him, but in the book he (and his co-writer)
discusses the technical issues of implementation of dispatching tables
specifically for single dispatch and full multiple inheritance. It is
very instructive, written in plain language with examples. It will help
you understand the basics. Since, from your proposals and questions, I
see that you still do not.

>> "Virtual function" is C++ term for method.
>
> OK, though I cannot imagine what's virtual about a normal method.

Compare this with "virtual memory" when the memory address goes through
the address table to get the physical address. So the method goes
through the virtual (dispatching) table to get the implementation of for
the specific type.

>>>> In Ada
>>>>
>>>>     procedure Foo (X : T) is
>>>>     begin
>>>>        X.Baz; -- Does not dispatch, the type is established to be T
>>>>     end Foo;
>>>
>>> So T.Baz has to be in T and cannot be overridden in a subclass of T?
>>
>> X.Baz does not dispatch in Ada. In C++ X.baz () it does dispatch.
>
> I am confused by your terms. When you say 'dispatch' do you mean
> dispatch at run time aka dynamic dispatch
>
>   https://en.wikipedia.org/wiki/Dynamic_dispatch
>
> as opposed to dispatch at compile time or static dispatch
>
>   https://en.wikipedia.org/wiki/Static_dispatch

Yes, I mean dispatch as using the dispatching table = dynamic
polymorphism. Static polymorphism is of no interest here.

luserdroog

unread,
Sep 20, 2021, 11:08:52 PMSep 20
to
On Monday, August 30, 2021 at 4:47:58 AM UTC-5, James Harris wrote:
> On 29/08/2021 20:32, Dmitry A. Kazakov wrote:
> > On 2021-08-29 20:24, James Harris wrote:
> >
> >> Ignoring the length part (as some objects would have their own length
> >> and some would not but that difference is immaterial here) how would a
> >> 'tag' system be laid out in memory and how would one use that to
> >> locate a method?
> >
> > You must consider the following inheritance variants:
> ...
>
> Thanks for the information, Dmitry. I might come back to you about some
> of that. But for this thread I was wondering how the memory might be
> laid out under the tag model that you mentioned. What form does a tag
> take and how does the runtime get from the tag to a method?
>
> If it depends on which model is in use then what about single
> inheritance and the Full MI model you mentioned?

For TMI on tags and tagging, see Representing Type Information in
Dynamically Typed Languages by David Gudeman.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.4394&rep=rep1&type=pdf

James Harris

unread,
Sep 22, 2021, 10:24:12 AMSep 22
to
On 09/09/2021 22:04, Bart wrote:
> On 09/09/2021 19:32, James Harris wrote:
>> On 09/09/2021 19:10, Bart wrote:
>>> On 09/09/2021 18:25, James Harris wrote:
>>>> On 04/09/2021 22:35, Bart wrote:
>>>>> On 04/09/2021 21:44, James Harris wrote:

...

>>>>>>    type A
>>>>>>      |
>>>>>>    type B

...

>>>> An object of type B has all of A's fields and some of its own.

...

>> So the offsets
>> would not change, and a type B could appear to be a type A.
>
> Where is that written?

Try the following set of slides. They are part of a compilers course and
show how such objects could be implemented, including how they could be
laid out in memory.

http://openclassroom.stanford.edu/MainFolder/courses/Compilers/docs/slides/12-06-object-layout-annotated.pdf

Keeping the fields is a natural consequence of the "is a" relationship
when using static typing. An object of type B _is_ also an object of
type A so it has to be usable wherever an object of type A is usable
and/or has been declared. That's the case even if type B provides new
methods to manipulate A's fields or adds some fields of its own.

To stress, this is about /static/ typing. If using dynamic typing then
the field types /could/ be overridden, and IIUC that's how the
dynamically typed Python works.

There are various permutations but you should find the above slides to
be really helpful in detailing the fundamentals. The full lectures are
probably findable on line, too.

...

> Exactly how different can variant types be? Maybe X is an int, and maybe
> X is an array of structs. Or an array of Variants. What are the
> limitations?

I don't see the need for any limitation on what types could be used but
the permissible types would have to be specified. For example,

typedef T = variant (int # array float # boolean)

(where # separates alternatives and "array float" means an array of
floats).

>
> Staying with my X[i] where X's type is array of Variant elements, they
> could be ints, floats or strings. And maybe i is a 32-int, or 64-bit
> int, or a float.
>
> You need to analyse it each time is is accessed to see if conversion is
> needed.
>
> Also, Variants could presumably be assigned something else of a
> different type? Then you can't even assume it will stay the same type.

Wouldn't code which handles variants typically include a switch on the
type as in the following?

switch T.type
case int:
/* Handle int */
case array float:
/* Handle array of float */
end switch /* Ignoring other cases */

>
>
>>>
>>> Or maybe write some C++ code and see what the compiler generates.
>>
>> Why would I want to do that?!
>
> To see what limitations it places on the possibilities. Or maybe there
> are no limitations and the resulting code is horrendous.

The problem with looking at how another language handles matters is that
it can limit the scope for creativity and tend to lead to the same ideas.

Wouldn't it be better to come up with designs based on how objects and
structures might be laid out and used?

>
> I don't quite understand why you seem to think that dealing with variant
> types is straightforward.

I don't think I ever said it was straightforward.

In fact, IMO the opposite is the case. There's a big language-design
challenge, here. Designers have come up with many ways to match objects
with code:

coercion
overloading
variants
polymorphism
generics
metatypes
macros

It may be better to get rid of some of them and absorb what they can do
into a common approach - as long as it makes programming easier rather
than harder.

>
>
>>>> Frankly, I'd like the programmer to be able to work purely in terms
>>>> of the logic of his algorithm and for the compiler to pick the
>>>> implementation.
>>>
>>> You might be on the verge of switching to a dynamically typed language!
>>>
>>
>> Not me!
>
> If you have a fully f