Function returning defined type?

john.chl...@gmail.com

unread,

Oct 20, 2012, 3:08:52 AM10/20/12

to

Again, after doing due diligence with web searches, I couldn't resolve this.

I have:

module Coordinate

implicit none

integer, parameter :: wp = kind(1.0d0)
private :: WP

type Coords
real(wp) :: x
real(wp) :: y
real(wp) :: z
contains
procedure, nopass :: add
procedure, nopass :: mult
procedure, nopass :: less_than
end type Coords
...
end module Coordinate

module HashTbl

use Coordinate
...
contains
...
type(Coords) function get_sll(list, key)
class(sllist), target, intent(in) :: list
integer, intent(in) :: key

class(sllist), pointer :: current
current => list

do while (current%key /= key)
if ( .not. associated(current%next) ) then
stop
end if
current => current%next
end do

get_sll = current%val

end function get_sll
...
end module HashTbl

type(hash_tbl_sll) :: table
...
call table%put(1, Coords(1, 2, 3))
print*, table%get(1)%x

This results in:

$ gfortran HashTbl.f95
HashTbl.f95:173.22:

print*, table%get(1)%x
1
Error: Syntax error in PRINT statement at (1)

This may be too C/C++'esque but the returned value is type(Coords), shouldn't I be able to reference a member?

I solved it by:

type(Coords) :: tmp
tmp = table%get(1)
print *, tmp%x

---John

jski

unread,

Oct 20, 2012, 3:16:42 AM10/20/12

to

One missing piece:

module HashTbl
...
type(Coords) function get_hash_tbl_sll(tbl, key)
class(hash_tbl_sll), intent(in) :: tbl
integer, intent(in) :: key

integer :: hash

hash = mod(key,tbl%sllhrdr_len) + 1
get_hash_tbl_sll = tbl%sllhrdr(hash)%get(key=key)

end function get_hash_tbl_sll
...
end module HashTbl

glen herrmannsfeldt

unread,

Oct 20, 2012, 5:00:49 AM10/20/12

to

john.chl...@gmail.com wrote:

(snip)

> type(hash_tbl_sll) :: table
> ...
> call table%put(1, Coords(1, 2, 3))
> print*, table%get(1)%x

> This results in:

> $ gfortran HashTbl.f95
> HashTbl.f95:173.22:

> print*, table%get(1)%x
> 1
> Error: Syntax error in PRINT statement at (1)

> This may be too C/C++'esque but the returned value is type(Coords),
> shouldn't I be able to reference a member?

Yes, Fortran isn't C or C++. In C, [], (), and . are operators,
but they aren't in Fortran.

You can't subscript (asked reasonably often), call, substring,
or select structure members from the return value of a function.

-- glen

jski

unread,

Oct 21, 2012, 1:02:29 AM10/21/12

to

On Oct 20, 5:00 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

Hmmmm ... think I'll charge "get" back into a subroutine.

BTW, into what category in Fortran does "%" fall?

---John

jski

unread,

Oct 21, 2012, 1:03:49 AM10/21/12

to

... think I'll CHANGE "get" back into a subroutine.

Ron Shepard

unread,

Oct 21, 2012, 2:31:14 AM10/21/12

to

In article
<91b30443-7f8c-490f...@b6g2000yqd.googlegroups.com>,

The other good reason for doing this is when the function is used in
a print or write statement (as you were trying to do), then you can
then not do any i/o within the function (e.g. for debugging, for
printing error messages, or anything). This is called "recursive"
i/o, and it is not allowed at all up through f95, and only in some
restricted situations in the newer standards. So it is better to
just avoid the problem in the first place by using a subroutine
rather than a function.

$.02 -Ron Shepard

michael...@compuserve.com

unread,

Oct 21, 2012, 7:54:39 AM10/21/12

to

Am Sonntag, 21. Oktober 2012 07:02:30 UTC+2 schrieb jski:
> BTW, into what category in Fortran does "%" fall? ---John

It's a separator (MFE, Section 2.3) used here as a component selector (op. cit. Section 2.9).

Regards,

Mike Metcalf

jski

unread,

Oct 21, 2012, 11:41:41 PM10/21/12

to

Mike,

At the risked of being flamed, without simply saying Fortran isn't C,
is there a rationale for:

if F(x) returns a type with components a, b, and c, why is F(x)
%c a "syntax error"?

---John

PS> Actually, in the case of the C++ I'm translating, it turns out
the function call I'm referring to here is the indexing into an STL
map<>. This is done over 2 dozen times in a loop the iterates at
least 1 million times per run. Replacing these index "calls" is an
obvious step to improving its performance.

Richard Maine

unread,

Oct 21, 2012, 11:47:23 PM10/21/12

to

jski <john.chl...@gmail.com> wrote:

> At the risked of being flamed, without simply saying Fortran isn't C,
> is there a rationale for:
>
> if F(x) returns a type with components a, b, and c, why is F(x)
> %c a "syntax error"?

Because there ar erelated forms that would be ambiguous. Sure, not that
one in particular, but others that are in the same general area that
would be let in if you allowed further stuff applied to a function
result. In particular, indexing an array function result could be
ambiguous.

--
Richard Maine
email: last name at domain . net
domain: summer-triangle

Richard Maine

unread,

Oct 22, 2012, 1:02:54 AM10/22/12

to

Richard Maine <nos...@see.signature> wrote:

> jski <john.chl...@gmail.com> wrote:
>
> > At the risked of being flamed, without simply saying Fortran isn't C,
> > is there a rationale for:
> >
> > if F(x) returns a type with components a, b, and c, why is F(x)
> > %c a "syntax error"?
>
> Because there ar erelated forms that would be ambiguous. Sure, not that
> one in particular, but others that are in the same general area that
> would be let in if you allowed further stuff applied to a function
> result. In particular, indexing an array function result could be
> ambiguous.

Hmm. Was a bit rushed to do an example when writing that. Now I'm having
trouble crafting one, though I'd swear I've seen ambiguous cases. Seems
like the best odds for ambiguity would be a function returning an array
of strings, where you have function arguments, array indices/sections,
and substrings all potentially involved. But I'm not seeing one right
now. Oh well.

glen herrmannsfeldt

unread,

Oct 22, 2012, 1:12:00 AM10/22/12

to

jski <john.chl...@gmail.com> wrote:

(snip regarding indexing, substringing, function calling and
member selection on the return value of a function.)

> At the risked of being flamed, without simply saying Fortran isn't C,
> is there a rationale for:

> if F(x) returns a type with components a, b, and c, why is F(x)
> %c a "syntax error"?

As mentioned in another post, there are some ambiguous cases, even if
this isn't one. One that I miss more is arrays of pointers, which
are also not allowed, possibly because of the ambiguities.

> PS> Actually, in the case of the C++ I'm translating, it turns out
> the function call I'm referring to here is the indexing into an STL
> map<>. This is done over 2 dozen times in a loop the iterates at
> least 1 million times per run. Replacing these index "calls" is an
> obvious step to improving its performance.

Are you claiming that the Fortran syntax is slower because it
requires an explicit temporary variable?

It might be that the C or C++ compilers generate a temporary
anyway, and it might be that the Fortran compiler can
optimize out the store into the variable. It is a syntax
question, not a performance question. Besides, it might
take one extra instruction, and so your 12 million calls
might take a few milliseconds longer.

Now, note that returning a structure requires the function
to return all the elements of the structure, though
you only need one. (Well, possilby there is only one member.)

Same for arrays. Indexing a function returning an array is a
waste, unless the array only has one element.

If performance is a question, write a version of the function
that returns only that element. (Or an ENTRY point into the
existing function.)

-- glen

jski

unread,

Oct 22, 2012, 1:34:02 AM10/22/12

to

> Are you claiming that the Fortran syntax is slower because it
> requires an explicit temporary variable?

Quite the opposite. Below of some of the C++ which is replicated many
times in a loop:

R_CG_Node.x = M6x6_RefNode(1,5)/M6x6_RefNode(1,1) +
node_geo[RefNode].x;
R_CG_Node.y = -M6x6_RefNode(0,5)/M6x6_RefNode(0,0) +
node_geo[RefNode].y;
R_CG_Node.z = M6x6_RefNode(0,4)/M6x6_RefNode(0,0) +
node_geo[RefNode].z;

Notice the calls: node_geo[RefNode].x, node_geo[RefNode].y, and
node_geo[RefNode].z. These are indexing into an instance of map<>:
node_geo. This map contains thousands of entries and the look-ups are
done with the same value for RefNode every time in an iteration. The C+
+ code has node_geo[RefNode] dozens of times in the loop.

My Fortran (translation) is:

call node_geo%get(RefNode, tmp_coord)

R_CG_Node%x = M6x6_RefNode(2,6) / M6x6_RefNode(2,2) + tmp_coord%x
R_CG_Node%y = -M6x6_RefNode(1,6) / M6x6_RefNode(1,1) + tmp_coord%y
R_CG_Node%z = M6x6_RefNode(1,5) / M6x6_RefNode(1,1) + tmp_coord%z

which does the loop-up in node_geo (an instance of my homespun hash-
table) once. The Fortran is more efficient.

---John

michael...@compuserve.com

unread,

Oct 22, 2012, 4:08:33 AM10/22/12

to

Am Montag, 22. Oktober 2012 05:41:42 UTC+2 schrieb jski:
> On Oct 21, 7:54 am, michaelmetc...@compuserve.com wrote: > Am Sonntag, 21. Oktober 2012 07:02:30 UTC+2 schrieb jski: > > > BTW, into what category in Fortran does "%" fall? ---John > > It's a separator (MFE, Section 2.3) used here as a component selector (op. cit. Section 2.9). > > Regards, > > Mike Metcalf Mike, At the risked of being flamed, without simply saying Fortran isn't C, is there a rationale for: if F(x) returns a type with components a, b, and c, why is F(x) %c a "syntax error"? ---John PS> Actually, in the case of the C++ I'm translating, it turns out the function call I'm referring to here is the indexing into an STL map<>. This is done over 2 dozen times in a loop the iterates at least 1 million times per run. Replacing these index "calls" is an obvious step to improving its performance.

The memory dims, but this prohibition is part of a general case, in that for any form of function result, it is not possible to reference an element, a section, a sub-string or a component, in other the words, the result must always be used as a whole (MFE, Section 5.10). At least that's easy to remember.

HTH,

Mike Metcalf

Tobias Burnus

unread,

Oct 22, 2012, 5:25:59 AM10/22/12

to

jski:

> At the risked of being flamed, without simply saying Fortran isn't C,
> is there a rationale for:
>
> if F(x) returns a type with components a, b, and c, why is F(x)
> %c a "syntax error"?

Not an answer to the question, but using BLOCK, one can relatively
simple create local variables. The C compiler would internally do
likewise for "c_function(x).comp":

block
type(type_returned_by_f) :: tmp
tmp = f(x)
!use tmp%comp
end block

Portability note: BLOCK is a Fortran 2008 feature and thus not yet
widely implemented. (gfortran has BLOCK support since GCC 4.5.)

(Side note: Fortran 2008 also allows a pointer-returning function at the
left (!) side in an assignment. But that also doesn't allow for
array/substring/component references. [This feature is implemented in a
very few compilers; gfortran doesn't belong to those.])

Tobias

Dick Hendrickson

unread,

Oct 22, 2012, 11:27:28 AM10/22/12

to

My recollection is about the same as Richard's and Mike's. If you allow
selection of structure-function elements it would be irregular to not
also allow selection of array-function elements and then the character
case is really ambiguous. True, normal character references like
CHAR_ARRAY(I:J)(K:L)
are also ambiguous and the ambiguity was simply defined away by making a
syntax choice. I think it was just viewed as more awkward with another
set of parenthesis.

Some other points, in no particular order.

1) Structures and arrays were added about the same time but shepherded
by different sub-groups and other sub-groups did other stuff. It's
possible that something was badly ambiguous in early versions and
rejected. Subsequent (simplified?) versions might have
removed/mitigated the ambiguity and nobody went back are rethought early
decisions. It's too bad, but that's the way people often work.

2) Selectors are pretty inefficient in the worst case. Something like
matmul(a,b)(i,j)
would be done much better as
dot_product(a(i,:),b(:,j))
[or possibly dot_product(a(:,i),b(j,:)) I always get confused, sigh]
The committee was reluctant to add inefficient features. (No, this was
not a universal rule.)

3) As I recall, there was no demand. In the 80s many companies were
making parallel processors and often designing their own array
languages. None of them strongly advocated for allowing array function
selectors.

4) F90 was viewed as being big and complex. This would be one more
complexity.

5) Functions are merely one kind of expression. If we can select from
a function result, why not from an expression. Then I think something like
array*other_array(i,j)
is both ugly and ambiguous beyond redemption ;) .

Dick Hendrickson

glen herrmannsfeldt

unread,

Oct 22, 2012, 2:48:04 PM10/22/12

to

Dick Hendrickson <dick.hen...@att.net> wrote:

(snip, and previous snip, on why no subscripting, substringing,
structure member selecting, and function calling on function results.)

>>> Because there ar erelated forms that would be ambiguous. Sure, not that
>>> one in particular, but others that are in the same general area that
>>> would be let in if you allowed further stuff applied to a function
>>> result. In particular, indexing an array function result could be
>>> ambiguous.

>> Hmm. Was a bit rushed to do an example when writing that. Now I'm having
>> trouble crafting one, though I'd swear I've seen ambiguous cases. Seems
>> like the best odds for ambiguity would be a function returning an array
>> of strings, where you have function arguments, array indices/sections,
>> and substrings all potentially involved. But I'm not seeing one right
>> now. Oh well.
>>

> My recollection is about the same as Richard's and Mike's. If you allow
> selection of structure-function elements it would be irregular to not
> also allow selection of array-function elements and then the character
> case is really ambiguous. True, normal character references like
> CHAR_ARRAY(I:J)(K:L)
> are also ambiguous and the ambiguity was simply defined away by making a
> syntax choice. I think it was just viewed as more awkward with another
> set of parenthesis.

In many other languages, substring is an intrinsic function, not
an operator. That allows it to be used on any expression, which
sometimes might be useful. The first language I knew with the
substring operator was HP2000 BASIC. (Is that here Fortran inherited
it from?)

Like Fortran, HP BASIC allows one to use the substring operator on
the left side on an assignment, but restricts against the creation
of a non-contiguous string. (String variables have a dynamic length
up to the declared maximum. You can't leave holes in assignment.)

PL/I has the SUBSTR function which allows for a substring of any
string expression, or any expression that can be converted to string.

There is also the SUBSTR pseudo-variable that can be used on the left
side of an assignment to assign to part of a string. There are no
(last I knew) user-defined pseudo-variables, and PVs don't nest.

You can't say SUBSTR(SUBSTR(S,1,10),4,3)="xx"; even if it
might make sense.

Three other pseudo-variables are REAL, IMAG, and COMPLEX, allowing
one to set the real part or imaginary part of a complex variable,
or extract the real and imaginary parts of a complex expression.
Again they don't nest, but it works well enough to allow:

X=3;
DO IMAG(X)=1 TO 100 BY 3;
PUT SKIP LIST(SQRT(X));
END;

which uses both the IMAG function and IMAG pseudo-variable.

Note for later that PL/I also allows for array expressions.

> Some other points, in no particular order.

> 1) Structures and arrays were added about the same time but shepherded
> by different sub-groups and other sub-groups did other stuff. It's
> possible that something was badly ambiguous in early versions and
> rejected. Subsequent (simplified?) versions might have
> removed/mitigated the ambiguity and nobody went back are rethought early
> decisions. It's too bad, but that's the way people often work.

and even more CHARACTER variables were added in Fortran 77, though I
don't remember is substrings were, or came later.

> 2) Selectors are pretty inefficient in the worst case. Something like
> matmul(a,b)(i,j)
> would be done much better as
> dot_product(a(i,:),b(:,j))
> [or possibly dot_product(a(:,i),b(j,:)) I always get confused, sigh]
> The committee was reluctant to add inefficient features. (No, this was
> not a universal rule.)

There is, at least, the possibility of an optimizer figuring
out that one. For user-defined functions it is much less likely.

Note that as mentioned early in the thread, C allows subscripting
function results, but C does NOT allow array expressions or functions
returning arrays. C allows for functions returning pointers, and
array subscripting is closely related to pointer indexing.

K&R C didn't allow for functions to return structures, but did
allow for structure pointers. ANSI added functions returning
structures, and structures can contain static sized arrays.

> 3) As I recall, there was no demand. In the 80s many companies were
> making parallel processors and often designing their own array
> languages. None of them strongly advocated for allowing array function
> selectors.

Unlike C, where subscripting operates on pointers, subscripting
Fortran array return values or expressions is likely to be
inefficient. (Unless the optimizer figures it out.)

> 4) F90 was viewed as being big and complex. This would be one more
> complexity.

> 5) Functions are merely one kind of expression. If we can select from
> a function result, why not from an expression. Then I think something like
> array*other_array(i,j)
> is both ugly and ambiguous beyond redemption ;) .

Well, there would be either array*(other_array(i,j)) or
(array*other_array)(i,j).

The former should be legal, the latter can be written as

array(i,j)*other_array(i,j).

Now, not that C does not allow for array expressions, so some that
might otherwise be ambiguous aren't.

In C, you an add or subtract to a pointer (which might represent
an array), though you can't multiply or divide one. You can also
subtract two pointers. The result is defined if both point to
(possibly different positions) within the same array.

C functions can validly return pointers to static arrays and to
dynamically allocated arrays. In the latter case, the returned
value should not be subscripted if it is the only pointer to
the object. Local variables, including arrays, go away on
function return (usually on the stack) and should not be used
after the return. It is an easy mistake in C to return a pointer
to a local auto array.

Note, though, that the structure member . is also used in object
oriented languages like C++ and Java for the class method selector.
It easily nests when you want to call a method in the class returned
by the first call. You might, for example:

String s="abcdefghijklmnop";
System.out.println(s.substring(3,3).substring(1,1));

in Java. (There are probably better example, but you can invoke any
method of a class from any expression of the type of that class.
(Well, you can invoke a method based on a reference to an Object of
the class.) With garbage collection, you don't have to worry
about what happens to all those Objects.

Simplifying some, static methods are called based on the static
type of the expression, where instance methods are called based
on the current type. The difference is important in the case
of subclasses.

It would be interesting to see more examples of the Object-Oriented
features of Fortran, and to see how expressions work.

-- glen

Wolfgang Kilian

unread,

Oct 23, 2012, 5:20:35 AM10/23/12

to

On 10/22/2012 05:27 PM, Dick Hendrickson wrote:
> [...]

> 4) F90 was viewed as being big and complex. This would be one more
> complexity.
>
> 5) Functions are merely one kind of expression. If we can select from
> a function result, why not from an expression. Then I think
something like
> array*other_array(i,j)

> is both ugly and ambiguous beyond redemption .

Fortran already has the convention that enclosing an object reference in
parentheses evaluates it as an expression. Useful, for instance, in
procedure arguments.

I'd guess (but I may be proved wrong) that by requiring parentheses,
subscripting and component addressing could be made available for
arbitrary Fortran expressions without syntax ambiguities:

x = (array * other_array)(i,j)

a = (foo (x))%first_component

call (foobar(2))%some_method (y)

y = (result (u,v,w))(2)

A compiler may be able to optimize the overhead, in some cases.

This adds complexity to the language, but it eliminates the need for
extra temporaries, functions/subroutines or block structures. As
always, it depends on the programmer whether this makes code more or
less readable.

Syntax constructs like this are more common in modern languages than
some decades ago. Therefore, the demand by users may be increasing, as
recent postings show.

-- Wolfgang

David Thompson

unread,

Nov 4, 2012, 10:30:10 PM11/4/12

to

On Mon, 22 Oct 2012 18:48:04 +0000 (UTC), glen herrmannsfeldt
<g...@ugcs.caltech.edu> wrote:

> Dick Hendrickson <dick.hen...@att.net> wrote:
>
> (snip, and previous snip, on why no subscripting, substringing,
> structure member selecting, and function calling on function results.)
>

<snip: believed or at least suspected to be ambiguous>

> In many other languages, substring is an intrinsic function, not
> an operator. That allows it to be used on any expression, which
> sometimes might be useful. The first language I knew with the
> substring operator was HP2000 BASIC. (Is that here Fortran inherited
> it from?)
>

Pedantic: As you said elsethread, strictly speaking, it's not an
operator in Fortran: in F77 it's one form of primary in an arithmetic
character or logical expression as applicable; in F90+ it's (more
formally) a data designator, which can be (and often is) an expression
primary. C's treatment of array subscripting and struct member
selection, and also simple and compound assignments, as operators
almost the same as traditional computational operators (like plus) and
relational and boolean operators -- rather than another category --
was unusual when it was created and still far from universal.

> Like Fortran, HP BASIC allows one to use the substring operator on
> the left side on an assignment, but restricts against the creation
> of a non-contiguous string. (String variables have a dynamic length
> up to the declared maximum. You can't leave holes in assignment.)
>
> PL/I has the SUBSTR function which allows for a substring of any
> string expression, or any expression that can be converted to string.
>

which with PL/I's expansive ideas about conversion is quite a lot <G>
especially since PL/I has both character strings AND bit strings.

> There is also the SUBSTR pseudo-variable that can be used on the left
> side of an assignment to assign to part of a string. There are no
> (last I knew) user-defined pseudo-variables, and PVs don't nest.
>
> You can't say SUBSTR(SUBSTR(S,1,10),4,3)="xx"; even if it
> might make sense.
>

Are you sure? Composing SUBSTR like that is silly and I wouldn't have
tried, but ISTR that I did use SUBSTR(UNSPEC(x),bits)=y. But it was
long ago and my memory could be suffering from alpha particles.

> Three other pseudo-variables are REAL, IMAG, and COMPLEX, <snip>

> Note for later that PL/I also allows for array expressions.
>
> > Some other points, in no particular order.
>
> > 1) Structures and arrays were added about the same time but shepherded
> > by different sub-groups and other sub-groups did other stuff. It's
> > possible that something was badly ambiguous in early versions and
> > rejected. Subsequent (simplified?) versions might have
> > removed/mitigated the ambiguity and nobody went back are rethought early
> > decisions. It's too bad, but that's the way people often work.
>
> and even more CHARACTER variables were added in Fortran 77, though I
> don't remember is substrings were, or came later.
>

Substrings are in F77; confirmed while checking my wording above.

> > 2) Selectors are pretty inefficient in the worst case. <snip>

> Note that as mentioned early in the thread, C allows subscripting
> function results, but C does NOT allow array expressions or functions
> returning arrays. C allows for functions returning pointers, and
> array subscripting is closely related to pointer indexing.
>

Continuing the above pedantry, C doesn't have what Fortran folks would
consider array expressions (add, multiply, transpose, etc). Formally
struct.memberarray is an expression in K&R C (vs s%a is a primary or
designator in Fortran) and array compound literal in C99 is a form of
postfix-expression (which is one step away from primary) vs
array-constructor is a designator in F90+.

> K&R C didn't allow for functions to return structures, but did
> allow for structure pointers. ANSI added functions returning
> structures, and structures can contain static sized arrays.
>

Yes (and also by-value argument, and assignment)

> > 3) As I recall, there was no demand. In the 80s many companies were
> > making parallel processors and often designing their own array
> > languages. None of them strongly advocated for allowing array function
> > selectors.
>
> Unlike C, where subscripting operates on pointers, subscripting
> Fortran array return values or expressions is likely to be
> inefficient. (Unless the optimizer figures it out.)
>
> > 4) F90 was viewed as being big and complex. This would be one more
> > complexity.
>
> > 5) Functions are merely one kind of expression. If we can select from

> > a function result, why not from an expression. <snip>

> Now, not that C does not allow for array expressions, so some that
> might otherwise be ambiguous aren't.
>

IHYM 'note'.

> In C, you an add or subtract to a pointer (which might represent
> an array), though you can't multiply or divide one. You can also
> subtract two pointers. The result is defined if both point to
> (possibly different positions) within the same array.
>
> C functions can validly return pointers to static arrays and to
> dynamically allocated arrays. In the latter case, the returned
> value should not be subscripted if it is the only pointer to

To be exact it causes a memory leak, which in general is a problem but
in some situations is okay. And exactly the same for member selection
from a returned structure. Nothing worse than a leak can happen ...

> the object. Local variables, including arrays, go away on
> function return (usually on the stack) and should not be used
> after the return. It is an easy mistake in C to return a pointer
> to a local auto array.
>

... whereas using 'auto' storage after the end of its lifetime is
always undefined and often gives at least garbage output.

> Note, though, that the structure member . is also used in object
> oriented languages like C++ and Java for the class method selector.
> It easily nests when you want to call a method in the class returned
> by the first call. You might, for example:
>
> String s="abcdefghijklmnop";
> System.out.println(s.substring(3,3).substring(1,1));
>
> in Java. (There are probably better example, but you can invoke any

Yes that is a silly example. Real ones that I have used often are
s .substring(...) .trim(), s .substring(...) .toLower() (or toUpper),
s .substring(...) .replace(...) or variants.

Further, some classes are _designed_ to be used this way. E.g.
StringBuffer and StringBuilder .append methods return the object they
were invoked on, so you can do b .append("X=") .append(42) etc.
C++ also uses the same idea with operator syntax e.g. the canonical
output_stream << x << y and input_stream >> x >> y .

> method of a class from any expression of the type of that class.
> (Well, you can invoke a method based on a reference to an Object of
> the class.) With garbage collection, you don't have to worry
> about what happens to all those Objects.
>

That's kind of redundant. All 'object' values in Java, including
variables and method returns, are references; there are no by-value
or by-copy objects in the language -- although there is a trivial
standard interface for classes that choose to implement clone().
Also the box types silently convert to and from primitive types, with
(only) the latter primitive types by-value.

This can be most surprising for arrays, because all arrays in Java are
objects (though not strictly classes) even arrays of primitive types.
{ int a = 3; int b = a; a = 5; /* b is still 3 */ }
{ int[] a = {3}; int[] b = a; a[0] = 5; /* b[0] is now 5 */ }

>
> Simplifying some, static methods are called based on the static
> type of the expression, where instance methods are called based
> on the current type. The difference is important in the case
> of subclasses.
>

More exactly, dispatched. The calls you can compile depend on the
*declared* type of the object you use. Also you can write a static
method call using the typename (class or interface) directly instead
of an object declared that type, and this is considered better style;
e.g. the Eclipse IDE by default puts ugly warning marks on
instance.static_method() calls.

For the more interesting non-static=instance methods, yes it does
dispatch on the runtime type as OO must. But if you have
class Foo { void walk () { ... } }
class BigFoo extends Foo { void shrink () { ... } }
then in a chunk of code
{ Foo x = new BigFoo ();
x.walk(); // okay, and also okay if x is actually a baseclass Foo
x.shrink(); // compile error, even though runtime x is a BigFoo
((BigFoo)x).shrink(); // okay
}
If you want to reach methods added (versus overridden) in known
subclasses you have to cast; usually to be safe you should cast only
after checking obj instanceof Subclass. If you don't know at coding
time (all) the possible subclasses of interest, at runtime you can use
reflection but it's more work and clutter.

glen herrmannsfeldt

unread,

Nov 5, 2012, 3:48:03 AM11/5/12

to

David Thompson <dave.th...@verizon.net> wrote:

(snip)

>> In many other languages, substring is an intrinsic function, not
>> an operator. That allows it to be used on any expression, which
>> sometimes might be useful. The first language I knew with the
>> substring operator was HP2000 BASIC. (Is that here Fortran inherited
>> it from?)

> Pedantic: As you said elsethread, strictly speaking, it's not an
> operator in Fortran: in F77 it's one form of primary in an arithmetic
> character or logical expression as applicable; in F90+ it's (more
> formally) a data designator, which can be (and often is) an expression
> primary.

Yes, I didn't think of any other name for it. There isn't
a good antonym for function. How about operator-like form?

> C's treatment of array subscripting and struct member
> selection, and also simple and compound assignments, as operators
> almost the same as traditional computational operators (like plus) and
> relational and boolean operators -- rather than another category --
> was unusual when it was created and still far from universal.

I was reading since that post that subscripting in Java isn't an
operator, but it does allow subscripting of expressions of the
appropirate type, such as method return values. One that I have
done a few times, is to subscript the return value from get()
in HashTable.

>> Like Fortran, HP BASIC allows one to use the substring operator on
>> the left side on an assignment, but restricts against the creation
>> of a non-contiguous string. (String variables have a dynamic length
>> up to the declared maximum. You can't leave holes in assignment.)

I suppose it isn't an operator in HP BASIC, but then again,
there are no expressions that you could apply the operator
to to test it.

>> PL/I has the SUBSTR function which allows for a substring of any
>> string expression, or any expression that can be converted to string.

> which with PL/I's expansive ideas about conversion is quite a lot <G>
> especially since PL/I has both character strings AND bit strings.

>> There is also the SUBSTR pseudo-variable that can be used on the left
>> side of an assignment to assign to part of a string. There are no
>> (last I knew) user-defined pseudo-variables, and PVs don't nest.

>> You can't say SUBSTR(SUBSTR(S,1,10),4,3)="xx"; even if it
>> might make sense.

> Are you sure? Composing SUBSTR like that is silly and I wouldn't have
> tried, but ISTR that I did use SUBSTR(UNSPEC(x),bits)=y. But it was
> long ago and my memory could be suffering from alpha particles.

http://publibfp.dhe.ibm.com/epubs/pdf/ibm4lr02.pdf on page 429
says no nesting of pseudo-variables, and even gives an example
of what you can't do:

unspec(substr(A,1,2)) = ???00???B;

>> Three other pseudo-variables are REAL, IMAG, and COMPLEX, <snip>
>> Note for later that PL/I also allows for array expressions.

>> > Some other points, in no particular order.

(snip)

>> Note that as mentioned early in the thread, C allows subscripting
>> function results, but C does NOT allow array expressions or functions
>> returning arrays. C allows for functions returning pointers, and
>> array subscripting is closely related to pointer indexing.

> Continuing the above pedantry, C doesn't have what Fortran folks would
> consider array expressions (add, multiply, transpose, etc). Formally
> struct.memberarray is an expression in K&R C (vs s%a is a primary or
> designator in Fortran) and array compound literal in C99 is a form of
> postfix-expression (which is one step away from primary) vs
> array-constructor is a designator in F90+.

>> K&R C didn't allow for functions to return structures, but did
>> allow for structure pointers. ANSI added functions returning
>> structures, and structures can contain static sized arrays.

> Yes (and also by-value argument, and assignment)

But not full structure expressions like PL/I.

(snip)

>> Now, not that C does not allow for array expressions, so some that
>> might otherwise be ambiguous aren't.

> IHYM 'note'.

I usually blame missed characters on this keyboard. I have another
one, but still haven't switched them.

(snip)

>> C functions can validly return pointers to static arrays and to
>> dynamically allocated arrays. In the latter case, the returned
>> value should not be subscripted if it is the only pointer to

> To be exact it causes a memory leak, which in general is a problem but
> in some situations is okay. And exactly the same for member selection
> from a returned structure. Nothing worse than a leak can happen ...

To some people, a leak is bad enough.

(big snip)

David Thompson

unread,

Nov 16, 2012, 11:21:59 PM11/16/12

to

On Mon, 5 Nov 2012 08:48:03 +0000 (UTC), glen herrmannsfeldt
<g...@ugcs.caltech.edu> wrote:

> David Thompson <dave.th...@verizon.net> wrote:
>
> (snip)
>
> >> In many other languages, substring is an intrinsic function, not
> >> an operator. That allows it to be used on any expression, which
> >> sometimes might be useful. The first language I knew with the
> >> substring operator was HP2000 BASIC. (Is that here Fortran inherited
> >> it from?)
>
> > Pedantic: As you said elsethread, strictly speaking, it's not an
> > operator in Fortran: in F77 it's one form of primary in an arithmetic
> > character or logical expression as applicable; in F90+ it's (more
> > formally) a data designator, which can be (and often is) an expression
> > primary.
>
> Yes, I didn't think of any other name for it. There isn't
> a good antonym for function. How about operator-like form?
>

That works. Or I might focus on 'syntax', such as 'specific syntax',
since to me the main point is that the actual capabilities of
functional and operator-style syntax are basically the same.
For examples, LBOUND(a,n) is almost certainly inline code not an
actual function call, while X/Y for a greater-than-hardware type like
quad float probably calls a runtime support function.

> > C's treatment of array subscripting and struct member
> > selection, and also simple and compound assignments, as operators
> > almost the same as traditional computational operators (like plus) and
> > relational and boolean operators -- rather than another category --
> > was unusual when it was created and still far from universal.
>
> I was reading since that post that subscripting in Java isn't an
> operator, but it does allow subscripting of expressions of the
> appropirate type, such as method return values. One that I have
> done a few times, is to subscript the return value from get()
> in HashTable.
>

Java calls it ArrayAccess and it is one form of Primary. It applies to
any value of array type -- where arrays in Java are always objects aka
references. An array can be "hidden" in an Object, but it can only be
subscripted while its type is known to be an array type:
Hashtable<something,int[]> x ... x.get(...) [ i ] // works but
Hashtable<something,Object> y ... y.get(...) // even if that
// table entry is actually an array must be cast before subscripting

<snip rest>