Special characters in method names

990 views
Skip to first unread message

Peter Niederwieser

unread,
May 18, 2008, 9:15:54 PM5/18/08
to JVM Languages
Does the JVM specification allow special characters (e.g. space,
comma, question mark) in method names? I couldn't find a clear answer
in the document. I'm asking because the following works for me in
Groovy:

class Foo {
def "JVM, anyone?" {}
}

I've tried to call this method from Groovy (foo."JVM, anyone?"()) and
Java (using reflection), and both times it worked. It would be great
if I could use such method names in my Groovy-based DSL, but first I
need to know whether this is officially supported by the JVM.

Cheers,
Peter

Randall R Schulz

unread,
May 18, 2008, 9:31:52 PM5/18/08
to jvm-la...@googlegroups.com
On Sunday 18 May 2008 18:15, Peter Niederwieser wrote:
> Does the JVM specification allow special characters (e.g. space,
> comma, question mark) in method names? I couldn't find a clear answer
> in the document.

Is there a reason to expect (long-term) that characters other than those
specified as acceptable in identifiers to be allowed? Would it not be
safe to simply limit those used to that set? It's pretty large. See
section 3.8 of the Java Language Specification.


> I'm asking because the following works for me in Groovy:

Today. Maybe not tomorrow...


> ...
>
> Cheers,
> Peter


Randall Schulz

Neal Gafter

unread,
May 18, 2008, 9:54:26 PM5/18/08
to jvm-la...@googlegroups.com
The set of characters acceptable to the VM in identifiers was expanded in Java SE 5, and includes virtually every unicode character except a few that have special meaning to the VM, such as '[', ';', '/', etc:

4.3.2   Unqualified Names
Names of methods, fields and local variables are stored as unqualified names.
Unqualified names must not contain the characters '.', ';', '[' or '/'. Method
names are further constrained so that, with the exception of the special method
names(§3.9) <init>and <clinit>, they must not contain the characters
'<' or '>'.

Regards,
Neal

Charles Oliver Nutter

unread,
May 19, 2008, 4:11:32 AM5/19/08
to jvm-la...@googlegroups.com

I've been using special character names in Duby for operators.

class MyInteger
def +(other)
# do something
end
end

...and the resulting Java class has a + method in it. When you add to
that Duby's full compile-time reflective type inference, you can build a
Ruby-like DSL using all the operators you'd expect (though as Neal G
points out, not < or >).

I really must get back to working Duby one of these days.

- Charlie

Jochen Theodorou

unread,
May 19, 2008, 5:56:15 AM5/19/08
to jvm-la...@googlegroups.com
Neal Gafter schrieb:

> The set of characters acceptable to the VM in identifiers was expanded
> in Java SE 5, and includes virtually every unicode character except a
> few that have special meaning to the VM, such as '[', ';', '/', etc:
>
> 4.3.2 Unqualified Names
> Names of methods, fields and local variables are stored as unqualified
> names.
> Unqualified names must not contain the characters '.', ';', '[' or '/'.
> Method
> names are further constrained so that, with the exception of the special
> method
> names(§3.9) <init>and <clinit>, they must not contain the characters
> '<' or '>'.

'/', because of class names, '[', because of arrays, ';' because it is
used to terminate a class name ([Ljava/lang/Object; = Object[])... but
why '.'?

bye blackdrag

--
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/
http://www.g2one.com/

Richard Warburton

unread,
May 19, 2008, 6:07:35 AM5/19/08
to jvm-la...@googlegroups.com
> Neal Gafter schrieb:
>> The set of characters acceptable to the VM in identifiers was expanded
>> in Java SE 5, and includes virtually every unicode character except a
>> few that have special meaning to the VM, such as '[', ';', '/', etc:

If one generates a method with such a naming, is it possible to call
it from java code? I presume reflection will allow it, but what about
otherwise?

Richard

Jochen Theodorou

unread,
May 19, 2008, 6:20:40 AM5/19/08
to jvm-la...@googlegroups.com
Richard Warburton schrieb:

I would the verifier expect to not allow it...

Peter Niederwieser

unread,
May 19, 2008, 7:06:58 AM5/19/08
to JVM Languages
On May 19, 12:07 pm, "Richard Warburton" <richard.warbur...@gmail.com>
wrote:
> If one generates a method with such a naming, is it possible to call
> it from java code? I presume reflection will allow it, but what about
> otherwise?

According to the Java language specification, Java method names may
only contain characters for which Character.isJavaIdentifierStart()/
Character.isJavaIdentifierPart() returns true. Thus, ',' '?' ' ' are
not allowed. Nevertheless, names with such characters might be useful
in DSL's based on other JVM languages.

Cheers,
Peter

Richard Warburton

unread,
May 19, 2008, 7:13:35 AM5/19/08
to jvm-la...@googlegroups.com
>> If one generates a method with such a naming, is it possible to call
>> it from java code? I presume reflection will allow it, but what about
>> otherwise?
>
> According to the Java language specification, Java method names may
> only contain characters for which Character.isJavaIdentifierStart()/
> Character.isJavaIdentifierPart() returns true. Thus, ',' '?' ' ' are
> not allowed. Nevertheless, names with such characters might be useful
> in DSL's based on other JVM languages.

Sorry - my question should probably have been phrased - 'is there a
clever workaround for this', I am aware of the basic limitations.

Richard

John Wilson

unread,
May 19, 2008, 9:12:25 AM5/19/08
to jvm-la...@googlegroups.com
On 5/19/08, Richard Warburton <richard....@gmail.com> wrote:
>
> >> If one generates a method with such a naming, is it possible to call
> >> it from java code? I presume reflection will allow it, but what about
> >> otherwise?
> >
> > According to the Java language specification, Java method names may
> > only contain characters for which Character.isJavaIdentifierStart()/
> > Character.isJavaIdentifierPart() returns true. Thus, ',' '?' ' ' are
> > not allowed. Nevertheless, names with such characters might be useful
> > in DSL's based on other JVM languages.

Well you could write a static varags method in Groovy class which took
the instance, the name as a String and the parameters which had a body
like

instance."$methodName"(*params)

(You could probably do the same in JRuby or Jython).

This is not an entirely facetious suggestion. All three dynamic
languages are getting better at method dispatch at every release. If
they are not already, they will soon doing dynamic dispatch
considerably faster than refection.

John Wilson

John Rose

unread,
May 19, 2008, 5:42:17 PM5/19/08
to jvm-la...@googlegroups.com
On May 18, 2008, at 6:54 PM, Neal Gafter wrote:
> The set of characters acceptable to the VM in identifiers was
> expanded in Java SE 5, and includes virtually every unicode
> character except a few that have special meaning to the VM, such as
> '[', ';', '/', etc:

For a longer treatment of this issue, and a good JVM-level solution,
see the entry on "symbolic freedom" in my blog http://blogs.sun.com/
jrose .

The JVM allows almost any string as a class, field, or method name.
I think the only reasonable way to serve many languages at once, with
good integration, is to agree on a convention for escaping (quoting,
lightly mangling) the few remaining illegal JVM names and have
languages use this convention in their backends and introspectors.
That way Scheme (or whoever) can just use names like + and / without
having a global prohibition against the latter, because of internal
JVM encoding conventions.

Also, Java should have a convention for escaping names not otherwise
allowed in the JLS, with the same conventions for escaping at the JVM
level. That way it can be used as a systems programming language for
the JVM, even for languages that have non-JLS identifiers. Backslash
would work:

> int \+(int x, int y) { return x+y; }
> int foo() { return \+(1, 0); }


So would string or char quotes (as in Groovy):

> int '+'(int x, int y) { return x+y; }
> int foo() { return this.'+'(1, 0); }
> // qualified literal is a literal name, not string or char


Best,
-- John

P.S. Some would suggest backquote, as an unused quote-like
character. I don't favor this, because some language projects use
backquote for experimental syntaxes. It's not an important enough
feature on which to waste a new punctuation character. There are a
suggestion on record for using underbars to escape Java keywords, but
that is not a general enough solution.

Reply all
Reply to author
Forward
0 new messages