Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Some Lexical Questions

50 views
Skip to first unread message

Jeremy Roman

unread,
Feb 3, 2014, 3:54:49 PM2/3/14
to
1. Comments: The JLS has a fairly narrow definition of what constitutes a
traditional comment (in particular, requiring a NotStar after the first
star). Since Joos 1W apparently permits "Javadoc-style" comments which are
not mentioned in JLS 2ED as far as I can see, what is permitted? The
simplest thing would be to permit anything between /* and */, including
"/**/", as a comment, and this seems to be what the current version of the
JDK does. Is this correct?

2. Operators: The operator list in JLS 3.12 seems to be incorrect (it is
missing some, and = appears three times). Is it reasonable to correct this
for our own scanners, based on the operators mentioned elsewhere in the
JLS?

3. String/character escapes: it is possible for an octal escape sequence
to produce a character which is outside ASCII. Is it permissible to allow
these to occur in character and string literals, or should such programs
be rejected as invalid Joos 1W?

Ondřej Lhoták

unread,
Feb 3, 2014, 10:48:26 PM2/3/14
to
On Mon, Feb 03, 2014 at 08:54:49PM +0000, Jeremy Roman wrote:
> 1. Comments: The JLS has a fairly narrow definition of what
> constitutes a traditional comment (in particular, requiring a
> NotStar after the first star). Since Joos 1W apparently permits
> "Javadoc-style" comments which are not mentioned in JLS 2ED as far
> as I can see, what is permitted? The simplest thing would be to
> permit anything between /* and */, including "/**/", as a comment,
> and this seems to be what the current version of the JDK does. Is
> this correct?

You are correct: the JLS2 taken literally does prohibit Javadoc-style
comments, even though they are ubiquitous in Java code and supported
by actual Java implementations. I would classify this as a bug in the
specification.

Yes, you should allow anything between /* and */ to be a comment,
including the comment "/**/".

> 2. Operators: The operator list in JLS 3.12 seems to be incorrect
> (it is missing some, and = appears three times). Is it reasonable to
> correct this for our own scanners, based on the operators mentioned
> elsewhere in the JLS?

Interesting. This appears to be a typographical error in the Safari
online version of the book. The paper copy version of the book does
not have this error, and the HTML version that Sun/Oracle used to make
available on the web did not have the error either. Yes, you should
correct the error. Apparently, the Safari version has other errors in
this list as well. The correct list from the paper version of the book
is:
= > < ! ~ ? :
== <= >= != && || ++ --
+ - * / & | ^ % << >> >>>
+= -= *= /= &= |= ^= %= <<= >>= >>>=

> 3. String/character escapes: it is possible for an octal escape
> sequence to produce a character which is outside ASCII. Is it
> permissible to allow these to occur in character and string
> literals, or should such programs be rejected as invalid Joos 1W?

Octal escapes outside the range of 7-bit ASCII characters (i.e. 0-127
or 0-0177) are allowed in Joos 1W. However, octal escapes must be in
the range allowed by Java (i.e. 0-255 or 0-0377), and your Joos compiler
must enforce this.

Jeremy Roman

unread,
Feb 5, 2014, 10:19:37 AM2/5/14
to
But the Joos "char" type must still be a 16-bit unsigned quantity (per the
JLS), correct (even though we have no literals for them)?
This seems to be valid Joos 1W that stores the 16-bit value 32000, since I
can't find anything that rules it out:

char c = (char) 32000;

Or should we instead deal with 8-bit chars and strings exclusively?

Ondřej Lhoták

unread,
Feb 5, 2014, 12:10:01 PM2/5/14
to
On Wed, Feb 05, 2014 at 03:19:37PM +0000, Jeremy Roman wrote:
> But the Joos "char" type must still be a 16-bit unsigned quantity
> (per the JLS), correct (even though we have no literals for them)?

Yes, char is a 16-bit unsigned type.

> This seems to be valid Joos 1W that stores the 16-bit value 32000,
> since I can't find anything that rules it out:
>
> char c = (char) 32000;

Yes, this is valid Joos 1W.

> Or should we instead deal with 8-bit chars and strings exclusively?

Only character and string constants have this limitation. Variables of
type char stor 16 bits, and strings are sequences of (16-bit) chars.

0 new messages