Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Determining the last statement exercise

1 view
Skip to first unread message

Csaba Gabor

unread,
Nov 3, 2009, 10:21:31 AM11/3/09
to
Suppose you have some javascript statements in a string.
Can you determine whether the entire string is syntactically
valid, and if so, the starting position of the last statement?
In other words,

function lastStatementPos(code) {
// returns the starting position within code of the last
// javascript statement, and -1 if code is not syntactiaclly
// valid.


This came up in a different context today, and I thought
it would make an interesting exercise.

Csaba Gabor from Vienna

SAM

unread,
Nov 3, 2009, 10:28:27 AM11/3/09
to
Le 11/3/09 4:21 PM, Csaba Gabor a �crit :

> Suppose you have some javascript statements in a string.

can you give an example of that ?

> Can you determine whether the entire string is syntactically
> valid, and if so, the starting position of the last statement?
> In other words,
>
> function lastStatementPos(code) {
> // returns the starting position within code of the last
> // javascript statement, and -1 if code is not syntactiaclly
> // valid.
>
>
> This came up in a different context today, and I thought
> it would make an interesting exercise.

eval(code); ?


--
sm

Evertjan.

unread,
Nov 3, 2009, 10:33:29 AM11/3/09
to

A string like?

str = "alert(11);{alert('alert(11); is correct Javascript');};"

Would that be usefull?
I don't think it would.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)

Richard Cornford

unread,
Nov 3, 2009, 10:56:51 AM11/3/09
to
On Nov 3, 3:21 pm, Csaba Gabor wrote:
> Suppose you have some javascript statements in a string.
> Can you determine whether the entire string is syntactically
> valid, and if so, the starting position of the last statement?

It would be possible to write a javascript tokeniser/parser with
javascript, and have that determine the syntactic correctness of the
source and expose the contained statements in a way that would allow
you to determine the starting position of the last.

> In other words,
>
> function lastStatementPos(code) {
> // returns the starting position within code of the last
> // javascript statement, and -1 if code is not syntactiaclly
> // valid.
>
> This came up in a different context today, and I thought
> it would make an interesting exercise.

Maybe a bit too big a task to be considered an 'exercise'. Quickest
results would probably start with something that was already doing
most of the job like Narcissus or JSLint.

<URL: http://mxr.mozilla.org/mozilla/source/js/narcissus/ >

Richard.

Johannes Baagoe

unread,
Nov 3, 2009, 11:19:56 AM11/3/09
to
Csaba Gabor :

> Suppose you have some javascript statements in a string. Can you
> determine whether the entire string is syntactically valid, and if so,
> the starting position of the last statement? In other words,

> function lastStatementPos(code) {
> // returns the starting position within code of the last // javascript
> statement, and -1 if code is not syntactiaclly // valid.

What should lastStatementPos('{a = 1; b = 2}') return? 0 or 8?

--
Johannes

Thomas 'PointedEars' Lahn

unread,
Nov 3, 2009, 5:11:53 PM11/3/09
to
Csaba Gabor wrote:

> Suppose you have some javascript statements in a string.

Define "javascript".

> Can you determine whether the entire string is syntactically
> valid,

Depends. Short of writing an ECMAScript parser, which would only be able
to cover specified syntax rules, where available you could try-catch the
SyntaxError that the eval() of the specific implementation would throw if
the code would not conform to its syntax rules. However, AFAICS that can
become quite a nuisance in Konqueror 4.3.2 as syntax errors cannot be
catched there (syntax errors caused Konqueror 3.5.x to crash, therefore I
had to modify the ECMAScript Support Matrix.)

> and if so, the starting position of the last statement?

Depends. What is, in your book, considered "the last statement" in

if (x)
y;
else
z;

(we are looking at the following production here:

/IfStatement/ :
if (/Expression/) /Statement/ else /Statement/

see ES3F, 12.5)


PointedEars
--
Use any version of Microsoft Frontpage to create your site.
(This won't prevent people from viewing your source, but no one
will want to steal it.)
-- from <http://www.vortex-webdesign.com/help/hidesource.htm> (404-comp.)

Csaba Gabor

unread,
Nov 3, 2009, 7:42:34 PM11/3/09
to
On Nov 3, 4:33 pm, "Evertjan." <exjxw.hannivo...@interxnl.net> wrote:
> Csaba  Gabor wrote on 03 nov 2009 in comp.lang.javascript:
>
> > Suppose you have some javascript statements in a string.
> > Can you determine whether the entire string is syntactically
> > valid, and if so, the starting position of the last statement?
> > In other words,
>
> > function lastStatementPos(code) {
> >   // returns the starting position within code of the last
> >   // javascript statement, and -1 if code is not syntactiaclly
> >   // valid.
>
> > This came up in a different context today, and I thought
> > it would make an interesting exercise.
>
> A string like?
>
> str = "alert(11);{alert('alert(11); is correct Javascript');};"
>
> Would that be usefull?
> I don't think it would.

On Nov 3, 4:33 pm, "Evertjan." <exjxw.hannivo...@interxnl.net> wrote:
> Csaba Gabor wrote on 03 nov 2009 in comp.lang.javascript:
>
> > Suppose you have some javascript statements in a string.
> > Can you determine whether the entire string is syntactically
> > valid, and if so, the starting position of the last statement?
> > In other words,
>
> > function lastStatementPos(code) {
> > // returns the starting position within code of the last
> > // javascript statement, and -1 if code is not syntactiaclly
> > // valid.
> > }
>
> > This came up in a different context today, and I thought
> > it would make an interesting exercise.
>
> A string like?
>
> str = "alert(11);{alert('alert(11); is correct Javascript');};"
>
> Would that be usefull?
> I don't think it would.

You've made a good example, and yes it can be useful.
But I think I better give context to my question
and recast it, especially in light of the fact that my
proposed solution does not work across browsers.

The question arose in the following context: The
user is asked to enter some javascript statements
to affect a value. I want to ensure that all the
statements are syntactically correct, AND to insert
a return before the last statement, if it doesn't
start with return, but does make syntactic sense.

The validation is fairly straightforward:

function syntax_check(code) {
// returns false is code is not syntactically OK
// returns browser's interpretation of the code if it's OK
try {
var f = new Function(code);
return f.toString(); }
catch (err) { return false; } // syntax error
}


On firefox, the returned string is cleaned of all
comments, and is recast into a 'standard form'.
I was thinking that if the penultimate line of
this returned string did not consist of "}",
then a return could be prefixed, if it wasn't
already there. Any exceptions to this?

However, with IE the returned string is not gussied up
into standard form and is left pretty much as is.

So the question, exercise, or problem is: given
a function which will tell you whether or not you
have a syntactically correct javascript, can you
semi reasonably isolate the last statement to
determine whether it should be prefixed with a
return, and to do so in such case.


Some examples:
code = "'Fred'" => return 'Fred';

code = "var i=7; i *= 9"; =>
var i=7;
return i *= 9;

code = "x='word'\nx+='s' // making a plural\n" +
" return x /* pluralizing a word */";
=> no change

Evertjan's code =>
alert(11);
return alert('alert(11); is correct Javascript');

code = '{a = 1; b = 2}' =>
a = 1;
return b = 2;


Two problem cases:
code = "var y=6;\n y*= 2 // difficult; hard case";
code = "var y=6;\n y*= 2; // difficult; hard case";

Evertjan.

unread,
Nov 4, 2009, 10:33:26 AM11/4/09
to
Csaba Gabor wrote on 04 nov 2009 in comp.lang.javascript:

> Evertjan's code =>
> alert(11);
> return alert('alert(11); is correct Javascript');
>

Which is utter nonsense as

1 alert() does not have a return value

2 return is only sensible in a function

> The question arose in the following context:
> The user is asked to enter some javascript statements
> to affect a value.

Sorry, that is not only not very useful, but it could even be dangerous.

Thomas 'PointedEars' Lahn

unread,
Nov 4, 2009, 1:18:30 PM11/4/09
to
Csaba Gabor wrote:

> The validation is fairly straightforward:

AISB, it is not.



> function syntax_check(code) {
> // returns false is code is not syntactically OK

Based on which syntax rules? ECMAScript's, JavaScript's, JScripts or
others'?

> // returns browser's interpretation of the code if it's OK
> try {

And if this already constitutes a syntax error, the whole thing breaks.

> var f = new Function(code);

And if the Function constructor is not supported, the whole thing breaks.

> return f.toString(); }
> catch (err) { return false; } // syntax error
> }

Does not work reliably. You have failed to realize, among other things,
that syntax is context-sensitive. For example, `return' is allowed in a
function, it is not allowed elsewhere.


PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann

VK

unread,
Nov 4, 2009, 1:36:18 PM11/4/09
to
Thomas 'PointedEars' Lahn wrote:
>
> Based on which syntax rules?  ECMAScript's, JavaScript's, JScripts or
> others'?
>
> And if the Function constructor is not supported, the whole thing breaks.

By trying get nasty do not get dorky ;)

To OP: I am wondering if you are looking for a regexp solution
recreating parser rules - or some creative eval or eval-like approach.
Or it's up to respondents to find the most effective way?


Thomas 'PointedEars' Lahn

unread,
Nov 4, 2009, 2:27:39 PM11/4/09
to
VK wrote:

> Thomas 'PointedEars' Lahn wrote:
>> Based on which syntax rules? ECMAScript's, JavaScript's, JScripts or
>> others'?
>>
>> And if the Function constructor is not supported, the whole thing breaks.
>
> By trying get nasty do not get dorky ;)

Please shut up until you know what you are talking about (suppose
that is ever going to happen). And lose the bogus smileys.


PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16

John G Harris

unread,
Nov 4, 2009, 2:35:47 PM11/4/09
to
On Tue, 3 Nov 2009 at 16:42:34, in comp.lang.javascript, Csaba Gabor
wrote:

<snip>


>code = '{a = 1; b = 2}' =>
> a = 1;
> return b = 2;

<snip>

b=2 is not the last statement. The last statement is the one that ends
with } .

You do know that a block statement, { ... }, is a statement, don't you ?

John
--
John Harris

VK

unread,
Nov 4, 2009, 6:13:36 PM11/4/09
to
Csaba Gabor wrote:
> So the question, exercise, or problem is: given
> a function which will tell you whether or not you
> have a syntactically correct javascript, can you
> semi reasonably isolate the last statement to
> determine whether it should be prefixed with a
> return, and to do so in such case.

No, it is not possible because it would violate Rice's Theorem and
respectively would violate the halting problem which is known to be
undecidable. For the full description see http://en.wikipedia.org/wiki/Rice%27s_theorem
in a very simplified yet useful form it states that "Any question
about what an arbitrary script does with an arbitrary input is
undecidable, unless it is trivial".

For "semi reasonably" practical use as a helper for beginners it can
be a script that
1) takes the string input
2) gets guarantees that it is a function body with curly brackets
removed
3) gets guarantees that there are not inner functions in the body
( steps 2 and 3 assure that any closing curly bracket is closing an
inner block of statements and nothing else )
4) find the first line break going from the end that followed by non-
white-space chars
5) if it's return whatever then OK, exit, else go to 6)
6) if the previous line contains "return \s*\n" then remove \s*\n, OK,
exit
(often beginners' mistake of
return
myReturnValue;)
else go to 7)
7) if 4) contains "}" the return error, the problem is not decidable
8) if 4) contains something else than change to "return something
else", OK, exit.

That is the physical maximum you can get, the actual coding and tune
up is skipped.

Csaba Gabor

unread,
Nov 5, 2009, 5:35:23 AM11/5/09
to

A rhetorical question?
Nevertheless, since it has come up a few times in this
thread... This exercise is about determining the last
statement and its location (in possibly transformed code),
and the method is more interesting to me than the exact
definition used to get at an answer. What I am saying
is that I would be just as happy with either the original
{a = 1; b = 2} in the example above being returned, or
the b = 2 portion that FF indicates.

Csaba Gabor

unread,
Nov 5, 2009, 6:39:41 AM11/5/09
to
On Nov 5, 12:13 am, VK <schools_r...@yahoo.com> wrote:
> Csaba  Gabor wrote:
> > So the question, exercise, or problem is: given
> > a function which will tell you whether or not you
> > have a syntactically correct javascript, can you
> > semi reasonably isolate the last statement to
> > determine whether it should be prefixed with a
> > return, and to do so in such case.
>
> No, it is not possible because it would violate Rice's Theorem and
> respectively would violate the halting problem which is known to be
> undecidable. For the full description seehttp://en.wikipedia.org/wiki/Rice%27s_theorem

> in a very simplified yet useful form it states that "Any question
> about what an arbitrary script does with an arbitrary input is
> undecidable, unless it is trivial".

An interesting approach VK. I am familiar with Rice's Theorem,
but let me show you that it does not apply, and then perhaps
you will be able to see WHY it does not apply.

Let's consider this same question in the PHP world,
where it is easier to handle. In PHP land, all statements
except the last one must end with ; or }.

Thus, ignoring the final character in the code, find
the prior occurrence of either ; or }. Strip the
string from the next character to the end and syntax
check it. If it doesn't pass, keep looking back for
the next prior occurrence of ; or }.

If instead, the syntax check passes, then the final
character is either the end of the prior statement or
at the end of a // comment. The latter is easily
checked by syntax checking the code with ' x y' affixed.
If at the end of a // comment, keep looking back
for the next prior occurrence of ; or }. Otherwise,
we're done.

So, since I've demonstrated a solution to the problem
(in PHP land), it's clear that Rice's Theorem does
not hold here. My question to you is why not?

Javascript syntax is more complicated, however, and
the question remains whether the approach outlined
above is viable in JS land. In particular, how to
differentiate between (arbitrarily messy versions of):

while(truth)
x+="*"

and

whole(truth)
x+="*"

Csaba

John G Harris

unread,
Nov 5, 2009, 9:24:46 AM11/5/09
to
On Thu, 5 Nov 2009 at 02:35:23, in comp.lang.javascript, Csaba Gabor

wrote:
>On Nov 4, 8:35�pm, John G Harris <j...@nospam.demon.co.uk> wrote:
>> On Tue, 3 Nov 2009 at 16:42:34, in comp.lang.javascript, Csaba Gabor
>> wrote:
>>
>> � <snip>>code = '{a = 1; b = 2}' =>
>> > �a = 1;
>> > �return b = 2;
>>
>> � <snip>
>>
>> b=2 is not the last statement. The last statement is the one that ends
>> with } .
>>
>> You do know that a block statement, { ... }, is a statement, don't you ?
>
>A rhetorical question?
>Nevertheless, since it has come up a few times in this
>thread... This exercise is about determining the last
>statement and its location (in possibly transformed code),
>and the method is more interesting to me than the exact
>definition used to get at an answer. What I am saying
>is that I would be just as happy with either the original
>{a = 1; b = 2} in the example above being returned, or
>the b = 2 portion that FF indicates.

The only kind of statement you can put a 'return' in front of is an
expression statement. E.g a=b; becomes return a=b;
So you have to worry about this piece of code :

if ((new Date()).getDay() == 0) a = 23; else a = 42;

Which value do you want to return ?

I can't help thinking that you need to spend more time deciding what you
want, and also whether it's really needed.

John
--
John Harris

John G Harris

unread,
Nov 5, 2009, 9:04:56 AM11/5/09
to
On Wed, 4 Nov 2009 at 15:13:36, in comp.lang.javascript, VK wrote:
>Csaba Gabor wrote:
>> So the question, exercise, or problem is: given
>> a function which will tell you whether or not you
>> have a syntactically correct javascript, can you
>> semi reasonably isolate the last statement to
>> determine whether it should be prefixed with a
>> return, and to do so in such case.
>
>No, it is not possible because it would violate Rice's Theorem and
>respectively would violate the halting problem which is known to be
>undecidable. For the full description see http://en.wikipedia.org/wiki/
>Rice%27s_theorem
>in a very simplified yet useful form it states that "Any question
>about what an arbitrary script does with an arbitrary input is
>undecidable, unless it is trivial".
<snip>

He isn't trying to check that the code does a particular job, so Rice's
theorem isn't relevant. Nor is he trying to check that the code will
terminate on all inputs, so Turing's halting theorem isn't relevant.

He wants to find the start and end of each statement in the code.
Javascript parsers are able to do that, as is well known.

John
--
John Harris

VK

unread,
Nov 5, 2009, 1:15:56 PM11/5/09
to
Csaba Gabor wrote:
> Let's consider this same question in the PHP world,
> where it is easier to handle. In PHP land, all statements
> except the last one must end with ; or }.

It doesn't matter because Rice's Theorem doesn't depend on a formal
language syntax. The only thing that matters is if it's a general
purpose higher level programming language (skipping for now on exact
formal definitions of these terms). Perl also requires ; after each
statement, so does C++ if I recall properly - and their cases for
Rice's Theorem have been proven 2 and 1 year ago respectively. No one
bothered to make it for PHP yet simply because after the generalized
prove is made it is not so interesting anymore.

> Thus, ignoring the final character in the code, find
> the prior occurrence of either ; or }. Strip the
> string from the next character to the end and syntax
> check it. If it doesn't pass, keep looking back for
> the next prior occurrence of ; or }.

I gave the possible block schema in my previous answer and
demonstrated that for the most trivial cases it is well possible to
write a "return adder". The only extension I would still suggest is to
add correction for cases like:
return
Oops_ReturnOnTheNextLine;

For a general case finding the right place for the return statement
means to algorithmically decide for an arbitrary program with an
arbitrary input if it has a return point and then to find that
intended return point of it, so, unlike Harris claimed, it is
necessary to prove that the halting problem is decidable and
consecutively to resolve the Entscheidungsproblem. Another option is
to write a program that successfully passes the Turing test so having
all qualities of a free-will human identity.
I guess that either of these three tasks from above is way beyond the
humble frames of clj. The future Nobel laureate should post such
solution in a serious scientific preprint journals right away :)

Thomas 'PointedEars' Lahn

unread,
Nov 5, 2009, 1:59:22 PM11/5/09
to
VK wrote:

> Csaba Gabor wrote:
>> So the question, exercise, or problem is: given
>> a function which will tell you whether or not you
>> have a syntactically correct javascript, can you
>> semi reasonably isolate the last statement to
>> determine whether it should be prefixed with a
>> return, and to do so in such case.
>
> No, it is not possible because it would violate Rice's Theorem and
> respectively would violate the halting problem which is known to be
> undecidable. For the full description see
> http://en.wikipedia.org/wiki/Rice%27s_theorem in a very simplified yet
> useful form it states that "Any question about what an arbitrary script
> does with an arbitrary input is undecidable, unless it is trivial".

Neither Rice's theorem nor the halting problem applies here.

If they did, no compiler could throw a compile error, and in particular no
script engine could throw a syntax error.

This is neither about deciding whether the result of an algorithm meets a
particular value (as it suffices to get the value) nor whether this
algorithm holds (as it always holds, either with success or failure). It is
simply about whether an input can be produced by a (context-free) grammar,
and that is a decidable problem.

As usual, you do not know what you are talking about.

PointedEars
--
Prototype.js was written by people who don't know javascript for people
who don't know javascript. People who don't know javascript are not
the best source of advice on designing systems that use javascript.
-- Richard Cornford, cljs, <f806at$ail$1$8300...@news.demon.co.uk>

Thomas 'PointedEars' Lahn

unread,
Nov 5, 2009, 2:06:24 PM11/5/09
to
Csaba Gabor wrote:

> John G Harris wrote:
>> Csaba Gabor wrote:
>>
>> <snip>>code = '{a = 1; b = 2}' =>
>> > a = 1;
>> > return b = 2;
>>
>> <snip>
>>
>> b=2 is not the last statement. The last statement is the one that ends
>> with } .
>>
>> You do know that a block statement, { ... }, is a statement, don't you ?
>
> A rhetorical question?

Apparently not.

> Nevertheless, since it has come up a few times in this
> thread... This exercise is about determining the last
> statement and its location (in possibly transformed code),

That would be the /Block/ statement, then.

> and the method is more interesting to me than the exact
> definition used to get at an answer.

But the exact definition is a prerequisite for finding an answer that you
would consider correct.

> What I am saying is that I would be just as happy with either the original
> {a = 1; b = 2} in the example above being returned, or the b = 2 portion
> that FF indicates.

That would be the last statement *that is executed*, provided
the execution of the statement does not depend on a condition.
As you can see, your oversimplification is not helpful.

John G Harris

unread,
Nov 5, 2009, 3:19:32 PM11/5/09
to
On Thu, 5 Nov 2009 at 10:15:56, in comp.lang.javascript, VK wrote:
>Csaba Gabor wrote:
>> Let's consider this same question in the PHP world,
>> where it is easier to handle. In PHP land, all statements
>> except the last one must end with ; or }.
>
>It doesn't matter because Rice's Theorem doesn't depend on a formal
>language syntax. The only thing that matters is if it's a general
>purpose higher level programming language (skipping for now on exact
>formal definitions of these terms).
<snip>

It doesn't have to be a high-level language. It can be a Turing machine,
and you can't get any lower-level than that.

Rice's theorem is about testing a function to see if it meets its
specification. That's not what's wanted here.


>For a general case finding the right place for the return statement
>means to algorithmically decide for an arbitrary program with an
>arbitrary input if it has a return point and then to find that
>intended return point of it, so, unlike Harris claimed, it is
>necessary to prove that the halting problem is decidable

Read the question. The OP wants to find the last part of the submitted
code and make it a return statement if it's an expression. It doesn't
matter in the least if execution never reaches that last part : he isn't
interested.


>and
>consecutively to resolve the Entscheidungsproblem.

That's the problem of proving that a well formed formula is/isn't a
theorem, given a logic language and a set of axioms. It can be solved
for some, relatively simple, languages but not in general.


>Another option is
>to write a program that successfully passes the Turing test so having
>all qualities of a free-will human identity.
>I guess that either of these three tasks from above is way beyond the
>humble frames of clj. The future Nobel laureate should post such
>solution in a serious scientific preprint journals right away :)

They don't award Nobel Prizes for mathematics.

John
--
John Harris

Csaba Gabor

unread,
Nov 8, 2009, 8:12:52 PM11/8/09
to

My solution for determining the (position of) the
last javascript statement runs along the lines outlined
in my Nov. 5 response to Lasse's first post at
http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/91262ad01ca356bc/
and also in my response to VK within this thread:
http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/2aa9a60623eb5883/

In the below, the term inject will mean to insert
code at a given point within a larger block of code
and to have it be syntactically valid.

Unless I'm forgetting some cases, javascript statements
end with either ; or } or a newline. The idea will
be to go backwards (KSB = keep searching backwards)
through each viable one of these characters (call the
current character being checked cP). See the
description of KSB below for determining the next
previous viable terminating character) and make 9
checks to decide whether or not we have a last
statement:

1. First, we determine that the character after cP
is not part of a comment (or in other "inert" code
such as a string or regular expression) by injection
of non compiling code. If it's an inert position,
then KSB

2. Find the next non whitespace character, call it
chNext. If chNext is a semicolon or any closing
delimiter such as ], ), or } then we don't actually
have a statement (or it's empty) so KSB

3. If we can not inject an empty loop at this point
(and have the result be syntactically correct), then
then cP does not represent an end of statement, so KSB

4. Starting with cP, find the first non whitespace
character going backwards, call it chPrev, If chPrev
is a semicolon or closing curly brace then we are done,
since those will mark the end of a statement. But
see the description of KSB below for why we don't have
to worry about embeddings.

5. If we can inject a "break; " at this point,
we're in a loop, so KSB.

6. If we can inject an " else; if (x) " at this point,
then we're in an if, so KSB.

7. If the final part of the code through chPrev is
an else, then we're in an if statement, so KSB.

8. If chNext is the first character of an autoincrement
or an autodecrement (++ or --) then it is independent
of the previous statement (since we passed point 3),
so we are done.

9. If chNext is not one of the five
characters +, -, (, [, or / then we are likewise done.
Otherwise, KSB. The other characters are not ambiguous
in their desire to connect to a previous item (such
as .), but those five are. However, if we have gotten
here, and one of those characters appears, then the
prior portion of code is an expression and chNext will
be connected to it. KSB

KSB. To finish off the algorithm description, some
words on determining the prior newline, semicolon,
or right curly brace: If a right curly brace is
encountered, then anything within it cannot be a
last statement. Therefore, in that situation, find
the corresponding opening brace, and continue the
search backwards from there. Finding the matching
(opening) brace is straightforward because one just
searches backwards replacing the text starting from
each candidate opening brace till the closing one
with {x} or {x:y}, and then apply checkSyntax. The
first opening brace to pass such a syntax check is
the matching one. While this does not guarantee
that we will be at the top level (eg. if / nested
loops), at least be can be sure that we are not
embedded within {...}


I have implemented the above description into
working code, though there are some parts still
unexercised. Since this is already a long post, I
won't make it longer by posting the actual code.
However, one can piggyback off this to break up a
piece of code into all its component statements,
even within loops, conditionals, and blocks. That,
however, is a bit trickier since one can't simply
recurse with lastStatementPos at the lower levels,
because the code within a block won't necessarily
pass a syntax check (eg. if you try to work with:
{ var foo=7;
break; }
without including the relevant loop, checkSyntax
will be displeased (ie. it won't pass a syntax
check). try/catch also require a bit of care.


In any case, direct syntax parsing is more
efficient, but the method outlined here and in
removeTrailingComments, that of using javascript's
own syntax checking upon injection of a judicious
code snippet is sometimes far simpler than having
to write a parser from scratch. For example,
the method is easily adapted to removing all
comments from a given block of code. In addition,
it also solves the problem of injecting a return,
when possible, before the last statement under
either of the two possible definitions mentioned
(since if the final statement is a block, one can
just recurse into it).

Csaba Gabor from Vienna

kangax

unread,
Nov 9, 2009, 1:14:50 PM11/9/09
to
Csaba Gabor wrote:
> On Nov 3, 4:21 pm, Csaba Gabor <dans...@gmail.com> wrote:
>> Suppose you have some javascript statements in a string.
>> Can you determine whether the entire string is syntactically
>> valid, and if so, the starting position of the last statement?
>> In other words,
>>
>> function lastStatementPos(code) {
>> // returns the starting position within code of the last
>> // javascript statement, and -1 if code is not syntactiaclly
>> // valid.
>>
>> This came up in a different context today, and I thought
>> it would make an interesting exercise.
>>
>> Csaba Gabor from Vienna
>
> My solution for determining the (position of) the
> last javascript statement runs along the lines outlined
> in my Nov. 5 response to Lasse's first post at
> http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/91262ad01ca356bc/
> and also in my response to VK within this thread:
> http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/2aa9a60623eb5883/
>
> In the below, the term inject will mean to insert
> code at a given point within a larger block of code
> and to have it be syntactically valid.
>
> Unless I'm forgetting some cases, javascript statements
> end with either ; or } or a newline. The idea will

Don't forget that it's LineTerminator production (see 7.3), not just
newline.

[...]

--
kangax

0 new messages