Classification: UNCLASSIFIED
Jerome,
I strongly recommend reading and examining the parsers mentioned. My strategy for writing a parser would be these steps:
1) Tokenizer - Parse the input into tokens, which are independent atomic pieces. It is absolutely essential that this step come first.
2) Analyzer - Examine the relationships between tokens to provide meaning and syntax validation. You have to know if something is being compared or assigned. What exactly is the current token doing in relationship to the surrounding tokens? If you want to account for automatic semicolon insertion you would do that in this step, but if you wish to ignore ASI be prepared to output an error in the case of a missing semicolon.
3) Builder - Once the analysis phase is complete you have to organize the tokens into some kind of output. This output must be a structured and well defined form that combines the tokens with the analysis so that the code can be executed by merely reading this output. Esprima, for example, produces a JSON output that uses metadata names to describe the tokens and the structure of the tokens describes the relationship between the tokens.
4) Evaluator - The code that uses the output from the previous step to execute the code and return a result. It is possible to combine steps 3 and 4, but I do not recommend this as it would make your application a bit more challenging to extend later. By keeping steps 3 and 4 separate you can output the parsed code for evaluation and troubleshooting of your parser, which helps find and remove bugs.
I imagine the hardest part of writing a parser is the QA piece at the end. There is a fair amount of sloppiness allowed in JavaScript and you have to be prepared to test a diversity of code sample to ensure you have not missed anything. Be prepared to push hundreds of samples through your parser to ensure it does exactly what you expect it should do.
Austin
On 03/31/13, jerome <
jeromec...@gmail.com> wrote:
> Hi I'm interested in defining some new data formats and learning how to parse those into valid JSON, CSS and HTML. I would like to write my own parsers injava_script to do this. Can any of you recommend a good starting point? I've read some of Crockford's writing on the parse-tree used in JSLint and I have read though some of the codebase for PegJS and now I need more. Please, suggestions!
>
>
>
>
>
> --
>
> --
>
>
http://clausreinke.github.com/js-tools/resources.html(blockedhttp://clausreinke.github.com/js-tools/resources.html) - tool information
>
>
http://groups.google.com/group/js-tools(blockedhttp://groups.google.com/group/js-tools) - mailing list information
>
>
>
> ---
>
> You received this message because you are subscribed to the Google Groups "js-tools" group.
>
> To unsubscribe from this group and stop receiving emails from it, send an email to
js-tools+u...@googlegroups.com.
>
> For more options, visit
https://groups.google.com/groups/opt_out(blockedhttps://groups.google.com/groups/opt_out).
>
>
>
>
>
>
Classification: UNCLASSIFIED