For parser output, I would vote for an intermediate XML format with the
aim of compilation to XSLT (and ideally XQuery as well). I got feedback
from several people that compilation to a standard language would make
them more likely to use it.
This of course wouldn't prevent alternative implementations, such as
native support for Carrot in XQilla, which John Snelson proudly
demonstrated by the end of the day. :-)
Another resource that might be helpful:
http://www.fatdog.com/Extreme.html
Evan
I could take a stab at it.
An intermediate XML format would be useful although maybe an in-memory AST
first ? may need to do post-parsing logic to produce the right XML.
But having an intermediate XML lets you use XML tools to do the compilation
... is that what you had in mind ?
I suggest
Parser -> In-memory AST -> XML
Then this decopuples the parser and lets a compiler author pull in at either
the Java/Memory/Object layer or the XML layer.
So given this. What XML format ? Should we go with something close to
XQueryX or invent our own ?
Do you have an XML Schema in mind ?
----------------------------------------
David A. Lee
dl...@calldei.com
http://www.xmlsh.org
http://www.w3.org/2010/02/qt-applets/xquery10/
If it is what it claims to be ('it works!")
I agree we should start from this. Parser can be JavaCC , In memory tree
can be JJTree
XML format ? a specialization/bastardazation of XQueryX ?
I think something like an "extended subset" of XQueryX would work great,
but if it's easier to use existing grammar names (and extra work to
translate to XQueryX-ish names), I would avoid the extra work. I don't
think the intermediate XML format needs to be standardized, just clear.
"Extended" in that we have our own top-level module format, and XQuery
expressions are extended to include:
* ruleset invocations (^foo()),
* shallow copy constructors (copy{}), and
* text node literals (`text`).
"Subset" in that we only use XQuery expressions, not XQuery's syntax for
defining functions, etc.
Evan
Evan
http://www.w3.org/2007/01/applets/
XQueryX is a nasty format to start from for performing the translation,
but it is a place to start. There is also an existing stylesheet which
will translate XQueryX back into XQuery - which is probably a good
foundation for some of the functionality.
Building an XQuery parser is hard - we shouldn't attempt it if we have a
default alternative. That said, I'd love to see an XQuery parser written
in XQuery - one day maybe.
John
--
John Snelson, Senior Engineer http://twitter.com/jpcs
MarkLogic Corporation http://www.marklogic.com
If noone else is chomping at the bit, I volunteer to take this W3C XQuery
XML and first
1) Try to reproduce in my own environment a working implementation of a
standalone parser (not an applet).
2) Start adding mods one by one to evolve to the carrot specs (there is one
right ? :)
I dont particularly like JJTree because if has too little flexibility in
your in-memory data model, but if a framework is already functional that
uses it, its probably best to keep it for now. This then at least gives us
less variables for debugging.
-David
Despite the verbosity of yapp-xslt's output, I appreciated the fact that
the resulting XML was just a bunch of element markup added to the original
expression. That does seem way easier to use for our purposes than XQueryX.
Evan
Sent from my iPhone
Whereas yapp-xslt outputs the original query text with element decorations.
Evan
[24]Â Â Â | VarDecl | Â Â Â ::=Â Â Â |
|
[26]Â Â Â | FunctionDecl | Â Â Â ::=Â Â Â |
|
[84]   | PrimaryExpr |    ::=   | Literal | VarRef | ParenthesizedExpr | ContextItemExpr | FunctionCall | OrderedExpr | UnorderedExpr | Constructor |
[85]   | Literal |    ::=   | NumericLiteral | StringLiteral | TextNodeLiteral |
[109]Â Â Â | ComputedConstructor | Â Â Â ::=Â Â Â | CompDocConstructor |
I found a newer link to the XQuery parser files
Â
http://www.w3.org/2010/02/qt-applets/
Â
Any suggestion on what "base" we use for Carrot ?Â
Â
(The old link had some funky zip files without full source).
John
http://www.w3.org/2010/02/qt-applets/
I've made tiny progress but got stuck. I've asked Liam for help.
Other suggestions welcome
Here's the message to Liam:
--------------
Hi Liam. Great to see you again at Balisage.
I'm working on trying to run the XQuery grammar files here :
http://www.w3.org/2010/02/qt-applets/
And so far running into problems. Some I've resolved.
1) missing xerces.jar -> I got this on the web
2) Missing grammar.dtd -> I guessed the URL and got it.
Now I'm stuck in the build process while it looks for various "style" xsl
files none of them are in the source or library zips.
From build.xml:
<property name="strip-grammar-file" value="../../style/strip.xsl"/>
<property name="assemble-spec-file"
value="../../style/assemble-parser-note.xsl"/>
<property name="grammar2spec-file" value="../../style/grammar2spec.xsl"/>
<property name="style-spec-file"
value="../../style/xmlspec-override.xsl"/>
<property name="style-shared-file"
value="../../style/lexnote-shared.xsl"/>
Where would I find these ?
Preferably is there a location with ALL the dependent files so that I could
run the parser generator ?
If not ... I may be asking you more questions as I peel the onion.
Thanks very much !
-David Lee
--------------------
----------------------------------------
David A. Lee
dl...@calldei.com
http://www.xmlsh.org
-----Original Message-----
From: carro...@googlegroups.com [mailto:carro...@googlegroups.com] On
Behalf Of John Snelson
Sent: Wednesday, August 10, 2011 7:06 AM
To: carro...@googlegroups.com
Subject: Re: [Carrot] Re: Parser and Architecture
Thinking of just starting with the JavaCC and/or JTree code and skipping the
source XML.
Not really fond of JTree ... but since its already implemented that way it
might make a good start.
Suggestions ?
John
I'm simply missing lots of files.
Whats posted on the W3C site is a tiny subset of what's required. Lots of
references to ../../xxx
which is assumed to exist ...
Liam said he was going to refer my question to the person (unknown?) who's
in charge of the grammer but havent heard anything else.
Also said if I was a member I could pull the files (but I'm not).
Starting with the JJTree files wouldn't be the end of the world, I guess. Just not as nice to modify as it could be.
John
BUILD FAILED
C:\Work\DEI\carrot\xgrammar\grammar\parser\build.xml:415: The following
error occurred while executing this line:
C:\Work\DEI\carrot\xgrammar\grammar\parser\build.xml:496: input file
C:\Work\DEI\carrot\xgrammar\xquery-11\src\xquery.xml does not exist
The directory
C:\Work\DEI\carrot\xgrammar\xquery-11
Doesnt exist and I cant find it on W3C web site.
Any suggestions ?
cd parser
ant gen-grammar -Dlanguage=xquery10 -Dspec=xquery10
You can change the parameters to generate different languages. For instance:
ant gen-grammar -Dlanguage=combined -Dspec=xquery10 -Dspec2=fulltext
-Dspec3=update
The results are put into build/${language}. I don't think you need
anything else in the build file.
John