Python Target fails with SPARK grammar

113 views
Skip to first unread message

Simon Heisterkamp

unread,
Dec 20, 2021, 6:25:41 AM12/20/21
to antlr-discussion
I would like to run my own parser of the Spark Sql syntax. Spark already uses ANTLR4, but the python target fails. Spark SQL ANTLR4 grammar here.

Run this powershell code (or equivalent):

$jar = "antlr-4.9.3-complete.jar"
curl -O "https://www.antlr.org/download/$jar"
$target = "SqlBase.g4"
curl -O "https://raw.githubusercontent.com/apache/spark/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/$target"
java -jar $jar -Dlanguage=Python3 $target

Then inspect SqlBaseLexer.py. From line 1907 onwards the generated code is suddenly in C++ instead of python until about line 1960. 
I would like to submit this as an issue on the antlr github. Any objections?

Best,
Simon

Mike Lischke

unread,
Dec 20, 2021, 7:03:28 AM12/20/21
to ANTLR discussion group
Hi Simon,
The grammar contains some action code written for the C++ target (see the named actions @parser::members and @lexer::members). You have to port this manually to Python. 


Reply all
Reply to author
Forward
0 new messages