I’m trying to build a COBOL parser. One of the more difficult COBOL statements to parse is the PICTURE clause.
I’m having the most trouble with handling the decimal point ‘.’ . It can be used to indicate the end of the statement and it can also show where the decimal point is in a formatted string. The grammar below handles a single decimal point at the end of the statement [PIC X(9).], but does not handle the situation where there are two decimal points [Pic 999.99.] in the statement.
An alternative approach I was exploring was is to treat anything between the ‘Pic’ and the final period ‘.’ as a character string and let some java code analyze the picture clause. In this case, I can accumulate the entire string , strip off the last period and create an additional STATEMENT_END token to send to the parser. I’m looking for some help on how to make this work.
Any thoughts on the best approach or way to change my grammar to handle multiple decimal points is welcome.
My initial grammar is as follows.
grammar CoolPictureParser;
@header {
package antlr;
import java.util.*;
}
dd_picture_clauses
: (dd_picture_kw picture_string END_OF_PIC)+
;
dd_picture_kw
: KW_PICTURE
| KW_PIC
;
picture_string:
CURRENCY?
(PICCHAR+ REPEAT?)+
(PUNCTUATION (PICCHAR+ REPEAT?)+)* ;
KW_PIC: 'Pic';
PIC_WS : [ \t\r\n] -> skip;
CURRENCY : ~[0-9ABCDPRSVXZa-z\*\+\-\/\,\.\;\(\)\=\'\"] ;
PICCHAR :
[ABEGPSVXZabegpsvxz90\+\-\*\$]
| 'CR'
| 'DB' ;
REPEAT: '(' DIGIT+ ')' ;
//PUNCTUATION : [\/\,\.\:] ;
PUNCTUATION : [\/\,\:] ;
END_OF_PIC: '.';
fragment
DIGIT: [0-9];
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.