This might be something for Congo, not JavaCC21, but I thought I would ask now...
I'm investigating how to eliminate my legacy way of recovering from syntax errors while issuing error messages and such. In general the way it was done (and still is in the JavaCC21 version of the grammar) is with a try/catch block that catches the ParseException and then indicates the error with a meaningful message, followed by a "manual" re-sync hopefully to a known point in the source (consuming everything along the way). At that point another message is issued to declare that scanning has resumed, and the try/catch expansion ends successfully. It looks something like this:
UsingArgs :
ATTEMPT
( <USING> Usings [ ReturningArg ] | ReturningArg [ <USING> Usings ] )
RECOVER {
indicateError(ErrorCode.BAD_USING_PHRASE);
skipToBefore(
DECLARATIVES, /* try beginning of DECLARATIVES section */
Tokens.sequence(COBOL_WORD,SECTION),/* try a section header */
Tokens.sequence(LEVEL_66,DOT), /* try a paragraph header */
Tokens.sequence(LEVEL_77,DOT),
Tokens.sequence(LEVEL_78,DOT),
Tokens.sequence(LEVEL_88,DOT),
Tokens.sequence(LEVEL_01,DOT),
Tokens.sequence(CHILD_LEVEL_NUMBER,DOT),
Tokens.sequence(INTEGER,DOT),
Tokens.sequence(COBOL_WORD,DOT),
ACCEPT, /* try a statement */
ADD,
ALTER,
CALL,
CANCEL,
CLOSE,
COMPUTE,
CONTINUE,
DELETE,
DISPLAY,
DIVIDE,
EVALUATE,
EXEC_SQL,
EXIT,
GOBACK,
Tokens.sequence(GO,TO),
GO,
IF,
INITIALIZE,
INSPECT,
MERGE,
MOVE,
MULTIPLY,
OPEN,
PERFORM,
READ,
RELEASE,
RETURN,
REWRITE,
SEARCH,
SET,
SORT,
START,
STOP,
STRING,
SUBTRACT,
UNLOCK,
UNSTRING,
WRITE,
Tokens.sequence(END,PROGRAM), /* try the end of program */
EOF); /* punt */
printScanResumed();
}
;
Note the fragile, labor-intensive way it has to "guess" at the follow set (at least approximately) in order to position itself correctly to resume parsing. Clearly it is desirable to replace all this with something using the Javacc21 (or Congo) feature of re-syncing in fault-tolerant mode. Something intuitively like:
UsingArgs :
ATTEMPT
<USING> Usings [ ReturningArg ] | ReturningArg [ <USING> Usings ]
RECOVER (
{ indicateError(ErrorCode.BAD_USING_PHRASE); }
( <USING> Usings! [ ReturningArg ]! | ReturningArg! [ <USING> Usings ]! )
{ printScanResumed(); }
)
;
This way, the error associated with the failing expansion can be issued, the expansion can be retried with fault-tolerant re-sync in effect, and the point at which normal parsing resumes can be identified.
[ I realize this thought experiment has issues right now, among which are the ones that the comments in the INJECT AttemptBlock call attention to regarding whether or not the first and follow sets should include the RECOVER expansion(s), if any. This is just to illustrate what I am thinking.
So I began by investigating this to see what was actually generated:
Sentence# :
ATTEMPT
( ( Statement )* SomeDots ) // MV quirk (empty sentence)
RECOVER (
( Statement )*! SomeDots // Re-sync on Statement or dots
{ printScanResumed(); }
)
;
Well, it seems to cause an exception generating the parser (lookahead I think). A simpler case that also causes the same error is:
X :
FooOrBar
;
FooOrBar :
ATTEMPT
"foo"
RECOVER
( "bar" )
;
It seems that an ExpansionSequence is expected, but an AttemptBlock is an Expansion.
So I think failure of the simple case above is a bug, and now I've revealed where I am headed with this, so any other guidance regarding that would be welcome.
BTW, this is the error:
Jan 13, 2023 4:30:39 PM freemarker.log.JDK14LoggerFactory$JDK14Logger error
SEVERE: java.lang.ClassCastException: com.javacc.parser.tree.AttemptBlock cannot be cast to com.javacc.core.ExpansionSequence
java.lang.ClassCastException: com.javacc.parser.tree.AttemptBlock cannot be cast to com.javacc.core.ExpansionSequence
----------
==> if grammar.choicePointExpansions?size !=0 [on line 39, column 5 in LookaheadRoutines.java.ftl]
in user-directive LookaheadCode.Generate [on line 326, column 1 in Parser.java.ftl]
----------
```
...snip ...