I am interested in learning more about the error reporting capabilities of CongoCC.
I have a JavaCC-based custom embedded expression language that can evaluate expressions against a data model in my application. The expression language supports the usual math, string, logical and relational operators plus a large set of functions. Expressions can be entered and compiled, and then subsequently evaluated against records in the data model. I would like to improve the error reporting capabilities of this language, possibly by converting it to CongoCC.
There are three points in the expression life cycle where errors might occur:
- during parsing, when the inputs do not match the language (for example correct number of arguments);
- during semantic analysis, when the arguments are checked to see if they have appropriate data types (for example binary operators have data type agreement); and
- during run time, for example when a substring index is out of bounds of the string.
Ideally I would like to handle each of these errors in a similar way:
- provide a description of the error (e.g. something like "function cos takes one numeric argument" or "substring index out of bounds (index=10, string length=5)"
- provide a context for where the error occurred: print a portion of the expression showing the closest token to where the error occurred.
I can trap errors of type 2 and 3 easily in the Java code that does the semantic analysis and runtime testing. Is it possible to put put try-catch blocks in to the grammar so that parsing failures are also caught and my custom messages are thrown?
My next question is, how to print the expression context when dealing with an error situation? Of course when you catch a parsing error you know the exact spot in the input where the train jumped the tracks. But for semantic and runtime errors how is it possible to reconstruct the expression, given that you know what node you are in (and possibly what token is related to that node). I recall Jonathan (@revusky) mentioning that CongoCC can fully reconstruct inputs from any token, but I cannot seem to find this post again.