In terms of my ongoing work on this project, getting rid of so much of the visual clutter in legacy JavaCC was quite satisfying.
I mean, really, why should one have to write:
void Foobar() : {} {Foo() Bar()}
instead of just:
Foobar : Foo Bar ;
(Actually, one can note that it is pretty clear why you need empty parentheses when calling a method that takes no arguments in Java and other languages, but that reason was never present in JavaCC!)
Or, by the same token, why should one have to write:
LOOKAHEAD(2) Foobar()
rather than something shorter and just as clear. i.e.
SCAN 2 Foobar
Again, the parentheses served no purpose and LOOKAHEAD is just more of a mouthful than SCAN. Admittedly, this sort of thing is all cosmetic, but I think that getting rid of all that visual clutter already adds value, since it does make the grammar files less intimidating, more approachable. Or, in other words, even if (contrary to fact) the only thing I had ever done with this was to streamline the syntax, that would still be a pretty worthwhile contribution to the space. (That said, I am certain that even proposals of this nature, purely cosmetic, would have been rejected by the legacy JavaCC insiders. Or if not explicitly rejected, they would just bog you down in endless nothingburger-ism.)
Looking back on all of this process (it's been a number of years at this point) I strain to remember some intermediate points in this. I am pretty sure that there was some intermediate point where you had to write SCAN 2 => Foobar. I vaguely remember that, but that did not last very long, because I quickly realized that there was no real need to require the arrow in that construct.
On the other hand, if you have:
SCAN Foo Bar => Foobar
there, you do need the arrow. Otherwise, it's ambiguous. How would one know that the intention was not:
SCAN Foo => Bar Foobar
In short, the arrow is necessary when you have a separate lookahead expansion. Otherwise, not. Note, by the way, that you can still write SCAN 2 => Foo but the arrow is optional.
But hey, here is a radical thought. Why do we need the SCAN keyword there at all? Certainly, with a simple numerical lookahead, there is no ambiguity. So why not allow people to write:
[2 Foobar]
Is that clear? We optionally enter a Foobar but we scan ahead 2 tokens (instead of the default of 1 token) to check whether we enter. I've been aware of that for quite some time, but never acted on it. I guess it comes down to wondering whether it is possible to have too much terseness. I was mostly only aware of this in terms of the straight numerical lookahead, but it seems quite possible that SCAN could be made optional even for syntactical lookahead. In the following:
SCAN Foo => Bar
is the SCAN even strictly necessary? I was aware that LOOKAHEAD was verbose and could be replaced by something shorter, but still I thought there was the need for a keyword here, but is there? (I'm actually less certain of this and need to think about it.)
So, finally, I thought I'd put this up for discussion. What do people think? What about a further round of syntax streamlining?
Or will the resulting syntax be too cryptic? My intention is not to change anything in a non-backward-compatible way, so no need to worry about any existing grammars being broken...