adMartem
Shouldn't the LOOKAHEAD([...,]{...}) convert to SCAN {...}# [...] => in order to retain more consistent function?
Yes, you're right. 😅
I changed it, but I have to look at this carefully because, to be honest, I'm not 100% sure how it should really be, in general.
In legacy JavaCC, if you write:
LOOKAHEAD({someCondition()}) Expansion()
if the condition returns true
, then it enters the expansion without any actual lookahead. Not even one token, which is the default usually if nothing is specified. But not here. Or, in other words, the above is the same as:
LOOKAHEAD(0, {someCondition()}) Expansion()
I'm not sure that the way JavaCC21 handles this stuff is so much better. But, if you want the above, as things stand, you would have to put in the zero explicitly, i.e.:
SCAN 0 {someCondition()} => Expansion
Currently, the logic is that a SCAN with no numerical or syntactic lookahead just scans to the end of the expansion. This is because, at some point earlier, before I even came up with the up-to-here concept, I decided that:
SCAN Foo Bar Baz
would be equivalent to:
LOOKAHEAD(Foo() Bar() Baz()) Foo() Bar() Baz()
Or, in other words, a lone SCAN
in front of the expansion meant scanning to the end of the expansion. But now that there is a separate way of expressing it, i.e. Foo Bar Baz =>||
I am not sure I want to continue to support the older SCAN Foo Bar Baz
which is why the syntax converter now rewrites those things.
I guess, generally speaking, in the move to the Congo branding, there is a one-time opportunity to get this stuff right. Quite possibly the better approach is that:
SCAN {someCondition()} => Expansion
should mean that it checks the condition AND checks the default single-token lookahead. Or maaaybeeee go back to the notion that if you express a semantic lookahead with no numerical or syntactic lookahead, we just check the condition and have zero tokens of lookahead, which is the legacy JavaCC behavior.
I'm a bit worried about the above causing an indefinite lookahead, because it could be a major gotcha, where people write these very expensive indefinite lookaheads without intending to do so. So that is an open question to throw out there.
Though, all that said, this situation with the legacy JavaCC where semantic lookahead does apply in lookaheads but syntactic lookahead does not -- I wonder if anybody ever tried to justify that on theoretical grounds of some sort? Well, the reason it's like that is because, on their first-pass proof-of-concept implementation of this stuff back around 1996/1997, this was how it was implemented and it was never revisited!
Well, anyway, you're right that there is quite a bit of overlap between ASSERT and using lookaheads, i.e. SCAN. And maybe there is a historical opportunity to revisit this and get things right (or far closer to right) with a rebranding to CongoCC. And, of course, there is the possibility of introducing new constructs that work the way we prefer and using new keywords, like CHECK and/or VERIFY, say. And thus leaving SCAN and ASSERT working as before...
Well, the above includes some rather wooly-minded (at the moment) thoughts...