Well, I think the problem you're running into (or maybe it's just one of them) is that I changed (thinking I could get away with it) the way it works as regards using any scanahead specified in a non-terminal.
The way it was before, if you wrote:
A B C
|
D E F
and let's say that B contains an up-to-here, that would be used as long as the preceding expansions were potentially empty, i.e. consumed no tokens. Potentially. So, A could be:
A: ["foo" | "bar" | "baz"];
which is* potentially* empty. or if the first expansion in the choice above was:
[A] B C
which amounts to the same thing...
The way it's implemented now, the elements before the nonterminal (say B in this case) must consume no tokens. Since [A]
is potentially non-empty, then any up-to-here in B is ignored. But, in principle, you can still have:
ASSERT {condition1()} {doSomething()} B C
|
....
And it would use the up-to-here in B, because the elements preceding B do not consume any tokens. (Granted, the code block that is second in the sequence could explicitly call consumeToken()
but that's getting entirely too tricky. We do just assume that a Java code block does not consume any input.)
But anyway, the way it was expressed before was that the things preceding it potentially consumed no input. And I surely was thinking about this at some point. I was probably thinking in terms of constructs like:
Modifiers TypeDefinition
where Modifiers (public
, private
, static
etc.) is potentially empty so maybe the up-to-here is in the TypeDefinition. So there may well be a use-case for this (though none of my internal use was using this).
But finally (very recently) I decided that this was possibly a bit too tricky (not so much to implement as to just document!) and figured that I could get away with changing this so that the nonterminal has to the be the first non-empty sub-expansion in the sequence. I knew this was changing behavior but considered it unlikely that it would affect anybody and also I figured that I could get away with doing this now.
And if you really want to get dirty with the details a bit, this is where this is implemented: https://github.com/javacc21/javacc21/blob/master/src/java/com/javacc/core/NonTerminal.java#L67
So, the current "spec" is that the up-to-here (or SCAN) in a NonTerminal is used if:
1. The NonTerminal in question is the first non-empty sub-expansion in the sequence
2. There is no up-to-here (or SCAN) in the enclosing sequence that would have priority.
3. We're not more than 1 nesting level deep in terms of calling non-terminals or sub-expansions
It could be worth noting that points 1 and 2 are determined at build-time, while point 3 is at run-time, when the parser is actually being run. (Worth noting if you want to develop a conceptual model of how the thing actually works...)
Anyway, the question now is basically:
Could you live with the above semantics?