Okay, as regards this case:
select HIGH_PRIORITY bla from table1
one way to deal with this situation with a lookahead would be to write a production that is specifically only used in lookahead, i.e. something like:
(Note, BTW that the #scan annotation means that the production is only used in a lookahead, so no regular parsing production is generated or any skeletal node class.)
And then this can be used in a lookahead like:
=> <SELECT> ACTIVATE_TOKENS HIGH_PRIORITY (<HIGH_PRIORITY>)
<IDENTIFIER> <FROM> <IDENTIFIER>
... (other choices)
So, you see, the lookahead SelecthighPriorityLA actually scans ahead and if the token after <SELECT> is not "HIGH_PRIORITY" then the lookahead fails and it goes to the next choice.
You can see this being used in the current Java grammar here: https://github.com/javacc21/javacc21/blob/master/examples/java/Java.javacc#L174
Since "record" is a soft keyword, i.e. it is a keyword in this context of a type declaration but elsewhere can just be a regular identifier, we have this trick, which is basically the same thing as what I propose above. Note that this is a little tricky because the lookahead is scanning the string "record" as an identifier but then when it actually goes into the
RecordDeclaration production (see line 276 of the same file) it actually rescans the "record" as a <RECORD> token.
Note also that for this to work, we need what is on line 48 of that file, i.e.
DEACTIVATE_TOKENS=RECORD, VAR, YIELD, SEALED, NON_SEALED, PERMITS;
The various soft tokens are de-activated by default and only activated in the spot where they are realy not just identifiers.
Actually, another example that I think worthy of study would be how certain roughly similar things are dealt with in the C# grammar I wrote recently. See here: https://github.com/javacc21/javacc21/blob/master/examples/csharp/CSharp.javacc#L1017
So, you see, there is this machinery to deal with context-sensitive tokenization. It is still a bit tricky because you have to be cognizant of when a token type is activated or not. So you can write a lookahead that is based on the token type not being active but then when you actually go and parse, the ACTIVATE_TOKEN actually rolls back any cached tokens from the lookahead and rescans it as the appropriate token type, so it's in the AST as that (soft) token type, not just an identifier.