One of the most common tokens in SQL is the <identifier>. It is ubiquitous, mentioned in hundreds of productions.

And there are three kinds of identifiers, like so:
<Identifier:
<RegularIdentifier> |
<DelimitedIdentifier> |
<UnicodeDelimitedIdentifier>
>

I attempted to use the <Identifier> in a production, like so:

TableName:
    LocalOrSchemaQualifiedName
;

LocalOrSchemaQualifiedName:
    [LocalOrSchemaQualifier <Period>] <Identifier>
;

And ran the following test case

new JpParser("Alphabet").TableName();

Which produced the following error:

Encountered an error at (or somewhere around) input:1:1
Was expecting one of the following:
Identifier
Found string "Alphabet" of type RegularIdentifier

So here's my question. RegularIdentifier is of type Identifier. What's the best way of handling this without explicitly expanding every instance of <Identifier> in the grammar?

    Robert-Egan

    Hmm, actually this relates back to the question you posed earlier about "private" token definitions. Most likely, you want to define the different identifier patterns, like RegularIdentifier, DelimitedIdentifier, and UnicodeDelimitedIdentifier using the initial #. I am assuming that you really only want to have one token type corresponding to Identifier, i.e. you want to match Alphabet just as Identifier, not as RegularIdentifier, then where RegularIdentifier is defined, make it #RegularIdentifier.

    In terms of understanding how the lexical part is specified, you might do well to consult the various example grammars. For example, look at how FLOATING_POINT_LITERAL is defined in the Java grammar here: https://github.com/congo-cc/congo-parser-generator/blob/main/examples/java/JavaLexer.ccc#L224-L241

    Write a Reply...