I'm new to CongoCC and migrating a JavaCC-generated expression parser, sqlexpr-javacc, to a CongoCC-generated parser, sqlexpr-congocc. The language is a subset of SQL WHERE clause expressions.

I migrated the Javacc SqlExprParser.jj definition to the CongoCC SqlExprParser.ccc definition. When I run the goodFilters() unit test in ParserTest.java, I get the following error.

FAILED: net.magneticpotato.sqlexpr.congocc.parser.ParserTest.goodFilters
jakarta.jms.InvalidSelectorException: name = 'Bud' AND tenant_id = 'iplantc.org'
	at net.magneticpotato.sqlexpr.congocc.parser.SqlExprParser.parse(SqlExprParser.java:91)
	at net.magneticpotato.sqlexpr.congocc.parser.SqlExprParser.parse(SqlExprParser.java:63)
	at net.magneticpotato.sqlexpr.congocc.parser.ParserTest.goodFilters(ParserTest.java:26)
        ...
Caused by: java.lang.NullPointerException: Cannot invoke "net.magneticpotato.sqlexpr.congocc.parser.Token.getTokenSource()" because "this.lastConsumedToken" is null
	at net.magneticpotato.sqlexpr.congocc.parser.SqlExprParser.openNodeScope(SqlExprParser.java:3556)
	at net.magneticpotato.sqlexpr.congocc.parser.SqlExprParser.JmsSelector(SqlExprParser.java:289)
	at net.magneticpotato.sqlexpr.congocc.parser.SqlExprParser.parse(SqlExprParser.java:89)

The test code is the same as in the JavaCC-generated parser and it works there. Before I go down a rabbit hole, is this a familiar error that has an simple fix? Thanks!

11 days later

I'm not seeing much activity on the discussion forum. Is CongoCC still active?

    rcdev58 Is CongoCC still active?

    Yes, but it's a small team and we have other projects on the go! Someone should be able to look at this soon. It's generally been uncommon to run into an NPE. If you can shrink it to a small example, feel free to add a bug report.

    I'll try to isolate the problem. It might take a while being new to CongoCC, but I'm hoping to use it rather than JavaCC in an upcoming project, so I'm motivated. Thanks.

      5 days later

      rcdev58
      Hi, I'm sorry to be so slow to get back to you. For some reason, I just didn't notice that you had posted this question (nearly 3 weeks ago!)

      Just eyeballing the code, it seems like the problem is that lastConsumedToken is null, which would be because you define a custom constructor that never sets it, specifically here: https://github.com/richcar58/sqlexpr-congocc/blob/main/src/main/resources/SqlExprParser.ccc#L110-L116

      You define your own constructor that doesn't set the lastConsumedToken variable. But the thing is that there is already a generated constructor that takes a String (but actually a CharSequence) as an argument. So maybe you don't really need that constructor, and you probably don't need the sql variable since you can always get the input string via: parser.getInputSource().toString().

      But okay, one thing to do, maybe, is just change your constructor to:

       protected SqlExprParser(String sql) {
            this((CharSequence) sql);
            this.sql = sql;
       }

      Then it is invoking the generated constructor that sets what needs to be set, in principle. But again, you probably don't really need the sql variable and it would be better just to get rid of your own injected constructor and just use the generated one.

      I hope that is clear. If it isn't, by all means, ask more questions!

        revusky Thanks a lot for the advice! You hit the nail on the head. The code was adapted from the original JavaCC implementation that cached the parsed expression. I'm not so sure caching is even a good idea in this application. Either way, you solved my problem--thanks.

        I'm going to attempt to use CongoCC to generate Rust code. I'll use the fact that CongoCC already generates output in different languages as a model on how to approach the problem.

          rcdev58
          Hi Richard. It's good to hear you got it going. Actually, this was a flaw in the code, setting the lastConsumedToken in the constructor, and then if somebody defined their own constructor without setting that... Well, you can see the (quite trivial change) here

          I'm going to attempt to use CongoCC to generate Rust code. I'll use the fact that CongoCC already generates output in different languages as a model on how to approach the problem.

          Well, that's interesting. That is a fairly significant undertaking. It is Vinay who has done the most work to get the Python and C# code generation working and he could tell you how much work it is! But the place to look, in terms of implementing code generation for another language is here: https://github.com/congo-cc/congo-parser-generator/tree/main/src/templates

          So, in principle, it amounts to translating about 5000 lines of template code to generate Rust instead of the other languages. (There would be a bit of glue code to activate those templates as opposed to the others, but that is surely trivial.) One caveat I should mention is that the support for non-Java languages is still not really quite on a par with the Java implementation because, for example, INJECT is still not supported. Actually, I intend to get it working, and there was a massive refactoring to support that I carried out maybe two years ago (!) but, well...

          Of course, to get INJECT working for Rust or any other language, you would need to have a reliable parser for the language. And, actually, to generate code, it is nice to be able to do a final pass where you run it through some dead code reaer and pretty-printer and that requires being able to parse the language. And then it is fairly trivial to have the Code formatter anddead code remover working for Rust, say. Of course, that said, the code generator would work without any beautification or dead code removal, but I guess what I'm thinking also is that if you want to get going, maybe it's a more manageable initial project to get the Rust parser working and the code beautifier and reaper, say. I mean, those things are even somewhat useful on their own. But they are necessary pieces if you really want to have a parser generator for Rust that is on a par with the one for Java...

          So, I'm thinking maybe that contributing a Rust grammar/parser could be a better initial project to sink one's teeth into. But I don't really know what your ambitions with this are, so I'm just thinking out loud, I guess. By all means, don't be shy about batting around some ideas here.

            revusky Thanks for your insights and suggestions. Ultimately, my goal is to generate a parser for a pipeline or workflow language that would be incorporated into an integration test generation facility. I want to write the whole thing in Rust, so I'm exploring Rust parsers and parser-generators. CongoCC offers a powerful and familiar way to specify language syntax, so its a top contender for generating parsers in Rust.

            It turns out there is no Rust language specification and the Rust parser code is the actual specification for Rust syntax! Efforts have been started and abandoned to define Rust syntax in BNF. RFC 3355 focuses on writing a separate specification document for Rust, but work seems to be progressing slowly. Java was lucky to have significant, effective, institutional support early on.

            I need to complete my survey of what Rust parsing capabilities exist and what efforts are underway to advance the state of the art in Rust. It may make sense to join one of these efforts, or start my own, or carve out a modest piece of work to get my feet wet as you suggest.

            I have the luxury of choosing what I work on, but I need to choose wisely. Whatever I decide, I want to build something useful to people other than just myself.

            Because of how Rust's borrow checker works, and how a CongoCC parser will merge together templated code and user-specified code in the grammar, I expect a Rust-generating variant will be non-trivial, to say the least! It's hard enough fighting the borrow checker when all the code is your own. And I say this as someone who likes Rust!

              vsajip Agreed, I'm just scoping things out now to see how much I want to bite off. If nothing else, I'll learn about Rust syntax and parsing and then decide what's a good use of my limited time.