vsajip It might be OK as a quick hack,
Well, that's all it is is a quick hack. In fact, it's quite ghastly. For example, the way it uses the default lexical state in the grammar as a way of deducing the file extension (and the root production) is truly nasty! I would love for you to fix that up!
but it's too wedded to our built-in grammars (and perhaps not wedded enough, as the C# preprocessor is built-in too). In practice a user could be using any name for the top-level production (if there is one). IMO that whole preamble needs to go into a Main method, and perhaps a configuration value should be added to indicate a top-level production in a way that the user can control.
Well, sure. The entry production should be properly parametrized. What you see there now is just the result of me wanting to get the thing to work. So, by all means, feel free to fix it up.
On the Java side, what I think I'm going to do is consolidate all of these test harness programs, like JParse.java and CSParse.java and so on, and just have a single test harness that is a bit more generalized and can be used for all of them, and that can be in the congocc.jar file. I'm thinking that the ability to run over a directory of source files with whatever parser could be built into the jarfile, so you just could do:
java -jar congocc.jar jparse <directory>
and get some instant gratification that way.
Or the other possibility is to roll these things up as custom ant tasks, which is certainly possible. There is also the ability to define macros. So, maybe just leveraging Ant in a more sophisticated manner could be the right idea. Or... we could look for another build tool.
But, anyway, all those test harness programs like JParse.java and CSParse.java etc. are basically the same thing obviously. They just run over a directory and get the files and feed them to the parser. All these copy-paste-modify versions of the same thing look a bit silly. And, of course, what I did with C# is just another hack.
So I have been thinking about how to organize things better. I think that ant, as retro as it is (you're supposed to use Maven or Gradle apparently, if you want to be one of the cool kids) surely has the needed machinery to organize things better. Really, so far I have been using it in an extremely bloody-minded brain-dead manner. So just leveraging ant's feature set better, to have some ant macros, in some file like commonantmacros.xml that we import and so on.. The whole thing is gradually getting unwieldy. Though, that said, up to this point, I've never spent all that much time mucking with ant files. (And I don't really intend to start!)
And then, of course, the other possibility would be to find some other build tool. I was looking at this thing called Bazel. I guess it comes from Google. But I honestly don't kow whether it's worth bothering with. Maybe Ant is not so bad. Not that I like ant so much, and it's biggest downside is that it is horribly verbose because it's all in XML. Something more or less like ant with a more terse DSL, i.e. that doesn't use XML would maybe just be fine.
As the old test harnesses that you ripped out built everything correctly, that logic is perhaps what's needed to be replicated in the current build procedure.
Well, I don't want to get contentious over this. But the truth of the matter is that I just never got it working locally, in particular the C# tests you put in place. I guess it does work, since it did work with Github's CI, but I never managed to get the whole IronPython+Dotnet thing going locally and I never understood why. When I tried to run it locally, I just always got these messages that I couldn't make any sense of. I honestly don't know how many people ran across the project, tried to get it working, and failed and then gave up. (Surely a non-zero number.)
I guess I am not absolutely opposed to IronPython in principle, I mean having some examples that use it, but I would say that first we always needed clear examples of just running it all using the more conventional toolset. And the fact remains, I'm pretty sure that most people don't know what ipy is, so there would be a need to document the whole thing better for people. You'd need to write some README explaining what IronPython is and why it's appealing to use it as a dot-net scripting language that can script these generated parsers and so on. I don't think you quite understand how opaque the whole thing was for most people.
And the fact (and facts are stubborn things, as the adage says) remains that we just don't have any users (that I know of!) on the non-Java languages. It doesn't work too badly really, though not as well as in Java obviously, but we're trying to remedy that. But we need some tutorial material -- even on the Java side, but certainly on the C#/Python side. Things that could serve as articles somewhere, I think.
The whole thing with ANTLR has sort of galvanized me, I think. I just look at that thing and I think we've beaten the living crap out of those people technically -- at least in terms of having a usable, practical tool. Now, okay, if you need a parser in some language that we're not supporting, Javascript or whatever, then maybe one has to go with ANTLR, fine. But if you just wanted to do something in Java, CongoCC is just a vastly better tool. Actually, if you just wanted to do something in Java, and it was a choice between ANTLR and the legacy JavaCC, I think the practical choice could well be the old JavaCC. At least that, for all its limitations and glitches, tends to generate something reasonably performant and just somehow is not so strangely opaque as ANTLR. But, of course, why would anybody use the old JavaCC when CongoCC exists?