GPLEX is a scanner generator which produces lexical scanners written in C# V2 or higher. The input language is similar to the original LEX specification language, but allows full 21-bit Unicode scanners to be specified.
Alert: Between July 2013 and August 24 2013 the recommended distribution link on the downloads page erroneously linked to a previous release of GPLEX. This has now been fixed. John apologises for the error and suggests that users who want the latest version download the real v1.2.1 distribution.
GPLEX generates scanners based around finite state automata. The generated automata have the number of states minimized by default, and have a large number of options for table compression. The default compression scheme is chosen depending on the input alphabet cardinality, and almost always gives a reasonable result. However a large number of options are available for the user to tune the behavior if necessary.
The tool implements many of the FLEX extensions, including such things as start-state stacks.
The generated scanners are designed to interface cleanly with bottom-up parsers generated by Gardens Point Parser Generator. However, gplex-generated scanners have been successfully used with both handwritten parsers and with parsers generated by COCO/R.
Examples Of Use
There are a small number of examples of use included in the download package, and these are fully discussed in the documentation. For a more complex example GPLEX and the companion GPPG tool each themselves use scanners and parsers generated by GPLEX and GPPG.
There is a separate documentation file that deals with the special issues that arise with scanners that use the Unicode character set.
Is GPLEX What You Need?
GPLEX is a scanner generator. It is intended to be used to generate scanners for compilers or other tools that process text. It picks out non-overlapping substrings from within a continuing input stream, and returns an integer token identification. It may also be used for other simple regular expression recognition tasks, but is not a replacement for the System.Text.RegularExpressions classes. It does not have built-in mechanisms for multiple substring capture or anything similar.
GPLEX has historically had an approximately 2-per-year release cycle. If there is some feature that fits within the broad intention of the tool and which you feel is missing ... raise an issue. If what you really want is a C# version of AWK then GPLEX isn't it, and the copyright notice explains the conditions under which you may use code of GPLEX to help you implement AWK.NET yourself.