This project is read-only.

Quoted strings are not scanned correctly with /caseinsensitive

Mar 19, 2012 at 5:18 PM
Edited Mar 19, 2012 at 5:19 PM

Hi all,

Given the following example, running this scanner does not scan the quoted text with caseinsensitive option enabled. The text is scanned correctly when i disable the option. Is this expected behavior?


%namespace LexScanner
%option noparser nofiles caseinsensitive

string	\"[^\"]*\"

{string}	Console.WriteLine("string: " + yytext);


    public static void Main(string[] argp) { 
		var foo = "\"a quoted string\"";
		Console.WriteLine("Scanning '" + foo + "'");
		Scanner scnr = new Scanner();
		scnr.SetSource(foo, 0);


I'm running GPLEX version: 1.1.6


Mar 21, 2012 at 11:46 AM

Hi swiebertje.  Yes, it looks like a bug.  I will check it out and get back when I have a fix.


Mar 22, 2012 at 5:19 AM


Ok, so this was a problem with the processing of inverted sets in the /caseinsensitive case. A couple of other small issues were fixed at the same time, and the updated sources are in the repository.  I will generate a new distribution within the next few days.

There were actually two problems:  character sets were always broken if all of the following were true - /caseinsensitive option, set was inverted [^...], character set was 8-bit (/nounicode option).  This is why your example is broken even though there is no alphabetic character in \"[^\"]*\".  The second problem was that the conversion of any such set to the case insensitive form must be performed *before* the set inversion is processed. 

Thank you very much for giving me a nice short example to beat on!


Mar 22, 2012 at 3:09 PM

Hi John.

Thanks for the fast fix and the effort you're putting in this nice project!