Returning Tokens Value

Aug 28, 2014 at 3:52 PM
Hi,

I'm sorry to ask such a dumb question but i can't find a way to return the token associated value from the scanner to the parser.

I'm used to JFLEX-like languages where you can return token like this:
return Symbol( sym.token, value);
Can you please telle me how i can do it?
Coordinator
Aug 29, 2014 at 11:15 AM
Hi NLK
The question is not dumb. The whole idea of the tools is to be used by people
who are experts in their application area but not necessarily in parsing
theory.

There is about 150+ lines of documentation for gplex and gppg, so I will point
you at the areas that you need to read to help with this particular problem.
The page numbers refer to the current versions of the doco, 1.2.2 for gplex,
1.5.2 for gppg.

lex scanners and yacc parsers play nicely together by sharing two types: a
semantic action type %YYSTYPE which is the type of the yylval variable, and
the location %YYLTYPE which is the type of the yylloc variable. The default
types for these two variables are int and LexLocation.

For gplex/gppg there is an abstract scanner type
AbstractScanner<TValue,TSpan>
which is the base type of the scanner produced by gplex. The *.y file may
declare the %YYSTYPE and %YYLTYPE types and gppg will declare an instance
of the abstract base class with these type arguments.
The scanner Abstract Scanner interface is given in figure-2 on page 8 of gppg.pdf.
Setting the yylval and yylloc values are discussed in section 4.4 of
gplex.pdf, starting at page 32.

The important thing to know about the semantic action type is that it is used
to pass values from the scanner to the parser, but also as the type referred
to by the $x variables in the semantic actions of the parser production rules.
So it is possible to choose a type that is the result type of the computations
of the parser. This is a typical use case with tree-building parsers which
will be passed leaf nodes from the scanner, and then build (sub-)trees by the
actions of the parser as particular structures are recognized. See the
tree-building example in the testfiles directory of the gppg distribution for
an example of this.

There are various patterns of use:
(1) a scanner interface where only the identity of the token matters, with the
parser grabbing any information it needs from the scanner's yytext variable if
needed. gplex's own scanner and parser work like this.
(2) an action type that in original lex would have been a union type. Since C#
does not have a union construct gppg uses a struct type with one field of each
required type. Gppg has this structure.

Section 2.4.7 and 2.4.8 of gppg.pdf explains how this works. In the case of
the union construct in gppg.y the only field that is used by the scanner is
the integer iVal field, all the others are fields used by the parser
productions.

For the scanner, the semantic actions must name the field of yylval to which
they are assigning (see lines 158 and 159 of gppg.lex for an example).

However in the parser, the declaration of the types of the non-terminal symbols
(see lines 42-46 of gppg.y) allows gppg to generate the field access code
itself from the production declarations. Thus, looking at the semantic action
at line 198 of gppg.y
$1.Add($3);
will use the known type of $1 (TokenList) and $3 (TokenDecl) to compute that
$1 must access the TokenList field of ValueType object at position-1 on the
value stack, while "$3" expands to access the tokenInfo field of the ValueType
object at position-3 on the value stack. Of course all of this is done by the
parser generator, so you don't need to know this.

Hope this helps. Let me know if you have more question.

Cheers
John