\input "supp-pdf" \input "/yacco2/diagrams+etc/o2mac.tex" \DOCtitle{Lr K Vocabulary}{yacco2\_k\_symbols} {NS\_yacco2\_k\_symbols}{8} @i "/yacco2/library/copyright.w" @** K symbols vocabulary.\fbreak Ahh the ``Constant grammar'' symbols used throughout all grammars. Depending on the command line options \O2 can generate the grammar and possibly the various flavours of the Terminal vocabulary. Under normal development, the grammar writer compiles and emits just the grammar. Flavours of the terminal vocabulary are not usually generated unless there have been changes to either {\bf error} or {\bf terminal} with accompaning command line option. At the initial ``big bang'' of bootstrapping \O2's library and compiler / compiler, both {\bf raw characters} and {\bf lr k} terminals were generated using their command line options {/rc} and {/lrk}. The command line option now uses an Unix style {-t, -err} and these 2 terminal types ``lrk'' and ``raw characters'' are now cast in cement: u cannot regen them from their ``*.T'' file definitions but u can change their ``big bang'' generated ``c++'' modules. These terminals are now read-only: they will never be changed by a user of \O2 and who'd want to anyway?. The hardwired ``k'' terminals are used by \O2's library for internal parsing situations. Apart from {\bf eog} who represents the end-of-grammar and end-of-file conditions, all other definitions are not part of the token source stream being parsed. I call them meta terminals as they are never in the token stream but represent internal parsing conditions within the emitted finite-state table that triggers the \O2's library routines. For example, the presence of \paralleloperator within a parse state indicates the potential ``to run'' threads. If u look carefully, their file definitions and implementations reside in \O2's ``../yacco2/library/grammars'' folder. Their definition files are ``yacco2\_k\_symbols.T'' and ``yacco2\_characters.T'' with their ``c++'' variants having the ``.h'' and ''.cpp'' extensions. @*2 {\bf eog}.\fbreak Enum: T\_LR1\_eog\_ \fbreak \line{Class: LR1\_eog \hfil AB: N \hfil AD: N} Used to indicate an end-of-grammar or an end-of-file condition. When the token container is reached, calls for another terminal will always return the |eog|. It's your door bouncer before hell. \fbreak \hrule @*3 eog user-declaration directive. @= LR1_eog(); @*3 eog user-implementation directive. @= LR1_eog::LR1_eog() T_CTOR("eog",T_LR1_eog_,0,false,false) {} LR1_eog LR1_eog__; yacco2::CAbs_lr1_sym* yacco2::PTR_LR1_eog__ = &LR1_eog__; @*2 {\bf eolr}.\fbreak Enum: T\_LR1\_eolr\_ \fbreak \line{Class: LR1\_eolr \hfil AB: N \hfil AD: N} Used to indicate all-terminals of the terminal vocabulary including itself. It saves finger blisters by not having to be explicit in the thread's lookahead expression. Dieting hasn't been this effective to code bloat. \fbreak \hrule @*3 eolr user-declaration directive. @= LR1_eolr(); @*3 eolr user-implementation directive. @= LR1_eolr::LR1_eolr() T_CTOR("eolr",T_LR1_eolr_,0,false,false) {} LR1_eolr LR1_eolr__; yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_eolr__ = &LR1_eolr__; @*2 {\bf \ALLshift{}}.\fbreak Enum: T\_LR1\_all\_shift\_operator\_ \fbreak \line{Class: LR1\_all\_shift\_operator \hfil AB: N \hfil AD: N} Represents the wild token situation. Lowers the specific shifts of the finite-state-table and allows the grammar writer to field the unexpected from returned threads. Good stuff. Caveat: One should use the \QUEshift to field unknow return Tes if they are to be interpreted as errors. \fbreak \hrule @*3 \ALLshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive. @<\ALLshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>= LR1_all_shift_operator(); @*3 \ALLshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive. @<\ALLshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>= LR1_all_shift_operator::LR1_all_shift_operator() T_CTOR("|+|",T_LR1_all_shift_operator_,0,false,false) {} LR1_all_shift_operator LR1_all_shift_operator__; yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_all_shift_operator__ = &LR1_all_shift_operator__; @*2 {\bf \INVshift{}}.\fbreak Enum: T\_LR1\_invisible\_shift\_operator\_ \fbreak \line{Class: LR1\_invisible\_shift\_operator \hfil AB: N \hfil AD: N} It's a nice way to program out of an ambiguous grammar. It can also lower the code bloat of a thread's first set. \fbreak \hrule @*3 \INVshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive. @<\INVshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>= LR1_invisible_shift_operator(); @*3 \INVshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive. @<\INVshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>= LR1_invisible_shift_operator::LR1_invisible_shift_operator() T_CTOR("|.|",T_LR1_invisible_shift_operator_,0,false,false) {} LR1_invisible_shift_operator LR1_invisible_shift_operator__; yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_invisible_shift_operator__ = &LR1_invisible_shift_operator__; @*2 {\bf \QUEshift{}}.\fbreak Enum: T\_LR1\_questionable\_shift\_operator\_ \fbreak \line{Class: LR1\_questionable\_shift\_operator \hfil AB: N \hfil AD: N} Represents a questionable grammar situation. It pinpoints programmed error points within the grammar. The subrule using this symbol has a lr(0) reduction as the lookahead is not kosher and so would probably not reduce in the lr(1) context. It can be used both in the following grammar expressions:\fbreak \INDENT{1.5cm}{1) \subrule \QUEshift} \INDENT{1.5cm}{2) \subrule \PARshift \quad \QUEshift \quad NULL} Point 1 covers the state where the current token being parsed is improper. Point 2 is more interesting as it captures a returned terminal that the thread passes back as an error. The \QUEshift was not one of the original ``k'' terminals. It replaced the ``eof'' terminal which was marginal in intent. I felt the \QUEshift symbol drew the reader's eye of the grammar where ``faulty'' points where captured and to force lr(0) context processing to reduce its subrule. Why lr(0) context? Glad u asked, the lookahead terminal --- the current terminal being parsed, is in error and so ``how is the subrule with the \QUEshift to reduce after its shifted T?''. It must be divorced of any lookahead and just acted upon. Now another question arises: ``how is this condition detected in a parsing state of mixed conditions --- threading, shifting, reducing''? There is a pecking order on the conditions tried by the parser:\fbreak \INDENT{1.5cm}{$\circ$ threading} \INDENT{2.5cm}{if tried and unsuccessful the balance of conditions are attempted} \INDENT{1.5cm}{$\circ$ shifts pecking order by their presence in current parse state:} \INDENT{2.5cm}{can the current token be shifted?} \INDENT{2.5cm}{\QUEshift --- error condition} \INDENT{2.5cm}{\INVshift --- explicit \emptyrule} \INDENT{2.5cm}{\ALLshift --- any terminial} \INDENT{1.5cm}{$\circ$ reduce} \INDENT{2.5cm}{note shifting is favoured over reducing} \fbreak \hrule @*3 \QUEshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive. @<\QUEshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>= LR1_questionable_shift_operator(); @*3 \QUEshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive. @<\QUEshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>= LR1_questionable_shift_operator::LR1_questionable_shift_operator() T_CTOR("|?|",T_LR1_all_shift_operator_,0,false,false) {} LR1_questionable_shift_operator LR1_questionable_shift_operator__; yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_questionable_shift_operator__ = &LR1_questionable_shift_operator__; @*2 {\bf \REDshift{}}.\fbreak Enum: T\_LR1\_reduce\_operator\_ \fbreak \line{Class: LR1\_reduce\_operator \hfil AB: Y \hfil AD: Y} Its presence within the individual state of the ``fsm'' table is to force a reduce operation. Why? it's a back-to-back situation within the state table whereby a thread should reduce while its reducing lookahead is the \paralleloperator indicating to run a thread. \fbreak \hrule @*2 {\bf \TRAshift{}}.\fbreak Enum: T\_LR1\_fset\_transience\_operator\_ \fbreak \line{Class: LR1\_fset\_transience\_operator \hfil AB: Y \hfil AD: Y} \TRAshift has dual purposes: used in \O2linker to process the transient first sets generated by threads, and used within a grammar's ``chained call procedure'' expression to lower thread overhead by calling a procedure with explicit intent on double use of its ``first set'' token. I'll give an example of a ``chained procedure call'' expression drawn from the ``pass3.lex'' grammar handling the grammar's file include expression:\fbreak \fbreak \INDENT{1cm}{\subrule "\ATsign" Rprefile\_inc\_dispatcher} \fbreak The ``Rprefile\_inc\_dispatcher'' grammar rule has the following subrule:\fbreak \fbreak \INDENT{1cm}{\subrule \TRAshift "file-inclusion" NS\_prefile\_include::PROC\_TH\_prefile\_include} \fbreak The ``chained'' part is in the duplicating of ``\ATsign''; that is, the parsing mechanism does not get a new terminal when shifted but passes this T onto the called procedure. The called PROC\_TH\_prefile\_include procedure / thread has its start rule as:\fbreak \fbreak \INDENT{1cm}{\subrule "\ATsign" Rpossible\_ws Rfile\_string Reof} \fbreak The repeated use of ``\ATsign'' was to reenforce the idea that the procedure called cuz of ``\ATsign'': there's that ``first set'' again. Well time will pass its comments on this thought process. \fbreak \hrule @*3 \TRAshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive. @<\TRAshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>= LR1_fset_transience_operator(); @*3 \TRAshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive. @<\TRAshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>= LR1_fset_transience_operator::LR1_fset_transience_operator() T_CTOR("|t|",T_LR1_fset_transience_operator_,0,false,false) {} LR1_fset_transience_operator LR1_fset_transience_operator__; yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_fset_transience_operator__ = &LR1_fset_transience_operator__; @*2 {\bf \PARshift{}}.\fbreak Enum: T\_LR1\_parallel\_operator\_ \fbreak \line{Class: LR1\_parallel\_operator \hfil AB: N \hfil AD: N} Its presence within the individual state of the ``fsm'' table dictates potential threads to run. You see it sprinkled throughout my grammars to call threads. This is part of \O2's raison d'{\^e}tre. \fbreak \hrule @*3 \PARshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive. @<\PARshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>= LR1_parallel_operator(); @*3 \PARshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive. @<\PARshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>= LR1_parallel_operator::LR1_parallel_operator() T_CTOR("|||",T_LR1_parallel_operator_,0,false,false) {} LR1_parallel_operator LR1_parallel_operator__; yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_parallel_operator__ = &LR1_parallel_operator__; @*1 {\bf lrk-sufx} directive.\fbreak As they are constants, they are defined globally to save space / overhead in the typical new create / delete cycle of terminals. Thar's recycling going on in this green space. @= extern yacco2::CAbs_lr1_sym* PTR_LR1_parallel_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_fset_transience_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_invisible_shift_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_questionable_shift_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_all_shift_operator__; extern yacco2::CAbs_lr1_sym* PTR_LR1_eolr__; @** Index.