Wait and See Parser -- WASP

WASP is a wait and see parser which some linguists believe gives more insight into syntactic structures. Sentences move into the WASP processor through a noun-phrase preprocessor. This identifies the noun phrases and replaces the words by appropriate node. This is simpler in English than other languages because of the structure of the language.

Preprocessed sentences with words and noun phrase nodes stream from right to left through a three place buffer and eventually finish as a standard syntactic tree. A node stack is used and this is depicted horizontally rather than vertically. The top node is called the current node and in the diagram this is the rightmost node.

Flow on and off the node stack is controlled by condition action parsing rules. The condition depends upon the state of the buffer and the current node. Words and noun phrase nodes flow into the buffer whenever a place examined by the condition part is empty. The action part specifies one of the following:

create a new node and push it onto the node stack;

complete the top node in node stack and move it into the first place in the buffer shifting existing numbers to the right;

existing numbers to the right;

attach the first item in the buffer to the top node in the node stack and shift the other items to the left;

The key questions start stop ;and attach are answered explicitly by the above rules.

The following example considers a simple sentence and sufficient rules to parse it.

Consider the sentence

the silly robot moved the red pyramid to the top of the big table

The noun phrases in this sentence are

the silly robot

the red pyramid

the top

the big table

The main mechanism of the parser

continue until the main sentence is completely parsed or no active condition action rule applies:

activate the condition action rules associated with the current node;

use the first active rule that matches the condition of the buffer and the current node.

All the start top attach instructions are tied up in the rules.

Consider first the sentence rules

S1 IF first item is a noun-phrase node

THEN attach the first item

S2 IF first item is a verb-phrase node

THEN attach the first item

S3 IF first item is a verb

current node has noun-phrase attached

THEN create a verb-phrase node

S4 IF buffer is empty

THEN stop, reporting a successful parse

Consider next the prepositional phrase rules

PP1 IF first item is a preposition

THEN attach the first item

PP2 IF first item is a noun-phrase

THEN attach the first item

PP3 IF buffer is empty

THEN drop the current node into the buffer

Consider next the verb phrase rules

VP1 IF first item is an auxiliary verb

second item is a verb

THEN attach the first item

VP2 IF first item is a verb

THEN attach the first item

VP3 IF first item is a noun-phrase node

THEN attach the first item

VP4 IF first item is a preposition

second item is a noun-phrase

current node has a verb and noun-phrase attached

THEN create a prepositional-phrase node

VP5 IF first item is a prepositional-phrase node

THEN attach the first item

VP6 IF buffer is empty

THEN drop the current node into the buffer

Notes

In English certain words like verbs give an indication when it is time to start a node.

VP1 handles auxiliary verbs such as will have been or is going to have

The other Vp's such as VP2 handle the other conditions that can arise.

Parsing of the sentence

the silly robot moved the red pyramid to the top of the big table

The first step is to isolate the noun phrases

NPa (the silly robot) moved NPb (the red pyramid) to NPc (the top of the big table)

We are assuming that the preprocessor is intelligent in the SHRDLU sense in that it recognises the top of the big table as a noun-phrase with prepositional phrase embedded as top of the big table makes sense whereasthe phrase of the big table cannot be attached to pyramid.

The attached sheet shows the blow by blow account of how the sentence is parsed.

a parser is in initial state with S as only node in the stack and three sentence items in the buffer. In this state the sentence rules apply allowing S1 to trigger and fire

b in moving to b the noun phrase NPa is attached to S, parser is still in state S as S is the current node and proceeds to c as the first item is a verb using S3

c now a verb phrase node is created and so the parser moves to the verb phrase set of rules. The top place is moved and so VP2 is activated and the verb is attached and the parser moves on to d

d now the buffer has moved on one place and the box is occupied by a NPb so rule VP3 is triggered and the noun- phrase is attached and the parse moves on to e

e At this stage the VP is in the spotlight and also to is in the boxseat and it is a preposition thus as VP rules still apply the rule VP4 leads to a PP node being created.

f At this point PP rules apply and PP1 is fired

g Once again a PP rule is used this time PP2

h Now it is clear the buffer is empty so PP3 applies and the buffer is given a node

i Now VP rules come into favour and VP5 locates the PP in the buffer and attaches it

j Once again the buffer is empty and so rule VP6 is used to put the node back into the buffer

k The parser is on S rules and S2 is used to attach the VP

l merely serves to finally recognise the empty buffer and to report success

Notes

WASP and ATN parsers give different answers to the start stop and attach questions. In ATN's the answers are bound up in the way the parser uses the net, for WASP the answers are tied up in the rules. This simple version has a three item look ahead whereas ATN's only look at next item. It seems that looking three moves ahead is sensible and saves building doomed structures. It seems that English speakers confine themselves to the complexity that can be covered by three level lookahead. In ATN's syntactic knowledge is expressed in an interpreter procedure in nodes and arcs and in obscure arc procedures whilst in WASP syntactic knowledge is expressed in an interpreter procedure and in simple condition action rules. The interpreter part of the ATN is simpler to explain than that for WASP the procedures on the arcs are more complex than that of the rules. Arguably the balance is better in WASP because the linguistically motivated design choices are more accessible in WASP hence more subject to careful linguistic study.