|
![]() |
![]() |
(Please note that I have combined the tasks of Section 2.1 through Section 2.4 below.) 2.1 Grammar 1: Rules
for declarative sentences 2.2 Grammar 2: Verbs
that take full sentences as arguments 2.3 Grammar 3: Unbounded
dependencies and questions 2.4 Grammar 4: Agreement,
features, and grammar size
I
categorized the verbs found among the test phrases based on the types of arguments that they require (please note that I only
note the past-tense of each verb here for simplicity, since the all tenses of a verb follow the same categorization to begin
with): V1
-- requires an object as an argument; lost, saw,
solved V2
-- requires either no argument or else an argument that is a full proposition or sentence; thought V3
-- requires at least one object; sent V4
-- requires an argument that is a full proposition or sentence; believed V5
-- requires an adjective as an argument; were To
allow specified verbs to take a full proposition or sentence as an argument, I included the usage of [Sbar à Comp S], where the Comp position is either empty or filled with a complementizer,
such as that. Since
“to the police” should not modify “the solutions” in “poirot
sent the solutions to the police,” I required that when V3 verbs, such as sent, have only one object, that object must be an unmodified noun phrase, or UNP, which is either just N or Det
N. To
include the particular verb do, I separated present-tense verbs from past-tense
verbs by indicating them as dV#, where # is the respective verb category (as explained above) and d is to symbolize that the
present-tense verb can be associated with did, as in “did
solve,” “did send,” and
“did believe.” I
then expanded the grammar to handle the inverted form seen in questions, where the auxiliary verb did is switched with the subject noun phrase. This was done simply
by allowing [S à Aux NP dVP], in addition to the [S à NP Aux dVP], which had been allowed since the introduction of do. Since
wh-phrases are special noun phrases, WhNP, where the determiner preceding the noun is a wh-word, I wrote new NP rules specifically
for them. Also, since they are the phrases that are relocated to the beginning
of question sentences, they needed to be specified all the more. I additionally
allowed WhNPs to be a part of VPs, and also a part of the final sentence structure by adding [ I
considered applying the idea of embedded sentences uniformly, using Sbar to move the filler from the gap to its correct position
in question sentences. Comp had originally accounted only for words such as that, so instead of altering it to account for WhNP (filler) also, I implemented [S
à WhNP S-WhNP], where S-WhNP is the part of the sentence
with the WhNP removed. The “-WhNP” indicates the phrases with the WhNP eliminated, the “-WhF” indicates the phrases with the WhNP filler still existent, and the “-WhS” indicates the situation where NP is associated with VP-WhNP (the WhNP has been removed from the VP). The
grammar for handling issues from Section 2.1 through Section 2.3 can be seen by clicking the following link: http://web.mit.edu/~kaede11/Public/lab2bBo10.grm For
handling agreement phenomena, I essentially wrote a rule for each of the singular (noted with the prefix “si”) and plural (noted with the prefix “pl”) cases of the originally existent rules. In turn, the
grammar size |G| increased significantly; in Section 2.3, |G|=163, but in Section 2.4, |G|=323, marking an approximately 98%
increase in grammar size through the implementation of agreement phenomena. The
grammar for handling issues from Section 2.1 through Section 2.4 can be seen by clicking the following link: http://web.mit.edu/~kaede11/Public/lab2bBo12.grm A
record of a parser run on the supplied test sentences, using the grammar for handling issues from Section 2.1 through Section
2.4, can be seen by clicking the following link: http://web.mit.edu/~kaede11/Public/testout24f.txt 2.5
Grammar 5: Using feature grammars In
order to implement agreement phenomena using features, I used the grammar obtained for Section 2.3 and made changes to both
the rules and the lexicon parts. For correctly parsing the test sentences provided
(except, of course, the last test sentence, which will be handled in Section 2.6), it was necessary to add [Agr ?x] to 7 rules
in the grammar. Since such a feature would require a check-and-match process,
I estimated that it would contribute to the grammar size by increasing its associated rule’s size
by 1. For instance, the grammar count for the rule {S[Agr ?x] à NP[Agr ?x] VP[Agr ?x]} would be 4. Then,
|G|=173, which is only an approximately 6% increase from the grammar size of Section 2.3, as opposed to the approximately
98% increase marked by Section 2.4. Therefore, the usage of features notably
decreases the grammar size in this case. Extending
the above result, let |G| now denote the size of the feature grammar without features taken in to account; |G|=166. If we suppose that there are m agreement phenomena in a language,
with each feature binary-valued, then the maximum size of the nonterminal-expanded context-free grammar that would be needed
to encode m agreement phenomena would be GNF = (2^m)|G|. For an equivalent feature grammar, the maximum size would
be GF = km+|G|, where k is a constant
indicating the number of rules to which the feature for agreement phenomena must be applied (in our case, 7). The value of the ratio GF/GNF is then (km+|G|) / [(2^m)|G|], so as m approaches infinity, the ratio approaches 0. This limit clearly illustrates that, the greater the number of features (i.e. agreement phenomena), the
more size-efficient the feature grammar is when compared to nonterminal-expanded context-free grammar. The
feature grammar for handling the identical issue as that of Section 2.4 can be seen by clicking the following link: http://web.mit.edu/~kaede11/Public/lab2bBo14.grm A
record of a parser run on the supplied test sentences, using the feature grammar for handling the identical issues from Section
2.1 through Section 2.4, can be seen by clicking the following link (it is a reproduction of the previous record that used
nonterminal-expanded context-free grammar): http://web.mit.edu/~kaede11/Public/testout25b.txt 2.6 Grammar 6: Filler-gap
dependencies and features 2.6.1 Features Using
the new outlook of featurizing WhNP, I rewrote my grammar from Section 2.3 to utilize the feature-value pairs as explained
in the lab handout. This new grammar can be seen by clicking the following link: http://web.mit.edu/~kaede11/Public/lab2bBo15.grm A
record of a parser run on the seven sentences from Section 2.3, using the new grammar with featurized WhNP, can be seen by
clicking the following link: http://web.mit.edu/~kaede11/Public/testout26c.txt 2.6.2 Other sentence types with ‘wh’ features By
using [Wh +/-] and adding a feature WhPP to check for wh- prepositional phrases, such as “to
whom,” I modified the grammar from 2.6.1 above to account for the final test sentence. This modified grammar can be seen by clicking the following link: http://web.mit.edu/~kaede11/Public/lab2bBo16.grm A
record of a parser run on the same seven sentences plus the WhPP sentence, using the modified grammar with [Wh +/-], can be
seen by clicking the following link: http://web.mit.edu/~kaede11/Public/testout26f.txt 2.6.3 The ‘slash’ feature (I
searched for more information to gain further insight into this topic, and found http://www.informatics.sussex.ac.uk/research/nlp/gazdar/nlp-in-prolog/ to be helpful in outlining my following points of discussion.) An
expression A/B marks an A with B missing, so such usage of the slash permits any of the following types of sentences: the car mary went to to the car mary went go to the car, mary did This
means that a direct transfer of information about a missing item, between two sides of the arrow within a rule, can happen,
as opposed to the need for remembering what the displaced item is. Thus although
it may seem that the remaining need to expand out displaced items will cause no save in terms of the number of rules, the
usage of the slash provides the advantage that there is essentially no limit on the amount of intervening material that can
occur between the displaced constituent at the front of the sentence and the empty constituent that corresponds to it, occurring
within the sentence. While being able to account for such unbound dependencies,
a clear disadvantage of this feature is that, since phrases with gaps without displaced fillers, or vice versa, need to be
prevented from occurring, the value of what exactly is being displaced needs to be laboriously stated explicitly in the rules. |
![]() |
|
![]() |
![]() |
![]() |
Bo Sung Kim © 2005 kaede11 |