In order
to successfully carry out morphological analysis for a subset of Spanish, I primarily consulted the provided englishpyk.rul file.
A. Adding lexical characters
I initially
added three lexical (underlying) characters -- C for a possible c softening, J for a possible g
softening, and Z for a possible z
insertion -- as are mentioned in Section 1.1.1 of the Lab 1b handout. Eventually,
J was the only additional lexical character to be used in the morphological analyses
carried out.
B. Defining subsets
I
define six subsets of characters --- V for vowels, BACK for back vowels, FRONT for front vowels, LOW for low vowels, HIGH for high vowels, and CONSONANTS for consonants --- as are mentioned in Section 1.1.2 of the Lab 1b handout. Of course, the subsets for vowels and certain types of vowels include the accented vowels that exist in
Spanish (á, é, í, ó, and ü), and the subset for consonants likewise contains the n tilde (ñ). In my rule automata, I actually only use the BACK, FRONT, and CONSONANTS
subsets to specify characters that affect the mutations and pluralization.
C. Rule automata
The g-j mutation rule (RULE
3 in my spanishrulBo.rul file, for which the link is provided in Section
II of this report) ensures that the consonant g becomes a j before back vowels, but remains as a g otherwise. I utilize the defined BACK subset and the added lexical character
J, which can either be represented as a g
or a j on the surface. The following
is the graphical form of the implemented finite state automaton:

The
z-c mutation rule (RULE 4) ensures
that the consonant z becomes a c before
front vowels, but remains as a z otherwise.
I utilize the defined FRONT subset.
Similar to that of RULE 3 above, the following is the graphical form of the implemented finite state automaton:

The
pluralization rule (RULE 5) ensures that pluralizing (adding s) to a noun that ends in a consonant induces an e to appear on the
surface, preceding the s. I utilize
the defined CONSONANTS subset. The
following is the graphical form of the implemented finite state automaton (please note that CONSONANTS:CONSONANTS
is represented as CONS:CONS):

D. Lexicon automaton
The
lexicon automaton (contained in my spanishlexBo.lex file, for which the link is
provided in Section II of this report) has nine distinct states and seven transition arc descriptions, assembled as shown
in the figure below:

The
NUMBER description specifies whether the noun being handled is singular or plural,
the V_SUFFIX1 description specifies the verb conjugation (suffixes) for ar verbs, the V_SUFFIX2 similarly specifies for er verbs, and the V_SUFFIX3 is for ir verbs. The N_ROOT
description provides the root of the noun being handled and its English meaning, the V_ROOT
provides similar information when a verb is being handled, and the END description
indicates the arc the state transition will follow at the end of an analysis.
E. Issue during development
The
only major issue that I faced during development was the need to revise my initial formulation of the z-c mutation rule (RULE 4) due to its interaction with the pluralization rule (RULE 5). Initially, RULE 4 did not contain the
arcs that are labeled as +:e in its graphical form shown above. But since the rules are run simultaneously, and consonant changes from z to c need the opportunity to be reflected on the surface before
es is added to pluralize a noun with a root ending in z, the revision was implemented.
F. System extendibility
If one
had to add more nouns and verbs to the system (without extending the morphological processes that can be handled), he only
needs to list the roots of those additional nouns and verbs (specifying verb type ar(1),
er(2), or ir(3)) under the N_ROOT and V_ROOT transition arc descriptions, respectively, within
the lexicon file spanishlexBo.lex. My
system is thus highly extendible to include more nouns and verbs, although the task of inclusion could possibly be made even
easier with 1) a method for automatically specifying verb type and 2) a collaborative function (either within the system or
separately) that outputs the root to any input verb or noun that needs to be included, then in turn automatically includes
the outputted root under the respective transition arc description in the lexicon file.
II. Pointers to my automata files
http://web.mit.edu/~kaede11/Public/spanishrulBo.rul
http://web.mit.edu/~kaede11/Public/spanishlexBo.lex
III. Log of batch run on spanish.rec
----- Tue, 15 Feb 2005 05:37 AM ----- ;; Good examples (47) coger -> coJ+er [Verb(catch,seize,grab)
.INF] cojo -> coJ+o [Verb(catch,seize,grab) +PresInd1pSG] coges -> coJ+es
[Verb(catch,seize,grab) +PresInd2pSG] coge -> coJ+e [Verb(catch,seize,grab)
+PresInd3pSG] cogemos -> coJ+emos [Verb(catch,seize,grab) +PresInd1pPL] cogen -> coJ+en
[Verb(catch,seize,grab) +PresInd3pPL] coja -> coJ+a [Verb(catch,seize,grab)
+PresSubj1pSG] coJ+a [Verb(catch,seize,grab) +PresSubj3pSG] llegar
-> lleg+ar [Verb(arrive) .INF] llego -> lleg+o [Verb(arrive)
+PresInd1pSG] llegan -> lleg+an [Verb(arrive) +PresInd3pPL] pagar -> pag+ar
[Verb(pay) .INF] pago -> pag+o [Verb(pay) +PresInd1pSG] pagan ->
pag+an [Verb(pay) +PresInd3pPL] cruzar -> cruz+ar [Verb(cross)
.INF] cruzo -> cruz+o [Verb(cross) +PresInd1pSG] cruzas -> cruz+as
[Verb(cross) +PresInd2pSG] cruza -> cruz+a [Verb(cross) +PresInd3pSG] cruzamos ->
cruz+amos [Verb(cross) +PresInd1pPL] cruzan -> cruz+an [Verb(cross)
+PresInd3pPL] cruce -> cruz+e [Verb(cross) +PresSubj1pSG]
cruz+e [Verb(cross) +PresSubj3pSG] l^piz -> l^piz [Noun(pencil)
.SG] l^pices -> l^piz+s [Noun(pencil) +PL] ciudad -> ciudad [Noun(city)
.SG] ciudades -> ciudad+s [Noun(city) +PL] bota -> bota [Noun(boot)
.SG] botas -> bota+s [Noun(boot) +PL] cojas -> coJ+as [Verb(catch,seize,grab)
+PresSubj2pSG] cojamos -> coJ+amos [Verb(catch,seize,grab) +PresSubj1pPL] cojan ->
coJ+an [Verb(catch,seize,grab) +PresSubj3pPL] conozcas -> conozc+as
[Verb(know) +PresSubj2pSG] conozcamos -> conozc+amos [Verb(know) +PresSubj1pPL] conozcan
-> conozc+an [Verb(know) +PresSubj3pPL] parezcas -> parezc+as [Verb(seem)
+PresSubj2pSG] parezcamos -> parezc+amos [Verb(seem) +PresSubj1pPL] parezcan -> parezc+an
[Verb(seem) +PresSubj3pPL] venzas -> venz+as [Verb(conquer,defeat)
+PresSubj2pSG] venzamos -> venz+amos [Verb(conquer,defeat) +PresSubj1pPL] venzan ->
venz+an [Verb(conquer,defeat) +PresSubj3pPL] cuezas -> cuez+as [Verb(cook,bake)
+PresSubj2pSG] cuezamos -> cuez+amos [Verb(cook,bake) +PresSubj1pPL] cuezan -> cuez+an
[Verb(cook,bake) +PresSubj3pPL] ejerzas -> ejerz+as [verb(exercise,practice)
+PresSubj2pSG] ejerzamos -> ejerz+amos [verb(exercise,practice) +PresSubj1pPL] ejerzan
-> ejerz+an [verb(exercise,practice) +PresSubj3pPL] cruces -> cruz+es
[Verb(cross) +PresSubj2pSG] crucemos -> cruz+emos [Verb(cross) +PresSubj1pPL] crucen
-> cruz+en [Verb(cross) +PresSubj3pPL] ;;; ;;; Bad Examples (13) ;;; llejo
-> lleja -> cogo -> coga -> cruco -> cruca -> crucan -> cruze -> l^pizes
-> ciudads -> l^pizs -> l^pics -> botaes ->
|