Ryan Georgi
Ling 567
Lab 5 Writeup

#0: Basic Lexicon/Test Suite Changes
---------------
After the last lab, I realized the importance of getting
the itsdb suite to use the regularized forms, so for this
lab I've changed all the lexical entries so that they'll
be using regularized morpheme forms.


#1: PNG & Pronouns
---------------

Arabic has a large set of pronouns that vary for all
person, number, and gender. Furthermore, the number system
includes dual in addition to singular and plural.

Here is a chart for the different forms of free-standing 
pronouns in Arabic:

Person/Gender         Sg.        Dual          Pl.

1                     ?ana        --          nah.nu

2/masc                ?anta   \               ?antum
                               => ?antuma:
2/fem                 ?anti   /               ?antunna

3/masc                h.uwa   \               hum
                               => huma:
3/fem                 hiya    /               hunna


All the singular and plural forms were specified in 
pronoun entries with their person, gender, and number
specified. The 2nd and 3rd person dual entries were
left underspecified for gender.

As I noted in a previous lab; these freestanding pronouns
are not commonly used in Arabic due to the unambiguous
person/number/gender marking on the verb. When these
pronouns do show up, it's usually to topicalize a
preceding noun or as the subject of a verbless sentence
(such as "?ana mudarris" = I  teacher = 'I am a teacher').

When these pronouns do appear, they are always nominative,
so I have put that constraint on the pronoun-lex type itself.

* Arabic doesn't have seperate determiners, so I don't have
determiners to test the optionality of, but I did make sure
the inflectional rules for definiteness I created below
don't apply to these pronouns.

#1b: Common Nouns
--------------

Gender of arabic nouns is generally quite regular.
With a few exceptions, all nouns whose base form
ends  in -t(a) are feminine, all others are masculine.

I have gone through and added gender to the lexical
entries in my language; though as I haven't yet
implemented agreement, this case won't be showing up
anywhere visible yet.

* ( I didn't specify their roots very well in the last lab,
but I have gone through with an Arabic dictionary to
make sure I have the correct genders. )

** Again, determiners don't exist in arabic, but there are
 quantifying affixes for definiteness/indefiniteness that I
 implemented (in a very simple way) as lexical rules below.

#2: Case / Inflection
-----------------

Case and inflection are really part and parcel of getting
any coverage over my test suite, so I implemented these two
packages to get some semblance of coverage.

First, I started simply by adding case to the verb-lex entries.
Since the basic word orders are SVO and VSO, as I posted
in the forums, I should be able to place case on the
ARG-ST without worrying (too much) about reordering at
this point.

Furthermore, subjects appear to always be nominative, so I
simply added the constraint that first arg on verb-lex's
ARG-ST (unifying with #subj) must be HEAD noun & CASE nom.


verb-lex := basic-verb-lex &
    ...
    ...
    ARG-ST < #subj &
             [ LOCAL [ CAT [VAL [ SPR < >,
                                 COMPS < > ],
			    HEAD noun &
				[CASE nom] ],
    ...
    ...

I also added a constraint to the transitive verb lex
that the second arg must be accusative.


At this point, I moved to writing some basic inflectional
rules. I began by defining the following i-rules:

nom_lex_rule, acc_lex rule
def_lex_rule, indef_lex_rule

The first two rules simply add to the SYNSEM the value of
CASE on a noun, and the instances of these rules add the
affixes -u (nom) and -a (acc).

After implementing these simple cases,I tried testing the 
following sentences, and successfully got parses for the
positives, and rejections for the negatives.
(I have added these simple sentences to my test suite
as well)


yins.arifu  lwaladu
y-ins.arifu ?al-walad-u
3MSG-leaves the-boy-NOM
"The boy leaves"

*yins.arifu  lwalada
*y-ins.arifu ?al-walad-a
3MSG-leaves the-boy-ACC
*"The boy leaves"

yadribu   lwaladu      lwalada
y-adribu  ?al-rajul-u  ?al-walad-a
3MSG-hits the-man-NOM  the-boy-ACC
"The man hits the boy"

... ( more permutations were added as well ) ...

** (...looks like I'm not immune to the violent tendencies
      of grammar-writing linguists...)

Lastly, nouns in arabic are inflected for not just case but
definiteness as well, so I created the new constraint FNT
to hold a value for finiteness:

+nj :+ [ CASE case,
          FNT finiteness ] .

* I used the disjoint noun/adjective set because adjectives
  are also inflected to agree with the case and definiteness
  of the noun that they modify.

** Also, with regards to my rules, I currently have the case
   ending being applied as the innermost rule, and as a
   infl-add-only-no-ccont-ltol-rule. In order to get it to
   apply case/finiteness to adjectives, I created a supertype
   +nj_lex that both common-noun-lex and adjective-lex
   inherit from. I use this supertype to constraint the
   DTR of these case rules so they don't spin in generation.

   I have the definiteness prefix/suffixes being applied
   as an infl-ltow-rule to make sure they are the last rules
   that apply. This is probably not the preferred way to
   do it, but I had to do some triage to get everything
   working.


#3: Adjectives
--------------------

As I mentioned above, adjectives show agreement with nouns,
so I implemented a single adjective, 'gabi:y' (stupid) to
test some of a number of sentences I'd included in my
test suite.

My TDL for the adjective structure was the same as that
presented in the instructions with the exception that
adjectives in arabic are post-head, so simply changed
[POSTHEAD - ] to [ POSTHEAD + ]

#4: Wrap-up
---------------------
While I now have at least *some* coverage in my test suite
(14.6%, according to itsdb) I've yet to implement agreement,
so my overgeneration is pretty bad with regards to noun/adj
agreement. However, now that I have some working lexical
rules, I should be able to increase coverage dramatically
by fleshing out my test suite's vocabulary.