Ryan Georgi Ling 567 Lab 5 Writeup #0: Basic Lexicon/Test Suite Changes --------------- After the last lab, I realized the importance of getting the itsdb suite to use the regularized forms, so for this lab I've changed all the lexical entries so that they'll be using regularized morpheme forms. #1: PNG & Pronouns --------------- Arabic has a large set of pronouns that vary for all person, number, and gender. Furthermore, the number system includes dual in addition to singular and plural. Here is a chart for the different forms of free-standing pronouns in Arabic: Person/Gender Sg. Dual Pl. 1 ?ana -- nah.nu 2/masc ?anta \ ?antum => ?antuma: 2/fem ?anti / ?antunna 3/masc h.uwa \ hum => huma: 3/fem hiya / hunna All the singular and plural forms were specified in pronoun entries with their person, gender, and number specified. The 2nd and 3rd person dual entries were left underspecified for gender. As I noted in a previous lab; these freestanding pronouns are not commonly used in Arabic due to the unambiguous person/number/gender marking on the verb. When these pronouns do show up, it's usually to topicalize a preceding noun or as the subject of a verbless sentence (such as "?ana mudarris" = I teacher = 'I am a teacher'). When these pronouns do appear, they are always nominative, so I have put that constraint on the pronoun-lex type itself. * Arabic doesn't have seperate determiners, so I don't have determiners to test the optionality of, but I did make sure the inflectional rules for definiteness I created below don't apply to these pronouns. #1b: Common Nouns -------------- Gender of arabic nouns is generally quite regular. With a few exceptions, all nouns whose base form ends in -t(a) are feminine, all others are masculine. I have gone through and added gender to the lexical entries in my language; though as I haven't yet implemented agreement, this case won't be showing up anywhere visible yet. * ( I didn't specify their roots very well in the last lab, but I have gone through with an Arabic dictionary to make sure I have the correct genders. ) ** Again, determiners don't exist in arabic, but there are quantifying affixes for definiteness/indefiniteness that I implemented (in a very simple way) as lexical rules below. #2: Case / Inflection ----------------- Case and inflection are really part and parcel of getting any coverage over my test suite, so I implemented these two packages to get some semblance of coverage. First, I started simply by adding case to the verb-lex entries. Since the basic word orders are SVO and VSO, as I posted in the forums, I should be able to place case on the ARG-ST without worrying (too much) about reordering at this point. Furthermore, subjects appear to always be nominative, so I simply added the constraint that first arg on verb-lex's ARG-ST (unifying with #subj) must be HEAD noun & CASE nom. verb-lex := basic-verb-lex & ... ... ARG-ST < #subj & [ LOCAL [ CAT [VAL [ SPR < >, COMPS < > ], HEAD noun & [CASE nom] ], ... ... I also added a constraint to the transitive verb lex that the second arg must be accusative. At this point, I moved to writing some basic inflectional rules. I began by defining the following i-rules: nom_lex_rule, acc_lex rule def_lex_rule, indef_lex_rule The first two rules simply add to the SYNSEM the value of CASE on a noun, and the instances of these rules add the affixes -u (nom) and -a (acc). After implementing these simple cases,I tried testing the following sentences, and successfully got parses for the positives, and rejections for the negatives. (I have added these simple sentences to my test suite as well) yins.arifu lwaladu y-ins.arifu ?al-walad-u 3MSG-leaves the-boy-NOM "The boy leaves" *yins.arifu lwalada *y-ins.arifu ?al-walad-a 3MSG-leaves the-boy-ACC *"The boy leaves" yadribu lwaladu lwalada y-adribu ?al-rajul-u ?al-walad-a 3MSG-hits the-man-NOM the-boy-ACC "The man hits the boy" ... ( more permutations were added as well ) ... ** (...looks like I'm not immune to the violent tendencies of grammar-writing linguists...) Lastly, nouns in arabic are inflected for not just case but definiteness as well, so I created the new constraint FNT to hold a value for finiteness: +nj :+ [ CASE case, FNT finiteness ] . * I used the disjoint noun/adjective set because adjectives are also inflected to agree with the case and definiteness of the noun that they modify. ** Also, with regards to my rules, I currently have the case ending being applied as the innermost rule, and as a infl-add-only-no-ccont-ltol-rule. In order to get it to apply case/finiteness to adjectives, I created a supertype +nj_lex that both common-noun-lex and adjective-lex inherit from. I use this supertype to constraint the DTR of these case rules so they don't spin in generation. I have the definiteness prefix/suffixes being applied as an infl-ltow-rule to make sure they are the last rules that apply. This is probably not the preferred way to do it, but I had to do some triage to get everything working. #3: Adjectives -------------------- As I mentioned above, adjectives show agreement with nouns, so I implemented a single adjective, 'gabi:y' (stupid) to test some of a number of sentences I'd included in my test suite. My TDL for the adjective structure was the same as that presented in the instructions with the exception that adjectives in arabic are post-head, so simply changed [POSTHEAD - ] to [ POSTHEAD + ] #4: Wrap-up --------------------- While I now have at least *some* coverage in my test suite (14.6%, according to itsdb) I've yet to implement agreement, so my overgeneration is pretty bad with regards to noun/adj agreement. However, now that I have some working lexical rules, I should be able to increase coverage dramatically by fleshing out my test suite's vocabulary.