ToDo: 1. Copy/update all grammars for: /semi.vpm /lkb/script /lkb/globals.lsp : translate-grid /lkb/mt.lsp lkb/user-fns.lsp -- ? 2: Update directory identity/ setup.lisp what else? Soft link from $LOGONROOT/uw/mmt to lingo/grammars/matrix/gmmt 3. Take notes on DTD issues sg/du/pl isn't going to work --- we also need non-sg et al gender just isn't relevant for transfer should I just go ahead and rename things inside the grammars GEN -> GEND et al? 4. Add grammars to setup.lisp 5. Create bitext test suites, round-robin, or something 6. Consider adding transfer rules ------------------------------------------------------------------------ Exerpted from chat with Stephan on 5/18/07 To run a batch process from the command line: ./batch --binary --from hau --to isl --ascii ./uw/mmt/toy.txt If it works, results in a .fan file $LOGONROOT/summarize -o 10 on the file, and afterwards they should be diff(1)able. the contents of the .fan log is also in an [incr tsdb()] profile now, which you can browse ------------------------------------------------------------------------ Stephan: the main changes i would suggest to your grammars are the following Sent at 8:26 AM on Friday Stephan: decide on the inventory of MRS variable properties and values. i added a variable property mapping (VPM) to HAU and ISL to demonstrate the machinery. the assumption is that outside the FS universe, there is a limited set of properties and values, and that they are declared in the SEM-I (and ideally correspond to the DTD); take a look at `erg.smi' below lingo/erg/ to get the basics. there even is documentation on the wiki: LogonVpm this way, you can assume an interlingual space for tense, number, et al. each grammar maps into that space when constructing MRSs from AVMs, and the reverse mapping is applied when generating from an input MRS. me: The .smi file is enough to enable the mapping? Stephan: i would suggest adding a `semi.vpm' file to each grammar. no, the file you need is the .vpm file me: Ah, I see. For the most part, those will be pretty boring for these grammars, as we've done pretty extensive harmonization where possible on the shared attributes. But I guess we could replace the sg/non-sg for example with sg/pl in Icelandic, and have Icelandic pl map to interlingual non-sg? Stephan: yes, but the external property names are differend (just GEND, NUM, et al), i.e. they have their position in the index FS stripped. and, oddly, the current best practice in values (as represented by the `erg.smi', and mimicked in JaCY and GG), is not always the same as (a) what your grammars are using or (b) what is in the DTD. a very nasty space. between the ERG, JaCY, and GG we have so far succeeded in a bottom-up fashion. i would suggest you take inspiration as much as you can from there, and keep track of divergences (as we did in `erg.smi', comparing to the DTD). me: Where is the DTD -- on the wiki? Sent at 8:34 AM on Friday Stephan: i gave you one transfer grammar so far, called `identity' and it provides a set of files that could be shared across transfer grammars, once you bifurcate more; all variable properties and values in your semi.vpm files and my mrs.tdl need to be compatible. ann would have to say; i usually look at the version in src/rmrs/'. but since we are doing MRS and not RMRS, there is extra wiggle room. Stephan: if you look at uw/mmt/setup, you will see how the cross-multiplication of languages and association with transfer grammars is accomplished. fully configurable by you, and you can also use wildcards for source or target, e.g. to implement the target-specific idea. to create a new transfer grammar, just copy identity to a new name and make changes as you see fit. i have tried arranging the files so that mrs.tdl, mtr.tdl, et al. end up shared. this is modeled after what you will find below uio/, which is the Transfer MatriX emerging from LOGON, plus JaCY and GG. i decided to start you out in isolation, because there's lots more than you need just now, and we are still making adjustments in that space that are hard enough to synchronize across three grammars. Stephan: you'll need a VPM in each grammar Stephan: the rule-priority() function must always return an integer, at least ISL had a non-compliant version (probably all user-fns.lsp files are the same, so maybe you can just copy the ISL one around) the mt.lsp file contained assignements to deprecated variables, which yield load-time warning. the HAU and ISL one is now stripped down to what you actually need, so maybe (again) copy that version into all the other grammars. Stephan: each grammar should set the translate-grid variable suitably; see what i did in ISL and IDENTITY the mmt/tsdb/skeletons/ directory needs to be a proper skeleton root, i.e. contain Relations and Index.lisp; see what i did in my copy. it might be worth adding a (set-coding-system utf-8) to the top of each script Adding languages: Add them to mmt-languages in setup.lisp, and make sure they can be loaded by means of $LOGONROOT/uw/mmt//lkb/script Stephan: nope, you need a VPM in each grammar; and implicitly that makes decisions about the SEM-I for that grammar, but you need to write those decisions down as a .smi file for each grammar. you could think of `uw/mmt/mrs.tdl' as the meta SEM-I for your grammars, as they should all (post-VPM) conform to what you declare there ERB 2007-05-21: When adding new languages, it appears that *translate-grid* needs to be updated in setup.lisp as well as in ALL of the transfer grammars. ------------------------------------------------------------------- Updating script files: At top of file: (set-coding-system utf-8) Instead of old transfer stuff, in whatever form: ;;; ;;; the mapping from grammar-internal to SEM-I compliant variable properties ;;; (mt:read-vpm (lkb-pathname (parent-directory) "semi.vpm") :semi) (lkb-load-lisp (this-directory) "mt.lsp" t) Remove (index-for-generator) ------------------------------------------------------------------- Updating other files: Fix rule-priority() in user-fns.lsp cp isl/semi.vpm to each language --- note, no GENDER in VPM, as we want to strip it. For now assuming same VPM will work everywhere new mt.lsp should just have the following: (setf *transfer-filter-p* nil) (setf *lm-model* nil)