CoNLL 2019 Shared Task: Meaning Representation Parsing --- Training Data Version 1.0; February 3, 2020 Overview ======== This directory contains training data for the MRP 2019 shared task, semantic graphs in five distinct frameworks: AMR, DM, EDS, PSD, and UCCA. All graphs are encoded in a uniform abstract representation with a common serialization based on JSON. For general information on the task and the meaning representation frameworks involved, please see: http://mrp.nlpl.eu The JSON-based uniform interchange format for all frameworks is documented at: http://mrp.nlpl.eu/index.php?page=4#format Contents ======== The main contents in this release are the following files: $ wc -l */*.mrp 969 amr/amr-guidelines.mrp 1061 amr/bolt.mrp 213 amr/cctv.mrp 7378 amr/dfa.mrp 32914 amr/dfb.mrp 48 amr/fables.mrp 4440 amr/lorelei.mrp 203 amr/mt09sdl.mrp 6603 amr/proxy.mrp 527 amr/rte.mrp 865 amr/wb.mrp 191 amr/wiki.mrp 87 amr/wsj.mrp 741 amr/xinhua.mrp 35656 dm/wsj.mrp 35656 eds/wsj.mrp 35656 psd/wsj.mrp 3812 ucca/ewt.mrp 2673 ucca/wiki.mrp 87 ucca/wsj.mrp 169780 total Here, line counts correspond to the number of graphs available in each of the frameworks. Where there are multiple files for one fraemwork, the union of all graphs constitutes the training data for this task. AMR, for example, draws on a diverse range of text types and domains; these various segments are preserved in the distribution, but the sub-division has no technical relevance in the MRP 2019 context. We anticipate that participants may just concatenate these files into one. The task setup does not designate a specific sub-set of the training graphs as development or validation data. Participants are free to put the full training data (from this package) to use as they best see fit. In general, the goal of this distribution is to re-package the five collections of semantic graphs in a uniform representation, to facilitate cross-framework learning and unified evaluation. Thus, the MRP graphs only contain information that parsers are expected to predict, i.e. structural and labeling components that will be considered in evaluation. In several cases, this design decision has led to omission of additional information from the original annotations, as for example the :wiki links in AMR and the encodings of tense, number, et al. in EDS. The MRP training data includes what is called the shared sample of WSJ graphs: 89 sentences for which annotations are available in all five frameworks. This sample is also available as a separate package, including visual renderings of these graphs in DOT and PDF format, which may facilitate human inspection: http://svn.nlpl.eu/mrp/2019/public/sample.tgz AMR: Abstract Meaning Representation ==================================== In the AMR graphs, all nodes have the ‘label’ property (holding what AMR calls concept identifiers), and many nodes additional use ‘properties’ and ‘values’, for example to encode negative :polarity or the various components of complex proper names, e.g. :op1, :op2, etc. The AMR graphs are unordered, i.e. there are no instances of the ‘anchors’ property on nodes. As discussed by Kuhlmann & Oepen (2016; CL), AMR graphs can be viewed in two variants, viz. either in the tree-like structure that is created by annotators or in a normalized variant, where inverse edges (something like ‘ARG0-of’) are un-inverted, i.e. treated as an ‘ARG0’ edge in the opposite direction. There is an established tradition in AMR evaluation to score the normalized graphs, i.e. assume that there can be multiple equivalent serializations of the same graph. On the other hand, at least same AMR parsers have found it beneficial to predict graphs in the tree-like, un-normalized topology, and therefore the MRP release represents both views on the AMR graphs in the same structure: The ‘source’, ‘target’, and ‘label’ properties on edge objects correspond to the tree-like form, i.e. AMR graphs as annotated; an optional ‘normal’ property on edges indicates inversion. On an ‘ARG0-of’ edge, for example, the ‘normal’ property will be ‘ARG0’; conversely, a ‘consist-of’ edge (which superficially might look like an inverted edge, but is of course not) does not carry the ‘normal’ property. DM: DELPH-IN MRS Bi-Lexical Dependencies ======================================== The DM (and PSD, see below) graphs are essentially a re-release of the training data from Task 18 at the 2015 Semantic Evaluation Exercise (SemEval); see: http://sdp.delph-in.net/2015/ The main difference is that the MRP version of these graphs further breaks the linkage to established conventions in syntactic dependency parsing: The inputs to parsing are no longer presented with gold-standard tokenization (and PoS and lemma values), as this would essentially pre-determine the inventory of nodes and a good part of their labeling. Instead, the DM (and PSD) graphs in MRP now are only comprised of content nodes: Semantically vacuous tokens (e.g. function words including auxiliaries, complementizers, or particles) are not part of the graph. Nodes in DM are labeled with lemmas and carry two additional properties that jointly determine the predicate sense, viz. ‘pos’ and ‘frame’. Even if at least the first of these properties might look morpho-syntactic in nature, they encode semantically relevant distinctions, e.g. verbal vs. nominal ‘board’; the ‘frame’ values further make sense distinctions as e.g. causative vs. inchoative ‘increase’, or plain vs. verb–particle ‘look’ vs. ‘look up’. Another update compared to the original SDP 2015 release of the DM graphs is that lemma values have been normalized for several classes of ‘generic’ lexical entries (in the underlying grammar used to construct these annotations): Thus, lemma values like ‘_generic_proper_ne_’ or ‘_generic_card_ne_’ in this release have been replaced with actual values. The DM (and PSD) graphs are ordered, in the sense of a strict, complete linear ordering relation among their nodes, reflecting surface order of the anchoring into sub-strings of the underlying input string. Both node properties and edge labels in DM are functional, i.e. there cannot be duplicates. EDS: Elementary Dependency Structures ===================================== The EDS graphs, in a sense, present a middle ground between the bi-lexical DM graphs and the unanchored AMR ones: All nodes are anchored onto sub-strings of the input, but anchors can correspond to arbitrary character ranges (e.g. affix or phrasal sub-strings), and multiple nodes can have overlapping anchors. Node labels in EDS are semantic predicates that are sense-disambiguated inasmuch as is determined by syntactic structure, e.g. ‘_increase_v_cause’ or ‘_look_v_up’ for some of the same examples invoked in the above discussion of DM already. In the context of MRP 2019, EDS node properties (encoding for example tense or number) have been omitted, such that the only framework-specific property used on nodes is ‘carg’ (for constant argument), a string-valued parameter that is used with predicates like ‘named’ or ‘dofw’, for proper names and the days of the week, respectively. EDS does not use edge properties, and while there can in principle be multiple edges between two nodes, edge labels are functional. PSD: Prague Semantic Dependencies ================================= Please see the discussion of the DM framework above for general updates in the PSD graphs compared to their original release for the SDP 2015 (SemEval) task. In addition to these somewhat foundational differences, lemma values in the PSD graphs have been corrected in cases where the original conversion (from the FGD tecto-grammatical trees) had accidentally introduced a kind of reduplication of lemma values on re-entrant nodes, e.g. values like ‘bill_bill’, now corrected to just ‘bill’. The PSD (and DM) graphs are ordered, in the sense of a strict, complete linear ordering relation among their nodes, reflecting surface order of the anchoring into sub-strings of the underlying input string. PSD uses the same node-local properties as DM (‘pos’ and ‘frame’), but the latter is only present on verbal nodes and (unlike) in DM actually encodes a lemma-specific sense identifier in the associated EngValLex valency dictionary. Unlike DM, edge labels in PSD are not functional: in coordinate and appositive structures, there will frequently be multiple outgoing edges from a node with the same label. UCCA: Universal Conceptual Cognitive Annotation =============================================== In the UCCA graphs, nodes are generally unlabeled and free of properties, as they essentially work as group-formning structural elements. Leaf nodes in the graphs are anchored to non-overlapping sub-strings of the underlying input, but there can be multiple, non-consecutive anchors on a node (e.g. for discontinous multi-word expressions like for example ‘neither ... nor’). UCCA is the only framework with edge properties that parsers are expected to predict (and which will be considered for evaluation, unlike the AMR ‘normal’ property on edges, which merely provides structural hints). On re-entrant nodes (with in-degree greater than one), all but one of the incoming edges will be considered remote participants (from other UCCA units). This distinction is encoded through a boolean-valued ‘remote’ property, which (currently at least) is only present on edges that actually are remote (i.e. have a ‘true’ value for this property). Since the original release of the MRP 2019 training data, the UCCA grahps have been updated twice, first in May 2019 (to provide additional annotations and to improve sentence and token splitting; released as a separate, UCCA-only overlay package), and again in February 2020 (to correct erroneous inclusion of part of the held-out 2019 evaluation data in the ‘overlay’ release). Seeing as most of the submissions to the 2019 shared task were trained on the erroneous release, UCCA evaluation scores regrettably are likely to be artificially high. Known Limitations ================= Anchor values in EDS graphs sometimes include character positions corresponding to punctuation marks (reflecting an idiosyncrasy in the original annotation process), which we plan to further normalize for increased predictability in evaluation. Release History =============== [Version 1.0; February 3, 2020] + Re-release of all data, plugging ‘leakage’ of UCCA evaluation graphs. [Version 0.9; April 11, 2019] + First release of MRP 2019 training data in all frameworks. Contact ======= For questions or comments, please do not hesitate to email the task organizers at: ‘mrp-organizers@nlpl.eu’. Omri Abend Jan Hajič Daniel Hershcovich Marco Kuhlmann Stephan Oepen Tim O'Gorman Nianwen Xue