CoNLL 2019 Shared Task: Meaning Representation Parsing --- Training Data

Version 1.0; February 3, 2020


Overview
========

This directory contains training data for the MRP 2019 shared task, semantic
graphs in five distinct frameworks: AMR, DM, EDS, PSD, and UCCA.  All graphs
are encoded in a uniform abstract representation with a common serialization
based on JSON.

For general information on the task and the meaning representation frameworks
involved, please see:

  http://mrp.nlpl.eu

The JSON-based uniform interchange format for all frameworks is documented at:

  http://mrp.nlpl.eu/index.php?page=4#format


Contents
========

The main contents in this release are the following files:

  $ wc -l */*.mrp
      969 amr/amr-guidelines.mrp
     1061 amr/bolt.mrp
      213 amr/cctv.mrp
     7378 amr/dfa.mrp
    32914 amr/dfb.mrp
       48 amr/fables.mrp
     4440 amr/lorelei.mrp
      203 amr/mt09sdl.mrp
     6603 amr/proxy.mrp
      527 amr/rte.mrp
      865 amr/wb.mrp
      191 amr/wiki.mrp
       87 amr/wsj.mrp
      741 amr/xinhua.mrp
    35656 dm/wsj.mrp
    35656 eds/wsj.mrp
    35656 psd/wsj.mrp
     3812 ucca/ewt.mrp
     2673 ucca/wiki.mrp
       87 ucca/wsj.mrp
   169780 total

Here, line counts correspond to the number of graphs available in each of the
frameworks.  Where there are multiple files for one fraemwork, the union of all
graphs constitutes the training data for this task.  AMR, for example, draws on
a diverse range of text types and domains; these various segments are preserved
in the distribution, but the sub-division has no technical relevance in the MRP
2019 context.  We anticipate that participants may just concatenate these files
into one.

The task setup does not designate a specific sub-set of the training graphs as
development or validation data.  Participants are free to put the full training
data (from this package) to use as they best see fit.

In general, the goal of this distribution is to re-package the five collections
of semantic graphs in a uniform representation, to facilitate cross-framework
learning and unified evaluation.  Thus, the MRP graphs only contain information
that parsers are expected to predict, i.e. structural and labeling components
that will be considered in evaluation.  In several cases, this design decision
has led to omission of additional information from the original annotations, as
for example the :wiki links in AMR and the encodings of tense, number, et al.
in EDS.

The MRP training data includes what is called the shared sample of WSJ graphs:
89 sentences for which annotations are available in all five frameworks.  This
sample is also available as a separate package, including visual renderings of
these graphs in DOT and PDF format, which may facilitate human inspection:

  http://svn.nlpl.eu/mrp/2019/public/sample.tgz


AMR: Abstract Meaning Representation
====================================

In the AMR graphs, all nodes have the ‘label’ property (holding what AMR calls
concept identifiers), and many nodes additional use ‘properties’ and ‘values’,
for example to encode negative :polarity or the various components of complex
proper names, e.g. :op1, :op2, etc.  The AMR graphs are unordered, i.e. there
are no instances of the ‘anchors’ property on nodes.

As discussed by Kuhlmann & Oepen (2016; CL), AMR graphs can be viewed in two
variants, viz. either in the tree-like structure that is created by annotators
or in a normalized variant, where inverse edges (something like ‘ARG0-of’) are
un-inverted, i.e. treated as an ‘ARG0’ edge in the opposite direction.  There
is an established tradition in AMR evaluation to score the normalized graphs,
i.e. assume that there can be multiple equivalent serializations of the same
graph.  On the other hand, at least same AMR parsers have found it beneficial
to predict graphs in the tree-like, un-normalized topology, and therefore the
MRP release represents both views on the AMR graphs in the same structure: The
‘source’, ‘target’, and ‘label’ properties on edge objects correspond to the
tree-like form, i.e. AMR graphs as annotated; an optional ‘normal’ property on
edges indicates inversion.  On an ‘ARG0-of’ edge, for example, the ‘normal’
property will be ‘ARG0’; conversely, a ‘consist-of’ edge (which superficially
might look like an inverted edge, but is of course not) does not carry the
‘normal’ property.


DM: DELPH-IN MRS Bi-Lexical Dependencies
========================================

The DM (and PSD, see below) graphs are essentially a re-release of the training
data from Task 18 at the 2015 Semantic Evaluation Exercise (SemEval); see:

  http://sdp.delph-in.net/2015/

The main difference is that the MRP version of these graphs further breaks the
linkage to established conventions in syntactic dependency parsing: The inputs
to parsing are no longer presented with gold-standard tokenization (and PoS and
lemma values), as this would essentially pre-determine the inventory of nodes
and a good part of their labeling.  Instead, the DM (and PSD) graphs in MRP now
are only comprised of content nodes: Semantically vacuous tokens (e.g. function
words including auxiliaries, complementizers, or particles) are not part of the
graph.  Nodes in DM are labeled with lemmas and carry two additional properties
that jointly determine the predicate sense, viz. ‘pos’ and ‘frame’.  Even if at
least the first of these properties might look morpho-syntactic in nature, they
encode semantically relevant distinctions, e.g. verbal vs. nominal ‘board’; the
‘frame’ values further make sense distinctions as e.g. causative vs. inchoative
‘increase’, or plain vs. verb–particle ‘look’ vs. ‘look up’.

Another update compared to the original SDP 2015 release of the DM graphs is
that lemma values have been normalized for several classes of ‘generic’ lexical
entries (in the underlying grammar used to construct these annotations): Thus,
lemma values like ‘_generic_proper_ne_’ or ‘_generic_card_ne_’ in this release
have been replaced with actual values.

The DM (and PSD) graphs are ordered, in the sense of a strict, complete linear
ordering relation among their nodes, reflecting surface order of the anchoring
into sub-strings of the underlying input string.  Both node properties and edge 
labels in DM are functional, i.e. there cannot be duplicates.


EDS: Elementary Dependency Structures
=====================================

The EDS graphs, in a sense, present a middle ground between the bi-lexical DM
graphs and the unanchored AMR ones: All nodes are anchored onto sub-strings of
the input, but anchors can correspond to arbitrary character ranges (e.g. affix
or phrasal sub-strings), and multiple nodes can have overlapping anchors.  Node
labels in EDS are semantic predicates that are sense-disambiguated inasmuch as
is determined by syntactic structure, e.g. ‘_increase_v_cause’ or ‘_look_v_up’
for some of the same examples invoked in the above discussion of DM already.

In the context of MRP 2019, EDS node properties (encoding for example tense or
number) have been omitted, such that the only framework-specific property used
on nodes is ‘carg’ (for constant argument), a string-valued parameter that is
used with predicates like ‘named’ or ‘dofw’, for proper names and the days of
the week, respectively.  EDS does not use edge properties, and while there can
in principle be multiple edges between two nodes, edge labels are functional.


PSD: Prague Semantic Dependencies
=================================

Please see the discussion of the DM framework above for general updates in the
PSD graphs compared to their original release for the SDP 2015 (SemEval) task.
In addition to these somewhat foundational differences, lemma values in the PSD
graphs have been corrected in cases where the original conversion (from the FGD
tecto-grammatical trees) had accidentally introduced a kind of reduplication of
lemma values on re-entrant nodes, e.g. values like ‘bill_bill’, now corrected
to just ‘bill’.

The PSD (and DM) graphs are ordered, in the sense of a strict, complete linear
ordering relation among their nodes, reflecting surface order of the anchoring
into sub-strings of the underlying input string.  PSD uses the same node-local
properties as DM (‘pos’ and ‘frame’), but the latter is only present on verbal
nodes and (unlike) in DM actually encodes a lemma-specific sense identifier in
the associated EngValLex valency dictionary.  Unlike DM, edge labels in PSD are
not functional: in coordinate and appositive structures, there will frequently
be multiple outgoing edges from a node with the same label.


UCCA: Universal Conceptual Cognitive Annotation
===============================================

In the UCCA graphs, nodes are generally unlabeled and free of properties, as
they essentially work as group-formning structural elements.  Leaf nodes in the
graphs are anchored to non-overlapping sub-strings of the underlying input, but
there can be multiple, non-consecutive anchors on a node (e.g. for discontinous
multi-word expressions like for example ‘neither ... nor’).  UCCA is the only
framework with edge properties that parsers are expected to predict (and which
will be considered for evaluation, unlike the AMR ‘normal’ property on edges,
which merely provides structural hints).  On re-entrant nodes (with in-degree
greater than one), all but one of the incoming edges will be considered remote
participants (from other UCCA units).  This distinction is encoded through a
boolean-valued ‘remote’ property, which (currently at least) is only present on
edges that actually are remote (i.e. have a ‘true’ value for this property).

Since the original release of the MRP 2019 training data, the UCCA grahps have
been updated twice, first in May 2019 (to provide additional annotations and to
improve sentence and token splitting; released as a separate, UCCA-only overlay
package), and again in February 2020 (to correct erroneous inclusion of part of
the held-out 2019 evaluation data in the ‘overlay’ release).  Seeing as most of
the submissions to the 2019 shared task were trained on the erroneous release,
UCCA evaluation scores regrettably are likely to be artificially high.


Known Limitations
=================

Anchor values in EDS graphs sometimes include character positions corresponding
to punctuation marks (reflecting an idiosyncrasy in the original annotation
process), which we plan to further normalize for increased predictability in
evaluation.


Release History
===============

[Version 1.0; February 3, 2020]

+ Re-release of all data, plugging ‘leakage’ of UCCA evaluation graphs.

[Version 0.9; April 11, 2019]

+ First release of MRP 2019 training data in all frameworks.


Contact
=======

For questions or comments, please do not hesitate to email the task organizers
at: ‘mrp-organizers@nlpl.eu’.

Omri Abend
Jan Hajič
Daniel Hershcovich
Marco Kuhlmann
Stephan Oepen
Tim O'Gorman
Nianwen Xue