CoNLL 2020 Shared Task: Meaning Representation Parsing --- Evaluation Data

Version 1.0; July 22, 2020


Overview
========

This directory (combined with its ‘sibling’ directory for the cross-lingual
track) contains evaluation data for the MRP 2020 shared task, i.e. the parser
inputs that form the starting point for system submissions.  The files use a
‘stripped down’ version of the common MRP serialization format, with the fields
to be predicted by participating systems supressed, and some additional fields
indicating which target frameworks can be evaluated for each input.

For general information on the task and the meaning representation frameworks
involved, please see:

  http://mrp.nlpl.eu

The JSON-based uniform interchange format for all frameworks is documented at:

  http://mrp.nlpl.eu/2020/index.php?page=14#format


Cross-Framework Track
=====================

The main contents in this release is the file providing English parser inputs:

  $ wc -l input.mrp
  9355 input.mrp

Here, the number of lines corresponds to the number of parser inputs, i.e. the
evaluation data for MRP 2020 is comprised of 9355 strings to be parsed.  Parser
inputs are presented as empty MRP graphs, where the ‘input’ property provides
the string (each for one sentence-like unit, i.e. one parser input).  Each of
the inputs additionally has a top-level property ‘targets’ that indicates the
range of frameworks that will be evaluated for this sentence.  Because only a
small subset of the MRP 2020 evaluation data is annotated in all frameworks,
most inputs have between one and three elements in their ‘targets’ list.

Additionally, parser inputs are accompanied with the same type of ‘companion’
morpho-syntactic trees as the training data, using the same software version
and parsing model.

  $ wc -l udpipe.mrp
  9355 udpipe.mrp

For additional technical information on the preparation of these companion
analyses, please see the original companion package for the training data:

  http://svn.nlpl.eu/mrp/2020/public/companion/README.txt


Cross-Lingual Track
===================

Parallel to the cross-framework track, parser inputs (in Chinese, Czech, and
German) for the cross-lingual track are provided in MRP format, where parsers
are expected to fill in the missing fields (i.e. actual meaning representation
graphs), according to top-level ‘language’ and ‘targets’ properties.  Files for
the cross-linual track reside in the directory ‘.../mrp/2020/cl/evaluation/’:

  $ wc -l input.mrp
  8036 input.mrp

Also parallel, parser inputs are accompanied with the same type of ‘companion’
morpho-syntactic trees as the training data, using the same software version
and parsing models.  However, these are provided in MRP format for each of the
languages separately:

  $ wc -l ces.mrp deu.mrp zho.mrp 
    5476 ces.mrp
     847 deu.mrp
    1713 zho.mrp
    8036 total


System Submissions
==================

The files in this archive provide the starting point for participants in the
MRP 2020 shared task.  The evaluation period will run between Monday, July 27,
and Monday, August 10, 2020; no submissions will be possible after that date.

The task emphasizes a cross-framework perspective and in one of its tracks
invites submissions that include predicted semantic graphs in all of the five
frameworks (AMR, DRG, EDS, PTG, and UCCA).  Thus, in the cross-framework track,
a complete submission will provide graphs for all parser ‘input’ strings and
all ‘targets’ elements (10,502 predicted graphs in total).  All graphs should
be concatenated into a single file called ‘output.mrp’ and must be uploaded
through the CodaLab submission interface.  In other words, the outputs of a
complete submission should include four separate graphs for a (hypothetical)
parser input like the following:

  {"id": "111450", "version": 1.1, "time": "2020-07-22",
   "language": "eng", "source": "lpps", "provenance": "MRP 2020",
   "targets": ["eds", "ptg", "ucca", "amr"],
   "input": "So the little prince tamed the fox."}

For the cross-lingual track, the above applies with equal force, if only in
principle: in the cross-lingual track, there are (regrettably) no sentences
with annotations in multiple frameworks.  Therefore, the ‘targets’ list will
always contain exactly one entry (AMR, DRG, PTG, or UCCA).  Submissions to the
cross-framework and cross-lingual track must be combined into a single file (as
above: ‘output.mrp’); they will be separated (based on graph ‘id’entifiers and
‘framwork’ values) for evaluation; the two tracks will be scored separately.

All submitted graphs must be serialized in the unified MRP interchange format
and must pass the MRP validator.  For background on the file format and tool
support for pre-submission validation, please see:

  http://mrp.nlpl.eu/2020/index.php?page=14#format
  https://github.com/cfmrp/mtool#validation

Further information for participants, including instructions for how to access
the CodaLab site for the task, how to package a submission, the MRP 2020 policy
on re-submissions (within the evaluation deadline), and more, please see:

  http://mrp.nlpl.eu/index.php?page=16#submission


Known Limitations
=================

Anchor values in EDS graphs sometimes include character positions corresponding
to adjacent punctuation marks (reflecting their morpho-syntactic analysis as a
kind of ‘pseudo-affix’ in the underlying annotations).  Evaluation of anchoring
in the official MRP scorer is somewhat robust to such variation, i.e. there is
a notion of anchor normalization (ignoring a specific set of punctuation marks
in prefix or affix positions), such that it will be legitimate for a parser to
directly predict normalized anchors.  For background, please see:

  http://mrp.nlpl.eu/2020/index.php?page=15#software


Release History
===============

[Version 1.0; July 22, 2020]

+ First release of MRP 2020 evaluation parser inputs in all frameworks.


Contact
=======

For questions or comments, please do not hesitate to email the task organizers
at: ‘mrp-organizers@nlpl.eu’.

Omri Abend,
Lasha Abzianidze,
Johan Bos,
Jan Hajič,
Daniel Hershcovich,
Bin Li,
Stephan Oepen (chair),
Tim O'Gorman,
Nianwen Xue,
and Dan Zeman