For reasons of comparability and fairness, the MRP 2020 shared task imposes some constraints on which third-party data or pre-trained models can be used in addition to the resources distributed by the task organizers. Following is a ‘white-list’ of legitimate resources, which was constructed from nominations by prospective participants. The deadline for suggesting additional resources is Monday, June 15, 2020. In general, only resources that all participants could in principle obtain are considered for white-listing. Some of the MRP task data intricately overlaps with common syntactic treebanks. Therefore, a general rule is that resources like the Penn Treebank, its derivatives like PropBank, as well as the Universal Dependencies treebanks need to be used with some care MRP system development. For example, common English parsers (like CoreNLP, spaCy, or UDPipe) have been trained on some of the same texts that are annotated in the MRP training split, which will most likely lead to unrealistically high syntactic parsing accuracy during development and, correspondingly, a distinct drop in parser performance when moving to held-out evaluation data. To avoid such effects, the companion data for the task provides high-quality morpho-syntactic dependency parses that were produced using jack-knifing; please see: http://svn.nlpl.eu/mrp/2020/public/companion/README.txt The parser and models used to produce the morpho-syntactic companion trees will be released to the public upon completion of the shared task. Within reason, the organizers can parse additional corpora for participants, provided that the text and parses can be shared will all participants. Anyone who would like to use this service please contact ‘mrp-organizers@nlpl.eu’, to discuss specifics of data preparation and turnaround time. + CoNLL 2017 texts and embeddings: http://hdl.handle.net/11234/1-1989 + NLPL EngC3 Corpus: http://corpora.nlpl.eu/engc3/10/txt/ + Wikipedia dumps: https://dumps.wikimedia.org/ + Huggingface Transformers: https://huggingface.co/transformers/ + BERT: https://github.com/google-research/bert + ELMo: https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md + ERNIE: https://github.com/thunlp/ERNIE + FastText: https://fasttext.cc/docs/en/english-vectors.html + GloVe embeddings: https://nlp.stanford.edu/projects/glove/ + NLPL Vectors Repository: http://vectors.nlpl.eu + CoreNLP: https://stanfordnlp.github.io/CoreNLP/ + spaCy: https://spacy.io/ + Stanza: https://stanfordnlp.github.io/stanza/ + UDPipe: http://ufal.mff.cuni.cz/udpipe + UDify: https://github.com/Hyperparticle/udify + Illinois Named Entity Tagger: https://cogcomp.org/page/software_view/NETagger + FrameNet: https://framenet.icsi.berkeley.edu + Older (Version 1) UCCA annotations of ‘20K Leagues Under The Sea’: https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-20K https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_German-20K https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_French-20K + CzEngVallex: http://hdl.handle.net/11234/1-1512 + ERG Semantic Interface and lexicon: http://svn.delph-in.net/erg/tags/1214/etc http://svn.delph-in.net/erg/tags/1214/lexicon.tdl + VerbNet: http://verbs.colorado.edu/verb-index/vn/verbnet-3.2.tar.gz + Princeton WordNet: https://wordnet.princeton.edu/ + Open Multilingual WordNet: http://compling.hss.ntu.edu.sg/omw/ + ConceptNet: http://conceptnet.io/ + Microsoft Concept Graph: https://concept.research.microsoft.com/Home/Download