The scripts described in this document can be used to merge the training, development and evaluation files of the three tasks in EPE. Scripts: -------- 1. generate_keys.py 2. pack.py 3. validate.py Usage: -------- First, run the script `generate_keys.py' using the following command: python generate_keys.py --path [path to the root directory of the EPE files] The script takes an optional argument to specify the task (negation, events or opinion) for which to generate key files. If no task is specified, the script generates key files for the three tasks. The output of the script is two pickle files per task containing a mapping from file names to keys and other way around. For example, the script generates two pickle files for the task events named: events_fk.p and events_kf.p, the first contains a dictionary mapping file names to keys and the second contains a dictionary mapping keys to file names. Note that one only needs to run the script once, and the generated files must not be changed afterwards, i.e. the script must not be ran more than once, and definitely not after the following scripts have been called. Second, run the `pack.py' script which packs and unpacks the separate EPE files. To pack the files, run the following command: python pack.py -mode pack --path [path to the EPE directory] Like the previous script, pack.py also takes an optional argument `task' to run the script for a specific task. Otherwise, the script will pack the files of the three tasks generating three files: events.txt, opinion.txt and events.txt The script adds a separator after each file with the following format: Document NUMERIC_IDENTIFIER ends. Third, to unpack the files of a specific task from one CoNLL-U-formatted file, run: python pack.py -mode unpack --task [opinion,events,negation] --infile [path to and name of the CoNLL-U file] --outpath [path to output folder] All the arguments in the previous command *are required*. The script takes a single file and unpacks it to separate files (following the original names in EPE) in separate directories corresponding to the original splits, i.e. training, development and evaluation. The script will create the directories if they do not exist already. Please note that the `pack.py' script in either mode requires that the key-filename mapping files (i.e. the pickle files) to be in the same working directory of the script itself. Fourth, to validate that the number of tokens in the parsed and unpacked files is comparable to the number of tokens in a tokenized version of the EPE files, you can run the script `validate.py' using the following command: python validate.py -epe [path to EPE directory] -parsed [path to your parsed and unpacked files] --task [negation,opinion,events] The first two arguments (epe and parsed) are required, but the third one (task) is optional. If the task argument is not provided, the validation script will check the files of the three tasks.