Download Tilde`s wrapper system for CollTerm
Transcript
Contract no. 248347 B-LOC 59.89 81.54 98.45 69.06 I-PERS 65.43 89.83 99.66 75.71 I-LOC 13.60 54.20 98.98 21.74 B-PERS 48.50 90.94 99.33 63.26 I-ORG 23.60 71.35 98.19 35.47 I-TIME 26.14 86.79 99.73 40.18 B-ORG 27.36 89.80 98.54 41.94 B-DATE 46.26 82.98 98.94 59.40 B-TIME 10.68 91.67 99.81 19.13 I-MON 42.27 87.58 99.60 57.02 I-PROD 26.00 57.96 99.35 35.90 B-PROD 29.40 60.22 99.32 39.51 The columns in the tab separated result file represent the following in the exact sequence: result category, recall, precision, accuracy and F-measure. For full named entities accuracy results will not be given (accuracy can be estimated on single token performance only and not on multiple token sequences as the interpretation of non-entities and their possible sequences is ambiguous). The command line to call the evaluation script is as follows: perl ./NEEvaluation_v2.pl [1: directory] [3: Output file] Gold data directory] [2: Test result The script requires in total three arguments passed to the script in a fixed order: 1. The path of the directory containing the human annotated/gold documents. 2. The path of the directory containing the test result documents. 3. The path to the evaluation result output file. The script depends on the “NEUtilities.pm” Perl module. 3.1.5.3.5 Tagging and evaluating files in a directory As the bootstrapping and NE training scripts require tagging of multiple full directories of files (development, test and unlabelled data), the script “NETagDirectory.pl” is provided. The script executes Stanford NER NE classification, NE refinements and also evaluation (optional) on all files in a directory. The files have to be pre-processed (for input data formats refer to section 3.1.6.3 and 3.1.6.4). The script creates tab-separated NE-tagged files (for the output data format refer to section 3.1.6.5). The command line to call the NE-tagging for a single directory is as follows: perl ./NETagDirectory.pl [1: NER model path] [2: Input directory] [3: Output directory] [4: Input file extension] [5: Output file extension] [6: Tagging property file] [7: Evaluation result file] [8: Refinement order definition string] D2.6 V3.0 Page 91 of 164