Download Tilde`s wrapper system for CollTerm

Transcript
Contract no. 248347
B-LOC
59.89
81.54
98.45
69.06
I-PERS
65.43
89.83
99.66
75.71
I-LOC
13.60
54.20
98.98
21.74
B-PERS
48.50
90.94
99.33
63.26
I-ORG
23.60
71.35
98.19
35.47
I-TIME
26.14
86.79
99.73
40.18
B-ORG
27.36
89.80
98.54
41.94
B-DATE
46.26
82.98
98.94
59.40
B-TIME
10.68
91.67
99.81
19.13
I-MON
42.27
87.58
99.60
57.02
I-PROD
26.00
57.96
99.35
35.90
B-PROD
29.40
60.22
99.32
39.51
The columns in the tab separated result file represent the following in the exact sequence:
result category, recall, precision, accuracy and F-measure. For full named entities accuracy
results will not be given (accuracy can be estimated on single token performance only and not
on multiple token sequences as the interpretation of non-entities and their possible sequences
is ambiguous).
The command line to call the evaluation script is as follows:
perl ./NEEvaluation_v2.pl [1:
directory] [3: Output file]
Gold
data
directory]
[2:
Test
result
The script requires in total three arguments passed to the script in a fixed order:
1.
The path of the directory containing the human annotated/gold documents.
2.
The path of the directory containing the test result documents.
3.
The path to the evaluation result output file.
The script depends on the “NEUtilities.pm” Perl module.
3.1.5.3.5
Tagging and evaluating files in a directory
As the bootstrapping and NE training scripts require tagging of multiple full directories of
files (development, test and unlabelled data), the script “NETagDirectory.pl” is provided.
The script executes Stanford NER NE classification, NE refinements and also evaluation
(optional) on all files in a directory. The files have to be pre-processed (for input data formats
refer to section 3.1.6.3 and 3.1.6.4). The script creates tab-separated NE-tagged files (for the
output data format refer to section 3.1.6.5).
The command line to call the NE-tagging for a single directory is as follows:
perl ./NETagDirectory.pl [1: NER model path] [2: Input directory] [3:
Output directory] [4: Input file extension] [5: Output file extension] [6:
Tagging property file] [7: Evaluation result file] [8: Refinement order
definition string]
D2.6 V3.0
Page 91 of 164