Download LBJava User Guide - Cognitive Computation Group

Transcript
Label
Precision Recall
F1
LCount PCount
-------------------------------------------------------------alt.atheism
80.000 80.000 80.000
80
80
comp.graphics
78.814 77.500 78.151
120
118
comp.os.ms-windows.misc
80.198 79.412 79.803
102
101
comp.sys.ibm.pc.hardware
74.074 79.208 76.555
101
108
comp.sys.mac.hardware
80.000 77.551 78.756
98
95
comp.windows.x
82.955 85.882 84.393
85
88
misc.forsale
70.588 80.769 75.336
104
119
rec.autos
77.551 89.063 82.909
128
147
rec.motorcycles
78.571 84.615 81.481
104
112
rec.sport.baseball
81.197 91.346 85.973
104
117
rec.sport.hockey
90.291 90.291 90.291
103
103
sci.crypt
90.816 85.577 88.119
104
98
sci.electronics
77.570 85.567 81.373
97
107
sci.med
83.019 88.000 85.437
100
106
sci.space
91.837 78.947 84.906
114
98
soc.religion.christian
84.946 79.000 81.865
100
93
talk.politics.guns
86.747 72.727 79.121
99
83
talk.politics.mideast
91.262 89.524 90.385
105
103
talk.politics.misc
85.915 76.250 80.795
80
71
talk.religion.misc
86.792 63.889 73.600
72
53
-------------------------------------------------------------Accuracy
82.150
2000
The TestDiscrete class also supports the notion of a null label, which is a label intended to
represent the absense of a prediction. The 20 Newsgroups task doesn’t make use of this concept,
but if our task were, e.g., named entity classification in which every phrase is potentially a named
entity, then the classifier will likely output a prediction we interpret as meaning “this phrase is
not a named entity.” In that case, we will also be interested in overall precision, recall, and F1
scores aggregated over the non-null labels. On the TestDiscrete command line, all arguments
after the four we’ve already seen are optional null labels. The output with a single null label
“O” might look like this (note the Overall row at the bottom):
Label
Precision Recall
F1
LCount PCount
---------------------------------------------LOC
88.453 87.153 87.798
1837
1810
MISC
83.601 79.067 81.271
922
872
ORG
76.226 76.510 76.368
1341
1346
PER
86.554 88.762 87.644
1842
1889
---------------------------------------------O
0.000 0.000 0.000
581
606
---------------------------------------------Overall
84.350 83.995 84.172
5942
5917
Accuracy
76.514
6523
18