public class TestDiscrete extends Object
Classifier
against an oracle
Classifier
on the objects returned from a Parser
.
Usage:
java edu.illinois.cs.cogcomp.lbjava.classify.TestDiscrete [-t <n>] <classifier>
<oracle> <parser>
<input file> [<null label>
[<null label> ...]]
Options: The -t <n>
option is similar to the LBJava compiler's command
line option of the same name. When <n>
is greater than 0, a time stamp is
printed to STDOUT
after every <n>
examples are processed.
Input: The first three command line parameters are fully qualified class names, e.g.
myPackage.myClassifier
. Next, <input file>
is passed (as a
String
) to the constructor of <parser>
. The optional parameter
<null label>
identifies one of the possible labels produced by
<oracle>
as representing "no classification". It is used during the
computation of overall precision, recall, and F1 scores. Finally, it is also assumed that
<classifier>
is discrete, and that its discreteValue(Object)
method is implemented.
Output: First some timing information is presented. The first time reported is the time
taken to load the specified classifier's Java class into memory. This reflects the time taken for
LBJava to load the classifier's internal representation if the classifier does not
make use of the cachedin
keyword. Next, the time taken to evaluate the first example
is reported. It isn't particularly informative unless the classifier does make use of the
cachedin
keyword. In this case, it reflects the time LBJava takes to load the
classifier's internal representation better than the first time reported. Finally, the average
time taken to execute the classifier's discreteValue(Object)
method is reported.
After the timing information, an ASCII table is written to STDOUT
reporting
precision, recall, and F1 scores itemized by the values that either the classifier or
the oracle produced during the test. The two rightmost columns are named "LCount"
and "PCount"
(standing for "labeled count" and "predicted count" respectively), and
they report the number of times the oracle produced each label and the number of times the
classifier predicted each label respectively. If a "null label" is specified, overall precision,
recall, and F1 scores and a total count of non-null-labeled examples are reported at
the bottom of the table. In the last row, whether a "null label" is specified or not, overall
accuracy is reported in the precision column. In the count column, the total number of
predictions (or labels, equivalently) is reported.
Modifier and Type | Field and Description |
---|---|
protected HashMap |
correctHistogram
The histogram of correct predictions.
|
protected HashMap |
goldHistogram
The histogram of correct labels.
|
protected HashSet |
nullLabels
The set of "null" labels whose statistics are not included in overall precision, recall, F1,
or accuracy.
|
protected HashMap |
predictionHistogram
The histogram of predictions.
|
Constructor and Description |
---|
TestDiscrete()
Default constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
addNull(String n)
Adds a label to the set of "null" labels.
|
String[] |
getAllClasses()
Returns the set of all classes reported as either predictions or labels.
|
int |
getCorrect(String p)
Returns the number of times the requested prediction was reported correctly.
|
double |
getF(double b,
String l)
Returns the Fbeta score associated with the given label.
|
double |
getF1(String l)
Returns the F1 score associated with the given label.
|
int |
getLabeled(String l)
Returns the number of times the requested label was reported.
|
String[] |
getLabels()
Returns the set of labels that have been reported so far.
|
double[] |
getOverallStats()
Computes overall the overall statistics precision, recall, F1, and accuracy.
|
double[] |
getOverallStats(double b)
Computes overall the overall statistics precision, recall, Fbeta, and accuracy.
|
double |
getPrecision(String p)
Returns the precision associated with the given prediction.
|
int |
getPredicted(String p)
Returns the number of times the requested prediction was reported.
|
String[] |
getPredictions()
Returns the set of predictions that have been reported so far.
|
double |
getRecall(String l)
Returns the recall associated with the given label.
|
boolean |
hasNulls()
Returns
true iff there exist "null" labels. |
protected void |
histogramAdd(HashMap histogram,
String key,
int amount)
Takes a histogram implemented as a map and increments the count for the given key by the
given amount.
|
protected void |
histogramAddAll(HashMap h1,
HashMap h2)
Takes two histograms implemented as maps and adds the amounts found in the second histogram
to the amounts found in the first.
|
protected int |
histogramGet(HashMap histogram,
String key)
Takes a histogram implemented as a map and retrieves the count for the given key.
|
boolean |
isNull(String n)
Determines if a label is treated as a "null" label.
|
static void |
main(String[] args)
The entry point of this program.
|
void |
printPerformance(PrintStream out)
Performance results are written to the given stream in the form of precision, recall, and F1
statistics.
|
void |
removeNull(String n)
Removes a label from the set of "null" labels.
|
void |
reportAll(TestDiscrete t)
Report all the predictions in the argument's histograms.
|
void |
reportPrediction(String p,
String l)
Whenever a prediction is made, report that prediction and the correct label with this method.
|
static TestDiscrete |
testDiscrete(Classifier classifier,
Classifier oracle,
Parser parser)
Tests the given discrete classifier against the given oracle using the given parser to
provide the labeled testing data.
|
static TestDiscrete |
testDiscrete(TestDiscrete tester,
Classifier classifier,
Classifier oracle,
Parser parser,
boolean output,
int outputGranularity)
Tests the given discrete classifier against the given oracle using the given parser to
provide the labeled testing data.
|
protected HashMap goldHistogram
protected HashMap predictionHistogram
protected HashMap correctHistogram
protected HashSet nullLabels
public static void main(String[] args)
args
- The command line parameters.public static TestDiscrete testDiscrete(Classifier classifier, Classifier oracle, Parser parser)
testDiscrete(TestDiscrete,Classifier,Classifier,Parser,boolean,int)
assumes there
are no null predictions and that output should not be generated on STDOUT
.classifier
- The classifier to be tested.oracle
- The classifier to test against.parser
- The parser supplying the labeled example objects.TestDiscrete
object filled with testing statistics.public static TestDiscrete testDiscrete(TestDiscrete tester, Classifier classifier, Classifier oracle, Parser parser, boolean output, int outputGranularity)
Object[]
s
containing arrays of int
s and double
s, as would be the case if
pre-extraction was performed, then it is assumed that this example array already includes the
label, so this is used directly and the oracle classifier is ignored. In this case, it is
also assumed that the given discrete classifier is an instance of Learner
and
thus a lexicon of label mappings can be retrieved from it.tester
- An object of this class that has already been told via addNull(String)
which prediction values are considered to be null predictions.classifier
- The classifier to be tested.oracle
- The classifier to test against.parser
- The parser supplying the labeled example objects.output
- Whether or not to produce output on STDOUT
.outputGranularity
- The number of examples processed in between time stamp messages.TestDiscrete
object passed in the first argument, after being
filled with statistics.public void reportPrediction(String p, String l)
p
- The prediction.l
- The correct label.public void reportAll(TestDiscrete t)
t
- Another object of this class.public String[] getLabels()
public String[] getPredictions()
public String[] getAllClasses()
public void addNull(String n)
n
- The label to add.public void removeNull(String n)
n
- The label to remove.public boolean isNull(String n)
n
- The label in question.true
iff n
is one of the "null" labels.public boolean hasNulls()
true
iff there exist "null" labels.protected void histogramAdd(HashMap histogram, String key, int amount)
histogram
- The histogram.key
- The key whose count should be incremented.amount
- The amount by which to increment.protected int histogramGet(HashMap histogram, String key)
histogram
- The histogram.key
- The key whose count should be retrieved.protected void histogramAddAll(HashMap h1, HashMap h2)
h1
- The first histogram, whose values will be modified.h2
- The second histogram, whose values will be added into the first's.public int getLabeled(String l)
l
- The label in question.l
was reported.public int getPredicted(String p)
p
- The prediction in question.p
was reported.public int getCorrect(String p)
p
- The prediction in question.p
was reported.public double getPrecision(String p)
p
- The given prediction.p
.public double getRecall(String l)
l
- The given label.l
.public double getF1(String l)
l
- The given label.l
.public double getF(double b, String l)
Fbeta = (beta2 + 1) * P * R / (beta2 * P + R)
b
- The value of beta.l
- The given label.l
.public double[] getOverallStats()
public double[] getOverallStats(double b)
b
- The value of beta.public void printPerformance(PrintStream out)
out
- The stream to write to.Copyright © 2016. All rights reserved.