edu.illinois.cs.cogcomp.lbj.coref.scorers
Class BCubedBase

java.lang.Object
  extended by edu.illinois.cs.cogcomp.lbj.coref.scorers.Scorer<ChainSolution<T>>
      extended by edu.illinois.cs.cogcomp.lbj.coref.scorers.ChainScorer<Mention>
          extended by edu.illinois.cs.cogcomp.lbj.coref.scorers.BCubedBase
Direct Known Subclasses:
BCubedUniformPerMentionBase

public abstract class BCubedBase
extends ChainScorer<Mention>

Base class for scorers that implement some version of Bagga and Baldwin's B-Cubed scoring algorithm. See (Amit) Bagga and Baldwin (MUC-7 1998).

Author:
Eric Bengtson

Constructor Summary
protected BCubedBase()
          Default constructor.
 
Method Summary
protected  java.util.List<java.util.Set<Mention>> getPartition(java.util.Set<Mention> keyChain, ChainSolution<Mention> predSol)
          Partitions the key chain into a list of sets such that each set in the result contains elements that are together in a chain in the predicted solution.
 double getPrecision(ChainSolution<Mention> key, ChainSolution<Mention> pred)
          Computes the B-Cubed precision for a chain solution.
abstract  double getPrecision(java.util.List<ChainSolution<Mention>> keys, java.util.List<ChainSolution<Mention>> preds)
          Computes the B-Cubed precision for a collection of documents.
 double getRecall(ChainSolution<Mention> key, ChainSolution<Mention> pred)
          Computes the B-Cubed recall for a chain solution.
abstract  double getRecall(java.util.List<ChainSolution<Mention>> keys, java.util.List<ChainSolution<Mention>> preds)
          Computes the B-Cubed recall for a collection of documents.
 Score getScore(ChainSolution<Mention> key, ChainSolution<Mention> pred)
          Computes the B-Cubed F-Score for a chain solution.
abstract  Score getScore(java.util.List<ChainSolution<Mention>> keys, java.util.List<ChainSolution<Mention>> preds)
          Computes the B-Cubed F-Score for a collection of documents.
protected  boolean haveSameMembers(ChainSolution<Mention> sol1, ChainSolution<Mention> sol2)
          Determines whether the specified solutions have exactly the same set of mentions.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BCubedBase

protected BCubedBase()
Default constructor.

Method Detail

getScore

public Score getScore(ChainSolution<Mention> key,
                      ChainSolution<Mention> pred)
Computes the B-Cubed F-Score for a chain solution.

The precision of a chain solution is the average of the precisions of all mentions in the solution. The precision of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the predicted cluster containing m.

The recall of a chain solution is the average of the recalls of all mentions in the solution. The recall of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the true cluster containing m.

The B-Cubed F-Score is the harmonic mean of the precision and recall defined above.

Overrides:
getScore in class ChainScorer<Mention>
Parameters:
key - The true (gold standard) solution.
pred - The predicted solution.
Returns:
The B-Cubed F-Score.

getPrecision

public double getPrecision(ChainSolution<Mention> key,
                           ChainSolution<Mention> pred)
Computes the B-Cubed precision for a chain solution.

The precision of a chain solution is the average of the precisions of all mentions in the solution. The precision of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the predicted cluster containing m.

Parameters:
key - The true (gold standard) solution.
pred - The predicted solution.
Returns:
The B-Cubed precision.

getRecall

public double getRecall(ChainSolution<Mention> key,
                        ChainSolution<Mention> pred)
Computes the B-Cubed recall for a chain solution.

The recall of a chain solution is the average of the recalls of all mentions in the solution. The recall of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the true cluster containing m.

Parameters:
key - The true (gold standard) solution.
pred - The predicted solution.
Returns:
The B-Cubed recall.

getScore

public abstract Score getScore(java.util.List<ChainSolution<Mention>> keys,
                               java.util.List<ChainSolution<Mention>> preds)
Computes the B-Cubed F-Score for a collection of documents.

The precision of a collection of documents is the average of the precisions of all mentions in all documents. The precision of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the predicted cluster containing m.

The recall of a collection of documents is the average of the recalls of all mentions in all documents. The recall of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the true cluster containing m.

The B-Cubed F-Score is the harmonic mean of the precision and recall defined above.

Specified by:
getScore in class ChainScorer<Mention>
Parameters:
keys - A collection of true (gold standard) solutions (for example, one per document)
preds - A collection of predicted solutions (for example, one per document)
Returns:
The B-Cubed F-Score.

getPrecision

public abstract double getPrecision(java.util.List<ChainSolution<Mention>> keys,
                                    java.util.List<ChainSolution<Mention>> preds)
Computes the B-Cubed precision for a collection of documents.

The precision of a collection of documents is the average of the precisions of all mentions in all documents. The precision of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the predicted cluster containing m.

Parameters:
keys - A collection of true (gold standard) solutions (for example, one per document)
preds - A collection of predicted solutions (for example, one per document)
Returns:
The B-Cubed precision.

getRecall

public abstract double getRecall(java.util.List<ChainSolution<Mention>> keys,
                                 java.util.List<ChainSolution<Mention>> preds)
Computes the B-Cubed recall for a collection of documents.

The precision of a collection of documents is the average of the precisions of all mentions in all documents. The precision of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the predicted cluster containing m.

The recall of a collection of documents is the average of the recalls of all mentions in all documents. The recall of a mention m is calculated as the number of mentions correctly predicted to be in the same cluster as m (including m) divided by the number of mentions in the true cluster containing m.

The B-Cubed F-Score is the harmonic mean of the precision and recall defined above.

Parameters:
keys - A collection of true (gold standard) solutions (for example, one per document)
preds - A collection of predicted solutions (for example, one per document)
Returns:
The B-Cubed F-Score.

getPartition

protected java.util.List<java.util.Set<Mention>> getPartition(java.util.Set<Mention> keyChain,
                                                              ChainSolution<Mention> predSol)
Partitions the key chain into a list of sets such that each set in the result contains elements that are together in a chain in the predicted solution.

Parameters:
keyChain - A chain (cluster) from the key solution.
predSol - The predicted solution.
Returns:
A partition of the elements from the key chain.

haveSameMembers

protected boolean haveSameMembers(ChainSolution<Mention> sol1,
                                  ChainSolution<Mention> sol2)
Determines whether the specified solutions have exactly the same set of mentions.

Parameters:
sol1 - One solution.
sol2 - Another solution.