edu.illinois.cs.cogcomp.lbj.coref.decoders
Class BestLinkDecoder

java.lang.Object
  extended by edu.illinois.cs.cogcomp.lbj.coref.decoders.DecoderWithOptions<ST>
      extended by edu.illinois.cs.cogcomp.lbj.coref.decoders.ChainSolutionDecoder<ChainSolution<Mention>,Mention,CExample>
          extended by edu.illinois.cs.cogcomp.lbj.coref.decoders.CorefDecoder
              extended by edu.illinois.cs.cogcomp.lbj.coref.decoders.ScoredCorefDecoder
                  extended by edu.illinois.cs.cogcomp.lbj.coref.decoders.BestLinkDecoder
All Implemented Interfaces:
SolutionDecoder<ChainSolution<Mention>>, java.io.Serializable

public class BestLinkDecoder
extends ScoredCorefDecoder
implements java.io.Serializable

Translates classification decisions to a collection of coreference equivalence classes in the form of a ChainSolution via the decode method according to the best link decoding algorithm. The best link decoding method specifies that for each mention m a link will be produced with highest scoring preceding mention a only if ScoredCorefDecoder.predictedCoreferential(CExample) returns true for the example doc.getCExampleFor(a, m). Also, allows several options to be set that modify the performance of the best link decoding algorithm. See the relevant setter methods for details.

Author:
Eric Bengtson
See Also:
Serialized Form

Field Summary
protected  boolean m_allowCataphora
          Whether to allow cataphora.
protected  boolean m_experimental
          Currently does nothing.
protected  boolean m_preventLongDistPRO
          Whether to prevent long distance pronoun reference.
protected  java.io.PrintStream m_scoresLog
          Holds the optional scores log.
private static long serialVersionUID
           
 
Fields inherited from class edu.illinois.cs.cogcomp.lbj.coref.decoders.ScoredCorefDecoder
m_cacheScores, m_recordAllScores, m_scorer
 
Fields inherited from class edu.illinois.cs.cogcomp.lbj.coref.decoders.CorefDecoder
m_classifier
 
Fields inherited from class edu.illinois.cs.cogcomp.lbj.coref.decoders.DecoderWithOptions
m_options, m_train
 
Constructor Summary
BestLinkDecoder(LBJ2.learn.LinearThresholdUnit scorer)
          Constructor for the case where a scoring classifier has had its threshold set.
BestLinkDecoder(LBJ2.learn.LinearThresholdUnit scorer, LBJ2.classify.Classifier decider)
          Constructor for use when the scoring classifier is not sufficient to decide whether links should be made, such as when inference is being applied.
 
Method Summary
 ChainSolution<Mention> decode(Doc doc)
          Takes the mentions in the specified document and produces a collection of coreference equivalence classes.
 java.lang.String getStatsString()
          Enables recorded statistics to be returned.
 void processOption(java.lang.String option, java.lang.String value)
          Processes the options by calling super and calling the dedicated methods for setting specific options.
protected  void recordStatsFor(CExample ex)
          Enables the recording of data about coreference examples as they are used in the decoding algorithm.
 void setAllowPronounCataphora(boolean allow)
          Specifies whether to allow pronoun cataphora Specifically, if allow is true, a pronoun cannot take an referent that appears after the pronoun.
 void setPreventLongDistPRO(boolean prevent)
          Specifies whether to limit pronoun reference to within a small number of sentences.
 
Methods inherited from class edu.illinois.cs.cogcomp.lbj.coref.decoders.ScoredCorefDecoder
getEdgeLabels, getFeatureWeights, getMinimumPartsOfCompounds, getScorer, getSumsOfParts, getTotalsOfPartsOfCompounds, getTrueScore, predictedCoreferential, setScorer, setup
 
Methods inherited from class edu.illinois.cs.cogcomp.lbj.coref.decoders.CorefDecoder
getClassifier, setClassifier
 
Methods inherited from class edu.illinois.cs.cogcomp.lbj.coref.decoders.DecoderWithOptions
getBooleanOption, getOption, getRealOption, setOption, setOption, setOption, setTrain
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serialVersionUID

private static final long serialVersionUID
See Also:
Constant Field Values

m_allowCataphora

protected boolean m_allowCataphora
Whether to allow cataphora.


m_preventLongDistPRO

protected boolean m_preventLongDistPRO
Whether to prevent long distance pronoun reference.


m_experimental

protected boolean m_experimental
Currently does nothing.


m_scoresLog

protected java.io.PrintStream m_scoresLog
Holds the optional scores log.

Constructor Detail

BestLinkDecoder

public BestLinkDecoder(LBJ2.learn.LinearThresholdUnit scorer)
Constructor for the case where a scoring classifier has had its threshold set.

Parameters:
scorer - A scoring classifier (specifically, a LinearThresholdUnit), whose threshold should be set using its setThreshold method. scorer's discreteValue takes CExamples and returns "true" or "false". It also provides scores for the "true" value.

BestLinkDecoder

public BestLinkDecoder(LBJ2.learn.LinearThresholdUnit scorer,
                       LBJ2.classify.Classifier decider)
Constructor for use when the scoring classifier is not sufficient to decide whether links should be made, such as when inference is being applied. Both scorer and decider must return "true" for an example to be considered coreferential.

Parameters:
scorer - Determines the score or confidence. Takes CExamples and returns a score.
decider - Final arbiter of linking decisions. Takes CExamples and returns "true" or "false".
Method Detail

decode

public ChainSolution<Mention> decode(Doc doc)
Takes the mentions in the specified document and produces a collection of coreference equivalence classes. The best link decoding method specifies that for each mention m a link will be produced with highest scoring preceding mention a only if ScoredCorefDecoder.predictedCoreferential(CExample) returns true for the example doc.getCExampleFor(a, m). Note: Several options ignore the ScoredCorefDecoder.predictedCoreferential(CExample) method; in these cases, a decider may specify false and links may still be made, possibly interfering with successful inference.

Specified by:
decode in interface SolutionDecoder<ChainSolution<Mention>>
Specified by:
decode in class ChainSolutionDecoder<ChainSolution<Mention>,Mention,CExample>
Parameters:
doc - a document whose mentions will be placed in coreference classes.
Returns:
A ChainSolution representing the coreference equivalence classes as chains. Links established between mentions will also be given labels in the solution.

setAllowPronounCataphora

public void setAllowPronounCataphora(boolean allow)
Specifies whether to allow pronoun cataphora Specifically, if allow is true, a pronoun cannot take an referent that appears after the pronoun.

Parameters:
allow - Whether to allow a pronoun to refer to mentions appearing after the pronoun.

setPreventLongDistPRO

public void setPreventLongDistPRO(boolean prevent)
Specifies whether to limit pronoun reference to within a small number of sentences. Specifically, if prevent is true, a pronoun cannot take an antecedent from any sentence earlier than the previous sentence.

Parameters:
prevent - Whether to prevent long-distance pronoun reference.

processOption

public void processOption(java.lang.String option,
                          java.lang.String value)
Processes the options by calling super and calling the dedicated methods for setting specific options. Also, if scoreslog is set, constructs the m_scoresLog print stream.

Overrides:
processOption in class DecoderWithOptions<ChainSolution<Mention>>
Parameters:
option - The name of the option, which is generally all lowercase.
value - The value, which may be the string representation of a boolean or real value (In a format supported by by Boolean.parseBoolean(java.lang.String) or Double.parseDouble(java.lang.String)) or any arbitrary string.

recordStatsFor

protected void recordStatsFor(CExample ex)
Enables the recording of data about coreference examples as they are used in the decoding algorithm. Currently does nothing, but may be overridden or revised to record any statistic. This method should be called in the decode method whenever an example is examined (once per examination).

Parameters:
ex - The example whose statistics should be recorded.

getStatsString

public java.lang.String getStatsString()
Enables recorded statistics to be returned. Currently does nothing, since no statistics are being recorded, but may be overridden or revised to enable statistics output.

Overrides:
getStatsString in class ScoredCorefDecoder
Returns:
The statistics string, which is currently empty.