edu.illinois.cs.cogcomp.lbj.coref.features
Class TokenFeatures

java.lang.Object
  extended by edu.illinois.cs.cogcomp.lbj.coref.features.TokenFeatures

public class TokenFeatures
extends java.lang.Object

Collection of feature generating functions that return tokens (word strings) in or around the mentions of a CExample.


Constructor Summary
protected TokenFeatures()
          Should not need to construct this static feature collection.
 
Method Summary
static java.lang.String getLastHeadWordPair(CExample ex)
          Gets the last word of each mention's head, where each word by conjoining the ordered pair of words with "_AND_".
static java.lang.String[] getSharedWords(CExample ex, boolean useHead)
          Gets the set of all words that are contained in both mentions.
static java.lang.String lastWordPair(CExample ex, boolean useHead)
          Gets the last word of each mention, conjoined by "_AND_".
static java.lang.String mTypeProWord(CExample ex)
          Gets the mention types of both mentions, conjoined by "&&", except that if the second mention is a pronoun, the last word of its head is substituted for its mention type.
static java.lang.String[] preWordPairs(CExample ex)
          Get the pairs of words preceding the heads, by conjoining the ordered pair of words with "_AND_".
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TokenFeatures

protected TokenFeatures()
Should not need to construct this static feature collection.

Method Detail

preWordPairs

public static java.lang.String[] preWordPairs(CExample ex)
Get the pairs of words preceding the heads, by conjoining the ordered pair of words with "_AND_". Rare words are replaced with "_RARE_", and if both words are rare the string is "_Rare_Duplicate".

Parameters:
ex - The example whose mentions will be processed.
Returns:
An array of strings containing conjoined pairs of words.

getSharedWords

public static java.lang.String[] getSharedWords(CExample ex,
                                                boolean useHead)
Gets the set of all words that are contained in both mentions.

Parameters:
ex - The example whose mentions are processed.
useHead - Should the heads or the extents of the mentions be used?
Returns:
An array view of the set of all shared words.

lastWordPair

public static java.lang.String lastWordPair(CExample ex,
                                            boolean useHead)
Gets the last word of each mention, conjoined by "_AND_". If either word is rare, "_RARE_" is substituted. If both words are rare, the result is "_Rare_Duplicate".

Parameters:
ex - The example whose words are retrieved.
useHead - Whether the last word of the head or the extent should be retrieved.
Returns:
The string containing the last word of each mention, conjoined with "_AND_".

mTypeProWord

public static java.lang.String mTypeProWord(CExample ex)
Gets the mention types of both mentions, conjoined by "&&", except that if the second mention is a pronoun, the last word of its head is substituted for its mention type.

Parameters:
ex - The example whose types are retrieved.
Returns:
A string containing the mention types conjoined by "&&", except that if the second mention is a pronoun, the last word of its head replaces its type.

getLastHeadWordPair

public static java.lang.String getLastHeadWordPair(CExample ex)
Gets the last word of each mention's head, where each word by conjoining the ordered pair of words with "_AND_". Rare words are replaced with "_RARE_", and if both words are rare the string is "_Rare_Duplicate".

Parameters:
ex - The example whose mentions will be processed.
Returns:
The conjoined pairs of words.