edu.illinois.cs.cogcomp.lbj.coref
Class CorefPlainText

java.lang.Object
  extended by edu.illinois.cs.cogcomp.lbj.coref.CorefPlainText

public class CorefPlainText
extends java.lang.Object

Use this program to add coreference annotation to a plain text input file. Note that it requires at least 1 GB of memory to run correctly.

Usage

   java -Xmx1g edu.illinois.cs.cogcomp.lbj.coref.CorefPlainText <text file>
 

Input

<text file> is the absolute or relative path and name of the input file, which should contain naturally written, unannotated, plain English text.

Output

The output is the same plain text that appeared in the input file after being annotated by the coreference classifier. Annotations mark sequences of words as mentions of an entity. A single asterisk character (*) is prepended to the first word of the mention. An asterisk followed by an underscore (_) and an integer is appended to the last word of the mention. Mentions which the classifier believes refer to the same entity will be annotated with the same integer. Mentions can be nested, but they never overlap otherwise.


Constructor Summary
CorefPlainText()
           
 
Method Summary
static void main(java.lang.String[] args)
           
private static java.lang.String readLine(java.io.BufferedReader in, java.lang.String filename)
          Read a line from an input stream, printing an error message and terminating the program on error.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CorefPlainText

public CorefPlainText()
Method Detail

main

public static void main(java.lang.String[] args)

readLine

private static java.lang.String readLine(java.io.BufferedReader in,
                                         java.lang.String filename)
Read a line from an input stream, printing an error message and terminating the program on error.

Parameters:
in - The stream to read from.
filename - The name of the file that the stream is reading from.
Returns:
A line of text from the input stream without any newline character the line may have contained at the end.