edu.illinois.cs.cogcomp.lbj.coref.io.loaders
Class DocFromTextLoader

java.lang.Object
  extended by edu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocLoader
      extended by edu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocFromTextLoader

public class DocFromTextLoader
extends DocLoader

Loads a document from a plain text string, rather than from a file.

To load one or more documents from plain text, construct this and then call DocLoader.loadDoc(String) with the plain text as input.

Author:
Eric Bengtson

Field Summary
protected  boolean m_doWordSplit
           
 
Fields inherited from class edu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocLoader
m_caser, m_fileListFN, m_mdDecoder, m_mTypeClassifier
 
Constructor Summary
DocFromTextLoader(MentionDecoder mentionDetector, LBJ2.classify.Classifier mTyper)
          Constructs a loader that will detect and type mentions automatically.
DocFromTextLoader(MentionDecoder mentionDetector, LBJ2.classify.Classifier mTyper, boolean doWordSplit)
          Constructs a loader that will detect and type mentions automatically.
 
Method Summary
protected  Doc createDoc(java.lang.String text)
          Constructs and returns a document from the given plain text.
 
Methods inherited from class edu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocLoader
getDefaultLoader, getDefaultLoader, getFilenames, getPredMents, loadDoc, loadDocs
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_doWordSplit

protected boolean m_doWordSplit
Constructor Detail

DocFromTextLoader

public DocFromTextLoader(MentionDecoder mentionDetector,
                         LBJ2.classify.Classifier mTyper)
Constructs a loader that will detect and type mentions automatically. Words will be split by an automatic word splitting algorithm. Sentence boundaries, quotations, and part-of-speech tags will also automatically be discovered.

To load a document from plain text, construct this and then call DocLoader.loadDoc(String) with the plain text as input.

Parameters:
mentionDetector - The mention detector extracts mentions from a document, by predicting the head and extent boundaries of all mentions.
mTyper - Determines the mention types of each mention. Takes Mention objects as input and returns the type as a string, "NAM", "NOM", "PRE", or "PRO".

DocFromTextLoader

public DocFromTextLoader(MentionDecoder mentionDetector,
                         LBJ2.classify.Classifier mTyper,
                         boolean doWordSplit)
Constructs a loader that will detect and type mentions automatically. Sentence boundaries, quotations, and part-of-speech tags will automatically be discovered.

To load a document from plain text, construct this and then call DocLoader.loadDoc(String) with the plain text as input.

Parameters:
mentionDetector - The mention detector extracts mentions from a document, by predicting the head and extent boundaries of all mentions.
mTyper - Determines the mention types of each mention. Takes Mention objects as input and returns the type as a string, "NAM", "NOM", "PRE", or "PRO".
doWordSplit - If true, split words using an automatic word splitting algorithm; otherwise, split words based on whitespace.
Method Detail

createDoc

protected Doc createDoc(java.lang.String text)
Constructs and returns a document from the given plain text.

Specified by:
createDoc in class DocLoader
Parameters:
text - Plain text.
Returns:
A document representing the specified text.