edu.illinois.cs.cogcomp.lbj.coref.io.loaders
Class DocPlainTextLoader
java.lang.Object
edu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocLoader
edu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocPlainTextLoader
public class DocPlainTextLoader
- extends DocLoader
Loads documents from the filenames listed in the specified file.
To load documents from files,
construct this providing the filename of a file containing
a list of plain-text-document filenames (one per line)
and then call DocLoader.loadDocs()
.
(Note: Each filename should be specified relative to
a location in the classpath).
- Author:
- Eric Bengtson
Constructor Summary |
DocPlainTextLoader(java.lang.String fileListFN,
MentionDecoder mentionDetector,
LBJ2.classify.Classifier mTyper)
Constructs a loader that loads plain text files,
and will detect and type mentions automatically. |
Method Summary |
protected Doc |
createDoc(java.lang.String filename)
Constructs and returns a document
from the specified plain text file. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DocPlainTextLoader
public DocPlainTextLoader(java.lang.String fileListFN,
MentionDecoder mentionDetector,
LBJ2.classify.Classifier mTyper)
- Constructs a loader that loads plain text files,
and will detect and type mentions automatically.
Words will be split by an automatic word splitting algorithm.
Sentence boundaries, quotations, and part-of-speech tags will
also automatically be discovered.
The file contains a list of filenames, one per line.
- Parameters:
fileListFN
- The name of the corpus file, relative
to the a location in the classpath,
containing a list of plain text document filenames, one per line.
Each document filename should be specified relative to the classpath.mentionDetector
- The mention detector extracts mentions from
a document, by predicting the head and extent boundaries of all mentions.mTyper
- Determines the mention types of each mention.
Takes Mention
objects as input and returns the type as a string,
"NAM", "NOM", "PRE", or "PRO".
createDoc
protected Doc createDoc(java.lang.String filename)
- Constructs and returns a document
from the specified plain text file.
- Specified by:
createDoc
in class DocLoader
- Parameters:
filename
- The name of a plain text file,
relative to a location in the classpath.
- Returns:
- A document representing the specified plain text file.