|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.illinois.cs.cogcomp.lbj.coref.io.loaders.DocLoader
public abstract class DocLoader
Loads a corpus of documents.
To load a document, construct a subclass of this and then call
the loadDocs()
method or the loadDoc(java.lang.String)
method
called with the correct type of input (see the relevant subclass for details.
To get the default document loader (which currently loads
documents from annotated .apf.xml files, use getDefaultLoader(java.lang.String)
Field Summary | |
---|---|
protected LBJ2.classify.Classifier |
m_caser
Classifier that decides the true case (uppercase, etc) of text. |
protected java.lang.String |
m_fileListFN
Name of file containing list of document filenames, one per line. |
protected MentionDecoder |
m_mdDecoder
Decoder that extracts predicted mentions from a document |
protected LBJ2.classify.Classifier |
m_mTypeClassifier
Classifier that determines the mention types of a mention. |
Constructor Summary | |
---|---|
DocLoader()
Default constructor. |
|
DocLoader(MentionDecoder mentionDecoder,
LBJ2.classify.Classifier mTyper)
Construct a loader for use when no file is used. |
|
DocLoader(java.lang.String fileListFN)
Construct a loader that loads a list of documents from a file. |
|
DocLoader(java.lang.String fileListFN,
MentionDecoder mentionDecoder,
LBJ2.classify.Classifier mTyper)
Construct a loader that loads a list of documents from a file. |
Method Summary | |
---|---|
protected abstract Doc |
createDoc(java.lang.String inputString)
Create a document from the given string, treating inputString as a filename or as text
depending on the subclass. |
static DocLoader |
getDefaultLoader()
Gets the default loader. |
static DocLoader |
getDefaultLoader(java.lang.String fileList)
Gets the default loader. |
java.lang.String[] |
getFilenames(java.lang.String fileListFN)
Opens the given file and reads a list of filenames from it, one per line. |
protected java.util.List<Mention> |
getPredMents(Doc doc)
Predict mentions using predicted mention decoder, sets mention types predicted by mention type classifier, and sets entity types using the entity type feature. |
Doc |
loadDoc(java.lang.String inputString)
Loads a document. |
java.util.List<Doc> |
loadDocs()
Load all the documents using filename and utilities already set. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected LBJ2.classify.Classifier m_caser
protected java.lang.String m_fileListFN
protected MentionDecoder m_mdDecoder
protected LBJ2.classify.Classifier m_mTypeClassifier
Mention
objects as input and returns the type as a string,
"NAM", "NOM", "PRE", or "PRO".
Constructor Detail |
---|
public DocLoader(java.lang.String fileListFN, MentionDecoder mentionDecoder, LBJ2.classify.Classifier mTyper)
fileListFN
- The name of the corpus file,
containing a list of document filenames, one per line.mentionDecoder
- The mention decoder extracts mentions from
a document.mTyper
- Determines the mention types of each mention.
Takes Mention
objects as input and returns the type as a string,
"NAM", "NOM", "PRE", or "PRO".public DocLoader(MentionDecoder mentionDecoder, LBJ2.classify.Classifier mTyper)
loadDocs()
, but rather call
loadDoc(String inputString)
using the text as the input.
Mentions will be predicted using the provided decoders and classifiers.
containing a list of filenames corresponding to documents.
mentionDecoder
- The mention decoder extracts mentions from
a document.mTyper
- Determines the mention types of each mention.
Takes Mention
objects as input and returns the type as a string,
"NAM", "NOM", "PRE", or "PRO".public DocLoader(java.lang.String fileListFN)
fileListFN
- The name of the corpus file,
containing a list of document filenames, one per line.public DocLoader()
loadDocs()
, but rather call
loadDoc(String inputString)
using the text as the input.
Method Detail |
---|
public java.util.List<Doc> loadDocs()
public Doc loadDoc(java.lang.String inputString)
createDoc
method,
which may treat inputString
as a filename or as text.
inputString
- The filename or text, depending on the subclass.
If a filename, it may end with the appropriate extension.
inputString
, either
representing the text of inputString
or saved in the file
named by inputString
protected abstract Doc createDoc(java.lang.String inputString)
inputString
as a filename or as text
depending on the subclass.
inputString
- The filename or text, depending on the subclass.
If a filename, it may end with the appropriate extension.
inputString
, either
representing the text of inputString
or saved in the file
named by inputString
public java.lang.String[] getFilenames(java.lang.String fileListFN)
fileListFN
- The name of a file, relative to the "fileLists"
directory in the classpath, containing a list of filenames.
protected java.util.List<Mention> getPredMents(Doc doc)
doc
- The document whose mentions should be predicted.
public static DocLoader getDefaultLoader(java.lang.String fileList)
fileList
- The name of the file list @see DocAPFLoader constructor.
public static DocLoader getDefaultLoader()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |