|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.illinois.cs.cogcomp.lbj.coref.parsers.CoParser
public class CoParser
Extracts coreference examples for use in training an LBJ classifier. The examples are extracted from a corpus of documents specified either by providing a file name containing a list of document filenames or by providing a document loader. From each document, examples are extracted according to the specified example extractor. See the various constructors for details. To extract examples, repeatedly call the next method until it returns null.
Field Summary | |
---|---|
private CExampleExtractor |
m_cExExtractor
|
private java.util.List<Doc> |
m_docs
|
private java.util.List<CExample> |
m_examples
|
private int |
m_iD
|
private int |
m_iX
|
Constructor Summary | |
---|---|
CoParser(DocLoader loader,
CExampleExtractor extractor)
Constructs a Parser that extracts coreference examples from a corpus, with documents loaded by a specified document loader and coreference examples extracted from each document using the specified example extractor. |
|
CoParser(java.lang.String fileListFN)
Constructs a Parser that extracts coreference examples from a corpus loaded using the default document loader, and examples extracted using the default example extractor. |
|
CoParser(java.lang.String fileListFN,
CExampleExtractor extractor)
Constructs a Parser that extracts coreference examples from a corpus loaded using the default document loader as specified by DocLoader.getDefaultLoader(java.lang.String) ,
and examples extracted using the specified example extractor. |
Method Summary | |
---|---|
private void |
advanceDoc()
Prepares to extract examples from the next document (including resetting the document). |
protected void |
cleanup()
Called immediately before next returns null. |
void |
close()
|
void |
enqueue(java.lang.Object q)
Does nothing |
private java.util.List<CExample> |
getExamples()
Load all examples from the example extractor. |
private CExample |
getNextExample()
Gets an example from the cache and prepares for the next example. |
CExample |
next()
Gets the next coreference example, or null if no more examples remain. |
void |
reset()
Resets the parser to the first document in the corpus and resets the example extractor. |
private void |
resetDoc()
Resets the document, including caching the examples from the example extractor. |
protected void |
startup(DocLoader loader)
Prepares the parser, by loading documents and resetting the doc. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private java.util.List<Doc> m_docs
private CExampleExtractor m_cExExtractor
private java.util.List<CExample> m_examples
private int m_iD
private int m_iX
Constructor Detail |
---|
public CoParser(java.lang.String fileListFN)
CExExClosestPosAllNeg
, which loads examples as follows:
For each mention, creates a positive example with the nearest preceding
coreferential mention, and creates negative examples with each preceding
non-coreferential mention.
Does not include any cataphoric examples (examples where
a pronoun precedes a non-pronoun).
fileListFN
- The classpath-relative filename of the corpus file,
containing a list of document filenames, one per line.
Each filename should be specified relative to
a location in the classpath.public CoParser(DocLoader loader, CExampleExtractor extractor)
loader
- A document loader that loads a corpus of documents.extractor
- An coreference example extractor.public CoParser(java.lang.String fileListFN, CExampleExtractor extractor)
DocLoader.getDefaultLoader(java.lang.String)
,
and examples extracted using the specified example extractor.
fileListFN
- The classpath-relative filename of the corpus file,
containing a list of document filenames, one per line.
Each filename should be specified relative to
a location in the classpath.extractor
- An coreference example extractor.Method Detail |
---|
public CExample next()
next
in interface LBJ2.parse.Parser
public void reset()
reset
in interface LBJ2.parse.Parser
public void close()
close
in interface LBJ2.parse.Parser
public void enqueue(java.lang.Object q)
q
- An arbitrary object.private CExample getNextExample()
m_examples
is initialized and when m_iX
is less than the size of m_examples
private void advanceDoc()
private void resetDoc()
private java.util.List<CExample> getExamples()
m_iD
).
Should not be called if the document does not exist.
protected void startup(DocLoader loader)
loader
- The loader from which to get the documents.protected void cleanup()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |