DocPlainText

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.illinois.cs.cogcomp.lbj.coref.ir.docs
Class DocPlainText

java.lang.Object
  edu.illinois.cs.cogcomp.lbj.coref.ir.docs.DocBase
      edu.illinois.cs.cogcomp.lbj.coref.ir.docs.DocPlainText

All Implemented Interfaces:: Doc, java.io.Serializable

public class DocPlainText
extends DocBase
implements Doc
extends DocBase
implements Doc

Represents a Doc constructed from plain text.

To load a document from a string, construct using the no-arg constructor and then call loadFromPlainText(java.lang.String). To load the document including mention detection, see DocFromTextLoader

To load a document given the name of a plain text file, see DocPlainText(String). To load the document including mention detection, see DocPlainTextLoader.

Author:: Eric Bengtson
See Also:: Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from class edu.illinois.cs.cogcomp.lbj.coref.ir.docs.DocBase
`DocBase.PosSource`

Field Summary
`private static long`	`serialVersionUID`

Fields inherited from class edu.illinois.cs.cogcomp.lbj.coref.ir.docs.DocBase
`goodEnds, goodStarts, m_annotationAuthor, m_baseFN, m_bNeedsCasing, m_caser, m_dateTime, m_docID, m_docType, m_encoding, m_headline, m_slug, m_source, m_text, m_version, medEnds, totalMentions`

Constructor Summary
`DocPlainText()` Constructs an empty document.
`DocPlainText(java.lang.String filename)` Constructs a document using the specified plain text file.

Method Summary
`void`	`loadFromFilename(java.lang.String filename)` Builds this document from the specified plain text file.
`void`	`loadFromPlainText(java.lang.String text)` Builds the document from the given plain text, automatically splitting sentences, determining quote levels, determining part-of-speech tags, and splitting words by an automatic word-splitting algorithm.
`void`	`loadFromPlainText(java.lang.String text, boolean doWordSplit)` Builds the document from the given plain text, automatically splitting sentences, determining quote levels, determining part-of-speech tags, and either splitting words by whitespace or using a word-splitter.
`void`	`write(java.lang.String filename, boolean usePredictions)` Writes this Doc in the appropriate format.

Methods inherited from class edu.illinois.cs.cogcomp.lbj.coref.ir.docs.DocBase
addHeadPrediction, addPredEntities, addRelation, addTrueEntity, addTrueMention, alignPredMentsToTrue, buildMentionsContaining, buildMentionsInSents, calcAndSetQuotes, getBestMentionFor, getCExampleFor, getCoherenceInfo, getCoherenceInfo, getCorefChains, getDocID, getEntities, getEntityFor, getEntityFor, getEntityFor, getGExampleFor, getHeadPrediction, getInCorpusInverseFreq, getInDocInverseFreq, getInverseTrueHeadFreq, getInverseTrueHeadFreq, getMention, getMentions, getMentionsContainedIn, getMentionsContaining, getMentionsInSent, getMentionsInSentences, getMentionsWithExtentStartingAt, getMentionsWithHeadStartingAt, getNumMentions, getNumRelations, getNumSentences, getPlainText, getPOS, getPOS, getPredEntities, getPredMention, getPredMentions, getQuoteNestLevel, getRelation, getSentNum, getShortEID, getStartCharNum, getTextFirstWordNum, getTrueEntities, getTrueMention, getTrueMentionFor, getTrueMentions, getWholeDocCounts, getWord, getWordNum, getWords, hasHeadPrediction, hasPredEntities, hasPredMentions, hasTrueEntities, hasTrueMentions, initMembersDefault, isCaseSensitive, loadChunkedText, loadFromText, loadFromText, loadPOSTaggerOutput, loadPOSTags, loadSGMText, makeBestMentionMap, makeChunk, printChunkValidity, recordWordLocation, removeTagsAndExtraNL, repeat, save, setCorpusCounts, setPlainText, setPOSTags, setPredEntities, setPredictedMentions, setQuoteLevels, setSentenceNumbers, setUsePredictedEntities, setUsePredictedMentions, setWords, setWords, sortEntitiesByListOrder, sortPredictedMentions, sortTrueMentions, toAnnotatedString, toAnnotatedString, toCoherenceTableString, toCoherenceTableString, toString, toSubstituteString, translateEscaped, usePredictedEntities, usePredictedMentions, write

Methods inherited from class edu.illinois.cs.cogcomp.lbj.coref.ir.docs.DocBase

addHeadPrediction, addPredEntities, addRelation, addTrueEntity, addTrueMention, alignPredMentsToTrue, buildMentionsContaining, buildMentionsInSents, calcAndSetQuotes, getBestMentionFor, getCExampleFor, getCoherenceInfo, getCoherenceInfo, getCorefChains, getDocID, getEntities, getEntityFor, getEntityFor, getEntityFor, getGExampleFor, getHeadPrediction, getInCorpusInverseFreq, getInDocInverseFreq, getInverseTrueHeadFreq, getInverseTrueHeadFreq, getMention, getMentions, getMentionsContainedIn, getMentionsContaining, getMentionsInSent, getMentionsInSentences, getMentionsWithExtentStartingAt, getMentionsWithHeadStartingAt, getNumMentions, getNumRelations, getNumSentences, getPlainText, getPOS, getPOS, getPredEntities, getPredMention, getPredMentions, getQuoteNestLevel, getRelation, getSentNum, getShortEID, getStartCharNum, getTextFirstWordNum, getTrueEntities, getTrueMention, getTrueMentionFor, getTrueMentions, getWholeDocCounts, getWord, getWordNum, getWords, hasHeadPrediction, hasPredEntities, hasPredMentions, hasTrueEntities, hasTrueMentions, initMembersDefault, isCaseSensitive, loadChunkedText, loadFromText, loadFromText, loadPOSTaggerOutput, loadPOSTags, loadSGMText, makeBestMentionMap, makeChunk, printChunkValidity, recordWordLocation, removeTagsAndExtraNL, repeat, save, setCorpusCounts, setPlainText, setPOSTags, setPredEntities, setPredictedMentions, setQuoteLevels, setSentenceNumbers, setUsePredictedEntities, setUsePredictedMentions, setWords, setWords, sortEntitiesByListOrder, sortPredictedMentions, sortTrueMentions, toAnnotatedString, toAnnotatedString, toCoherenceTableString, toCoherenceTableString, toString, toSubstituteString, translateEscaped, usePredictedEntities, usePredictedMentions, write

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`

Methods inherited from interface edu.illinois.cs.cogcomp.lbj.coref.ir.docs.Doc
getBestMentionFor, getCExampleFor, getCoherenceInfo, getCoherenceInfo, getCorefChains, getDocID, getEntities, getEntityFor, getEntityFor, getGExampleFor, getInCorpusInverseFreq, getInDocInverseFreq, getInverseTrueHeadFreq, getInverseTrueHeadFreq, getMentions, getMentionsContainedIn, getMentionsContaining, getMentionsInSent, getMentionsInSentences, getMentionsWithExtentStartingAt, getMentionsWithHeadStartingAt, getNumRelations, getNumSentences, getPlainText, getPOS, getPOS, getPredEntities, getPredMentions, getQuoteNestLevel, getRelation, getSentNum, getStartCharNum, getTextFirstWordNum, getTrueEntities, getTrueMentionFor, getTrueMentions, getWholeDocCounts, getWord, getWordNum, getWords, hasPredEntities, hasPredMentions, hasTrueEntities, hasTrueMentions, isCaseSensitive, makeChunk, save, setCorpusCounts, setPredEntities, setPredictedMentions, setUsePredictedEntities, setUsePredictedMentions, toAnnotatedString, toAnnotatedString, toCoherenceTableString, toCoherenceTableString, toSubstituteString, usePredictedEntities, usePredictedMentions, write

Methods inherited from interface edu.illinois.cs.cogcomp.lbj.coref.ir.docs.Doc

getBestMentionFor, getCExampleFor, getCoherenceInfo, getCoherenceInfo, getCorefChains, getDocID, getEntities, getEntityFor, getEntityFor, getGExampleFor, getInCorpusInverseFreq, getInDocInverseFreq, getInverseTrueHeadFreq, getInverseTrueHeadFreq, getMentions, getMentionsContainedIn, getMentionsContaining, getMentionsInSent, getMentionsInSentences, getMentionsWithExtentStartingAt, getMentionsWithHeadStartingAt, getNumRelations, getNumSentences, getPlainText, getPOS, getPOS, getPredEntities, getPredMentions, getQuoteNestLevel, getRelation, getSentNum, getStartCharNum, getTextFirstWordNum, getTrueEntities, getTrueMentionFor, getTrueMentions, getWholeDocCounts, getWord, getWordNum, getWords, hasPredEntities, hasPredMentions, hasTrueEntities, hasTrueMentions, isCaseSensitive, makeChunk, save, setCorpusCounts, setPredEntities, setPredictedMentions, setUsePredictedEntities, setUsePredictedMentions, toAnnotatedString, toAnnotatedString, toCoherenceTableString, toCoherenceTableString, toSubstituteString, usePredictedEntities, usePredictedMentions, write

Field Detail

serialVersionUID

private static final long serialVersionUID

See Also:: Constant Field Values

Constructor Detail

DocPlainText

public DocPlainText()

Constructs an empty document. This constructor can be used, followed by loadFromPlainText(java.lang.String) to construct a document from a text string.

DocPlainText

public DocPlainText(java.lang.String filename)

Constructs a document using the specified plain text file. * Automatically splits sentences, determines quote levels, determines part-of-speech tags, and splits words using an automatic word-splitting algorithm. Mentions and entities will not be set here.

Parameters:: filename - The name of the specified file.

Method Detail

loadFromFilename

public void loadFromFilename(java.lang.String filename)

Builds this document from the specified plain text file. Automatically splits sentences, determines quote levels, determines part-of-speech tags, and splits words using an automatic word-splitting algorithm. Mentions and entities will not be set here.

Parameters:: filename - The name of a file containing plain text.

loadFromPlainText

public void loadFromPlainText(java.lang.String text)

Builds the document from the given plain text, automatically splitting sentences, determining quote levels, determining part-of-speech tags, and splitting words by an automatic word-splitting algorithm. Mentions and entities will not be set here.

Parameters:: text - The text of the document.

loadFromPlainText

public void loadFromPlainText(java.lang.String text,
                              boolean doWordSplit)

Builds the document from the given plain text, automatically splitting sentences, determining quote levels, determining part-of-speech tags, and either splitting words by whitespace or using a word-splitter. Mentions and entities will not be set here.

Parameters:: text - The text of the document.; doWordSplit - If true, words will be split by an automatic word-splitting algorithm; otherwise words will be assumed to be separated by whitespace.

write

public void write(java.lang.String filename,
                  boolean usePredictions)

Description copied from interface: Doc

Writes this Doc in the appropriate format.

Specified by:: write in interface Doc
Specified by:: write in class DocBase

Parameters:: filename - The name of the target file.; usePredictions - Whether predicted mentions and entities should be written.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.illinois.cs.cogcomp.lbj.coref.ir.docs Class DocPlainText

serialVersionUID

DocPlainText

DocPlainText

loadFromFilename

loadFromPlainText

loadFromPlainText

write

edu.illinois.cs.cogcomp.lbj.coref.ir.docs
Class DocPlainText