public class NLDocument
extends edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector
SentenceSplitter
and Sentence.wordSplit()
are
used to represent the text of the document internally as a collection of
vectors of words. As such, the text of the document is assumed plain,
i.e. there should not be any mark-up.Constructor and Description |
---|
NLDocument(NLDocument p,
String file)
Creates a document from the contents of the named file.
|
NLDocument(NLDocument p,
String[] text)
This constructor takes the entire text of the document in a String array
as input and initializes the representation.
|
NLDocument(String file)
Creates a document from the contents of the named file.
|
NLDocument(String[] text)
This constructor takes the entire text of the document in a String array
as input and initializes the representation.
|
Modifier and Type | Method and Description |
---|---|
void |
addAll(SentenceSplitter splitter)
Adds all the sentences that come from the argument sentence splitter to
this document after using a word splitter to chop them up.
|
String |
getFileName()
Returns the name of the file this document came from, or
null if one was not specified. |
public NLDocument(String[] text)
text
- The entire text of the document. Each element of this array
should represent a line of input without any line
termination characters.public NLDocument(NLDocument p, String[] text)
p
- The previous child in the parent vector.text
- The entire text of the document. Each element of this array
should represent a line of input without any line
termination characters.public NLDocument(String file)
file
- The name of the file containing a natural language, plain
text document.public NLDocument(NLDocument p, String file)
p
- The previous child in the parent vector.file
- The name of the file containing a natural language, plain
text document.public String getFileName()
null
if one was not specified.public void addAll(SentenceSplitter splitter)
splitter
- A sentence splitter.Copyright © 2017. All rights reserved.