public class WordSplitter extends Object implements edu.illinois.cs.cogcomp.lbjava.parse.Parser
Sentence
s returned by
another parser (e.g., SentenceSplitter
) and splits them into
Word
objects. Entire sentences now represented as
LinkedVector
s are then returned one at a time by calls
to the next()
method.
A main(String[])
method is also implemented which applies
this class to plain text in a straight-forward way.
Modifier and Type | Field and Description |
---|---|
protected edu.illinois.cs.cogcomp.lbjava.parse.Parser |
parser
The
Sentence returning parser. |
Constructor and Description |
---|
WordSplitter(edu.illinois.cs.cogcomp.lbjava.parse.Parser p)
Initializing constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Frees any resources this parser may be holding.
|
static void |
main(String[] args)
Run this program on a file containing plain text, and it will produce
the same text on
STDOUT rearranged so that each line
contains exactly one sentence, and so that character sequences deemed to
be "words" are delimited by whitespace. |
Object |
next()
Returns
LinkedVector s of Word objects one at
a time. |
void |
reset()
Sets this parser back to the beginning of the raw data.
|
protected edu.illinois.cs.cogcomp.lbjava.parse.Parser parser
Sentence
returning parser.public WordSplitter(edu.illinois.cs.cogcomp.lbjava.parse.Parser p)
p
- The Sentence
returning parser.public static void main(String[] args)
STDOUT
rearranged so that each line
contains exactly one sentence, and so that character sequences deemed to
be "words" are delimited by whitespace.
Usage:
java edu.illinois.cs.cogcomp.lbjava.edu.illinois.cs.cogcomp.lbjava.nlp.WordSplitter <file name>
args
- The command line arguments.public Object next()
LinkedVector
s of Word
objects one at
a time.next
in interface edu.illinois.cs.cogcomp.lbjava.parse.Parser
public void reset()
reset
in interface edu.illinois.cs.cogcomp.lbjava.parse.Parser
public void close()
close
in interface edu.illinois.cs.cogcomp.lbjava.parse.Parser
Copyright © 2017. All rights reserved.