Tokenizer.Tokenization| Constructor and Description |
|---|
IllinoisTokenizer()
Deprecated.
|
| Modifier and Type | Method and Description |
|---|---|
Pair<String[],IntPair[]> |
tokenizeSentence(String sentence)
Deprecated.
given a sentence, return a set of tokens and their character offsets
|
Tokenizer.Tokenization |
tokenizeTextSpan(String text)
Deprecated.
given a span of text, return a list of Pair< String[], IntPair[] > corresponding
to tokenized sentences, where the String[] is the ordered list of sentence tokens and the
IntPair[] is the corresponding list of character offsets with respect to the original
text.
|
public Pair<String[],IntPair[]> tokenizeSentence(String sentence)
tokenizeSentence in interface Tokenizersentence - the plain text sentence to tokenizepublic Tokenizer.Tokenization tokenizeTextSpan(String text)
tokenizeTextSpan in interface Tokenizertext - an arbitrary span of text.Tokenization object containing the ordered token strings, their character
offsets, and sentence end positions (as one-past-the-end token offsets)Copyright © 2017. All rights reserved.