public class Word
extends edu.illinois.cs.cogcomp.lbjava.parse.LinkedChild
form
and
capitalized
fields described below having meaningful values.
The form
field can be assumed to be filled in because it's
hard to imagine a situation in which a Word
object should be
created without any knowledge of how that word appeared in text. The
capitalized
field is computed from the form
by
this class' constructor.
All other fields must be obtained or computed externally. Space is provided for them in this class' implementation as a convenience, since we expect the user will make frequent use of these fields.
This class extends from LinkedChild
. Of course,
this means that objects of this class contain references to both the
previous and the next word in the sentence. Constructors are available
that take the previous word as an argument, setting that reference. Thus,
a useful technique for constructing all the words in a sentence will
involve code that looks like this (where form
is a
String
):
Word current = new Word(form);
a loop of some sort
{
current.next = new Word(form, current);
current = current.next;
}
Modifier and Type | Field and Description |
---|---|
boolean |
capitalized
Whether or not the word is capitalized is determined automatically by
the constructor.
|
String |
form
The actual text from the corpus that represents the word.
|
String |
lemma
The base form of the word.
|
String |
partOfSpeech
Names the part of speech of this word.
|
String |
wordSense
An indication of the meaning or usage of this instance of this word.
|
Constructor and Description |
---|
Word(String f)
When all that is known is the spelling of the word.
|
Word(String f,
int start,
int end)
When you have offset information.
|
Word(String f,
String pos)
Sets the actual text and the part of speech.
|
Word(String f,
String pos,
int start,
int end)
When you have offset information.
|
Word(String f,
String pos,
String l,
String sense,
Word p,
int start,
int end)
This constructor is useful when the sentence is being parsed forwards.
|
Word(String f,
String pos,
Word p)
This constructor is useful when the sentence is being parsed forwards.
|
Word(String f,
String pos,
Word p,
int start,
int end)
This constructor is useful when the sentence is being parsed forwards.
|
Word(String f,
Word p)
This constructor is useful when the sentence is being parsed forwards.
|
Word(String f,
Word p,
int start,
int end)
This constructor is useful when the sentence is being parsed forwards.
|
Modifier and Type | Method and Description |
---|---|
String |
toString()
The string representation of a word is its POS bracket form, or, if the
part of speech is not available, it is just the spelling of the word.
|
public String form
public boolean capitalized
public String partOfSpeech
public String lemma
public String wordSense
public Word(String f)
f
- The actual text of the word.public Word(String f, String pos)
f
- The actual text of the word.pos
- A token representing the word's part of speech.public Word(String f, Word p)
f
- The actual text of the word.p
- The word that came before this one in the sentence.public Word(String f, String pos, Word p)
f
- The actual text of the word.pos
- A token representing the word's part of speech.p
- The word that came before this one in the sentence.public Word(String f, int start, int end)
f
- The actual text of the word.start
- The offset into the parent document at which the first
character of this word is found.end
- The offset into the parent document at which the last
character of this word is found.public Word(String f, String pos, int start, int end)
f
- The actual text of the word.pos
- A token representing the word's part of speech.start
- The offset into the parent document at which the first
character of this word is found.end
- The offset into the parent document at which the last
character of this word is found.public Word(String f, Word p, int start, int end)
f
- The actual text of the word.p
- The word that came before this one in the sentence.start
- The offset into the parent document at which the first
character of this word is found.end
- The offset into the parent document at which the last
character of this word is found.public Word(String f, String pos, Word p, int start, int end)
f
- The actual text of the word.pos
- A token representing the word's part of speech.p
- The word that came before this one in the sentence.start
- The offset into the parent document at which the first
character of this word is found.end
- The offset into the parent document at which the last
character of this word is found.public Word(String f, String pos, String l, String sense, Word p, int start, int end)
f
- The actual text of the word.pos
- A token representing the word's part of speech.l
- The base form of the word.sense
- The sense of the word.p
- The word that came before this one in the sentence.start
- The offset into the parent document at which the first
character of this word is found.end
- The offset into the parent document at which the last
character of this word is found.public String toString()
"("
, "["
, and "{"
) as
"-LRB-"
and right brackets (")"
,
"]"
, "}"
) as "-RRB-"
.Copyright © 2017. All rights reserved.