public class TextAnnotationUtilities extends Object
Modifier and Type | Field and Description |
---|---|
static Comparator<Constituent> |
constituentEndComparator |
static Comparator<Constituent> |
constituentLengthComparator |
static Comparator<Constituent> |
constituentStartComparator |
static Comparator<Constituent> |
constituentStartEndComparator
This comparator will sort entities on start location, but where start is equal on end as well
so the shorter entities come first.
|
static Comparator<IntPair> |
IntPairComparator |
static Comparator<Sentence> |
sentenceStartComparator |
Constructor and Description |
---|
TextAnnotationUtilities() |
Modifier and Type | Method and Description |
---|---|
static void |
copyAttributesFromTo(HasAttributes origObj,
HasAttributes newObj) |
static Constituent |
copyConstituentWithNewTokenOffsets(TextAnnotation newTA,
Constituent c,
int offset)
create a new constituent with token offsets shifted by the specified amount
|
static Relation |
copyRelation(Relation r,
Map<Constituent,Constituent> consMap)
required: consMap *must* contain the source and target constituents for r as keys, and their values
must be non-null
|
static void |
copyViewFromTo(String vuName,
TextAnnotation ta,
TextAnnotation newTA,
int sourceStartTokenIndex,
int sourceEndTokenIndex,
int offset) |
static void |
copyViewsFromTo(TextAnnotation ta,
TextAnnotation newTA,
int sourceStartTokenIndex,
int sourceEndTokenIndex,
int offset)
copy views from the relevant span from ta to newTA.
|
static TextAnnotation |
createFromTokenizedString(String text) |
static List<String> |
getSentenceList(TextAnnotation ta) |
static TextAnnotation |
getSubTextAnnotation(TextAnnotation ta,
int sentenceId)
Given a
TextAnnotation object, and a sentence id, it gives a smaller TextAnnotation which contains
the annotations specific to the given sentence id. |
static String |
getTokenSequence(TextAnnotation ta,
int start,
int end) |
static void |
mapSentenceAnnotationsToText(TextAnnotation sentenceTa,
TextAnnotation textTa,
int sentenceId)
given a
TextAnnotation for a sentence with annotations, map its annotations into a
TextAnnotation object for a longer text containing that sentence. |
static TextAnnotation |
mapTransformedTextAnnotationToSource(TextAnnotation ta,
StringTransformation st)
given a TextAnnotation generated from the transformed text of a StringTransformation object, and the
corresponding StringTransformation object, generate a new TextAnnotation whose annotations correspond
to those of the transformed text TextAnnotation, but whose offsets correspond to the original text
Example: you parse an xml-formatted news document, and use a StringTransformation to record all the places
you removed xml markup or made other changes.
|
static void |
printTextAnnotation(PrintStream out,
TextAnnotation ta) |
public static final Comparator<Constituent> constituentStartEndComparator
public static final Comparator<Constituent> constituentStartComparator
public static final Comparator<Sentence> sentenceStartComparator
public static final Comparator<Constituent> constituentEndComparator
public static final Comparator<Constituent> constituentLengthComparator
public static final Comparator<IntPair> IntPairComparator
public static TextAnnotation createFromTokenizedString(String text)
public static String getTokenSequence(TextAnnotation ta, int start, int end)
public static List<String> getSentenceList(TextAnnotation ta)
public static void printTextAnnotation(PrintStream out, TextAnnotation ta)
public static void mapSentenceAnnotationsToText(TextAnnotation sentenceTa, TextAnnotation textTa, int sentenceId)
TextAnnotation
for a sentence with annotations, map its annotations into a
TextAnnotation object for a longer text containing that sentence.sentenceTa
- annotated TextAnnotation for sentencetextTa
- TextAnnotation for longer text containing sentence, without annotations for that sentencesentenceId
- index of the sentence in the longer textpublic static TextAnnotation getSubTextAnnotation(TextAnnotation ta, int sentenceId)
TextAnnotation
object, and a sentence id, it gives a smaller TextAnnotation
which contains
the annotations specific to the given sentence id. The underlying text is just the sentence text, and
character offsets are modified to correspond to this new shorter text.public static void copyViewsFromTo(TextAnnotation ta, TextAnnotation newTA, int sourceStartTokenIndex, int sourceEndTokenIndex, int offset)
ta
- newTA
- sourceStartTokenIndex
- sourceEndTokenIndex
- offset
- public static void copyViewFromTo(String vuName, TextAnnotation ta, TextAnnotation newTA, int sourceStartTokenIndex, int sourceEndTokenIndex, int offset)
public static Relation copyRelation(Relation r, Map<Constituent,Constituent> consMap)
r
- relation to copyconsMap
- map from original constituents to new counterpartspublic static void copyAttributesFromTo(HasAttributes origObj, HasAttributes newObj)
public static Constituent copyConstituentWithNewTokenOffsets(TextAnnotation newTA, Constituent c, int offset)
newTA
- TextAnnotation which will contain the new Constituentc
- original Constituent to copyoffset
- the offset to shift token indexes of new Constituent. Can be negative.public static TextAnnotation mapTransformedTextAnnotationToSource(TextAnnotation ta, StringTransformation st)
ta
- st
- Copyright © 2017. All rights reserved.