public class POSMikheevCounter extends POSBaseLineCounter
Modifier and Type | Field and Description |
---|---|
protected HashMap<String,TreeMap<String,Integer>> |
firstCapitalized
A map for capitalized words appearing first in the sentence.
|
protected HashMap<String,TreeMap<String,Integer>> |
notFirstCapitalized
A map for capitalized words not appearing first in the sentence.
|
corpusName, table
Constructor and Description |
---|
POSMikheevCounter(String corpusName) |
Modifier and Type | Method and Description |
---|---|
Set<String> |
allowableTags(TextAnnotation ta,
int tokenId)
Returns the set of tags that the given word's suffix has been observed with, or a reasonable
default if the suffix has never been observed.
|
void |
buildTable(String home)
A table is built from either a given source corpus file or source corpus directory by
counting the number of times that each suffix-POS association in a source corpus.
|
void |
count(HashMap<String,TreeMap<String,Integer>> table,
String suffix,
String tag)
Increments the count for the given suffix and tag.
|
void |
doneLearning()
Runs after all learning is complete.
|
void |
forget()
Clears out the table to start fresh.
|
void |
prune(HashMap<String,TreeMap<String,Integer>> table)
Prunes the specified table.
|
static POSMikheevCounter |
read(String json)
Read the an instance of POSMikheevCounter class from JSON format
|
String |
tag(int tokenId,
TextAnnotation ta)
Give a tag for a given form
|
static String |
write(POSMikheevCounter counter)
Write an instance of POSMikheevCounter class to JSON format
|
allowableTags, count, getCorpusName, looksLikeNumber, observed, observedCount, write
protected HashMap<String,TreeMap<String,Integer>> firstCapitalized
public POSMikheevCounter(String corpusName)
public void buildTable(String home) throws Exception
buildTable
in class POSBaseLineCounter
home
- file name or directory name of the source corpusException
public void count(HashMap<String,TreeMap<String,Integer>> table, String suffix, String tag)
table
- The table in which a count should be incremented.suffix
- The suffix.tag
- The POS tag.public void doneLearning()
public void prune(HashMap<String,TreeMap<String,Integer>> table)
table
- The table.public void forget()
forget
in class POSBaseLineCounter
public String tag(int tokenId, TextAnnotation ta)
tag
in class POSBaseLineCounter
public Set<String> allowableTags(TextAnnotation ta, int tokenId)
public static String write(POSMikheevCounter counter)
public static POSMikheevCounter read(String json)
Copyright © 2017. All rights reserved.