public class POSBaseLineCounter extends Object
Modifier and Type | Field and Description |
---|---|
protected String |
corpusName
The name of corpus used for source
|
protected HashMap<String,TreeMap<String,Integer>> |
table
This map associates forms with maps that associate POS tags with counts.
|
Constructor and Description |
---|
POSBaseLineCounter(String corpusName) |
Modifier and Type | Method and Description |
---|---|
Set<String> |
allowableTags(String form)
Returns the set of tags that the given form has been observed with.
|
void |
buildTable(String home)
A table is built from either a given source corpus file or source corpus directory by simply
counting the number of times that each form-POS association appear in a source corpus.
|
void |
count(String form,
String tag)
Increment the counting of a given form-POS association
|
void |
forget()
Clears out the table to start fresh.
|
String |
getCorpusName()
Return the name of the source corpus.
|
boolean |
looksLikeNumber(String form)
Determines if the input form looks like a number of some sort.
|
boolean |
observed(String form)
Indicates whether the input form was observed while source this learner.
|
int |
observedCount(String form)
Returns the number of times the given form has been observed.
|
static POSBaseLineCounter |
read(String json)
Read the an instance of POSBaseLineCounter class from JSON format
|
String |
tag(int tokenId,
TextAnnotation ta)
Give a tag for a given form
|
static String |
write(POSBaseLineCounter counter)
Write an instance of POSBaseLineCounter class to JSON format
|
protected HashMap<String,TreeMap<String,Integer>> table
protected final String corpusName
public POSBaseLineCounter(String corpusName)
public void buildTable(String home) throws Exception
home
- file name or directory name of the source corpusException
public void count(String form, String tag)
public void forget()
public String tag(int tokenId, TextAnnotation ta)
public boolean looksLikeNumber(String form)
form
- The form of the form.true
iff the form contains only characters in ".,-" and at least one
digit.public boolean observed(String form)
form
- The form of the form.true
if this learner contains statistics for the input form.public int observedCount(String form)
public Set<String> allowableTags(String form)
form
- The form of the form.public String getCorpusName()
public static String write(POSBaseLineCounter counter)
public static POSBaseLineCounter read(String json)
Copyright © 2017. All rights reserved.