public class OntonotesNamedEntityReader extends AnnotationReader<XmlTextAnnotation>
Modifier and Type | Field and Description |
---|---|
protected String |
currentfile
the current file ready to be read.
|
protected int |
fileindex
the index of the current file we are looking at.
|
protected ArrayList<File> |
filelist
the list of files, compiled during initialization, used to iterate over the parse trees.
|
protected String |
homeDirectory
the home directory to traverse.
|
corpusName, currentAnnotationId, resourceManager
Constructor and Description |
---|
OntonotesNamedEntityReader(String nerHome,
String language)
Reads the specified sections from penn treebank
|
Modifier and Type | Method and Description |
---|---|
String |
generateReport()
TODO: generate a human-readable report of annotations read from the source file (plus whatever
other relevant statistics the user should know about).
|
boolean |
hasNext()
we assume all files found are correct, hence if we have another file, we will produce
another text annotation.
|
protected void |
initializeReader()
called by constructor to perform subclass-specific initialization.
|
static void |
main(String[] args)
This class will read the ontonotes data from the provided directory, and write the resulting
NER view data to the specified output directory in CoNLL column format.
|
XmlTextAnnotation |
next()
return the next annotation object.
|
iterator, remove, reset
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
forEach, spliterator
forEachRemaining
protected final String homeDirectory
protected ArrayList<File> filelist
protected int fileindex
protected String currentfile
public OntonotesNamedEntityReader(String nerHome, String language) throws IllegalArgumentException, IOException
nerHome
- The directory that points to the merged (mrg) files of the WSJ portionlanguage
- the languageannotationFileExtension
- the name of the annotation fileIOException
IllegalArgumentException
public boolean hasNext()
hasNext
in interface Iterator<XmlTextAnnotation>
hasNext
in class AnnotationReader<XmlTextAnnotation>
public XmlTextAnnotation next()
next
in interface Iterator<XmlTextAnnotation>
next
in class AnnotationReader<XmlTextAnnotation>
public String generateReport()
generateReport
in class AnnotationReader<XmlTextAnnotation>
protected void initializeReader()
AnnotationReader
initializeReader
in class AnnotationReader<XmlTextAnnotation>
public static void main(String[] args) throws IOException
args
- command lines args specify input data directory, language and output directory.IOException
Copyright © 2017. All rights reserved.