public class OntonotesTreebankReader extends AnnotationReader<TextAnnotation>
| Modifier and Type | Field and Description |
|---|---|
protected String |
currentfile
the current file ready to be read.
|
protected int |
fileindex
the index of the current file we are looking at.
|
protected ArrayList<File> |
filelist
the list of files, compiled during initialization, used to iterate over the parse trees.
|
protected String |
homeDirectory
the home directory to traverse.
|
protected String |
parseViewName
the name of the resulting view.
|
static String |
PENN_TREEBANK_ONTONOTES
the view name we will employ.
|
protected int |
treesProduced
the number of trees produced.
|
corpusName, currentAnnotationId, resourceManager| Constructor and Description |
|---|
OntonotesTreebankReader(String treebankHome,
String language)
Reads the specified sections from penn treebank
|
| Modifier and Type | Method and Description |
|---|---|
String |
generateReport()
TODO: generate a human-readable report of annotations read from the source file (plus whatever
other relevant statistics the user should know about).
|
boolean |
hasNext()
we assume all files found are correct, hence if we have another file, we will produce
another text annotation.
|
protected void |
initializeReader()
called by constructor to perform subclass-specific initialization.
|
static void |
main(String[] args)
This class will read the ontonotes data from the provided directory, and write the resulting
serialized json form of the penn bank data to the specified output directory.
|
TextAnnotation |
next()
return the next annotation object.
|
iterator, remove, resetclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEach, spliteratorforEachRemainingpublic static final String PENN_TREEBANK_ONTONOTES
protected final String homeDirectory
protected String parseViewName
protected ArrayList<File> filelist
protected int fileindex
protected String currentfile
protected int treesProduced
public OntonotesTreebankReader(String treebankHome, String language) throws IllegalArgumentException, IOException
treebankHome - The directory that points to the merged (mrg) files of the WSJ portionlanguage - the languageannotationFileExtension - the name of the annotation fileIOExceptionIllegalArgumentExceptionpublic boolean hasNext()
hasNext in interface Iterator<TextAnnotation>hasNext in class AnnotationReader<TextAnnotation>public TextAnnotation next()
next in interface Iterator<TextAnnotation>next in class AnnotationReader<TextAnnotation>public String generateReport()
generateReport in class AnnotationReader<TextAnnotation>protected void initializeReader()
AnnotationReaderinitializeReader in class AnnotationReader<TextAnnotation>public static void main(String[] args) throws IOException
args - command lines args specify input data directory, language and output directory.IOExceptionCopyright © 2017. All rights reserved.