public abstract class AbstractIncrementalCorpusReader<T> extends AnnotationReader<T>
| Modifier and Type | Field and Description |
|---|---|
protected List<List<Path>> |
fileList
contains pointers to files comprising corpus.
|
protected String |
sourceDirectory
root directory of corpus
|
corpusName, currentAnnotationId, resourceManager| Constructor and Description |
|---|
AbstractIncrementalCorpusReader(ResourceManager rm)
ResourceManager must specify the fields
CorpusReaderConfigurator.CORPUS_NAME and
.CORPUS_DIRECTORY, plus whatever is required by the derived class for initializeReader(). |
| Modifier and Type | Method and Description |
|---|---|
String |
generateReport()
generate a human-readable report of annotations read from the source file (plus whatever
other relevant statistics the user should know about).
|
abstract List<T> |
getAnnotationsFromFile(List<Path> corpusFileListEntry)
given an entry from the corpus file list generated by
getFileListing() , parse its
contents and get zero or more TextAnnotation objects. |
abstract List<List<Path>> |
getFileListing()
generate a list of files comprising the corpus.
|
String |
getSourceDirectory() |
boolean |
hasNext()
is there another annotation object to return?
|
protected void |
initializeReader()
this method is called by the base class constructor, so all subclass-specific object
initialization must be done here.
|
T |
next()
Returns the next element in the iteration.
|
void |
reset()
override this to conform to whatever the derived class's state mechanism requires.
|
iterator, removeclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEach, spliteratorforEachRemainingprotected List<List<Path>> fileList
protected String sourceDirectory
public AbstractIncrementalCorpusReader(ResourceManager rm) throws Exception
CorpusReaderConfigurator.CORPUS_NAME and
.CORPUS_DIRECTORY, plus whatever is required by the derived class for initializeReader().rm - ResourceManagerExceptionprotected void initializeReader()
initializeReader in class AnnotationReader<T>public void reset()
AnnotationReaderreset in interface IResetableIterator<T>reset in class AnnotationReader<T>public String getSourceDirectory()
public boolean hasNext()
AnnotationReaderpublic T next()
next in interface Iterator<T>next in class AnnotationReader<T>NoSuchElementException - if the iteration has no more elementspublic abstract List<List<Path>> getFileListing() throws IOException
IOExceptionpublic abstract List<T> getAnnotationsFromFile(List<Path> corpusFileListEntry) throws Exception
getFileListing() , parse its
contents and get zero or more TextAnnotation objects.corpusFileListEntry - corpus file containing content to be processedExceptionpublic String generateReport()
generateReport in class AnnotationReader<T>Copyright © 2017. All rights reserved.