public class MultilingualTokenizeTextToColumn extends Object
Constructor and Description |
---|
MultilingualTokenizeTextToColumn(String lang) |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
void |
processDir(String corpusName,
String inDir,
String outDir)
Process a directory containing plain text files (possibly in subdirectories).
|
void |
processFile(String corpus,
File in,
String out)
given an input containing plain text, tokenize and write to named output file.
|
public MultilingualTokenizeTextToColumn(String lang)
public static void main(String[] args)
public void processFile(String corpus, File in, String out) throws IOException
corpus
- name of corpusin
- file to tokenizeout
- output file for tokenized textIOException
public void processDir(String corpusName, String inDir, String outDir) throws IOException
corpusName
- name of corpusinDir
- directory of files to processoutDir
- output directory for processed filesIOException
- if the input or output directories are invalidCopyright © 2017. All rights reserved.