edu.brandeis.cs.steele.wn
Class FileBackedDictionary

java.lang.Object
  extended by edu.brandeis.cs.steele.wn.FileBackedDictionary
All Implemented Interfaces:
DictionaryDatabase

public class FileBackedDictionary
extends java.lang.Object
implements DictionaryDatabase

A DictionaryDatabase that retrieves objects from the text files in the WordNet distribution directory. A FileBackedDictionary has an entity cache. The entity cache is used to resolve multiple temporally contiguous lookups of the same entity to the same object -- for example, successive calls to lookupIndexWord with the same parameters would return the same value (== as well as equals), as would traversal of two Pointers that shared the same target. The current implementation uses an LRU cache, so it's possible for two different objects to represent the same entity, if their retrieval is separated by other database operations. The LRU cache will be replaced by a cache based on WeakHashMap, once JDK 1.2 becomes more widely available.

Author:
Oliver Steele, steele@cs.brandeis.edu
See Also:
Cache, LRUCache

Nested Class Summary
protected  class FileBackedDictionary.DatabaseKey
           
 
Field Summary
protected  FileManagerInterface db
           
protected  int DEFAULT_CACHE_CAPACITY
           
protected  Cache entityCache
           
protected static java.lang.String[] POS_FILENAME_ROOTS
           
protected static POS[] POS_KEYS
           
 
Constructor Summary
FileBackedDictionary()
          Construct a dictionary backed by a set of files contained in the default WN search directory.
FileBackedDictionary(FileManagerInterface fileManager)
          Construct a DictionaryDatabase that retrieves file data from fileManager.
FileBackedDictionary(java.lang.String searchDirectory)
          Construct a dictionary backed by a set of files contained in searchDirectory.
 
Method Summary
protected static java.lang.String getDatabaseSuffixName(POS pos)
           
protected static java.lang.String getDataFilename(POS pos)
           
protected static java.lang.String getExceptionsFilename(POS pos)
           
protected static java.lang.String getIndexFilename(POS pos)
           
protected  IndexWord getIndexWordAt(POS pos, long offset)
           
 Synset getSynsetAt(POS pos, long offset)
           
protected  Synset getSynsetAt(POS pos, long offset, java.lang.String line)
           
 java.lang.String lookupBaseForm(POS pos, java.lang.String derivation)
          Return the base form of an exceptional derivation, if an entry for it exists in the database.
 IndexWord lookupIndexWord(POS pos, java.lang.String string)
          Look up a word in the database.
 java.util.Enumeration searchIndexWords(POS pos, java.lang.String substring)
          Return an enumeration of all the IndexWords whose lemmas contain substring as a substring.
 void setEntityCache(Cache cache)
          Set the dictionary's entity cache.
 java.util.Enumeration synsets(POS pos)
          Return an enumeration over all the Synsets in the database.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

db

protected final FileManagerInterface db

DEFAULT_CACHE_CAPACITY

protected final int DEFAULT_CACHE_CAPACITY
See Also:
Constant Field Values

entityCache

protected Cache entityCache

POS_KEYS

protected static final POS[] POS_KEYS

POS_FILENAME_ROOTS

protected static final java.lang.String[] POS_FILENAME_ROOTS
Constructor Detail

FileBackedDictionary

public FileBackedDictionary(FileManagerInterface fileManager)
Construct a DictionaryDatabase that retrieves file data from fileManager. A client can use this to create a DictionaryDatabase backed by a RemoteFileManager.

See Also:
RemoteFileManager

FileBackedDictionary

public FileBackedDictionary()
Construct a dictionary backed by a set of files contained in the default WN search directory. See FileManager for a description of the location of the default search directory.


FileBackedDictionary

public FileBackedDictionary(java.lang.String searchDirectory)
Construct a dictionary backed by a set of files contained in searchDirectory.

Method Detail

setEntityCache

public void setEntityCache(Cache cache)
Set the dictionary's entity cache.


getDatabaseSuffixName

protected static java.lang.String getDatabaseSuffixName(POS pos)

getDataFilename

protected static java.lang.String getDataFilename(POS pos)

getIndexFilename

protected static java.lang.String getIndexFilename(POS pos)

getExceptionsFilename

protected static java.lang.String getExceptionsFilename(POS pos)

getIndexWordAt

protected IndexWord getIndexWordAt(POS pos,
                                   long offset)

getSynsetAt

protected Synset getSynsetAt(POS pos,
                             long offset,
                             java.lang.String line)

getSynsetAt

public Synset getSynsetAt(POS pos,
                          long offset)

lookupIndexWord

public IndexWord lookupIndexWord(POS pos,
                                 java.lang.String string)
Description copied from interface: DictionaryDatabase
Look up a word in the database. The search is case-independent, and phrases are separated by spaces ("look up", not "look_up").

Specified by:
lookupIndexWord in interface DictionaryDatabase
Parameters:
pos - The part-of-speech.
string - The orthographic representation of the word.
Returns:
An IndexWord representing the word, or null if no such entry exists.

lookupBaseForm

public java.lang.String lookupBaseForm(POS pos,
                                       java.lang.String derivation)
Description copied from interface: DictionaryDatabase
Return the base form of an exceptional derivation, if an entry for it exists in the database.

Specified by:
lookupBaseForm in interface DictionaryDatabase
Parameters:
pos - The part-of-speech.
derivation - The inflected form of the word.
Returns:
The uninflected word, or null if no exception entry exists.

searchIndexWords

public java.util.Enumeration searchIndexWords(POS pos,
                                              java.lang.String substring)
Description copied from interface: DictionaryDatabase
Return an enumeration of all the IndexWords whose lemmas contain substring as a substring.

Specified by:
searchIndexWords in interface DictionaryDatabase
Parameters:
pos - The part-of-speech.
Returns:
An enumeration of IndexWords.

synsets

public java.util.Enumeration synsets(POS pos)
Description copied from interface: DictionaryDatabase
Return an enumeration over all the Synsets in the database.

Specified by:
synsets in interface DictionaryDatabase
Parameters:
pos - The part-of-speech.
Returns:
An enumeration of Synsets.