The SNoW (Sparse Network of Winnows) learning architecture framework is a sparse network of linear functions over a predefined or incrementally learned feature space and is specifically tailored for learning in domains in which the potential number of features taking part in decisions is very large, but may be unknown a priori. Some of the characteristics of this learning framework are its sparsely connected units, the allocation of features and links in a data driven way, its computational dependence on the number of active features rather than the total number of features and the utilization of a feature efficient update rule.
SNoW has been used successfully in a variety of large scale learning tasks in domains such as natural language [Roth, 1998,Golding and Roth, 1999,Roth and Zelenko, 1998,Munoz et al., 1999,Punyakanok and Roth, 2001,Shen and Joshi, 2003], bioinformatics [Chuang and Roth, 2001], and visual processing [Roth et al., 2000,Yang et al., 2000,Agarwal and Roth, 2002].
Several update rules may be used within SNoW: classical Winnow and Perceptron, variations of a regularized Winnow and a regularized Perception, regression algorithms based on Gradient Descent, and the naive Bayes algorithm.
SNoW can be thought of as a general purpose multi-class classifier, and in this release we have also included a true multi-class capability, in addition to the standard one-vs-all training policy. In addition to the predicted class label, SNoW can assign a prediction confidence value to each label which can be calculated as a function of the distance between the target node's activation and the threshold.
SNoW should also be thought of as a learning architecture framework. The user designs an architecture within that framework. This means, at a minimum, defining the number of class representations to be learned, but may also include defining many more parameters of the architecture, including update rules and their parameters, regularization parameters, training policies, etc.
In SNoW's documentation, the user-defined architecture and all data accumulated therein are referred to collectively as the network. In the network, class labels are called targets, and they are learned as sparse linear functions over the input features. By sparse in this context we mean that each target may be learned as a function of a (small) subset of all features in the feature space in a data driven way that is partially controlled by parameters set by the user.
When viewing SNoW simply as a classification system, the typical input would be a collection of labeled examples, consisting of Boolean or real valued features, in a format specified in Chapter 6. The following section provides a slightly more abstract view that may be useful for people in the stage of modeling their problem as a learning problem.