public class SupportVectorMachine extends Learner
liblinear
library which supports support vector machine classification. That
library must be downloaded separately and placed on your CLASSPATH
for this class to
work correctly. This class can perform both binary classification and multi-class classification.
It is assumed that Learner.labeler
is a single discrete classifier that produces the same
feature for every example object. Assertions will produce error messages if this assumption does
not hold.
When calling this algorithm in a with
clause inside an LBJava source file, there is
no need to specify the rounds
clause. At runtime, calling Learner.learn(Object)
merely performs feature extraction and stores an indexed representation of the example vector in
memory. The learning algorithm executes when doneLearning()
is called. This call also
frees the memory in which the example vectors are stored. Thus, subsequent calls to
Learner.learn(Object)
and doneLearning()
will discard the previous hypothesis and learn
an entirely new one.
liblinear
performs binary classification (as opposed to 1-vs.-all) whenever the
solver type is not MCSVM_CS
and exactly two class labels are observed in the
training data.
This algorithm's user-configurable parameters are stored in member fields of this class. They may
be set via either a constructor that names each parameter explicitly or a constructor that takes
an instance of Parameters
as input. The documentation in each member field in this class indicates the default
value of the associated parameter when using the former type of constructor. The documentation of
the associated member field in the
Parameters
class
indicates the default value of the parameter when using the latter type of constructor.
Modifier and Type | Class and Description |
---|---|
static class |
SupportVectorMachine.Parameters
A container for all of
SupportVectorMachine 's configurable parameters. |
Modifier and Type | Field and Description |
---|---|
protected edu.illinois.cs.cogcomp.core.datastructures.vectors.OVector |
allExamples
The array of example vectors.
|
protected edu.illinois.cs.cogcomp.core.datastructures.vectors.IVector |
allLabels
The array of example labels
|
protected String[] |
allowableValues
The label producing classifier's allowable values.
|
protected double |
bias
|
protected int |
biasFeatures
The number of bias features; there are either 0 or 1 of them.
|
protected double |
C
The cost parameter C; default
defaultC |
protected boolean |
conjunctiveLabels
Whether or not this learner's labeler produces conjunctive features.
|
static double |
defaultBias
Default for
bias . |
static double |
defaultC
Default for
C . |
static double |
defaultEpsilon
Default for
epsilon . |
static String |
defaultSolverType
Default for
solverType . |
protected boolean |
displayLL
Controls if
liblinear -related messages are output |
protected double |
epsilon
The tolerance of termination criterion; default
defaultEpsilon . |
protected Lexicon |
newLabelLexicon
Created during
doneLearning() in case the training examples observed by
learn(int[],double[],int[],double[]) are only a subset of a larger, pre-extracted
set. |
protected int |
numClasses
The number of unique class labels seen during training.
|
protected int |
numFeatures
The number of unique features seen during training.
|
protected String |
solverType
The type of solver; default
defaultSolverType unless there are more than 2 labels
observed in the training data, in which case "MCSVM_CS" becomes the default. |
protected double[] |
weights
An array of weights representing the weight vector learned after training with
liblinear . |
candidates, encoding, extractor, labeler, labelLexicon, lcFilePath, lexFilePath, lexicon, lossFlag, predictions, readLexiconOnDemand
containingPackage, name
Constructor and Description |
---|
SupportVectorMachine()
Default constructor.
|
SupportVectorMachine(double c)
Initializing constructor.
|
SupportVectorMachine(double c,
double e)
Initializing constructor.
|
SupportVectorMachine(double c,
double e,
double b)
Initializing constructor.
|
SupportVectorMachine(double c,
double e,
double b,
String s)
Initializing constructor.
|
SupportVectorMachine(double c,
double e,
double b,
String s,
boolean d)
Initializing constructor.
|
SupportVectorMachine(String n)
Initializing constructor.
|
SupportVectorMachine(String n,
double c)
Initializing constructor.
|
SupportVectorMachine(String n,
double c,
double e)
Initializing constructor.
|
SupportVectorMachine(String n,
double c,
double e,
double b)
Initializing constructor.
|
SupportVectorMachine(String n,
double c,
double e,
double b,
String s)
Initializing constructor.
|
SupportVectorMachine(String n,
double c,
double e,
double b,
String s,
boolean d)
Initializing constructor.
|
SupportVectorMachine(String n,
SupportVectorMachine.Parameters p)
Initializing constructor.
|
SupportVectorMachine(SupportVectorMachine.Parameters p)
Initializing constructor.
|
Modifier and Type | Method and Description |
---|---|
String[] |
allowableValues()
Returns the array of allowable values that a feature returned by this classifier may take.
|
FeatureVector |
classify(int[] exampleFeatures,
double[] exampleValues)
Evaluates the given example using
liblinear 's prediction method. |
protected Feature |
conjunctiveValueOf(int[] exampleFeatures,
double[] exampleValues,
Iterator I)
This method is a surrogate for
valueOf(int[],double[],Collection) when the labeler
is known to produce conjunctive features. |
String |
discreteValue(int[] exampleFeatures,
double[] exampleValues)
The evaluate method returns the class label which yields the highest score for this example.
|
void |
doneLearning()
This method converts the arrays of examples stored in this class into input for the
liblinear training method. |
Feature |
featureValue(int[] f,
double[] v)
Returns the classification of the given example as a single feature instead of a
FeatureVector . |
void |
forget()
Resets the internal bookkeeping.
|
int |
getNumClasses() |
Learner.Parameters |
getParameters()
Retrieves the parameters that are set in this learner.
|
double[] |
getWeights() |
void |
initialize(int ne,
int nf)
Initializes the example vector arrays.
|
void |
learn(int[] exampleFeatures,
double[] exampleValues,
int[] exampleLabels,
double[] labelValues)
This method adds the example's features and labels to the arrays storing the training
examples.
|
void |
read(edu.illinois.cs.cogcomp.core.datastructures.vectors.ExceptionlessInputStream in)
Reads the binary representation of a learner with this object's run-time type, overwriting
any and all learned or manually specified parameters as well as the label lexicon but without
modifying the feature lexicon.
|
double |
score(int[] exampleFeatures,
double[] exampleValues,
int label)
Computes the dot product of the specified feature vector and the weight vector associated
with the supplied class.
|
double |
score(Object example)
Computes the dot product of the specified example vector and the weight vector associated
with the supplied class.
|
double |
score(Object example,
int label)
Computes the dot product of the specified example vector and the weight vector associated
with the supplied class.
|
ScoreSet |
scores(int[] exampleFeatures,
double[] exampleValues)
An SVM returns a classification score for each class.
|
void |
setLabeler(Classifier l)
Sets the labels list.
|
void |
setParameters(SupportVectorMachine.Parameters p)
Sets the values of parameters that control the behavior of this learning algorithm.
|
Feature |
valueOf(int[] exampleFeatures,
double[] exampleValues,
Collection candidates)
Using this method, the winner-take-all competition is narrowed to involve only those labels
contained in the specified list.
|
Feature |
valueOf(Object example,
Collection candidates)
Using this method, the winner-take-all competition is narrowed to involve only those labels
contained in the specified list.
|
void |
write(edu.illinois.cs.cogcomp.core.datastructures.vectors.ExceptionlessOutputStream out)
Writes the learned function's internal representation in binary form.
|
void |
write(PrintStream out)
Writes the algorithm's internal representation as text.
|
classify, classify, classify, classify, clone, countFeatures, createPrediction, createPrediction, demandLexicon, discreteValue, discreteValue, doneWithRound, emptyClone, featureValue, featureValue, getCurrentLexicon, getExampleArray, getExampleArray, getExtractor, getLabeler, getLabelLexicon, getLexicon, getLexiconDiscardCounts, getLexiconLocation, getModelLocation, getPrunedLexiconSize, learn, learn, learn, learn, read, readLabelLexicon, readLearner, readLearner, readLearner, readLearner, readLearner, readLearner, readLexicon, readLexicon, readLexiconOnDemand, readLexiconOnDemand, readModel, readModel, readParameters, realValue, realValue, realValue, save, saveLexicon, saveModel, scores, scores, scoresAugmented, setCandidates, setEncoding, setExtractor, setLabelLexicon, setLexicon, setLexiconLocation, setLexiconLocation, setLossFlag, setModelLocation, setModelLocation, setParameters, setReadLexiconOnDemand, unclone, unsetLossFlag, write, writeLexicon, writeModel, writeParameters
classify, discreteValueArray, getCompositeChildren, getInputType, getOutputType, realValueArray, test, toString, valueIndexOf
public static final String defaultSolverType
solverType
.public static final double defaultC
C
.public static final double defaultEpsilon
epsilon
.public static final double defaultBias
bias
.protected String solverType
defaultSolverType
unless there are more than 2 labels
observed in the training data, in which case "MCSVM_CS" becomes the default. Note that if you
are doing multi-class classification, you can still override the "MCSVM_CS" default to use
another solver type.
Possible values:
"L2_LR"
= L2-regularized logistic regression;
"L2LOSS_SVM_DUAL"
= L2-loss support vector machines (dual);
"L2LOSS_SVM"
= L2-loss support vector machines (primal);
"L1LOSS_SVM_DUAL"
= L1-loss support vector machines (dual);
"MCSVM_CS"
= multi-class support vector machines by Crammer and Singer
protected double C
defaultC
protected double epsilon
defaultEpsilon
.protected double bias
protected int biasFeatures
protected boolean displayLL
liblinear
-related messages are outputprotected int numClasses
protected int numFeatures
protected boolean conjunctiveLabels
protected double[] weights
liblinear
.protected edu.illinois.cs.cogcomp.core.datastructures.vectors.IVector allLabels
protected edu.illinois.cs.cogcomp.core.datastructures.vectors.OVector allExamples
protected String[] allowableValues
protected Lexicon newLabelLexicon
doneLearning()
in case the training examples observed by
learn(int[],double[],int[],double[])
are only a subset of a larger, pre-extracted
set. If this is not the case, it will simply be a duplicate reference to
Learner.labelLexicon
.public SupportVectorMachine()
public SupportVectorMachine(double c)
c
- The desired C value.public SupportVectorMachine(double c, double e)
c
- The desired C value.e
- The desired epsilon value.public SupportVectorMachine(double c, double e, double b)
c
- The desired C value.e
- The desired epsilon value.b
- The desired bias.public SupportVectorMachine(double c, double e, double b, String s)
c
- The desired C value.e
- The desired epsilon value.b
- The desired bias.s
- The solver type.public SupportVectorMachine(double c, double e, double b, String s, boolean d)
c
- The desired C value.e
- The desired epsilon value.b
- The desired bias.s
- The solver type.d
- Toggles if the liblinear
-related output should be displayed.public SupportVectorMachine(String n)
n
- The name of the classifier.public SupportVectorMachine(String n, double c)
n
- The name of the classifier.c
- The desired C value.public SupportVectorMachine(String n, double c, double e)
n
- The name of the classifier.c
- The desired C value.e
- The desired epsilon value.public SupportVectorMachine(String n, double c, double e, double b)
n
- The name of the classifier.c
- The desired C value.e
- The desired epsilon value.b
- The desired bias.public SupportVectorMachine(String n, double c, double e, double b, String s)
n
- The name of the classifier.c
- The desired C value.e
- The desired epsilon value.b
- The desired bias.s
- The solver type.public SupportVectorMachine(String n, double c, double e, double b, String s, boolean d)
n
- The name of the classifier.c
- The desired C value.e
- The desired epsilon value.b
- The desired bias.s
- The solver type.d
- Toggles if the liblinear
-related output should be displayed.public SupportVectorMachine(SupportVectorMachine.Parameters p)
SupportVectorMachine.Parameters
object. The name of the classifier gets the empty
string.p
- The settings of all parameters.public SupportVectorMachine(String n, SupportVectorMachine.Parameters p)
SupportVectorMachine.Parameters
object.n
- The name of the classifier.p
- The settings of all parameters.public double[] getWeights()
public int getNumClasses()
public void setParameters(SupportVectorMachine.Parameters p)
p
- The parameters.public Learner.Parameters getParameters()
getParameters
in class Learner
public void setLabeler(Classifier l)
setLabeler
in class Learner
l
- A new label producing classifier.public String[] allowableValues()
allowableValues
in class Classifier
public void initialize(int ne, int nf)
initialize
in class Learner
ne
- The number of examples to train.nf
- The number of features.public void learn(int[] exampleFeatures, double[] exampleValues, int[] exampleLabels, double[] labelValues)
liblinear.Linear.train()
for training.
Note that learning via the liblinear
library does not actually take place until
doneLearning()
is called.
public void doneLearning()
liblinear
training method. The learned weight vector is stored in
weights
.doneLearning
in class Learner
public void write(PrintStream out)
C
, epsilon
, bias
, and
finally solverType
.public void write(edu.illinois.cs.cogcomp.core.datastructures.vectors.ExceptionlessOutputStream out)
public void read(edu.illinois.cs.cogcomp.core.datastructures.vectors.ExceptionlessInputStream in)
public Feature featureValue(int[] f, double[] v)
FeatureVector
.featureValue
in class Learner
f
- The features array.v
- The values array.public String discreteValue(int[] exampleFeatures, double[] exampleValues)
discreteValue
in class Learner
exampleFeatures
- The example's array of feature indicesexampleValues
- The example's array of feature valuespublic FeatureVector classify(int[] exampleFeatures, double[] exampleValues)
liblinear
's prediction method. Returns a
DiscretePrimitiveStringFeature
set to the label value.public ScoreSet scores(int[] exampleFeatures, double[] exampleValues)
score(int[],double[],int)
.public double score(Object example)
example
- The example object.public double score(Object example, int label)
example
- The example object.label
- The class labelpublic double score(int[] exampleFeatures, double[] exampleValues, int label)
exampleFeatures
- The example's array of feature indicesexampleValues
- The example's array of feature valueslabel
- The class labelpublic Feature valueOf(Object example, Collection candidates)
ByteString
s.example
- The example object.candidates
- A list of the only labels the example may take.null
if the network did not contain any
of the specified labels.public Feature valueOf(int[] exampleFeatures, double[] exampleValues, Collection candidates)
String
s.exampleFeatures
- The example's array of feature indices.exampleValues
- The example's array of feature values.candidates
- A list of the only labels the example may take.null
if the network did not contain any
of the specified labels.protected Feature conjunctiveValueOf(int[] exampleFeatures, double[] exampleValues, Iterator I)
valueOf(int[],double[],Collection)
when the labeler
is known to produce conjunctive features. It is necessary because when given a string label
from the collection, we will not know how to construct the appropriate conjunctive feature
key for lookup in the label lexicon. So, we must go through each feature in the label lexicon
and use Feature.valueEquals(String)
.exampleFeatures
- The example's array of feature indices.exampleValues
- The example's array of feature values.I
- An iterator over the set of labels to choose from.null
if the network did not
contain any of the specified labels.Copyright © 2016. All rights reserved.