supportVectorTrainer¶
Trains a Support Vector Machines (SVM) classifier.
Synopsis¶
supportVectorTrainer (nClasses, classification, x)
Required Arguments¶
- int
nClasses
(Input) - Number of unique target classification values.
- float
classification[]
(Input) - Array of length
nPatterns
containing the target classification values for each of the training patterns. - float
x[[]]
(Input) - Array of length
nPatterns
bynAttributes
containing the training data matrix.
Return Value¶
A Imsls_d_svm_model containing the trained support vector classifier
model. If training is unsuccessful, None
is returned. To release this
space, use svmClassifierFree
.
Optional Arguments¶
svmCSvcType
, floatC
, floatweightClass[]
, floatweight[]
(Input)Specifies that the C-support vector classification (C-SVC) algorithm is to be used to create the classification model. This is the default type of SVM used.
- float
C
(Input) - The regularization parameter. C must be greater than 0. By default,
the penalty parameters are set to the regularization parameter C. The
penalty parameters can be changed by scaling C by the values
specified in
weight
below. - float
weightClass[]
(Input) - An array of length
nrWeight
containing the target classification values that are to be weighted. - float
weight[]
(Input) An array of length
nrWeight
containing the weights corresponding to the target classification values inweightClass
to be used to change the penalty parameters.Default: C-SVC is the default SVM type used with
C
= 5.0,nrWeight
= 0,weightClass
=None
, andweight
=None
.
- float
or
svmNuSvcType
, float (Input)- Specifies that the ν-support vector classification (ν-SVC) algorithm is to be used to create the classification model.
or
svmOneClassType
, float (Input)- Specifies that the distribution estimation (one-class SVM) algorithm is to be used to create the classification model.
or
svmEpsilonSvrType
, floatC
, floatp
(Input)Specifies that the ɛ-support vector regression (ɛ-SVM) algorithm is to be used to create the classification model.
- float
C
(Input) - The regularization parameter.
C
must be greater than 0. - float
p
(Input) - The insensitivity band parameter
p
must be positive.
- float
or
svmNuSvrType
, floatC
, floatnu
(Input)Specifies that the ν-support vector regression (ν-SVR) algorithm is to be used to create the classification model.
- float
C
(Input) - The regularization parameter. C must be greater than 0.
- float
nu
(Input) - The parameter
nu
controls the number of support vectors andnu
∈ (0,1].
- float
svmWorkArraySize
, float (Input)- This work array size argument sets the number of megabytes allocated for
the work array used during the decomposition method. A larger work array
size can reduce the computational time of the decomposition method.
Default:
svmWorkArraySize
= 1.0. svmEpsilon
, float (Input)- The absolute accuracy tolerance for termination criterion. The algorithm
uses the SMO algorithm in solving the optimization problem. When the
Lagrange multipliers used in the SMO algorithm satisfy the
Karush-Kuhn-Tucker (KKT) conditions within
epsilon
, convergence is assumed.Default:epsilon
= 0.001. svmNoShrinking
, (Input)- Use of this argument specifies that the shrinking technique is not to be used in the SMO algorithm. The shrinking technique tries to identify and remove some bounded elements during the application of the SMO algorithm, so a smaller optimization problem is solved. Default: Shrinking is performed.
svmTrainEstimateProb
, (Input)Instructs the trainer to include information in the resultant classifier model to enable you to obtain probability estimates when invoking
supportVectorClassification
.Default: Information necessary to obtain probability estimates is not included in the model.
svmKernelLinear
, (Input)This argument specifies that the inner-product kernel type
\[K(x_i , x_j) = x_i^Tx_j\]is to be used. This kernel type is best used when the relation between the target classification values and attributes is linear or when the number of attributes is large (for example, 1000 attributes).
or
svmKernelPolynomial
, intdegree
, floatgamma
, floatcoef0
(Input)This argument specifies that the polynomial kernel type
\[K(x_i , x_j) = (\gamma x_i^Tx_j + r)^d\]is to be used. Use this argument when the data are not linearly separable.
- int
degree
(Input) - Parameter
degree
specifies the order of the polynomial kernel.degree
= d in the equation above. - float
gamma
(Input) - Parameter
gamma
must be greater than 0.gamma
= \(\gamma\) in the equation above. - float
coef0
(Input) - Parameter
coef0
corresponds to r in the equation above.
- int
or
svmKernelRadialBasis
, float (Input)This argument specifies that the radial basis function kernel type
\[K(x_i , x_j) = \exp (-\gamma ∥x_i - x_j∥^2)\]is to be used. Use this kernel type when the relation between the class labels and attributes is nonlinear, although it can also be used when the relation between the target classification values and attributes is linear. This kernel type exhibits fewer numerical difficulties. If no kernel type is specified, this is the kernel type used.
or
svmKernelSigmoid
, floatgamma
, floatcoef0
(Input)This argument specifies that the sigmoid kernel type
\[K(x_i , x_j) = \tanh(\gamma x_i^Tx_j + r)\]is to be used.
- float
gamma
(Input) - Parameter
gamma
= \(\gamma\) in the equation above. - float
coef0
(Input) - Parameter
coef0
corresponds to r in the equation above.
- float
or
svmKernelPrecomputed
, floatkernelValues[[]]
(Input)- Use of this argument indicates that the kernel function values have been
precomputed for the training and testing data sets. If
svmKernelPrecomputed
is used, the required argumentx
is ignored. - float
kernelValues[]
(Input) An array of length
nPatterns
bynPatterns
containing the precomputed kernel function values. Assume there are L training instances \(x_1,x_{2,} \ldots,x_L\) and let \(K(x,y)\) be the kernel function value of two instances x and y. Row i of the testing or training data set would be represented by \(K(x_i,x_1) K(x_i,x_2) \ldots K(x_i,x_L)\). All kernel function values, including zeros, must be provided.Default:
svmKernelRadialBasis
,gamma
=1.0/nAttributes
svmCrossValidation
, intnFolds
(Input/Output)Conducts cross validation on
nFolds
folds of the data.randomUniform_discrete
is used during the cross validation step. See the Description section for more information on cross validation. See the Usage Notes in Chapter 12, “Random Number Generation” for instructions on setting the seed to the random number generator if different seeds are desired.- int
nFolds
(Input) - The number of folds of the data to be used in cross validation.
nFolds
must be greater than 1 and less thannPatterns.
- int
Description¶
Function supportVectorTrainer
trains an SVM classifier for classifying
data into one of nClasses
target classes. There are several SVM
formulations that are supported through the optional arguments for
classification, regression, and distribution estimation. The C-support
vector classification (C-SVC) is the fundamental algorithm for the SVM
optimization problem and its primal form is given as
Where \((x_i,y_i)\) are the instance-label pairs for a given training
set, where l is the number of training examples, and \(x_i\in R^n\) and
\(y_i\in{1,-1}\). \(\xi_i\) are the slack variables in optimization
and is an upper bound on the number of errors. The regularization parameter
\(C>0\) acts as a tradeoff parameter between error and margin. This is
the default algorithm used and can be controlled through the use of the
svmCSvcType
optional argument.
The ν-support vector classification (ν-SVC) algorithm presents a new
parameter \(\nu\in \left(0,1\right]\) which acts as an upper bound on the
fraction of training errors and a lower bound on the fraction of support
vectors. The use of this algorithm is triggered through the use of the
svmNuSvcType
optional argument. The primal optimization problem for the
binary variable \(y\in\left\{1,-1\right\}\) is
The one-class SVM algorithm estimates the support of a high-dimensional
distribution without any class information. Control of this algorithm is
through the use of the svmOneClassType
optional argument. The primal
problem of one-class SVM is
If \(z_i\) is the target output and given the parameters \(C>0\), \(\varepsilon>0\), the standard form of ɛ-support vector regression (ɛ-SVR) is
where the two slack variables \(\xi_i\) and \(\xi_i\) are introduced,
one for exceeding the target value by more than ɛ and the other for being
more than ɛ below the target. The use of this algorithm is triggered through
the use of the svmEpsilonSvrType
optional argument.
Similar to ν-SVC, in ν-support vector regression (ν-SVR) the parameter
\(ν\in\left(0,1\right]\) controls the number of support vectors. Use
svmNuSvrType
to trigger this algorithm. The ν-SVR primal problem is
The decomposition method used to solve the dual formulation of these primal problems is an SMO-type (sequential minimal optimization) decomposition method proposed by Fan et. al. (2005).
The svmCrossValidation
optional argument allows one to estimate how
accurately the resulting training model will perform in practice. The cross
validation technique partitions the training data into nFolds
complementary subsets. Each of the subsets is subsequently used in training
and validated against the remaining subsets. The validation results of the
rounds are then averaged. The result is usually a good indicator of how the
trained model will perform on unclassified data.
Function supportVectorTrainer
is based on LIBSVM, Copyright (c)
2000-2013, with permission from the authors, Chih-Chung Chang and Chih-Jen
Lin, with the following disclaimer:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Examples¶
Example 1¶
In this example, we use a subset of the Fisher Iris data to train the
classifier. The default values of
supportVectorTrainer are used in the training. The
resultant classifier model, stored in svmClassifier
, is then used as
input to supportVectorClassification to classify
all of the patterns in the Fisher Iris data set. Results of the
classification are then printed. In the Fisher Iris data set, the first
column is the target classification value, 1=Setosa, 2=Versicolour, and
3=Virginica. Columns 2 through 5 contain the attributes sepal length, sepal
width, petal length, and petal width.
from __future__ import print_function
from numpy import *
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.scaleFilter import scaleFilter
from pyimsl.stat.supportVectorTrainer import supportVectorTrainer
from pyimsl.stat.supportVectorClassification import supportVectorClassification
nPatterns = 150 # 150 total patterns
nAttributes = 4 # four attributes
nClasses = 3 # three classification categories
predictedClass = []
# irisData: The raw data matrix. This is a 2-D matrix with 150
# rows and 5 columns. The first column is the target
# classification value (1-3), and the last 4 columns
# are the continuous input attributes. These data
# contain no categorical input attributes.
irisData = dataSets(3)
# Data corrections described in the KDD data mining archive
irisData[34, 4] = 0.1
irisData[37, 2] = 3.1
irisData[37, 3] = 1.5
# Set up the required input arrays from the data matrix
classification = irisData[:, 0]
x = irisData[:, 1:5]
# Scale the data
method = 1
xx = scaleFilter(x.reshape(150 * 4), method,
scaleLimits={'realMin': 0.,
'realMax': 10.,
'targetMin': 0.,
'targetMax': 1.}).reshape((150, 4))
# Use a subset of the data for training
trainingClassification = concatenate((classification[0:10],
classification[50:60],
classification[100:110]))
trainingData = concatenate((xx[0:10, :], xx[50:60, :], xx[100:110, :]))
print("\n The Input Classification and Training Data \n\n")
print("Classification Sepal Sepal Petal Petal")
print(" Value Length Width Length Width\n")
for i in range(len(trainingClassification)):
print("%8.4f %8.4f %8.4f %8.4f %8.4f" %
(trainingClassification[i], trainingData[i, 0],
trainingData[i, 1], trainingData[i, 2], trainingData[i, 3]))
# Train with the training data
svmClassifier = supportVectorTrainer(
nClasses, trainingClassification, trainingData)
# Classify the entire test set
classErrors = {"classification": classification, "classErrors": []}
predictedClass = supportVectorClassification(svmClassifier, xx,
classError=classErrors)
print("\n\n\n Some Output Classifications\n")
print("Pattern Predicted Actual")
print("Number Classification Classification\n")
for i in range(10):
print(" %d %8.4f %8.4f" %
(i, predictedClass[i], classification[i]))
errors = classErrors["classErrors"]
print("\n\n Iris Classification Error Rates\n")
print("--------------------------------------------------------------")
print(" Setosa Versicolour Virginica | TOTAL\n")
print(" %d/%d %d/%d %d/%d | %d/%d\n" %
(errors[0][0], errors[0][1], errors[1][0], errors[1][1],
errors[2][0], errors[2][1], errors[3][0], errors[3][1]))
print("--------------------------------------------------------------")
Output¶
The Input Classification and Training Data
Classification Sepal Sepal Petal Petal
Value Length Width Length Width
1.0000 0.5100 0.3500 0.1400 0.0200
1.0000 0.4900 0.3000 0.1400 0.0200
1.0000 0.4700 0.3200 0.1300 0.0200
1.0000 0.4600 0.3100 0.1500 0.0200
1.0000 0.5000 0.3600 0.1400 0.0200
1.0000 0.5400 0.3900 0.1700 0.0400
1.0000 0.4600 0.3400 0.1400 0.0300
1.0000 0.5000 0.3400 0.1500 0.0200
1.0000 0.4400 0.2900 0.1400 0.0200
1.0000 0.4900 0.3100 0.1500 0.0100
2.0000 0.7000 0.3200 0.4700 0.1400
2.0000 0.6400 0.3200 0.4500 0.1500
2.0000 0.6900 0.3100 0.4900 0.1500
2.0000 0.5500 0.2300 0.4000 0.1300
2.0000 0.6500 0.2800 0.4600 0.1500
2.0000 0.5700 0.2800 0.4500 0.1300
2.0000 0.6300 0.3300 0.4700 0.1600
2.0000 0.4900 0.2400 0.3300 0.1000
2.0000 0.6600 0.2900 0.4600 0.1300
2.0000 0.5200 0.2700 0.3900 0.1400
3.0000 0.6300 0.3300 0.6000 0.2500
3.0000 0.5800 0.2700 0.5100 0.1900
3.0000 0.7100 0.3000 0.5900 0.2100
3.0000 0.6300 0.2900 0.5600 0.1800
3.0000 0.6500 0.3000 0.5800 0.2200
3.0000 0.7600 0.3000 0.6600 0.2100
3.0000 0.4900 0.2500 0.4500 0.1700
3.0000 0.7300 0.2900 0.6300 0.1800
3.0000 0.6700 0.2500 0.5800 0.1800
3.0000 0.7200 0.3600 0.6100 0.2500
Some Output Classifications
Pattern Predicted Actual
Number Classification Classification
0 1.0000 1.0000
1 1.0000 1.0000
2 1.0000 1.0000
3 1.0000 1.0000
4 1.0000 1.0000
5 1.0000 1.0000
6 1.0000 1.0000
7 1.0000 1.0000
8 1.0000 1.0000
9 1.0000 1.0000
Iris Classification Error Rates
--------------------------------------------------------------
Setosa Versicolour Virginica | TOTAL
0/50 4/50 5/50 | 9/150
--------------------------------------------------------------
Example 2¶
In this example we use a subset of the Fisher Iris data to train the
classifier and use the cross-validation option with various combinations of
C
and gamma
to find a combination which yields the best results on
the training data. The best combination of C
and gamma
are then used
to get the classification model, stored in svmClassifier
. This model is
then used as input to supportVectorClassification
to classify all of the patterns in the Fisher Iris data set. Results of the
classification are then printed.
from __future__ import print_function
from numpy import *
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.scaleFilter import scaleFilter
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.supportVectorTrainer import supportVectorTrainer
from pyimsl.stat.supportVectorClassification import supportVectorClassification
from pyimsl.stat.svmClassifierFree import svmClassifierFree
nPatterns = 150 # 150 total patterns
nAttributes = 4 # four attributes
nClasses = 3 # three classification categories
classErrors = []
predictedClass = []
classLabel = ["Setosa ", "Versicolour", "Virginica "]
# irisData: The raw data matrix. This is a 2-D matrix with 150
# rows and 5 columns. The first column is the target
# classification value (1-3), and the last 4 columns
# are the continuous input attributes. These data
# contain no categorical input attributes.
irisData = dataSets(3)
# Data corrections described in the KDD data mining archive
irisData[34, 4] = 0.1
irisData[37, 2] = 3.1
irisData[37, 3] = 1.5
# Set up the required input arrays from the data matrix
classification = irisData[:, 0]
x = irisData[:, 1:5]
# Scale the data
method = 1
xx = scaleFilter(x.reshape(150 * 4), method,
scaleLimits={'realMin': 0.,
'realMax': 10.,
'targetMin': 0.,
'targetMax': 1.}).reshape((150, 4))
# Use a subset of the data for training
trainingClassification = concatenate((classification[0:10],
classification[50:60],
classification[100:110]))
trainingData = concatenate((xx[0:10, :], xx[50:60, :], xx[100:110, :]))
# Try different combinations of C and gamma to settle on model parameters
C = 2.0
svmCrossValidation = {"nFolds": 3, "target": [], "result": []}
bestAccuracy = 0.0
bestC = 0.0
bestGamma = 0.0
# The cross validation option uses randomUniformDiscrete, so set the seed in
# order to get consistent results from this example.
randomSeedSet(123457)
for i in range(10):
gamma = .1
for j in range(5):
svmClassifier = supportVectorTrainer(nClasses, trainingClassification,
trainingData,
svmCSvcType={
"C": C, "weightClass": [], "weight": []},
svmKernelRadialBasis=gamma,
svmCrossValidation=svmCrossValidation)
result = svmCrossValidation["result"]
if result > bestAccuracy:
bestAccuracy = result
bestC = C
bestGamma = gamma
gamma = gamma * 2.0
svmClassifierFree(svmClassifier)
C = C * 2.0
# Train with the best resultant parameters
svmClassifier = supportVectorTrainer(nClasses, trainingClassification,
trainingData,
svmCSvcType={
"C": bestC, "weightClass": [], "weight": []},
svmKernelRadialBasis=bestGamma)
# Call supportVectorClassification on the entire test set
classErrors = {"classification": classification, "classErrors": []}
predictedClass = supportVectorClassification(svmClassifier, xx,
classError=classErrors)
errors = classErrors["classErrors"]
print("\n\n Iris Classification Error Rates\n")
print("--------------------------------------------------------------")
print(" Setosa Versicolour Virginica | TOTAL\n")
print(" %d/%d %d/%d %d/%d | %d/%d\n" %
(errors[0][0], errors[0][1], errors[1][0], errors[1][1],
errors[2][0], errors[2][1], errors[3][0], errors[3][1]))
print("--------------------------------------------------------------")
Output¶
Iris Classification Error Rates
--------------------------------------------------------------
Setosa Versicolour Virginica | TOTAL
0/50 1/50 3/50 | 4/150
--------------------------------------------------------------
Example 3¶
One thousand uniform deviates from a uniform distribution are used in the
training data set of this example. svmOneClassType
is used to produce
the model during training. A test data set of one hundred uniform deviates
is produced and contaminated with ten normal deviates.
supportVectorClassification is then called in an
attempt to pick out the contaminated data in the test data set. The suspect
observations are printed.
from __future__ import print_function
from numpy import *
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.randomUniform import randomUniform
from pyimsl.stat.randomNormal import randomNormal
from pyimsl.stat.supportVectorTrainer import supportVectorTrainer
from pyimsl.stat.supportVectorClassification import supportVectorClassification
nPatternsTrain = 1000
nPatternsTest = 100
nPatternsTen = 10
nClasses = 1
# Create the training set from a uniform distribution
randomSeedSet(123457)
xTrain = randomUniform(nPatternsTrain)
classificationTrain = ones(nPatternsTrain)
svmClassifier = supportVectorTrainer(nClasses, classificationTrain, xTrain,
svmOneClassType=.001)
# Create a testing set from a uniform distribution
xTest = randomUniform(nPatternsTest)
# Contaminate the testing set with deviates from a normal distribution
xTestContaminant = randomNormal(nPatternsTen, mean=.1, variance=.2)
for i in range(10):
xTest[i * 10] = xTestContaminant[i]
target = supportVectorClassification(svmClassifier, xTest)
print("\n\n Classification Results \n")
for i in range(nPatternsTest):
if (target[i] != 1.0):
print("The %d-th observation may not to belong to the target distribution.\n" % i)
Output¶
Classification Results
The 0-th observation may not to belong to the target distribution.
The 20-th observation may not to belong to the target distribution.
The 30-th observation may not to belong to the target distribution.
The 40-th observation may not to belong to the target distribution.
The 60-th observation may not to belong to the target distribution.
The 70-th observation may not to belong to the target distribution.
Example 4¶
This example uses svmNuSvrType
to create a regression model which is
used by supportVectorClassification in an attempt
to predict values in the test data set. The predicted values are printed.
from __future__ import print_function
from numpy import *
from pyimsl.stat.supportVectorTrainer import supportVectorTrainer
from pyimsl.stat.supportVectorClassification import supportVectorClassification
nPatternsTrain = 10
nPatternsTest = 4
nClasses = 2
classificationTrain = [1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0]
classificationTest = [1.0, 1.0, 2.0, 2.0]
xTrain = [[0.19, 0.61],
[0.156, 0.564],
[0.224, 0.528],
[0.178, 0.51],
[0.234, 0.578],
[0.394, 0.296],
[0.478, 0.254],
[0.454, 0.294],
[0.48, 0.358],
[0.398, 0.336]]
xTest = [[0.316, 0.556],
[0.278, 0.622],
[0.562, 0.336],
[0.522, 0.412]]
svmClassifier = supportVectorTrainer(nClasses, classificationTrain, xTrain,
svmNuSvrType={"C": 50., "nu": .01})
target = supportVectorClassification(svmClassifier, xTest)
mse = 0.0
print("Predicted Actual Difference")
for i in range(nPatternsTest):
diff = target[i] - classificationTest[i]
print("%f %f %f" % (target[i], classificationTest[i], diff))
mse = mse + (diff * diff)
mse = mse / nPatternsTest
print("\nThe Mean squared error for the predicted values is %f" % mse)
Output¶
Predicted Actual Difference
1.443569 1.000000 0.443569
1.397248 1.000000 0.397248
1.648531 2.000000 -0.351469
1.598311 2.000000 -0.401689
The Mean squared error for the predicted values is 0.159861
Warning Errors¶
IMSLS_OPTION_NOT_SUPPORTED |
The optional argument # is not supported for #. |
IMSLS_LABEL_NOT_FOUND |
The class label # specified in
“weight ” not found. |
IMSLS_INADEQUATE_MODEL |
The model used contains inadequate information to compute the requested probability. |
IMSLS_TWO_CLASS_LINE_SEARCH |
The line search failed in a two-class probability estimation while performing cross validation. |
IMSLS_VALIDATION_MAX_ITERATIONS |
The maximum number of iterations was reached in a #-class probability estimation while performing cross validation. |