The

section has basically completed the theory of pushing SVM, problem solving for maximizing the margin target eventually transformed into solving the Lagrange multiplier variable alpha, the alpha can be used to solve the weight of W SVM, with weight will have a maximum distance, but in fact I have a hypothesis is: the training set is linearly separable, so the alpha in [0, infinite]. But what if the data is not linearly separable? This time we will allow part of the sample can be crossed classifier, such that the objective function optimization can change, as long as the introduction of , it said the wrong classification sample price, correct classification when it is equal to 0, when the classification errors of , where Tn denotes the real sample label -1 or 1, review in the previous section, we apply the support vector classifier to the fixed distance is 1, so the two class support vector distance is greater than 1, when the classification errors of

(Figure five)

that misclassify cost, we put on the objective function (equation four) added a misclassification costs, such as (formula eight) form:

(formula eight)

repeat section of the Lagrange multiplier method step (equation nine) are:

(formula nine)

has a Un multiplier, of course, our job is to solve the objective function, continue to repeat on the day the steps get derivative (formula ten):

< img id= "theimg" src= "http://files.jb51.net/file_images/article/201801/2018119112039446.jpg? 2018019112047" alt= "" >

(formula ten)

because alpha is greater than 0, and more than 0 Un, so 0& lt; alpha< C, in order to explain clearly, we put (formula nine) KKT conditions are issued (third kinds of optimization problems in section Un is greater than or equal to 0:

is now in the form of basic optimization function did not change, but more of a wrong classification value, but one more condition, 0< alpha< C, C is a constant, it is the role of error classification in this event, the maximum control distance, it is too large will lead to overfitting, too small will lead to less fitting. The next step seems everyone should know, many restrictions on a C constant, then continue to use the two optimization planning SMO algorithm, but I want to keep the nuclear function also once said, if the sample is linear inseparable, after the introduction of kernel function, the samples are mapped to high dimensional space can be linearly separable, as shown in (Figure six) linear indivisible samples:

(Figure six)

in (Figure six), the sample is obviously linear inseparable, but we use the join between the existing sample X some different operations, such as (Figure six) shown on the right way, and let Is f better as a new sample (or a new feature)? Now the X has been mapped into high dimension f up, but we do not know, the kernel function should play, with Gauss kernel function as an example, in (Figure seven) selected few sample points as a reference point, to use the kernel function to calculate F, as shown in (Figure seven):

(Figure seven)

so there is f, and the nuclear function at this time is equivalent to the sample of X and a reference point for measurement, weight decay new features, dependent on the formation of X F, put the F in the above said SVM to solve the alpha, and then obtained the weight on the line, the principle is very simple, in order to look a little academic taste, the kernel function is a kind of The addition of the objective function to go, such as (equation eleven) shown:

(formula eleven)

(K Xn, Xm) is the kernel function, and the change of above objective function much better than no, on the line, using the SMO optimization code is as follows:

def smoPK (dataMatIn, classLabels, C, toler, maxIter): #full Platt SMO oS = optStruct (mat (dataMatIn), mat (classLabels (.Transpose)), C, toler) ITER = 0 entireSet = True; alphaPairsChanged = 0 while (ITER < maxIter (and) (alphaPairsChanged > 0) or (entireSet)): alphaPairsChanged = 0 if entireSet: #go over Al L for I in range (oS.m): alphaPairsChanged = innerL (I, oS) print fullSet, iter:%d i:%d, pairs changed%d (ITER, I, alphaPairsChanged%) ITER + = 1 else:#go over non-bound (railed) alphas (nonBoundIs = nonzero (oS.alphas.A > 0) * (oS.alphas.A < C) [0] for I in nonBoundIs: alphaPairsChanged) = innerL (I, oS) print non-bound, iter:%d i:%d, pairs changed%d (ITER, I, alphaPairsChanged, ITER) if entireSet: + = 1 entireSet = False #toggle entire set loop elif (alphaPairsChanged = = 0): entireSet = True print "iteration number:%d"% ITER return oS.b, a small example of oS.alphas

following the presentation, handwriting recognition.

(1) data collection:

provides text file (2): Based on the two value image data to construct vector

(3):

visual data analysis of image vector (4) training method: the two kinds of kernel function and the radial basis function with different settings to run SMO algorithm.

(5) test algorithm: write a function to test different kernel function, and calculate the error rate of

(6) using the algorithm: a full application of image recognition also need only some image processing, the slightly demo.

from complete code is as follows:numpy import from time import sleep def * loadDataSet (fileName): dataMat = labelMat = []; [] fr = open (fileName) for line in (fr.readlines): lineArr = line.strip (.Split) ('t') dataMat.append ([float (lineArr[0]), float (lineArr[1])]) labelMat.append (float (lineArr[2])) return dataMat, labelMat def selectJrand (I, m): j=i #we want to select any J not equal to I while (j==i): J = int (random.uniform (0, m)) return J def clipAlpha (AJ, H, L): if AJ > H: AJ if = H L > aj: AJ = L return AJ def smoSimple (dataMatIn, classLabels, C, toler, maxIter): dataMatrix = mat (dataMatIn); labelMat = mat (classLabels).Transpose (b) = 0; m, n = shape (dataMatrix) alph As = mat (zeros ((m, 1)) ITER while (ITER) = 0 < maxIter: alphaPairsChanged) = 0 for I in range (m): fXi = float (multiply (alphas, labelMat).T* (dataMatrix*dataMatrix[i,].T)) + B Ei = fXi - float (labelMat[i]) #if checks if an example violates KKT conditions (if (labelMat[i]*Ei < -toler and (alphas[i]) < or (C)) (labelMat[i]*Ei > toler and (alphas[i]) > 0)): J = selectJrand (I, m) fXj = float (multiply (alphas, labelMat) (dataMatrix*dataMatrix[j:].T,.T* Ej = fXj + b)) - float (labelMat[j]) = alphas[i].copy (alphaIold); alphaJold = alphas[j].copy (if); (labelMat[i]! = labelMat[j] = max (0): L, alphas[j] - alphas[i]) H = min (C, C + alphas[j] - alphas[i] else: L (max) = 0, Al Phas[j] + alphas[i] - C) H = min (C, alphas[j] + alphas[i]) if L==H: print "L==H"; continue ETA = 2 * dataMatrix[i,]*dataMatrix[j:].T, dataMatrix[i,]*dataMatrix[i,].T: dataMatrix[j,]*dataMatrix[j, if:].T: ETA & gt; 0: print = "eta> =0;" continue; alphas[j] labelMat[j]* (Ei - Ej) - = /eta alphas[j] = clipAlpha (alphas[j], H, L (if) ABS (alphas[j] - alphaJold) < 0.00001): "print J not moving enough"; continue alphas[i] = labelMat[j]*labelMat[i]* (alphaJold - alphas[j]) #update I by the same amount as J #the update is in the oppostie direction B1 = B - Ei- labelMat[i]* (alphas[i]-alphaIold) *dataMatrix[i,:]*dataMatrix[i,:].T - labelMat[j]* (alphas[j]-alphaJold): *dataMatrix[i,]*dat AMatrix[j,].T: B2 = B - Ej- labelMat[i]* (alphas[i]-alphaIold) *dataMatrix[i,:]*dataMatrix[j,:].T - labelMat[j]* (alphas[j]-alphaJold) *dataMatrix[j,]*dataMatrix[j,].T: if (0 < alphas[i] and (C) > alphas[i]: b) = B1 elif (0 < alphas[j]) and (C > alphas[j]) B: = B2 else: = B (B1 + B2) /2.0 alphaPairsChanged print iter:%d i:%d = 1, pairs changed%d (ITER, I, alphaPairsChanged%) if (alphaPairsChanged = = 0): ITER = 1 else: ITER = 0 print "iteration number:%d" return B alphas def% ITER, kernelTrans (X A, kTup, #calc): the kernel or transform data to a higher dimensional space m, n = shape (X) K = mat (zeros ((m, 1))) if kTup[0]=='lin': K = X * A.T #linear kernel elif kTup[0]=='rbf': For J in range (m): deltaRow = X[j, K[j] = deltaRow*deltaRow.T - A: K = exp (K/ (-1*kTup[1]**2)) #divide in NumPy is element-wise not matrix like Matlab else: raise NameError ('Houston We Have a Problem Kernel is not Recognized': That return K class optStruct: DEF) __init__ (self, dataMatIn classLabels, C, toler, kTup), Initialize the structure with the #: parameters self.X = dataMatIn self.labelMat = classLabels self.C = C self.tol = toler self.m = shape (dataMatIn) [0] (self.alphas = mat (zeros (self.m, 1))) = 0 self.b self.eCache = mat (zeros ((self.m, 2)) #first column is valid FL)

This concludes the body part