Wednesday, May 26, 2010

How does scaling the kernel matrix affect the SVM outputs?

Unutmamak icin yaziyorum, bir ara ceviririm:

"Hi Pavan,

I was talking you about this baseline which performs different when the kernels were scaled. I found out that the problem was based on the SVM part. When the kernels are scaled so that the trace=1, the number of support vectors returned decreases. However when I increase C parameter, the output matches the old output again. So, the conclusion is that scaling the kernels corresponds to (proportionally) scaling C.

I think the problem was using a small C (C=10). In this case SVM just optimizes the first component (regularizer on w or f(w)). So kernel selection does not play a role in the cost function. This is also reflected to MKL formulation. Since they optimize error + regularizer on p, and changing the weights do not make any change on the error, MKL chooses p in a way that sum_of_p is decreased. that is why p approaches 0.

JFI
YGZ
"