Unutmamak icin yaziyorum, bir ara ceviririm:
"Hi Pavan,
I was talking you about this baseline which performs different when the kernels were scaled. I found out that the problem was based on the SVM part. When the kernels are scaled so that the trace=1, the number of support vectors returned decreases. However when I increase C parameter, the output matches the old output again. So, the conclusion is that scaling the kernels corresponds to (proportionally) scaling C.
I think the problem was using a small C (C=10). In this case SVM just optimizes the first component (regularizer on w or f(w)). So kernel selection does not play a role in the cost function. This is also reflected to MKL formulation. Since they optimize error + regularizer on p, and changing the weights do not make any change on the error, MKL chooses p in a way that sum_of_p is decreased. that is why p approaches 0.
JFI
YGZ
"
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment