What is the fundemental idea behind "Maximal Margin Classifiers" (as well as their extensions "Support Vector Classifier" and "Support Vector Machines")?
What is a support vector?
In the plot below, which points are the "support vectors"?
Sketch or code (using Python) the following two dimensional hyperplanes, indicating where $1 + 3X_1 - X_2 > 0$ and where $1 + 3X_1 - X_2 < 0$.
a. $1 + 3X_1 - X_2 = 0$
b. $-2 + X_1 + 2X_2 = 0$
Fundamentally, how are "Support Vector Classifier" and "Support Vector Machines" extensions of "Maximal Margin Classifiers"?
If $C$ is large for a support vector classifier in Scikit-Learn, will there be more or less support vectors than if $C$ is small? Explain your answer.
Is the "confidence score" output from a SVM classifier the same as a "probability score"?
Say you trained an SVM classifier with an RBF kernel. It seems to underfit the training set: should you increase or decrease $\gamma$ (gamma
) and/or $C$?