Finite function classes
Suppose $l$ is the zero-one loss. Let $F$ be a finite function class ($|F|\lt \infty$). Fix any $\delta \in(0,1)$. With probabilty (w.p.) at least $1-\delta$ :
$$\max_{f \in F}[l[f]-\hat{l_n}[f]] \leq \sqrt{\frac{2log{|F|/\delta}}{n}} $$
- $|F|$ amount of function we have
- $\delta$ failed probabilty
- $n$ number of samples
The same bound holds reverse for
$$\max_{f \in F}[\hat{l_n}[f]-l[f]]$$
We will use Hoeffding's inequality for proofing above
Proof
$f \in F$, define random veriable $X_i$
$$\psi_i=1\left[y_i \neq sgn(f(x_i))\right]$$
$$M_n = \sum_{i=1}^n \psi_i$$
By Hoeffding's inequality
:
$$
\begin{aligned}
\mathbb{P} ( \mathbb{E}[m_n]-m_n \geq t) &\leq exp(\frac{-t^2}{2\sum_{i=1}^n a_i^2 = 1})\\
& = exp(\frac{-t^2}{2n})
\end{aligned}$$
change $s = \frac{t}{n}$
$$\mathbb{P} ( \frac{1}{n}\mathbb{E}[m_n]-\frac{1}{n}m_n \geq t) \leq exp(\frac{-nt^2}{2})$$
- $$
\begin{aligned}
\frac{1}{n} \mathbb{E}[m_n] & = \frac{1}{n} \sum_{i=1}^n \mathbb{E}[1[y_i \neq sgn(f(x_i))] \\
& = \frac{1}{n} \sum_{i=1}^n l(f) = l(f)
\end{aligned}
$$
why can equal to $l(f)$, due to $l(f) = \mathbb{E} 1[y \neq sgn(f(x))]$
2.
$$
\begin{aligned}
\frac{1}{n}m_n & = \frac{1}{n} \sum_{i=1}^n 1[ y_i \neq sgn(f(x_i))] \\
& = \hat{L_n}[f]
\end{aligned}
$$
for fixed $f \in F$:
$$
\mathbb{P}(L[f] - \hat{L_n}[f] \geq s) \leq exp(\frac{-ns^2}{2})
$$
Main Part coming!!!!
$$
\begin{aligned}
\mathbb{P} (\max_{f \in F}[L[f]-\hat{L_n}[f]] \geq s) &= \mathbb{P} (\bigcup_{f \in F}[L[f]-\hat{L_n}[f]] \geq s) \\
&\leq \sum_{f \in F} \mathbb{P} (L[f] -\hat{L_n}[f] \geq s)\quad\text{due to union bound} \\
&\leq \sum_{f \in F} exp(\frac{-ns^2}{2}) = |F|exp(\frac{-ns^2}{2})\quad\text{due to Hoeffding’s}
\end{aligned}
$$
Cal failure prob that $\leq \delta$
$$
|F|exp(\frac{-ns^2}{2}) \leq \delta
$$
$$
=> s = \sqrt{\frac{2log{|F|/\delta}}{n}}
$$