Finite function classes

Suppose $l$ is the zero-one loss. Let $F$ be a finite function class ($|F|\lt \infty$). Fix any $\delta \in(0,1)$. With probabilty (w.p.) at least $1-\delta$ :

$$\max_{f \in F}[l[f]-\hat{l_n}[f]] \leq \sqrt{\frac{2log{|F|/\delta}}{n}} $$

  • $|F|$ amount of function we have
  • $\delta$ failed probabilty
  • $n$ number of samples

The same bound holds reverse for

$$\max_{f \in F}[\hat{l_n}[f]-l[f]]$$

We will use Hoeffding's inequality for proofing above

Proof

$f \in F$, define random veriable $X_i$
$$\psi_i=1\left[y_i \neq sgn(f(x_i))\right]$$

$$M_n = \sum_{i=1}^n \psi_i$$
By Hoeffding's inequality:

$$
\begin{aligned}
\mathbb{P} ( \mathbb{E}[m_n]-m_n \geq t) &\leq exp(\frac{-t^2}{2\sum_{i=1}^n a_i^2 = 1})\\
& = exp(\frac{-t^2}{2n})
\end{aligned}$$

change $s = \frac{t}{n}$

$$\mathbb{P} ( \frac{1}{n}\mathbb{E}[m_n]-\frac{1}{n}m_n \geq t) \leq exp(\frac{-nt^2}{2})$$

  1. $$
    \begin{aligned}
    \frac{1}{n} \mathbb{E}[m_n] & = \frac{1}{n} \sum_{i=1}^n \mathbb{E}[1[y_i \neq sgn(f(x_i))] \\
    & = \frac{1}{n} \sum_{i=1}^n l(f) = l(f)
    \end{aligned}
    $$

why can equal to $l(f)$, due to $l(f) = \mathbb{E} 1[y \neq sgn(f(x))]$

2.
$$
\begin{aligned}
\frac{1}{n}m_n & = \frac{1}{n} \sum_{i=1}^n 1[ y_i \neq sgn(f(x_i))] \\
& = \hat{L_n}[f]
\end{aligned}
$$

for fixed $f \in F$:
$$
\mathbb{P}(L[f] - \hat{L_n}[f] \geq s) \leq exp(\frac{-ns^2}{2})
$$

Main Part coming!!!!
$$
\begin{aligned}
\mathbb{P} (\max_{f \in F}[L[f]-\hat{L_n}[f]] \geq s) &= \mathbb{P} (\bigcup_{f \in F}[L[f]-\hat{L_n}[f]] \geq s) \\
&\leq \sum_{f \in F} \mathbb{P} (L[f] -\hat{L_n}[f] \geq s)\quad\text{due to union bound} \\
&\leq \sum_{f \in F} exp(\frac{-ns^2}{2}) = |F|exp(\frac{-ns^2}{2})\quad\text{due to Hoeffding’s}
\end{aligned}
$$

Cal failure prob that $\leq \delta$
$$
|F|exp(\frac{-ns^2}{2}) \leq \delta
$$
$$
=> s = \sqrt{\frac{2log{|F|/\delta}}{n}}
$$