Sub-Gaussion random variable

Defi:

A random variable $X$ is a $\sigma$-sub-Gaussion,
If $\lambda \in \mathbb{R}$.
$$
\mathbb{E}[exp(\lambda (X - \mathbb{E}(X)))] \leq exp(\frac{\lambda^2 \cdot \sigma^2}{2})
$$
$X$ is centered random variable
We need to know that, If $X\sim N(\mu, \sigma^2)$ (this a a Gaussion distribution), then
$$
\lambda \in \mathbb{R} \qquad \mathbb{E}[exp(\lambda(x-\mu))] = exp(\frac{\lambda^2 \cdot \sigma^2}{2})
$$

sub-Gaussion concentration(tail bound for sub-Gaussian random variable)

Let $X$ be a sub Gaussion random variable. Then we should have:
$$
\forall t \gt 0, \qquad \mathbb{P}(X-\mathbb{E}(X) \geq t) \leq exp(\frac{-t^2}{2\sigma^2})
$$

Proofing:

Fix $\lambda \gt 0$
$$
\begin{aligned}
\mathbb{P}(X-\mathbb{E}(X) \geq t) &\leq exp(-\lambda t) \cdot \mathbb{E}[exp(\lambda(X-\mathbb{E}[X]))] \qquad \text{laplace transform}\\
&\leq exp(-\lambda t) \cdot exp(\frac{\lambda^2 \cdot \sigma^2}{2}) \qquad \text{sub gaussion}\\
&\leq exp\{-\lambda t + \frac{\lambda^2 \cdot \sigma^2}{2}\}
\end{aligned}
$$
Now, we need to make $\lambda$ as small as possible.
So, let
$$
g(\lambda)=-\lambda t + \frac{\lambda^2 \cdot \sigma^2}{2}
$$
Calculate the derivative of $g(\lambda)$
$$
\begin{aligned}
g’(\lambda) = -t+\lambda \sigma^2 &= 0 \\
\lambda_* &=\frac{t}{\sigma^2}
\end{aligned}
$$
Then we can get $g(\lambda_*)$:
$$
g(\lambda_*)=\frac{-t^2}{2\sigma^2}
$$
Finally, take into above equation:
$$
\begin{aligned}
\mathbb{P}(X-\mathbb{E}(X) \geq t) &\leq exp\{-\lambda t + \frac{\lambda^2 \cdot \sigma^2}{2}\} \\
&\leq exp\{g(\lambda_*)\}\\
&\leq exp\{ \frac{-t^2}{2\sigma^2}\}
\end{aligned}
$$

Show that bdd(bounded) r.v.s are sub-Gaussion.

[=>Hoeffding Inequality

Rademacher r.v. is 1-sub-Gaussion.

Define Rademacher r.v. $\epsilon$ as:
$$
\mathbb{P}(\epsilon = +1)=\frac{1}{2} \qquad \mathbb{P}(\epsilon = -1)=\frac{1}{2}
$$

Proofing why Rademacher is 1-sub-Gaussion:

Fix $\lambda \in \mathbb{R}$. We need to show:
$$
\mathbb{E}[exp(\lambda \epsilon)] \leq exp(\frac{\lambda^2}{2})
$$
Here is the process:
$$
\begin{aligned}
\mathbb{E}[exp(\lambda \epsilon)] &= exp(\lambda)\frac{1}{2}+exp(-\lambda)\frac{1}{2} \\
(exp(x)=\sum_{k \geq 0}\frac{x^k}{k!}) \qquad &=\frac{1}{2}[\sum_{k \geq 0} \frac{\lambda^k}{k!} + \sum_{k \geq 0} \frac{(-\lambda)^k}{k!}] \\
\text{odd term cancel} \qquad & = 0\\
\text{even term exist} \qquad & = \sum_{k=0}^\infty \frac{\lambda^{2k}}{(2k)!}
\end{aligned}
$$

Now, we need to know that:
$$
(2k)!=(2k)(2k-1)(2k-2)….(k!) \geq 2^kk!
$$
$$
\begin{aligned}
\sum_{k=0}^\infty \frac{\lambda^{2k}}{(2k)!} &\leq \sum_{k=0}^\infty \frac{\lambda^{2k}}{2^kk!} \\
\text{due to}\sum_{k=0}^\infty \frac{\lambda^{2k}}{2^kk!}=exp(\frac{\lambda^2}{2}) \qquad &\leq exp(\frac{\lambda^2}{2})
\end{aligned}
$$

Let X be any r.v.

$X \in [-1, 1]$, Then $X$ is 1-sub-Gaussion

Proofing

“symmetry argument”: Introduce $X’$ as an iid(independent and identically distributed) draw of $X$.
Observe(obs):
$\epsilon$ is still Rademacher
$$X’-X \underset{\text{distribution}}{=} X-X’ \underset{\text{distribution}}{=} \epsilon(X-X’)$$

$$
\begin{aligned}
\mathbb{E}[exp(\lambda(x-\mathbb{E}(x)))] &= \underset{x}{\mathbb{E}} [exp(\lambda(x-\underset{x’}{\mathbb{E}}(x’)))]\\
&=\underset{x}{\mathbb{E}}[exp(\underset{x’}{\mathbb{E}}[\lambda(x-x’)])] \qquad (X \perp X’) \\
&\leq \underset{x,x’}{\mathbb{E}}[exp(\lambda(x-x’))] \qquad \text{Jensen’s ineq: If f is convex,} \mathbb{E}[f(x)] \geq f(\mathbb{E}[x]) \\
&= \underset{x,x’,\epsilon}{\mathbb{E}}[exp(\lambda(x-x’)\epsilon)] \qquad (X-X’ \underset{\text{distribution}}{=} \epsilon(X-X’)) \\
&= \underset{x,x’}{\mathbb{E}} \underset{\epsilon}{\mathbb{E}}[exp(\lambda(x-x’)\epsilon)] \qquad (\lambda(x-x’) \text{is constant to }\epsilon \text{ in } \underset{\epsilon}{\mathbb{E}}[]) \\
&\leq \underset{x,x’}{\mathbb{E}}[exp(\frac{\lambda^2(x-x’)^2}{2})] \qquad (\text{due to } \epsilon \text{ is 1-sub Gaussion}) \\
&\leq exp(\frac{\lambda^2(2)^2}{2}) \qquad (\text{due to the bound of x, 1-(-1)=2, max diff is 2}) \\
\end{aligned}
$$

Heoffding’s inequality

Heoffding's inequality

Proofing:

Claim: Let $x_1$, $x_2$, $x_3$,…,$x_n$ be independent random variables. Suppose $X_i$ is $\sigma_i$-sub-Gaussion.
Then $S_n=\sum_{i=1}^nx_i$ is $\sqrt{\sum_{i=1}^n \sigma_i^2}$-sub-Gaussion.

Claim:Let $X$ be $\sigma$-sub-Gaussison.
Then $\alpha X$ is $|\alpha|\sigma$-sub-Gaussison.

Claim:Let $X$ be $a$-sub-Gaussison, $X \in [-a, a]$. Given $X_i \in [-a_i, a_i]$, $X_i$ s are independent.
Then $S_n=\sum_{i=1}^nx_i$ is $\sqrt{\sum_{i=1}^n a_i^2}$-sub-Gaussison.

Now, proof:

$$\mathbb{P}\{S_n-\mathbb{E}[S_n] \geq t \} \leq exp(\frac{-t^2}{2 \sum_{i=1}^n a_i^2})$$