Home Stochastic Process 2023 08
Post
Cancel

Stochastic Process 2023 08

From Hao Zhang’s 2023 Lecture 08

Gaussian Processes

Sample Mean and Variance

  \(X_1, X_2, \dots, X_n \overset{\text{i.i.d.}}{\sim} N(\mu, \sigma^2)\), we look at \(\overline{X} = \dfrac{1}{n} \sum_{k=1}^{n} X_k, \quad \overline{S} = \dfrac{1}{n-1} \sum_{k=1}^n (X_k - \overline{X})^2\), we will prove \(\overline{X}, \overline{S}\) are independent.

\[\begin{align} & (n-1) \overline{S} = \sum_{k=1}^n (X_k - \overline{X})^2\\ =& \sum_{k=1}^n (X_k^2 - 2X_k \overline{X} + (\overline{X})^2)\\ =& \sum_{k=1}^n X_k^2 - n (\overline{X})^2 \end{align}\]

we design a matrix

\[A = \begin{pmatrix} \frac{1}{\sqrt{n}} & \frac{1}{\sqrt{n}} & \dots & \frac{1}{\sqrt{n}}\\ \dots & \dots & \dots & \dots\\ \vdots & \vdots & \ddots & \vdots\\ \dots & \dots & \dots & \dots \end{pmatrix}\]

such that \(A A^{\intercal} = A^\intercal A = I\)

Then, for \(Y = AX\), we know \(Y \sim N(A \mu, A \sigma^2 I A^\intercal) = N(A \mu, \sigma^2 I)\) \(Y_1 = \sqrt{n}\cdot\overline{X}\). Also

\[X = A^\intercal Y\] \[\begin{align} & (n-1) \overline{S} = \sum_{k=1}^n X_k^2 - n(\overline{X})^2\\ =& X^\intercal X - n(\overline{X})^2\\ =& Y^\intercal A A ^\intercal Y - Y_1^2\\ =& \sum_{k=2}^n Y_k^2 \end{align}\]

Thus we know \((n-1) \overline{S}\) are independent with \(Y_1\)

Conditional Distribution

\[(X_1, X_2) \sim N\Bigg( \begin{pmatrix} \mu_1\\ \mu_2 \end{pmatrix}, \begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix} \Bigg), \quad X_1 \in \mathbb{R}^m, X_2 \in \mathbb{R}^n\]

We want to find the distribution \(f_{X_2 \vert X_1} (x_2 \vert x_1) = f_{X_1, X_2}(x_1, x_2) / f_{X_1}(x_1)\), we need to deal with the following

\[\exp \Bigg( -\frac{1}{2} \begin{pmatrix} x_1 - \mu_1\\ x_2 - \mu_2 \end{pmatrix}^\intercal \begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix}^{-1} \begin{pmatrix} x_1 - \mu_1\\ x_2 - \mu_2 \end{pmatrix} - \frac{1}{2} (x_1^\intercal - \mu_1^\intercal) \Sigma_{11}^{-1} (x_1 - \mu_1) \Bigg)\]

First we need to calculate the inverse

\[\begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix} \begin{array}{c} \begin{pmatrix} I & 0 \\ -\Sigma_{21} \Sigma_{11}^{-1} & I \end{pmatrix} \\ \xrightarrow[\text{left}]{\hspace{3cm}} \end{array} \begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ 0 & \Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12} \end{pmatrix}\] \[\begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ 0 & \Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12} \end{pmatrix} \begin{array}{c} \begin{pmatrix} I & -\Sigma_{11}^{-1} \Sigma_{12} \\ 0 & I \end{pmatrix} \\ \xrightarrow[\text{right}]{\hspace{3cm}} \end{array} \begin{pmatrix} \Sigma_{11} & 0\\ 0 & \Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12} \end{pmatrix}\]

Thus

\[\begin{pmatrix} I & 0 \\ -\Sigma_{21} \Sigma_{11}^{-1} & I \end{pmatrix} \begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix} \begin{pmatrix} I & -\Sigma_{11}^{-1} \Sigma_{12} \\ 0 & I \end{pmatrix}= \begin{pmatrix} \Sigma_{11} & 0\\ 0 & \Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12} \end{pmatrix}\]

we know

\[\begin{pmatrix} I & 0\\ A & I \end{pmatrix}^{-1} = \begin{pmatrix} I & 0\\ -A & 0 \end{pmatrix}\] \[\begin{pmatrix} I & A\\ 0 & I \end{pmatrix}^{-1} = \begin{pmatrix} I & -A\\ 0 & 0 \end{pmatrix}\]

thus

\[\begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix}= \begin{pmatrix} I & 0 \\ \Sigma_{21} \Sigma_{11}^{-1} & I \end{pmatrix} \begin{pmatrix} \Sigma_{11} & 0\\ 0 & \Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12} \end{pmatrix} \begin{pmatrix} I &\Sigma_{11}^{-1} \Sigma_{12} \\ 0 & I \end{pmatrix}\]

thus

\[\begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix}^{-1}= \begin{pmatrix} I &-\Sigma_{11}^{-1} \Sigma_{12} \\ 0 & I \end{pmatrix} \begin{pmatrix} \Sigma_{11}^{-1} & 0\\ 0 & (\Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12})^{-1} \end{pmatrix} \begin{pmatrix} I & 0 \\ -\Sigma_{21} \Sigma_{11}^{-1} & I \end{pmatrix}\]

we can continue

\[\begin{align} &\begin{pmatrix} x_1 - \mu_1\\ x_2 - \mu_2 \end{pmatrix}^\intercal \begin{pmatrix} \Sigma_{11} & \Sigma_{12}\\ \Sigma_{21} & \Sigma_{22} \end{pmatrix}^{-1} \begin{pmatrix} x_1 - \mu_1\\ x_2 - \mu_2 \end{pmatrix}\\ =& \big( x_1^\intercal - \mu_1^\intercal \, , \, x_2^\intercal - \mu_2^\intercal - (x_1^\intercal - \mu_1^\intercal)\Sigma_{11}^{-1} \Sigma_{12}\big) \begin{pmatrix} \Sigma_{11}^{-1} & 0\\ 0 & (\Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12})^{-1} \end{pmatrix} \begin{pmatrix} x_1 - \mu_1\\ x_2 - \mu_2 - \Sigma_{21}\Sigma_{11}^{-1}(x_1 - \mu_1) \end{pmatrix}\\ =& (x_1^\intercal - \mu_1^\intercal) \Sigma_{11}^{-1} (x_1 - \mu_1) + \big(x_2 - \mu_2 - \Sigma_{21} \Sigma_{11}^{-1}(x_1-\mu_1) \big)^\intercal (\Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12})^{-1} \big(x_2 - \mu_2 - \Sigma_{21} \Sigma_{11}^{-1} (x_1 - \mu_1)\big) \end{align}\]

thus

\[\exp \Big( -\dfrac{1}{2} \big(x_2 - \mu_2 - \Sigma_{21} \Sigma_{11}^{-1}(x_1-\mu_1) \big)^\intercal (\Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12})^{-1} \big(x_2 - \mu_2 - \Sigma_{21} \Sigma_{11}^{-1} (x_1 - \mu_1)\big) \Big)\]

and

\[f_{X_2 \vert X_1} (x_2 \vert x_1) \sim N(\mu_2 + \Sigma_{21}\Sigma_{11}^{-1}(x_1 - \mu_1), \Sigma_{22} - \Sigma_{21}\Sigma_{11}^{-1}\Sigma_{12})\]

Special Cases

If \(X_1, X_2 \in \mathbb{R}\)

\[E(X_2 \vert X_1) = \mu_2 + \dfrac{\sigma_{21}}{\sigma_{11}}(x_1 - \mu_1)\] \[\mathrm{Var}(X_2 \vert X_1) = \sigma_{22} - \dfrac{\sigma_{21}^2}{\sigma_{11}}\]

More Examples

With the conclusion we have, we can do

\[E(f(X_2) \vert X_1)\]

for any function \(f\), since we know it is a normal distribution.

We have conditional pdf, conditional expection, why we didn’t have conditional variable?

For \(Y \sim N(0, \sigma^2)\), let’s calculate \(E(Y^n)\)

\[E(Y^n) = \begin{cases} 0, \quad n = 2k-1\\ (2k-1)!! \sigma^{2k}, \quad n = 2k \end{cases}\] \[\begin{align} &\int_{-\infty}^{\infty} y^{2k} \exp\big(-\dfrac{y^2}{2\sigma^2} \big) dy\\ =& -\sigma^2 \int_{-\infty}^{\infty} y^{2k-1} d\Big( \exp\big( - \dfrac{y^2}{2\sigma^2}\big) \Big)\\ =& -\sigma^2 y^{2k-1} \exp \big( -\dfrac{y^2}{2\sigma^2} \big) \Big\vert_{-\infty}^{\infty} + (2k-1)\sigma^2 \int_{-\infty}^{\infty} \exp\big(-\dfrac{y^2}{2\sigma^2} \big) y^{2k-2} dy\\ =& (2k-1)\sigma^2 \int_{-\infty}^{\infty} \exp\big(-\dfrac{y^2}{2\sigma^2} \big) y^{2k-2} dy \end{align}\]

Thus

\[E(Y^{2k}) = (2k-1)\sigma^2 E(Y^{2k-2})\]

Thus

\[E(Y^{2k}) = (2k-1)!! \sigma^{2k}\]

where

\[(2k-1)!! = (2k-1) (2k-3) \dots 1\] \[\begin{align} E(Y^2) &= \sigma^2\\ E(Y^4) &= 3 \sigma^4\\ E(Y^6) &= 15 \sigma^6 \end{align}\]

For \(Y \sim N(\mu, \sigma^2)\), how to calculate \(E(\cos(Y))\)?

\[\cos(Y) = \dfrac{1}{2} \big( \exp(jY) + \exp(-jY) \big)\] \[\begin{align} E(\cos(Y)) &= \dfrac{1}{2} E\big( \exp(jY) + \exp(-jY) \big)\\ &= \dfrac{1}{2} \big(\Phi_Y(1) + \Phi_Y(-1) \big)\\ &= \dfrac{1}{2} \big( \exp(j \mu - \frac{1}{2}\sigma^2) + \exp(-j\mu - \frac{1}{2} \sigma^2) \big)\\ &= \exp\big(-\frac{1}{2} \sigma^2 \Big) \cos(\mu) \end{align}\]

Bayes

Assume \(X \sim N(\mu, \Sigma_X), \quad Y = B X + Z\), where \(X, Z\) are independent, \(Z \sim N(0, \sigma^2 I)\). We want to investigate \(X \vert Y\). It can be understand as \(Y\) is the observed value, \(X\) is the state.

This post is licensed under CC BY 4.0 by the author.