Matrix algebra and some distributions

\(\def\bs#1{\boldsymbol{#1}} \def\b#1{\mathbf{#1}} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator{\r}{r} \DeclareMathOperator{\det}{det} \DeclareMathOperator{\logit}{\text{logit}}\)

Topics of matrix algebra

Matrix \((r\times c)\): \(\b{A} = (a_{ij})\), \(i=1,\ldots,r\) and \(j=1,\ldots,c\)

Vector \((r\times 1)\): \(\b{x} = (x_{i})\), \(i=1,\ldots,r\)

Note

Simple operations with matrices, such as sum, multiplication, transposition and the dot product between two vectors, are assumed to be known!

Some particular cases

\(\b{1}_r=(1, 1, \ldots, 1)_r'\)one vector \((r\times 1)\)

\(\b{0}_r=(0, 0, \ldots, 0)'\)zero vector \((r\times 1)\)

Diagonal matrix – a square matrix where all the elements not on the main diagonal are zero

\(\b{I}_r\)identity matrix – a diagonal matrix with \(\b{1}_r\) in the main diagonal

Some matrix operators

For a square matrix \(\b{A}_{r\times r}\):

diagonal\(\diag(\b{A})= (a_{11}, \ldots, a_{rr})'\)

trace\(\tr(\b{A}) = \sum_{i=1}^r a_{ii} = \b{1}_r'\diag(\b{A})\)

determinant\(\det(\b{A}) = |\b{A}|\) – a scalar function of the entries of \(\b{A}\) whose formula is not relevant here

For a general matrix \(\b{A}_{r\times c}\):

rank\(\r(\b{A}) =\) maximum number of columns (or rows) of \(\b{A}\) which are linearly independent

Some properties of trace and rank

For \(\b{A}_{r\times c}\) and \(\b{B}_{c\times r}\):

  1. \(\tr(\b{A}\b{B}) = \tr(\b{B}\b{A})\)

  2. \(\tr(\b{A}\b{A}') = \sum_{i=1}^r \sum_{j=1}^c{a_{ij}^2}\)

  3. \(\r(\b{A}) \leq \min(r, c)\)

    If \(\r(\b{A}) = \min(r, c)\) then \(\b{A}\) is said a full-rank matrix

  4. \(\r(\b{A}) = \r(\b{A}') = \r(\b{A}\b{A}') = \r(\b{A}'\b{A})\)

Some definitions and properties of square matrices

  1. \(\b{A}\) is said invertible if it exists \(\b{B}\) such that \(\b{A}\b{B}=\b{B}\b{A}=\b{I}\)

    \(\b{B}=\b{A}^{-1}\) is called the inverse of \(\b{A}\)

  1. \(\b{A}\) is called non-invertible or singular if and only if \(|\b{A}|=0\)

  2. A full rank square matrix is invertible.

  3. \((\b{A}')^{-1} = (\b{A}^{-1})'\)

  4. \((\b{A}\b{B})^{-1} = \b{B}^{-1}\b{A}^{-1}\)

  1. \(\b{A}\) is called orthogonal if \(\b{A}\b{A}'=\b{A}'\b{A}=\b{I}\)

    \(\b{A}\) orthogonal \(\implies |\b{A}|=\pm 1\) and \(\b{A}'=\b{A}^{-1}\)

  1. \(\b{A}\) is called idempotent if \(\b{A}\b{A}=\b{A}\)

    \(\b{A}\) idempotent \(\implies \begin{cases} \tr(\b{A})=\r(\b{A})\\ \b{I}-\b{A}\, \text{is also idempotent}\\ \r(\b{A})+\r(\b{I}-\b{A})=r \end{cases}\)

Some properties of the determinant

For \(\b{A}_{r\times r}\) and \(\b{B}_{r\times r}\):

  1. \(|\b{A}'| = |\b{A}|\)

  2. \(|\b{A}\b{B}| = |\b{A}|\times |\b{B}|\)

  3. \(|\b{A}^{-1}| = |\b{A}|^{-1}\)

Eigenvalues and eigenvectors

The eigenvalues, \(\lambda_1,\ldots,\lambda_r\), of \(\b{A}\) \((r\times r)\) are the roots of the characteristic polynomial \(P(\lambda) = |\b{A}-\lambda\b{I}|\).

For each \(\lambda_i\) there is a non-null vector \(\b{p}_i\), called an eigenvector, such that \(\b{A}\b{p}_i=\lambda_i\b{p}_i\).

Properties

If \(\b{A}\) is symmetric:

  1. The eigenvalues are real numbers and the eigenvectors are orthogonal

  2. \(\r(\b{A})\) is equal to the number of eigenvalues which are not zero

  3. \(\tr(\b{A}) = \sum_{i=1}^r\lambda_i\)

  4. \(|\b{A}| = \prod_{i=1}^r\lambda_i\)

  1. Spectral decomposition \[\b{A} = \b{P}\bs{\Lambda}\b{P}'\] where \(\b{P}\) is the orthogonal matrix whose columns are the eigenvectors and \(\bs{\Lambda}\) is a diagonal matrix with the eigenvalues on its main diagonal.

Quadratic forms

A symmetric matrix \(\b{A}\) defines an associated quadratic form: \[Q(\b{x})= \b{x}'\b{A}\b{x}\]

Properties

\(\b{A}_{r\times r}\) is positive definite:

  1. \(\implies r(\b{A})=r\) and \(a_{i,i}>0,\,\forall i\)

  2. \(\implies \b{A}^{-1}\) is positive definite

  3. \(\implies \b{P}'\b{A}\b{P}\) is positive definite, where \(\b{P}\) is non-singular

  4. \(\iff\) all its eigenvalues are positive

  5. \(\iff \exists \b{F}\) non-singular such that as \(\b{A}=\b{F}\b{F}'\)

Some matrix derivatives

\(f(\b{x}) = \b{a}'\b{x}=\b{x}'\b{a}\implies \dfrac{df}{d\b{x}}=\b{a}\)

\(\b{f}(\b{x}) = \b{A}\b{x}\implies \dfrac{d\b{f}}{d\b{x}}=\b{A}'\)

\(f(\b{x}) = \b{x}'\b{A}\b{x}\implies \dfrac{df}{d\b{x}}=(\b{A} + \b{A}')\b{x}\)

Some probability distributions

Univariate distributions

Theorem

  1. \(X\sim N(0,1) \implies X^2\sim \chi_{(1)}^2\)

  2. if \(X_1,\ldots,X_n\) are independent r.v. such as \(X_i\sim\chi_{(1)}^2\) then \(\sum_{i=1}^n{X_i}\sim \chi_{(n)}^2\)

A r.v. \(X\) with p.d.f \[f_X(x)=\frac{(1/2)^{n/2}}{\Gamma\left(n/2\right)}x^{n/2-1}e^{-x/2},\, x>0,\] is said to have a \(\chi^2\) distribution with \(n\) degrees of freedom, \(X\sim \chi_{(n)}^2\), \(n\in\mathbb{N}\).

\(\chi^2\) densities

Let \(X_1,\ldots,X_n\) be independent r.v such as \(X_i\sim N\left(\mu,\sigma^2\right)\). Then:

  1. \(\dfrac{\sum_{i=1}^n{(X_i-\mu)^2}}{\sigma^2}\sim \chi_{(n)}^2\)

  2. \(\dfrac{\sum_{i=1}^n{(X_i-\bar{X})^2}}{\sigma^2}=\frac{(n-1)S^2}{\sigma^2}\sim \chi_{(n-1)}^2\)

Theorem If \(X\sim N(0,1)\) and \(Y\sim\chi_{(n)}^2\) are independent r.v. then \[\frac{X}{\sqrt{Y/n}}\sim t_{(n)}\]

A r.v. \(X\) with p.d.f \[f_X(x)=\frac{\Gamma\left(\frac{n+1}{2}\right)}{\sqrt{n\pi}\Gamma\left(\frac{n}{2}\right)}\left(1+\frac{x^2}{n}\right)^{-\frac{n+1}{2}},\, x\in\mathbb{R}\] is said to have a t-Student distribution with \(n\) degrees of freedom, \(X\sim t_{(n)}\), \(n\in\mathbb{N}\).

Note

t-Student densities

Multivariate distributions

Let \(\b{y}\) be a random vector of dimension \(m\).

\(E[\b{y}]=\bs{\mu}=\left(E[y_1],\ldots, E[y_m]\right)'\)

\(Var[\b{y}]=\bs{\Sigma}=E\left[(\b{y}-\bs{\mu})(\b{y}-\bs{\mu})'\right]=\left(\sigma_{ij}\right)_{m\times m}\) where \(\sigma_{ij}=Cov[y_i, y_j]\)

Note \(\bs{\Sigma}\) is a symmetric positive semi-definite matrix

Properties

  1. \(E[\b{A}\b{y}+\b{b}]=\b{A}\bs{\mu}+\b{b}\)

  2. \(Var[\b{A}\b{y}+\b{b}]=\b{A}\bs{\Sigma}\b{A}'\)

The multivariate normal distribution

\(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma})\)

\[f(\b{y}) = (2\pi)^{-\frac{n}{2}}|\bs{\Sigma}|^{-\frac{1}{2}}\exp\left[-\frac{1}{2}(\b{y}-\bs{\mu})'\bs{\Sigma}^{-1}(\b{y}-\bs{\mu})\right],\,\b{y}\in\mathbb{R}^m\]

  1. \(E[\b{y}]=\bs{\mu}\)

  2. \(Var[\b{y}]=\bs{\Sigma}\)

Some properties

  1. If \(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma})\), \(\b{A}_{p\times m}\) with \(r(\b{A})=p\) then \[\b{z}=\b{A}\b{y}+\b{b}\sim N_p(\b{A}\bs{\mu}+\b{b}, \b{A}\bs{\Sigma}\b{A}')\]
  1. If \(\b{y}\) is partitioned in two blocks \(\b{y}_1\) and \(\b{y}_2\) with \(n_1\) and \(n_2\) components, such that \[\b{y}=\begin{pmatrix}\b{y}_1\\ \b{y}_2\end{pmatrix}\sim N_n\left(\begin{pmatrix}\bs{\mu}_1\\ \bs{\mu}_2\end{pmatrix}, \begin{pmatrix}\bs{\Sigma}_{11} & \bs{\Sigma}_{12}\\ \bs{\Sigma}_{21} & \bs{\Sigma}_{22}\end{pmatrix}\right)\] then \[\b{y}_i\sim N_{n_i}(\bs{\mu}_i, \bs{\Sigma}_{ii})\]
  1. For the same blocks, \[\b{y}_1\mid \b{y}_2\sim N_{n_1}\left(\bs{\mu}_1 + \bs{\Sigma}_{12}\bs{\Sigma}_{22}^{-1}(\b{y}_2-\bs{\mu}_2), \bs{\Sigma}_{11} - \bs{\Sigma}_{12}\bs{\Sigma}_{22}^{-1}\bs{\Sigma}_{21}\right)\]
  1. For the same blocks, \(\b{y}_1\) and \(\b{y}_2\) are independent iff \(\bs{\Sigma}_{12}=\bs{\Sigma}_{21}'=\b{0}\)

Non-central distributions

  1. \(X\sim N(0,1) \implies X^2\sim \chi_{(1)}^2\)

  2. \(X\sim N(\mu,1) \implies X^2\sim \chi_{(1, \mu^2)}^2\) (non-central \(\chi^2\) distr.)

  3. If \(X_1,\ldots,X_k\) are independent r.v. such as \(X_i\sim\chi_{(n_i,\lambda_i)}^2\) then \(\sum_{i=1}^k{X_i}\sim \chi_{(n, \lambda)}^2\) with \(n=\sum_{i=1}^k{n_i}\) and \(\lambda=\sum_{i=1}^k{\lambda_i}\)

  1. If \(X\sim N(0,1)\) and \(Y\sim\chi_{(n)}^2\) are independent r.v. then \[T=\frac{X}{\sqrt{Y/n}}\sim t_{(n)}\]

  2. If \(X\sim N(\mu,1)\) then \(T\sim t_{(n,\mu)}\) (non-central \(t\) distr.)

The F distribution

  1. If \(X_1\sim\chi_{(n_1)}^2\) and \(X_2\sim\chi_{(n_2)}^2\) are independent r.v. then \[W=\frac{X_1/n_1}{X_2/n_2}\sim F_{(n_1,n_2)}\]

  2. If \(X_1\sim\chi_{(n_1, \lambda)}^2\) then \(W\sim F_{(n_1,n_2, \lambda)}\)

Some multivariate results

Let \(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma})\) and \(\b{A}\) is a symmetric matrix:

  1. \(E[\b{y}'\b{A}\b{y}] = tr(\b{A}\bs{\Sigma}) + \bs{\mu}'\b{A}\bs{\mu}\)

  2. \(Var[\b{y}'\b{A}\b{y}] = 2tr(\b{A}\bs{\Sigma})^2 + 4\bs{\mu}'\b{A}\bs{\Sigma}\b{A}\bs{\mu}\)

  3. \(\b{y}'\b{A}\b{y}\sim\chi_{(r, \lambda)}^2\) with \(\lambda=\bs{\mu}'\b{A}\bs{\mu}\) iff \(\b{A}\bs{\Sigma}\) is idempotent of rank \(r\)

    1. \(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma}) \implies (\b{y}-\bs{\mu})'\bs{\Sigma}^{-1}(\b{y}-\bs{\mu})\sim\chi_{(m)}^2\)

    2. \(\b{y}\sim N_m(\bs{\mu}, \sigma^2\b{I}) \implies \frac{\b{y}'\b{y}}{\sigma^2}\sim\chi_{(m, \lambda)}^2\) with \(\lambda=\bs{\mu}'\bs{\mu}\)

  4. Let \(\b{B}\) be a symmetric matrix. Then \(\b{y}'\b{A}\b{y}\) and \(\b{y}'\b{B}\b{y}\) are independent iff \(\b{A}\bs{\Sigma}\b{B}=\b{0}\).

pages
grid_view zoom_out zoom_in swap_horiz autorenew visibility_off
menu
fullscreen
stylus_note ink_eraser delete_forever content_paste navigate_before navigate_next
draw