\(\def\bs#1{\boldsymbol{#1}} \def\b#1{\mathbf{#1}} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator{\r}{r} \DeclareMathOperator{\det}{det} \DeclareMathOperator{\logit}{\text{logit}}\)
Matrix \((r\times c)\): \(\b{A} = (a_{ij})\), \(i=1,\ldots,r\) and \(j=1,\ldots,c\)
Vector \((r\times 1)\): \(\b{x} = (x_{i})\), \(i=1,\ldots,r\)
Note
Simple operations with matrices, such as sum, multiplication, transposition and the dot product between two vectors, are assumed to be known!
Some particular cases
\(\b{1}_r=(1, 1, \ldots, 1)_r'\) – one vector \((r\times 1)\)
\(\b{0}_r=(0, 0, \ldots, 0)'\) – zero vector \((r\times 1)\)
Diagonal matrix – a square matrix where all the elements not on the main diagonal are zero
\(\b{I}_r\) – identity matrix – a diagonal matrix with \(\b{1}_r\) in the main diagonal
Some matrix operators
For a square matrix \(\b{A}_{r\times r}\):
diagonal – \(\diag(\b{A})= (a_{11}, \ldots, a_{rr})'\)
trace – \(\tr(\b{A}) = \sum_{i=1}^r a_{ii} = \b{1}_r'\diag(\b{A})\)
determinant – \(\det(\b{A}) = |\b{A}|\) – a scalar function of the entries of \(\b{A}\) whose formula is not relevant here
For a general matrix \(\b{A}_{r\times c}\):
rank – \(\r(\b{A}) =\) maximum number of columns (or rows) of \(\b{A}\) which are linearly independent
Some properties of trace and rank
For \(\b{A}_{r\times c}\) and \(\b{B}_{c\times r}\):
\(\tr(\b{A}\b{B}) = \tr(\b{B}\b{A})\)
\(\tr(\b{A}\b{A}') = \sum_{i=1}^r \sum_{j=1}^c{a_{ij}^2}\)
\(\r(\b{A}) \leq \min(r, c)\)
If \(\r(\b{A}) = \min(r, c)\) then \(\b{A}\) is said a full-rank matrix
\(\r(\b{A}) = \r(\b{A}') = \r(\b{A}\b{A}') = \r(\b{A}'\b{A})\)
Some definitions and properties of square matrices
\(\b{A}\) is said invertible if it exists \(\b{B}\) such that \(\b{A}\b{B}=\b{B}\b{A}=\b{I}\)
\(\b{B}=\b{A}^{-1}\) is called the inverse of \(\b{A}\)
\(\b{A}\) is called non-invertible or singular if and only if \(|\b{A}|=0\)
A full rank square matrix is invertible.
\((\b{A}')^{-1} = (\b{A}^{-1})'\)
\((\b{A}\b{B})^{-1} = \b{B}^{-1}\b{A}^{-1}\)
\(\b{A}\) is called orthogonal if \(\b{A}\b{A}'=\b{A}'\b{A}=\b{I}\)
\(\b{A}\) orthogonal \(\implies |\b{A}|=\pm 1\) and \(\b{A}'=\b{A}^{-1}\)
\(\b{A}\) is called idempotent if \(\b{A}\b{A}=\b{A}\)
\(\b{A}\) idempotent \(\implies \begin{cases} \tr(\b{A})=\r(\b{A})\\ \b{I}-\b{A}\, \text{is also idempotent}\\ \r(\b{A})+\r(\b{I}-\b{A})=r \end{cases}\)
Some properties of the determinant
For \(\b{A}_{r\times r}\) and \(\b{B}_{r\times r}\):
\(|\b{A}'| = |\b{A}|\)
\(|\b{A}\b{B}| = |\b{A}|\times |\b{B}|\)
\(|\b{A}^{-1}| = |\b{A}|^{-1}\)
Eigenvalues and eigenvectors
The eigenvalues, \(\lambda_1,\ldots,\lambda_r\), of \(\b{A}\) \((r\times r)\) are the roots of the characteristic polynomial \(P(\lambda) = |\b{A}-\lambda\b{I}|\).
For each \(\lambda_i\) there is a non-null vector \(\b{p}_i\), called an eigenvector, such that \(\b{A}\b{p}_i=\lambda_i\b{p}_i\).
Properties
If \(\b{A}\) is symmetric:
The eigenvalues are real numbers and the eigenvectors are orthogonal
\(\r(\b{A})\) is equal to the number of eigenvalues which are not zero
\(\tr(\b{A}) = \sum_{i=1}^r\lambda_i\)
\(|\b{A}| = \prod_{i=1}^r\lambda_i\)
Quadratic forms
A symmetric matrix \(\b{A}\) defines an associated quadratic form: \[Q(\b{x})= \b{x}'\b{A}\b{x}\]
\(\b{A}\) is positive definite if \(\forall \b{x}\neq \b{0} : Q(\b{x})>0\)
\(\b{A}\) is positive semi-definite if \(\forall \b{x} : Q(\b{x})\geq 0\) and \(\exists \b{x}\neq \b{0} : Q(\b{x})=0\)
Properties
\(\b{A}_{r\times r}\) is positive definite:
\(\implies r(\b{A})=r\) and \(a_{i,i}>0,\,\forall i\)
\(\implies \b{A}^{-1}\) is positive definite
\(\implies \b{P}'\b{A}\b{P}\) is positive definite, where \(\b{P}\) is non-singular
\(\iff\) all its eigenvalues are positive
\(\iff \exists \b{F}\) non-singular such that as \(\b{A}=\b{F}\b{F}'\)
Some matrix derivatives
\(f(\b{x}) = \b{a}'\b{x}=\b{x}'\b{a}\implies \dfrac{df}{d\b{x}}=\b{a}\)
\(\b{f}(\b{x}) = \b{A}\b{x}\implies \dfrac{d\b{f}}{d\b{x}}=\b{A}'\)
\(f(\b{x}) = \b{x}'\b{A}\b{x}\implies \dfrac{df}{d\b{x}}=(\b{A} + \b{A}')\b{x}\)
Univariate distributions
Theorem
\(X\sim N(0,1) \implies X^2\sim \chi_{(1)}^2\)
if \(X_1,\ldots,X_n\) are independent r.v. such as \(X_i\sim\chi_{(1)}^2\) then \(\sum_{i=1}^n{X_i}\sim \chi_{(n)}^2\)
A r.v. \(X\) with p.d.f \[f_X(x)=\frac{(1/2)^{n/2}}{\Gamma\left(n/2\right)}x^{n/2-1}e^{-x/2},\, x>0,\] is said to have a \(\chi^2\) distribution with \(n\) degrees of freedom, \(X\sim \chi_{(n)}^2\), \(n\in\mathbb{N}\).
\(\chi^2\) densities
Let \(X_1,\ldots,X_n\) be independent r.v such as \(X_i\sim N\left(\mu,\sigma^2\right)\). Then:
\(\dfrac{\sum_{i=1}^n{(X_i-\mu)^2}}{\sigma^2}\sim \chi_{(n)}^2\)
\(\dfrac{\sum_{i=1}^n{(X_i-\bar{X})^2}}{\sigma^2}=\frac{(n-1)S^2}{\sigma^2}\sim \chi_{(n-1)}^2\)
Theorem If \(X\sim N(0,1)\) and \(Y\sim\chi_{(n)}^2\) are independent r.v. then \[\frac{X}{\sqrt{Y/n}}\sim t_{(n)}\]
A r.v. \(X\) with p.d.f \[f_X(x)=\frac{\Gamma\left(\frac{n+1}{2}\right)}{\sqrt{n\pi}\Gamma\left(\frac{n}{2}\right)}\left(1+\frac{x^2}{n}\right)^{-\frac{n+1}{2}},\, x\in\mathbb{R}\] is said to have a t-Student distribution with \(n\) degrees of freedom, \(X\sim t_{(n)}\), \(n\in\mathbb{N}\).
Note
t-Student densities
Let \(\b{y}\) be a random vector of dimension \(m\).
\(E[\b{y}]=\bs{\mu}=\left(E[y_1],\ldots, E[y_m]\right)'\)
\(Var[\b{y}]=\bs{\Sigma}=E\left[(\b{y}-\bs{\mu})(\b{y}-\bs{\mu})'\right]=\left(\sigma_{ij}\right)_{m\times m}\) where \(\sigma_{ij}=Cov[y_i, y_j]\)
Note \(\bs{\Sigma}\) is a symmetric positive semi-definite matrix
Properties
\(E[\b{A}\b{y}+\b{b}]=\b{A}\bs{\mu}+\b{b}\)
\(Var[\b{A}\b{y}+\b{b}]=\b{A}\bs{\Sigma}\b{A}'\)
The multivariate normal distribution
\(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma})\)
\[f(\b{y}) = (2\pi)^{-\frac{n}{2}}|\bs{\Sigma}|^{-\frac{1}{2}}\exp\left[-\frac{1}{2}(\b{y}-\bs{\mu})'\bs{\Sigma}^{-1}(\b{y}-\bs{\mu})\right],\,\b{y}\in\mathbb{R}^m\]
\(E[\b{y}]=\bs{\mu}\)
\(Var[\b{y}]=\bs{\Sigma}\)
Some properties
Non-central distributions
\(X\sim N(0,1) \implies X^2\sim \chi_{(1)}^2\)
\(X\sim N(\mu,1) \implies X^2\sim \chi_{(1, \mu^2)}^2\) (non-central \(\chi^2\) distr.)
If \(X_1,\ldots,X_k\) are independent r.v. such as \(X_i\sim\chi_{(n_i,\lambda_i)}^2\) then \(\sum_{i=1}^k{X_i}\sim \chi_{(n, \lambda)}^2\) with \(n=\sum_{i=1}^k{n_i}\) and \(\lambda=\sum_{i=1}^k{\lambda_i}\)
If \(X\sim N(0,1)\) and \(Y\sim\chi_{(n)}^2\) are independent r.v. then \[T=\frac{X}{\sqrt{Y/n}}\sim t_{(n)}\]
If \(X\sim N(\mu,1)\) then \(T\sim t_{(n,\mu)}\) (non-central \(t\) distr.)
The F distribution
If \(X_1\sim\chi_{(n_1)}^2\) and \(X_2\sim\chi_{(n_2)}^2\) are independent r.v. then \[W=\frac{X_1/n_1}{X_2/n_2}\sim F_{(n_1,n_2)}\]
If \(X_1\sim\chi_{(n_1, \lambda)}^2\) then \(W\sim F_{(n_1,n_2, \lambda)}\)
Some multivariate results
Let \(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma})\) and \(\b{A}\) is a symmetric matrix:
\(E[\b{y}'\b{A}\b{y}] = tr(\b{A}\bs{\Sigma}) + \bs{\mu}'\b{A}\bs{\mu}\)
\(Var[\b{y}'\b{A}\b{y}] = 2tr(\b{A}\bs{\Sigma})^2 + 4\bs{\mu}'\b{A}\bs{\Sigma}\b{A}\bs{\mu}\)
\(\b{y}'\b{A}\b{y}\sim\chi_{(r, \lambda)}^2\) with \(\lambda=\bs{\mu}'\b{A}\bs{\mu}\) iff \(\b{A}\bs{\Sigma}\) is idempotent of rank \(r\)
\(\b{y}\sim N_m(\bs{\mu}, \bs{\Sigma}) \implies (\b{y}-\bs{\mu})'\bs{\Sigma}^{-1}(\b{y}-\bs{\mu})\sim\chi_{(m)}^2\)
\(\b{y}\sim N_m(\bs{\mu}, \sigma^2\b{I}) \implies \frac{\b{y}'\b{y}}{\sigma^2}\sim\chi_{(m, \lambda)}^2\) with \(\lambda=\bs{\mu}'\bs{\mu}\)
Let \(\b{B}\) be a symmetric matrix. Then \(\b{y}'\b{A}\b{y}\) and \(\b{y}'\b{B}\b{y}\) are independent iff \(\b{A}\bs{\Sigma}\b{B}=\b{0}\).