矩阵求导

矩阵求导

矩阵求导

矩阵求导的定义

自变量↓\因变量→ 标量$y$ 向量$\mathbf{y}$ 矩阵$\mathbf{Y}$
标量$x$ $\frac{\partial y}{\partial x}$ $\frac{\partial \mathbf{y}}{\partial x}$ $\frac{\partial \mathbf{Y}}{\partial x}$
向量$\mathbf{x}$ $\frac{\partial y}{\partial \mathbf{x}}$ $\frac{\partial \mathbf{y}}{\partial \mathbf{x}}$ $\frac{\partial \mathbf{Y}}{\partial \mathbf{x}}$
矩阵$\mathbf{X}$ $\frac{\partial y}{\partial \mathbf{X}}$ $\frac{\partial \mathbf{y}}{\partial \mathbf{X}}$ $\frac{\partial \mathbf{Y}}{\partial \mathbf{X}}$

矩阵求导的两种布局:

分子布局($numerator\ layout$)和分母布局($denominator\ layout$ )。

对于分子布局和分母布局的结果来说,两者相差一个转置。

自变量↓\因变量→ 标量$y$ 向量$\mathbf{y}$ 矩阵$\mathbf{Y}(m \times n)$
标量$x$ / $\frac{\partial \mathbf{y}}{\partial x}$
分子布局:m维列向量(默认布局)
分母布局:m维行向量
$\frac{\partial \mathbf{Y}}{\partial x}$
分子布局:$𝑝 \times 𝑞$ 矩阵(默认布局)
分母布局:$𝑞 \times 𝑝$矩阵
向量$\mathbf{x}$ $\frac{\partial y}{\partial \mathbf{x}}$
分子布局:m维列向量
分母布局:m维行向量(默认布局)
$\frac{\partial \mathbf{y}}{\partial \mathbf{x}}$
分子布局:$m \times n$雅克比矩阵(默认布局)
分母布局:$𝑛 \times 𝑚$梯度矩阵
/
矩阵$\mathbf{X}$ $\frac{\partial y}{\partial \mathbf{X}}$
分子布局:$n \times m$矩阵
分母布局:$𝑚 \times 𝑛$矩阵(默认布局)
/ /

矩阵求导的计算

自变量↓\因变量→ 标量$y$ 向量$\mathbf{y}$ 矩阵$\mathbf{Y}$
标量$x$ $\frac{\partial y}{\partial x}$ $\frac{\partial \mathbf{y}}{\partial x} = \left[ \begin{matrix} \frac{\partial{y_1}}{\partial{x}} \frac{\partial{y_2}}{\partial{x}} \end{matrix} \dots \frac{\partial{y_m}}{\partial{x}} \right] ^T$ (分子布局)
$\frac{\partial \mathbf{y}}{\partial x} = \left[ \begin{matrix} \frac{\partial{y_1}}{\partial{x}} \frac{\partial{y_2}}{\partial{x}} \end{matrix} \dots \frac{\partial{y_m}}{\partial{x}} \right]$ (分母布局)
$R_{m \times n}$
$R_{i,j} = {\frac{\partial{y_{i,j}}}{\partial x}}$ (分子布局),
$R_{i,j} = {\frac{\partial{y_{j,i}}}{\partial x}}$(分母布局)
向量$\mathbf{x}$ $\frac{\partial y}{\partial \mathbf{x}} = \left[ \begin{matrix} \frac{\partial{y}}{\partial{x_1}} \frac{\partial{y}}{\partial{x_2}} \end{matrix} \dots \frac{\partial{y}}{\partial{x_m}} \right]$ (分子布局)
$\frac{\partial y}{\partial \mathbf{x}} = \left[ \begin{matrix} \frac{\partial{y}}{\partial{x_1}} \frac{\partial{y}}{\partial{x_2}} \end{matrix} \dots \frac{\partial{y}}{\partial{x_m}} \right]^T$(分母布局)
$\frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \left( \begin{array}{ccc} \frac{\partial y_1}{\partial x_1}& \frac{\partial y_1}{\partial x_2}& \ldots & \frac{\partial y_1}{\partial x_n}\\ \frac{\partial y_2}{\partial x_1}& \frac{\partial y_2}{\partial x_2} & \ldots & \frac{\partial y_2}{\partial x_n}\\ \vdots& \vdots & \ddots & \vdots \\ \frac{\partial y_m}{\partial x_1}& \frac{\partial y_m}{\partial x_2} & \ldots & \frac{\partial y_m}{\partial x_n} \end{array} \right)$ (分子布局)

$\frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \left( \begin{array}{ccc} \frac{\partial y_1}{\partial x_1}& \frac{\partial y_2}{\partial x_1}& \ldots & \frac{\partial y_m}{\partial x_1}\\ \frac{\partial y_1}{\partial x_2}& \frac{\partial y_2}{\partial x_2} & \ldots & \frac{\partial y_m}{\partial x_2}\\ \vdots& \vdots & \ddots & \vdots \\ \frac{\partial y_1}{\partial x_n}& \frac{\partial y_2}{\partial x_n} & \ldots & \frac{\partial y_m}{\partial x_n} \end{array} \right)$ (分母布局)
$\frac{\partial \mathbf{Y}}{\partial \mathbf{x}}$
矩阵$\mathbf{X}$ $R_{m \times n}$
$R_{i,j} = {\frac{\partial{y}}{\partial{x_{i,j}}}}$ (分子布局),
$R_{i,j} = {\frac{\partial{y}}{\partial{x_{j,i}}}}$(分母布局)
$\frac{\partial \mathbf{y}}{\partial \mathbf{X}}$ $\frac{\partial \mathbf{Y}}{\partial \mathbf{X}}$

矩阵求导规则

$$
\frac{d x^T}{dx} = I
\tag{1.1}
$$

$$
\frac{dx}{dx^T} = I
\tag{1.2}
$$

$$
\frac{dx^T A}{dx} = A
\tag{2.1}
$$

$$
\frac{dAx}{dx^T} = A
\tag{2.2}
$$

$$
\frac{dAx}{dx} = A^T
\tag{2.3}
$$

$$
\frac{dxA}{dx} = A^T
\tag{2.4}
$$

$$
\frac{\partial{u}}{\partial{x^T}} = \left ( \frac{\partial{u^T}}{\partial x} \right)^T
\tag{3}
$$

$$
\frac{\partial{u^Tv}}{\partial{x}} = \frac{\partial{u^T}}{\partial{x}}v + \frac{\partial{v^T}}{\partial{x}}u^T
\tag{4.1}
$$

$$
\frac{\partial{uv^T}}{\partial{x}} = \frac{\partial{u}}{\partial{x}}v^T + u\frac{\partial{v^T}}{\partial{x}}
\tag{4.2}
$$

$$
\frac{d x ^Tx}{dx} = 2x
\tag{4.3}
$$

$$
\frac{dx^TAx}{dx} = (A+A^T)x
\tag{4.4}
$$

$$
\frac{\partial{AB}}{\partial{x}} = \frac{\partial{A}}{x}B + A\frac{\partial{B}}{\partial{x}}
\tag{5}
$$

$$
\frac{\partial{u^T}Xv}{\partial{X}} = uv^T
\tag{6.1}
$$

$$
\frac{\partial{u^T}X^TXu}{\partial{X}} = 2Xuu^T
\tag{6.2}
$$

$$
\frac{\partial{\left [ \left( Xu - v \right)^T \left( Xu - v \right)\right]}}{\partial X} = 2 \left( Xu - v \right) u^T
\tag{6.3}
$$

作者

Cyrusky

发布于

2019-10-03

更新于

2024-11-18

许可协议

评论