Deep Learning Book Series · 2.10 The Trace Operator
Introduction
I can assure you that you will read this chapter in 2 minutes! It is nice after the last two chapters that were quite big! We will see what is the Trace of a matrix. It will be needed for the last chapter on the Principal Component Analysis (PCA).
2.10 The Trace Operator
The trace of matrix
The trace is the sum of all values in the diagonal of a square matrix.
$ \bs{A}= \begin{bmatrix} 2 & 9 & 8 \\ 4 & 7 & 1 \\ 8 & 2 & 5 \end{bmatrix} $
Numpy provides the function trace()
to calculate it:
A = np.array([[2, 9, 8], [4, 7, 1], [8, 2, 5]])
A
array([[2, 9, 8], [4, 7, 1], [8, 2, 5]])
A_tr = np.trace(A)
A_tr
14
GoodFellow et al. explain that the trace can be used to specify the Frobenius norm of a matrix (see 2.5). The Frobenius norm is the equivalent of the $L^2$ norm for matrices. It is defined by:
Take the square of all elements and sum them. Take the square root of the result. This norm can also be calculated with:
We can check this. The first way to compute the norm can be done with the simple command np.linalg.norm()
:
np.linalg.norm(A)
17.549928774784245
The Frobenius norm of $\bs{A}$ is 17.549928774784245.
With the trace the result is identical:
np.sqrt(np.trace(A.dot(A.T)))
17.549928774784245
Since the transposition of a matrix doesn’t change the diagonal, the trace of the matrix is equal to the trace of its transpose:
Trace of a product
Example 1.
Let’s see an example of this property.
$ \bs{A}= \begin{bmatrix} 4 & 12 \\ 7 & 6 \end{bmatrix} $
$ \bs{B}= \begin{bmatrix} 1 & -3 \\ 4 & 3 \end{bmatrix} $
$ \bs{C}= \begin{bmatrix} 6 & 6 \\ 2 & 5 \end{bmatrix} $
A = np.array([[4, 12], [7, 6]])
B = np.array([[1, -3], [4, 3]])
C = np.array([[6, 6], [2, 5]])
np.trace(A.dot(B).dot(C))
531
np.trace(C.dot(A).dot(B))
531
np.trace(B.dot(C).dot(A))
531
$ \bs{ABC}= \begin{bmatrix} 360 & 432 \\ 180 & 171 \end{bmatrix} $
$ \bs{CAB}= \begin{bmatrix} 498 & 126 \\ 259 & 33 \end{bmatrix} $
$ \bs{BCA}= \begin{bmatrix} -63 & -54 \\ 393 & 594 \end{bmatrix} $
References
Feel free to drop me an email or a comment. The syllabus of this series can be found in the introduction post. All the notebooks can be found on Github.
This content is part of a series following the chapter 2 on linear algebra from the Deep Learning Book by Goodfellow, I., Bengio, Y., and Courville, A. (2016). It aims to provide intuitions/drawings/python code on mathematical theories and is constructed as my understanding of these concepts. You can check the syllabus in the introduction post.
1. Scalars, Vectors, Matrices and Tensors
2. Multiplying Matrices and Vectors
3. Identity and Inverse Matrices
4. Linear Dependence and Span
5. Norms
6. Special Kinds of Matrices and Vectors
7. Eigendecomposition
8. Singular Value Decomposition
9. The Moore-Penrose Pseudoinverse
10. The Trace Operator
11. The Determinant
12. Principal Components Analysis (PCA)