Linear Algebra Done Right, Axler

Then the eigenvectors of consist precisely of the entries on the diagonal of that upper-triangular matrix

I think this is a typo and should be "eigenvalues" instead of "eigenvectors"?

The determinant is negative when the operator flips all the vectors it works on.

This could be misleading. E.g. the operator f(v) := -v that literally just flips all vectors has determinant (-1)^n, where n is the dimension of the space it's working on. The sign of the determinant tells you whether an operator flips the orientation of volumes, it can't tell you anything about what it does to individual vectors.

(Regarding "orientation of volumes": in the 2D case, think of R^2 as a sheet of paper, then f(v) := -v is just a 180 degree rotation, so the same side stays up, and the determinant is positive. In contrast, flipping along an axis requires turning over the paper, so negative determinant. Unfortunately this can't really be visualized the same way in 3D, so then you have to think about ordered bases.)

[-]David Udell3y30

Thanks -- right on both counts! Post amended.

[-]Oliver Sourbut3y70

This is great! I'll thread a few nits under this comment

[-]Oliver Sourbut3y92

We call a subspace invariant under $f \in L (V)$ if, for all $\to s \in S$ ,

$f (\to s) = \to s$

should read

$f (\to s) \in S$

[-]Oliver Sourbut3y82

Let now specifically be a one-dimensional subspace of $V$ such that, for all $\to v \in V$ ,

$S = {a \to v : a \in F}$

I think such $S$ can not exist in most cases, and it should instead read '... for some $\to v \in V$ ...'

The expression for $S$ is describing the span of the vector $\to v$ , so certainly if $V$ is more than one-dimensional, if some subspace $S$ has this property for all $\to v \in V$ then it has this property for linearly independent vectors in $V$ , which is a contradiction.

[-]Gurkenglas3y40

The definition of matrix ("the basis maps to:") ought to come after the "uniquely determines the linear map" that justifies it.

For interpreting v as a slim matrix, I would use bra-ket notation: |v> for the function of type V <- R, <v| for the function whose type is the dual R <- V. Then <v|v> has type R <- R (and corresponds to multiplication by a scalar) and |v><v| has type V <- V.

An inner product just maps |v> to <v|. (Though I don't quite see what the symmetry is for.)

Mapping a point cloud through a linear map thins it by a factor of the determinant; this generalizes to smooth maps, since they are locally linear.

^{^}

Intuitively, a a homomorphism is a function showing how the operation of vector addition can be translated from one vector space into another and back.

More precisely, a homomorphism is a function (here, from a vector space $V$ to a vector space $W$ ) such that

f (\to v + \to x) = f (\to v) + f (\to x)

with $\to v, \to x \in V$ and $f (\to v), f (\to x) \in W$ .

The vector addition symbol $+$ on the left side of the equality, inside the function, is defined in $V$ , and the addition symbol $+$ on the right side of the equality, between the function values, is defined in $W$ .

^{^}

Vectors can be interpreted geometrically as rays from the origin out to points in a space. Vectors can also be understood algebraically as ordered sets of numbers (with each number representing a coordinate over in the ray interpretation).

As far as notation goes, we'll use variables with arrows $\to v$ for vectors, lowercase variables $x$ for numbers, and capital variables $V$ for other larger mathematical structures, such as vector spaces.

^{^}

In this book, that field $F$ will be either the reals $R$ or the complexes $C$ .

^{^}

Take note of how homomorphism-ish the below distributive relationships are!

^{^}

Vectors are conventionally written vertically. But each vector $\to v = [\begin{matrix} 10 \end{matrix}]$ has a transpose $[1, 0]^{T} = \to v = [\begin{matrix} 10 \end{matrix}]$ , where the vector is written out horizontally instead.

So we'll use vector transposes to stay in line with conventional notation while not writing out those giant vertical vectors everywhere.

^{^}

One deep idea out of mathematics is that the dimensionality of a system is just the number of variables in that system that can vary independently of every other variable. You live in $3$ -dimensional space because you can vary your horizontal, vertical, and $z$ -dimensional position without necessarily changing your position in the other two spatial dimensions by doing so.

^{^}

Note that the set ${\to 0}$ , where $\to 0$ is a vector containing only $0$ any number $n \in N$ of times, satisfies the vector space axioms!

\to 0 + \to 0 = \to 0 = (\to 0 + \to 0) + \to 0 = \to 0 + (\to 0 + \to 0)

establishes closure under addition, existence of an additive identity, existence of an additive inverse for all vectors, additive commutativity, and additive associativity. Letting the field be the reals with $n, m \in R$

n \to 0 = \to 0 = m (n \to 0) = (m n) \to 0 = 1 (\to 0)

establishes closure under multiplication, multiplicative associativity, and the existence of a multiplicative identity. Finally,

n (\to 0 + \to 0) = n \to 0 + n \to 0 = \to 0 = (n + m) \to 0 = n \to 0 + m \to 0

establishes distributivity.

Any such vector space ${\to 0}$ has just one basis, $\emptyset$ . Intuitively, since you live at the origin, the origin is already spanned by no vectors at all -- i.e., the empty set of vectors. Any additional vector would be redundant, so no other sets constitute bases for ${\to 0}$ .

^{^}

In math, the bigger and/or fancier the symbol, the bigger the set or class that symbol usually stands for.

^{^}

A vector $\to p$ can stand for a polynomial by containing all the coefficients in the polynomial, coefficients ordered by the degree of each coefficient's monomial.

^{^}

This is addition of functions, $(f + g) x = f (x) + g (x)$ , on the left side of the equation. $I$ is the identity function.

^{^}

$dim V$ is the dimension of $V$ , formalized as the number of vectors in any basis of $V$ .

^{^}

Intuitively, orthonormal sets are nice sets of vectors like ${[1, 0, 0]^{T}, [0, 1, 0]^{T}, [0, 0, 1]^{T}}$ , where each vector has length one and is pointing out in a separate dimension.

More precisely, a set of vectors is called orthonormal when its elements are pairwise orthogonal and each vector has a norm of $1$ . We will especially care about orthonormal bases, like the set above with respect to $R^{3}$ .

^{^}

The adjoint of a linear map $f : V \to W$ is a linear map $f^{*} : W \to V$ such that the inner product of $f (\to v)$ and $\to w$ equals the inner product of $\to v$ and $f^{*} (\to w)$ for all $\to v \in V$ and $\to w \in W$ .

Remember that inner products aren't generally commutative, so the order of arguments matters. Adjoints feel very anticommutative.

An operator $f \in L (V)$ on an inner-product space $V$ is called normal when

f f^{*} = f^{*} f

^{^}

An operator $f$ is self-adjoint when $f = f^{*}$ .

^{^}

A block diagonal matrix is a square matrix of the form
$⎡ ⎢ ⎢ ⎣ \begin{matrix} A_{1} & 0 ⋱ 0 & A_{m} \end{matrix} ⎤ ⎥ ⎥ ⎦$
where $A_{1}, \dots, A_{m}$ are square matrices lying along the diagonal and all the other entries of the matrix equal $0$ (p. 142).

^{^}

Suppose $V$ is a complex vector space and $f \in L (V)$ . Let $λ_{1}, \dots, λ_{m}$ denote the distinct eigenvalues of $f$ . Let $d_{j}$ denote the multiplicity of $λ_{j}$ as an eigenvalue of $f$ . The polynomial
$(x - λ_{1})^{d_{1}} \dots (x - λ_{m})^{d_{m}}$
is called the characteristic polynomial of $f$ . Note that the degree of the characteristic polynomial of $f$ equals $dim V$ … the roots of the characteristic polynomial of $f$ equal the eigenvalues of $f$ (p. 172; notation converted).

Characteristic polynomials can also be defined for real vector spaces, though the reals are a little less well behaved as vector spaces than the complexes.

Suppose $V$ is a real vector space and $f \in L (V)$ . With respect to some basis of $V$ , $f$ has a block upper-triangular matrix [any entries acceptable above $A_{1}, \dots, A_{m}$ ] of the form
$⎡ ⎢ ⎢ ⎣ \begin{matrix} A_{1} & * ⋱ 0 & A_{m} \end{matrix} ⎤ ⎥ ⎥ ⎦$
where each $A_{j}$ is a $1$ -by- $1$ or a $2$ -by- $2$ matrix with no eigenvalues. We define the characteristic polynomial of $f$ to be the product of the characteristic polynomials of $A_{1}, \dots, A_{m}$ . Explicitly, for each $j$ , define $q_{j} \in P (R)$ by
$q_{j} (x) = ⎧ ⎪ ⎨ ⎪ ⎩ \begin{matrix} x - λ & if A_{j} = [λ]; (x - a) (x - d) - b c & if A_{j} = [\begin{matrix} a & c b & d \end{matrix}] \end{matrix}$
Then the characteristic polynomial of $f$ is
$q_{1} (x) \dots q_{m} (x)$
Clearly the characteristic polynomial of $f$ has degree $dim V$ … The characteristic polynomial of $f$ depends only on $f$ and not on the choice of a particular basis (p. 206; notation converted).

LESSWRONG
LW

LESSWRONG
LW

57

Linear Algebra Done Right, Axler

57

57

Contents and Notes

1. Vector Spaces

2. Finite-Dimensional Vector Spaces

3. Linear Maps

The Matrix of a Linear Map

4. Polynomials

5. Eigenvalues and Eigenvectors

Polynomials Applied to Operators

Upper-Triangular Matrices

Diagonal Matrices

6. Inner-Product Spaces

7. Operators on Inner-Product Spaces

8. Operators on Complex Vector Spaces

9. Operators on Real Vector Spaces

10. Trace and Determinant