Linear transformations and matrices form the backbone of numerous mathematical and scientific disciplines, ranging from computer graphics and machine learning to physics and engineering. The interplay between these concepts is not only intellectually stimulating but also crucial in solving real-world problems efficiently.
In this article, we will explore the deep connection between linear transformations and matrices, shedding light on their fundamental principles and practical applications.
Generalities on linear transformations
To begin, let’s define what linear transformations and matrices are. A linear transformation is a function that preserves vector addition and scalar multiplication. In simpler terms
Definition of a linear transformation
Let $E$ and $F$ be vector spaces on a field $\mathbb{K}=\mathbb{R}$ or $\mathbb{C}$. An applicaion $f: E\to F$ is called a linear transformation if for any $u,v\in E$ and $\lambda\in\mathbb{K},$ \begin{align*} f(u+\lambda v)=f(u)+\lambda f(v).\end{align*}
The set of all linear transformations from $E$ to $F$ is denoted by $\mathcal{L}(E,F)$ and it is a vector space.
The space $\mathcal{L}(E):=\mathcal{L}(E,E)$ is called the space of endomorphism.
Examples of linear spplications
$\bullet$ Let $E$ be a vector space and $\lambda\in \mathbb{R}$. Define the application $f:E\to E$ by $f(x)=\lambda x$. This $f$ is a linear transformation.
$\bullet$ Let $T:\mathbb{R}^3\to \mathbb{R}^2$ the application defined by $$ T\begin{pmatrix} x\\y\\z\end{pmatrix}=\begin{pmatrix}2x-y+z\\ y-z\end{pmatrix}.$$ Then $T$ is a linear application.
$\bullet$ Let $R[X]$ be the space of all polynomial with real coefficients and let $g: \mathbb{R}[X]\to \mathbb{R}$ be defined by $$ g(P)=P(0)+P'(0),\qquad P\in \mathbb{R}[X].$$ Then $g$ is a linear application.
The first property of linear transformations $f$ is $f(0)=0$. In fact, $f(0)=f(0+0)=f(0)+f(0)=2f(0)$, which implies that $f(0)=0$.
The range of a linear transformation $f: E\to F$ denoted as ${\rm Im}(f)$ is defined by $${\rm Im}(f)=\{f(u):u\in E\}.$$
The kernel of a linear map $f: E\to F$ is denoted as $\ker(f)$ and defined by $$\ker(f):=\{u\in E:f(u)=0\}.$$ It is a subspace of $E$.
We recall that an application $f: E\to F$ is injective if for $x,y\in E$ with $f(x)=f(y)$ implies that $x=y$. Now if in addition $f$ is linear, then the injectivity of $f$ is equivalent to $\ker(f)=\{0\}$.
The rank theorem:
If the spaces $E$ and $F$ have a finite dimension and if the map $f: E\to F$ is linear, then \begin{align*} \dim(E)=\dim(\ker(f))+\dim({\rm Im}(f)).\end{align*} By the way the number $\dim({\rm Im}(f))$ is called the rank of $f$ and will be denoted by ${\rm rank}(f)$
Here we gather some classical examples of linear transformations with detailed solutions.
Linear transformations examples
Example on the space of continuous functions
Let $E=\mathscr{C}(\mathbb{R},\mathbb{R})$ be the vector space on $\mathbb{R}$ of all continuous functions from $\mathbb{R}$ to $\mathbb{R}$. If $f$ and $g$ are real functions then the usual product of $f$ by $g$ is defined by $(fg)(x)=f(x)g(x)$ for any $x\in\mathbb{R}$. We denote by $id:\mathbb{R}\to \mathbb{R}$ the identity function, $id(x)=x$ for all $x\in \mathbb{R}$. Let us now define the application $\Phi:E\to E$ by \begin{align*} \Phi(f)(x)=xf(x),\qquad \forall f\in E,\quad x\in \mathbb{R}. \end{align*}
Prove that $\Phi$ is a linear transformation and determine the kernel of $\Phi$. Does $\Phi$ surjective? Determine the image of $\Phi$.
First of all, we need to prove that the map $\Phi$ is well defined. In fact, if $f\in E$, the the function $x\mapsto xf(x)$ is continuous as product of continuous functions. This implies that $\Phi(f)\in E$, so $\Phi$ define a function from $E$ to $E$. Let us now prove that $\Phi$ is a linear transformation on $E$. In fact, let $f,g\in E$ and $\lambda\in \mathbb{R}$. For any $x\in \mathbb{R},$ we have \begin{align*} \Phi(f+\lambda g)(x)&= x(f(x)+\lambda g(x))\cr &= xf(x)+\lambda (xg(x))\cr &=\Phi(f)(x)+\lambda \Phi(g)(x). \end{align*} Hence \begin{align*} \Phi(f+\lambda g)=\Phi(f)+\lambda \Phi(g). \end{align*} This means that $\Phi$ is linear.
Next, let us compute the the kernel of $\Phi$. For $f\in \ker(\Phi)$, we have $xf(x)=0$ for any $x\in \mathbb{R}$. This equality always hold for $x=0$. Now assume that $x\neq 0$, this implies that $f(x)=0$. But as $f$ is continuous at zero, by taking the limit of $f$ at zero we obtain $f(0)=0$. this means that $f$ is, in fact, identically null on $\mathbb{R}$. Thus $f=0$, and then $\ker(\Phi)=\{0\},$ which means that the application $\Phi$ is injective.
The map $\Phi$ is not surjective. In fact, let define the function $g(x)=1$ for all $x\in\mathbb{R}$, so that $g\in E$ as a constant function. Now if we assume that $\Phi$ is surjective then we can find $f\in E$ such that $\Phi(f)=1$. This means that for any $x\in \mathbb{R}$ we have $xf(x)=1$. But if we take $x=0$, we find $0=1$. This is absurd, and hence $\Phi$ is not surjective.
Finally, let us determine ${\rm Im}(\Phi)$. Let $g\in {\rm Im}(\Phi)$. Then there is $f\in E$ such that $g(x)=xf(x)$ for all $x\in \mathbb{R}$. In particular, $g(0)=0$. On the other have \begin{align*} \forall x\in \mathbb{R}\setminus\{0\},\quad \frac{g(x)-g(0)}{x-0}=f(x). \end{align*} As $f$ is continuous at $0,$ we deduce that $g$ is differentiable in $0$ and $g'(0)=f(0)$. This implies that \begin{align*} {\rm Im}(\Phi)\subset \{g\in E: g(0)=0\;\text{and}\; f\;\text{is differentiable at}\; 0\}. \end{align*} Conversely, let $g\in E$ such that $g(0)=0$ and $g$ is differentiable at $0$. Then the function \begin{align*} f(x)=\begin{cases}\frac{g(x)}{x},& x\neq 0,\cr g'(0),& x=0,\end{cases} \end{align*} is an element of $E$ and $g=\Phi(f)\in {\rm Im}(\Phi)$. Hence \begin{align*} {\rm Im}(\Phi)= \{g\in E: g(0)=0\;\text{and}\; f\;\text{is differentiable at}\; 0\}. \end{align*}
The following example uses linear transformations to solve a problem on supplementary spaces.
Example on linear transformations and supplementary spaces:
Let $\mathbb{K}$ be a field and consider the $\mathbb{K}$-vector space $E=\mathbb{K}^n$ with $n\ge 0$. Let $a=(1,1,\cdots,1)\in E$ and denote \begin{align*} A&:=\mathbb{K}a={(\lambda,\lambda,\cdots,\lambda):\lambda\in \mathbb{K}}\cr B&:=\{x=(x_1,x_2,\cdots,x_n)\in E: x_1+x_2+\cdots+x_n=0\}. \end{align*} Let the linear transformation \begin{align*} f:E\to \mathbb{K},\quad f(x_1,x_2,\cdots,x_n)=\sum_{i=1}^n x_i. \end{align*}
- Determine $\ker(f)$. Deduce that $B$ is a subspace.
- Let $x\in E$. Prove that there exists a unique scalar $\lambda\in\mathbb{K}$ such that $f(x)=\lambda f(a)$.
- Justify that $A\cap B=\{0\}$.
- Deduce that the sum $A+B$ is direct, that is, $E=A\oplus B$.
$\bullet$ The kernel of $f$ is, by definition, \begin{align*} \ker(f)&=\left\{(x_1,x_2,\cdots,x_n)\in E: f(x_1,x_2,\cdots,x_n)=0\right\}\cr &Â =\left\{(x_1,x_2,\cdots,x_n)\in E: x_1+x_2+\cdots+x_n=0\right\}\cr &= B. \end{align*} As $B$ coincides with the kernel of a linear transformation, it is a subpace.
$\bullet$ Let’s prove the uniqueness of $\lambda$. Assume that there exist $(\lambda,\mu)\in\mathbb{K}^2$ such that $f(x)=\lambda f(a)=\mu f(a)$. As $f(a)=n,$ we then have $(\lambda-\mu) n=0$, so that $\lambda=\mu$. Now let us prove its existence. Let $x=(x_1,x_2,\cdots,x_n)\in X$. It suffice to prove that there exist $\lambda\in\mathbb{K}$ such that $x-\lambda a\in \ker(f)$, so that \begin{align*} \sum_{i=1}^n x_i-\lambda n=0. \end{align*} It suffice to take \begin{align*} \lambda=\frac{x_1+x_2+\cdots+x_n}{n}. \end{align*}
$\bullet$ Let $x\in A\cap B$ Then we have there exist $\lambda\in \mathbb{K}$ such that \begin{align*} x=\lambda a\quad\text{and}\quad x_1+x_2+\cdots+x_n=0. \end{align*} As $f$ is linear then $f(x)=\lambda f(a)$. By the previous question we have \begin{align*} \lambda=\frac{x_1+x_2+\cdots+x_n}{n}=\frac{0}{n}=0. \end{align*} Hence $x=\lambda a=0\times a=0$, so that $A\cap B=\{0\}$.
$\bullet$ Because of the previous question it suffices to show that $E=A+B$. According to the question $2$, there exists $\lambda\in\mathbb{K}$ such that $f(x)=\lambda f(a)$, so that $f(x-\lambda a)=0$. Then $x-\lambda a\in \ker(f)=B$. On the other hand, $\lambda a \in A$. Thus $x=\lambda a+(x-\lambda a)\in A+B$. This ends the proof.
The link between Linear Transformations and Matrices
Let $U$ and $V$ be two finite-dimensional vectors spaces with $\dim(U)=n$ and $\dim(V)=m$ and consider a linear transformation $T\in \mathcal{L}(U,V)$. Consider $\mathscr{B}_U=\{u_1,\cdots,u_n\}$ as the basis of $U$, and $\mathscr{B}_V=\{v_1,\cdots,v_m\}$ as the basis of $V$.
For any $i=1,2,\cdots,n$, we have $T(u_i)\in V$. Then we can represent the vector $T(u_i)$ in the basis $\mathscr{B}_V$ as $$ T(u_i)=\sum_{k=1}^m a_{ki}v_k,$$ wher $a_{ik}$ are scalars.
Then we associate to $T$ a matrix denoted as $$ {\rm Mat}(T):={\rm Mat}(T,\mathscr{B}_U\to \mathscr{B}_V)$$ and define by $$ {\rm Mat}(T)=\begin{pmatrix} T(u_1)&T(u_2)&\cdots & T(u_n)\end{pmatrix}$$ That is $$ {\rm Mat}(T)=\begin{pmatrix} a_{11}& a_{12}&\cdots & a_{1n}\\ a_{21}& a_{22}&\cdots & a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}& a_{m2}&\cdots & a_{mn}\end{pmatrix}.$$ We denote by ${\rm Mat}(m\times n)$ the vector space of all matrices with $m$ lines and $n$ columns. In addition if $A\in {\rm Mat}(m\times n)$, we dnote $A=(a_{ij})_{1\le i\le m,1\le j\le n}$.
Product of matrices
Consider three vector spaces $U,V$ and $W$ with bases $\mathscr{B}_U,\mathscr{B}_V$, and $\mathscr{B}_W$, respectively. We assume that $\dim(U)=n$, $\dim(V)=m$ and $\dim(W)=k$. Let $T\in\mathcal{L}(U,V)$ and $S\in\mathcal{L}(V,W)$. Then $$ S\circ T\in \mathcal{L}(U,W).$$ Let us denote $$ A={\rm Mat}(T,\mathscr{B}_U\to \mathscr{B}_V)\in {\rm Mat}(m\times n)$$ and $$ B={\rm Mat}(S,\mathscr{B}_V\to \mathscr{B}_W)\in {\rm Mat}(k\times m).$$ Then $$ C:=BA={\rm Mat}(S\circ T,\mathscr{B}_U\to \mathscr{B}_W)\in {\rm Mat}(k\times n)$$ The entries of the matrix $C$ are given by
$$ c_{ij}=\sum_{p=1}^m a_{ip}b_{pj}.$$
We know that the basis of any finite-dimensional vector space is not unique. Thus on the same vector space, one can find serval bases. Now have the following natural question:
Question
Consider $\mathscr{B}_U$ as the basis of $U$, and $\mathscr{B}_V$ as the basis of $V$. Let now $\mathscr{B}’_U$ and $\mathscr{B}’_V$ be two other bases of $U$ and $V$, respectively. Do we have $${\rm Mat}(T,\mathscr{B}_U\to \mathscr{B}_V)={\rm Mat}(T,\mathscr{B}’_U\to \mathscr{B}’_V)?.$$
The answer is no. Let us see this on simple example. Consider the linear application $T:\mathbb{R}^2\to \mathbb{R}^2$ defined by $$ T\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}x+y\\ x-y\end{pmatrix}.$$ Consider the vectors $$ e_1=\begin{pmatrix}1\\ 0\end{pmatrix},\quad e_2=\begin{pmatrix}0\\ 1\end{pmatrix},\quad v_1=\begin{pmatrix}1\\ 1\end{pmatrix}.$$ It is not diffuclt to prove that $\mathscr{B}=\{e_1,e_2\}$ and $\mathscr{B}’=\{e_1,v_2\}$ are bases of $\mathbb{R}^2$. Observe that $$ T(e_1)=e_1+e_2,\quad T(e_2)=e_1-e_2.$$ Then $$ {\rm Mat}(T,\mathscr{B}\to \mathscr{B})=\begin{pmatrix} 1&1\\1&-1\end{pmatrix}.$$ On the other hand, $$ T(e_1)=\begin{pmatrix}1\\1\end{pmatrix}=v_2,\qquad T(v_2)= \begin{pmatrix}2\\0\end{pmatrix}=2 e_1.$$ Then $$ {\rm Mat}(T,\mathscr{B}’\to \mathscr{B}’)=\begin{pmatrix} 0&2\\1&0\end{pmatrix}.$$