Dividing by a vector

There are various ways that we can multiply vectors.

  • We can multiply a vector by a scalar to get another vector: $w=\lambda v$.
  • We can multiply a vector by a linear map to get another vector: $w=\alpha v$.
  • We can dot-product two vectors to get a scalar: $\lambda = w\cdot v$.
  • We can cross-product two vectors to get another vector: $u = w\times v$.

If we wanted to divide by a vector we would need for one of these equations to have a unique inverse. Sadly this rarely happens.

  • Given $v$ and $w$ there’s no scalar $\lambda$ such that $w=\lambda v$, unless $v$ and $w$ happen to point in the same direction.
  • Given $v$ and $w$ there are typically many linear maps such that $w=\alpha v$. If $v\in V$ and $w\in W$ then the space of linear maps sending $v$ to $w$ has dimension $(\mathrm{dim}(V)-1)\mathrm{dim(W)}$.
  • Given $\lambda$ and $v$ there’s an entire hyperplane of solutions to $\lambda = w\cdot v$.
  • Given $u$ and $v$, there are no solutions to $u = w\times v$ unless $u$ and $v$ are perpendicular, in which case there is an entire line of solutions.

But there are some cases where you can divide by a vector. The case I want to look at today is the one where $V$ is $1$-dimensional.

Suppose $v$ is a nonzero element of a $1$-dimensional vector space $V$. Then for any vector $w$ there’s a unique linear map $\alpha$ such that $w=\alpha v$. We can therefore write $\alpha = w/v$. Specifically, we define $w/v$ by using the fact that $V$ is $1$-dimensional to write each element as $\lambda v$ and then we define $\frac{w}{v}(\lambda v)=\lambda w$.

This is of course rather trivial. But trivial is not the same as unimportant. The following collection of examples prove how useful this notion can be.

Units

When labelling a chart or a table of numbers, it’s useful to specify what units the numbers are being measured in. For example a column might be labelled ‘Time (s)’, representing that the time is measured in seconds. But you will also see this written as ‘Time/s’. Is this just a notation, or is ‘/’ being used with its mathematical meaning? I claim that this use of the division symbol is an example of division by a vector.

Time is definitely a vector quantity. You can add two periods of time, or subtract them; $240s-100s=140s$. It also makes sense to multiply a period of time by a scalar; $2\times 100s=200s$. But you can’t multiply two times together (at least not to get another time). We can therefore view the set of all possible durations as forming a $1$-dimensional vector space. Let’s call it $T$. There’s no canonical time scale to the universe because Newton’s laws give exactly the same result if you double all velocities and then run everything at half speed. So $T$ doesn’t have a canonical basis. But we can arbitrarily pick a basis, a specific duration like the second, and then use this to measure other durations.

Given some duration $t\in T$, the number of seconds that it represents is equal to the real scalar the duration $1s$ must be multiplied by in order to get the duration $t$. In other words $t=\frac t{1s}s$.

So when someone writes ‘Time/s’ as a column heading they really do mean what they’ve written. They are measuring the time and then dividing by the quantity of one second to get a real number, which is then the number they write in the table.

Logarithms

The function $x\mapsto\log_b(x)$ gives the inverse to the function $n\mapsto b^n$. It satisfies various nice properties, most importantly that $\log_b(xy)=\log_b(x)+\log_b(y)$. This property holds for all $b\in\mathbb R_{>0}$.

Quite often one will see the expression ‘$\log(x)$’, written without a base $b$. The base is supposed to be known to the reader, but occasionally there is ambiguity. This can act as a shibboleth for various fields. Mathematicians on the whole assume base $e$, but information theorists and computer scientists often assume base $2$. Everyone else assumes base $10$.

This paper by Michael P. Frank suggests the following idea: rather than picking a base we instead view $\log(x)$ as not being a number at all, but rather an element of a $1$-dimensional vector space. We can define addition on this space by

$$\log(x)+\log(y)=\log(xy)$$

and scalar multiplication by

$$\phantom{.}\lambda \log(x)=\log\left(x^\lambda\right).$$

We can then carry these quantities through our equations without ever having to choose a base.

When we do want to settle on a base we can do so using the following formula, an instance of vector division,

$$\frac{\log(x)}{\log(b)}=\log_b(x).$$

Determinants

Let $\alpha:V\to V$ be a linear map from an $n$-dimensional vector space $V$ to itself. The inverse of $\alpha$ is usually calculated in terms of its adjugate $\mathrm{adj}(\alpha):V\to V$ and its determinant $\det(\alpha)\in\mathbb F$ (where $\mathbb F$ is the underlying field).

These satisfy the property that $\mathrm{adj}(\alpha)\circ\alpha=\det(\alpha)\mathrm{id}_V=\alpha\circ\mathrm{adj}(\alpha)$. Therefore whenever $\det(\alpha)$ is nonzero the map $\alpha$ is invertible with inverse $\frac{\mathrm{adj}(\alpha)}{\det(\alpha)}$. This is an example of ordinary division, because $\det(\alpha)$ is an element of the field of scalars.

But now suppose that $\alpha$ maps between two different vector spaces $\alpha:V\to W$, where both spaces still have dimension $n$. One may try to calculate the determinant of $\alpha$ by picking bases of $V$ and $W$ and then calculating the determinant of the matrix for $\alpha$ with respect to these bases. If you do this you’ll find that the result depends on the bases chosen. Specifically, if one applies change-of-basis matrices $A$ to $V$ and $B$ to $W$ then the result is multiplied by $\det(A)^{-1}\det(B)$.

Can $\det(\alpha)$ still be defined? It can, but no longer as an element of $\mathbb F$. Instead it must be seen as an element of the $1$-dimensional vector space $\Lambda^n(V)^*\otimes\Lambda^n(W)$. Here, $\Lambda^n$ is the exterior power.

If we do similar calculations for $\mathrm{adj}$, we find that it changes in the same way as $\det$ when we change bases; each entry multiplies by $\det(A)^{-1}\det(B)$. This means that $\mathrm{adj}(\alpha)$ is not an element of $\mathrm{Hom}(W,V)$ but rather of $\mathrm{Hom}(W,V)\otimes\Lambda^n(V)^*\otimes\Lambda^n(W)$.

But the situation rights itself when we calculate the inverse by dividing the adjugate by the determinant. We define $\alpha^{-1}=\frac{\mathrm{adj}(\alpha)}{\det(\alpha)}$ to be the unique linear map taking the determinant to the adjugate. So using the fact that the tensor-product of a $1$-dimensional vector space and its dual is canonically isomorphic to the underlying field, we have

\begin{align*}
\alpha^{-1}&\in\mathrm{Hom}(\Lambda^n(V)^*\otimes\Lambda^n(W),\mathrm{Hom}(W,V)\otimes\Lambda^n(V)^*\otimes\Lambda^n(W))\\
&\cong\mathrm{Hom}(W,V)\otimes\Lambda^n(V)^*\otimes\Lambda^n(V)\otimes\Lambda^n(W)^*\otimes\Lambda^n(W)\\
&\cong\mathrm{Hom}(W,V)\otimes\mathbb F\otimes\mathbb F\\
&\cong\mathrm{Hom}(W,V)
\end{align*}

which is what we wanted, since we expected $\alpha^{-1}$ to be a map from $W$ to $V$. Alternatively we can just note that the two factors of $\det(A)^{-1}\det(B)$ cancel out, and hence the matrix of $\alpha^{-1}$ transforms appropriately for a linear map $W\to V$.

Calculus

Let $\mathcal M$ be a $1$-dimensional manifold, and suppose we have a function $x:\mathcal M\to \mathbb R$. Suppose we also have a function $f:\mathbb R\to\mathbb R$. Then by composition we may define a new function $y=fx:\mathcal M \to \mathbb R$.

Since $x$ and $y$ are scalar functions on $\mathcal M$ they have differentials $\mathrm{d}x$ and $\mathrm{d}y$, which live in the cotangent space on $\mathcal M$. Since $\mathcal M$ is $1$-dimensional the cotangent space at any given point is a $1$-dimensional vector space. Therefore whenever $\mathrm{d}x$ is nonzero we may divide $\mathrm{d}y$ by $\mathrm{d}x$. Thus we can view the expression $\mathrm{d}y/\mathrm{d}x$ as a genuine fraction, and we do indeed get

$$\frac{\mathrm{d}y}{\mathrm{d}x}=f'(x).$$

(See also the very popular math.stackexchange question Is $\frac{\mathrm{d}y}{\mathrm{d}x}$ not a ratio?.)

This formalism can be useful when it comes to multivariable calculus. Suppose $\mathcal M$ is now an $n$-dimensional manifold, with functions $x_0, x_1,\dots,x_n:\mathcal M \to \mathbb R$. Then if we take the chain rule

$$\mathrm dx_0 = \left(\frac{\partial x_0}{\partial x_1}\right)_{x_2,\dots, x_{n}}\mathrm dx_1 + \dots + \left(\frac{\partial x_0}{\partial x_{n}}\right)_{x_1, \dots, x_{n-1}}\mathrm dx_{n}$$

and wedge it with $\mathrm dx_2\wedge\dots\wedge\mathrm dx_{n}$ we obtain

$$\mathrm dx_0\wedge \mathrm dx_2\wedge\dots\wedge\mathrm dx_{n}=\left(\frac{\partial x_0}{\partial x_1}\right)_{x_2, \dots, x_{n}}\mathrm dx_1\wedge \mathrm dx_2\wedge\dots\wedge\mathrm dx_{n}$$

and hence

$$\left(\frac{\partial x_0}{\partial x_1}\right)_{x_2, \dots, x_{n}} = \frac{\mathrm dx_0\wedge \mathrm dx_2\wedge\dots\wedge\mathrm dx_{n}}{\mathrm dx_1\wedge \mathrm dx_2\wedge\dots\wedge\mathrm dx_{n}}.$$

This formula allows one to convert multivariable calculus problems to exterior algebra.

Exercise

Prove the triple product rule.

$$\left(\frac{\partial x}{\partial y}\right)_z\left(\frac{\partial y}{\partial z}\right)_x\left(\frac{\partial z}{\partial x}\right)_y = -1$$


Addendum: This post was discussed on Reddit.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Note: $\LaTeX$ will sometimes break code blocks.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax