The Existence of the Exponential Function

Introduction

The purpose of this paperlet is to use some homological algebra in order to prove the existence of a power series e(x) (with coefficients in ${\mathbb Q}$) which satisfies the non-linear equation

 [Main]
e(x + y) = e(x)e(y)

as well as the initial condition

 [Init]
e(x) = 1 + x + (higher order terms).

Alternative proofs of the existence of e(x) are of course available, including the explicit formula $e(x)=\sum_{k=0}^\infty\frac{x^k}{k!}$. Thus the value of this paperlet is not in the result it proves but rather in the allegorical story it tells: that there is a technique to solve functional equations such as [Main] using homology. There are plenty of other examples for the use of that technique, in which the equation replacing [Main] isn't as easy (see Further Examples below). Thus the exponential function seems to be the easiest illustration of a general principle and as such it is worthy of documenting.

Thus below we will pretend not to know the exponential function and/or its relationship with the differential equation e' = e.

Acknowledgment

Significant parts of this paperlet were contributed by Omar Antolin Camarena. Further thanks to Yael Karshon, to Peter Lee and to the students of Math 1350 (2006) in general.

The Scheme

We aim to construct e(x) and solve [Main] inductively, degree by degree. Equation [Init] gives e(x) in degrees 0 and 1, and the given formula for e(x) indeed solves [Main] in degrees 0 and 1. So booting the induction is no problem. Now assume we've found a degree 7 polynomial e7(x) which solves [Main] up to and including degree 7, but at this stage of the construction, it may well fail to solve [Main] in degree 8. Thus modulo degrees 9 and up, we have

 [M]
e7(x + y) − e7(x)e7(y) = M(x,y),

where M(x,y) is the "mistake for e7", a certain homogeneous polynomial of degree 8 in the variables x and y.

Our hope is to "fix" the mistake M by replacing e7(x) with e8(x) = e7(x) + ε(x), where ε(x) is a degree 8 "correction", a homogeneous polynomial of degree 8 in x (well, in this simple case, just a multiple of x8).

 *1 The terms containing no ε's make a copy of the left hand side of [M]. The terms linear in ε are ε(x + y), − e7(x)ε(y) and − ε(x)e7(y). Note that since the constant term of e7 is 1 and since we only care about degree 8, the last two terms can be replaced by − ε(y) and − ε(x), respectively. Finally, we don't even need to look at terms higher than linear in ε, for these have degree 16 or more, high in the stratosphere.

So we substitute e8(x) = e7(x) + ε(x) into e(x + y) − e(x)e(y) (a version of [Main]), expand, and consider only the low degree terms - those below and including degree 8:*1

e8(x + y) − e8(x)e8(y) = M(x,y) − ε(y) + ε(x + y) − ε(x).

We define a "differential" $d:{\mathbb Q}[x]\to{\mathbb Q}[x,y]$ by (df)(x,y) = f(y) − f(x + y) + f(x), and the above equation becomes

e8(x + y) − e8(x)e8(y) = M(x,y) − (dε)(x,y).
 *2 It is worth noting that in some a priori sense the existence a solution of e(x + y) = e(x)e(y) is unexpected. For e must be an element of the relatively small space ${\mathbb Q}[[x]]$ of power series in one variable, but the equation it is required to satisfy lives in the much bigger space ${\mathbb Q}[[x,y]]$ of power series in two variables. Thus in some sense we have more equations than unknowns and a solution is unlikely. How fortunate we are that exponentials do exist, after all!

To continue with our inductive construction we need to have that e8(x + y) − e8(x)e8(y) = 0. Hence the existence of the exponential function hinges upon our ability to find an ε for which M = dε. In other words, we must show that M is in the image of d. This appears hopeless unless we learn more about M, for the domain space of d is much smaller than its target space and thus d cannot be surjective, and if M was in any sense "random", we simply wouldn't be able to find our correction term ε.*2

As we shall see momentarily by "finding syzygies", ε and M fit within the 1st and 2nd chain groups of a rather short complex

 [Complex]
$\left(\epsilon\in C^1={\mathbb Q}[[x]]\right)\longrightarrow\left(M\in C^2={\mathbb Q}[[x,y]]\right)\longrightarrow\left(C^3={\mathbb Q}[[x,y,z]]\right)$,

whose first differential was already written and whose second differential is given by (d2m)(x,y,z) = m(y,z) − m(x + y,z) + m(x,y + z) − m(x,y) for any $m\in{\mathbb Q}[[x,y]]$. We shall further see that for "our" M, we have d2M = 0. Therefore in order to show that M is in the image of d1, it suffices to show that the kernel of d2 is equal to the image of d1, or simply that H2 = 0.

Finding a Syzygy

So what kind of relations can we get for M? Well, it measures how close e7 is to turning sums into products, so we can look for preservation of properties that both addition and multiplication have. For example, they're both commutative, so we should have M(x,y) = M(y,x), and indeed this is obvious from the definition. Now let's try associativity, that is, let's compute e7(x + y + z) associating first as (x + y) + z and then as x + (y + z). In the first way we get

$e_7(x+y+z) = M(x+y,z)+e_7(x+y)e_7(z) = M(x+y,z)+\left(M(x,y)+e_7(x)e_7(y)\right)e_7(z)$.

In the second we get

$e_7(x+y+z) = M(x,y+z)+e_7(x)e_7(y+z) = M(x+y,z)+e_7(x)\left(M(y,z)+e_7(y)e_7(z)\right)$.

Comparing these two we get an interesting relation for M: M(x + y,z) + M(x,y)e7(z) = M(x,y + z) + e7(x)M(y,z). Since we'll only use M to find the next highest term, we can be sloppy about all but the first term of M. This means that in the relation we just found we can replace e7 by its constant term, namely 1. Upon rearranging, we get the relation promised for M: d2M = M(y,z) − M(x + y,z) + M(x,y + z) − M(x,y) = 0.

Computing the Homology, Easy but Limited

Now let's prove that H2 = 0 for our (piece of) chain complex. That is, letting $M(x,y) \in \mathbb{Q}[[x,y]]$ be such that d2M = 0, we'll prove that for some $\epsilon(x) \in \mathbb{Q}[[x]]$ we have d1ε = M.

Write the two power series as $\epsilon(x) = \sum{\frac{\alpha_i}{i!}x^i}$ and $M(x,y) = \sum{\frac{m_{ij}}{i!j!}x^i y^j}$, where the αi are the unknowns we wish to solve for.

The coefficient of xiyjzk in M(y,z) − M(x + y,z) + M(x,y + z) − M(x,y) is

$\frac{\delta_{i0} m_{jk}}{j!k!} - {{i+j} \choose i} \frac{m_{i+j,k}}{(i+j)!k!} + {{j+k} \choose j} \frac{m_{i,j+k}}{i!(j+k)!} - \frac{\delta_{k0} m_{ij}}{i!j!}$.

Here, δi0 is a Kronecker delta: 1 if i = 0 and 0 otherwise. Since d2M = 0, this coefficient is zero. Multiplying by i!j!k! (and noting that, for example, the first term doesn't need an i! since the delta is only nonzero when i = 0) we get:

 [m]
δi0mjkmi + j,k + mi,j + k + δk0mij = 0.

An entirely analogous procedure tells us that the equations we must solve boil down to δi0αj − αi + j + δj0αi = mij.

By setting i = j = 0 in this last equation we see that α0 = m00. Now let i and j be arbitrary positive integers. This solves for most of the coefficients: αi + j = − mij. Any integer at least two can be written as i + j, so this determines all of the αm for $m \ge 2$. We just need to prove that αm is well defined, that is, that αi + j = − mij doesn't depend on i and j but only on their sum.

But when i and k are strictly positive, the relation [m] reads mi + j,k = mi,j + k, which show that we can "transfer" j from one index to the other, which is what we wanted.

It only remains to find α1 but it's easy to see this is impossible: if ε satisfies d1ε = M, then so does ε(x) + kx for any k, so α1 is arbitrary. How do our coefficient equations tell us this?

Well, we can't find a single equation for α1! We've already tried taking both i and j to be zero, and also taking them both positive. We only have taking one zero and one positive left. Doing so gives two necessary conditions for the existence of the αm: m0r = mr0 = 0 for r > 0. So no α1 comes up, and we're still not done. Fortunately setting one of i and k to be zero and one positive in the realtion for the mij does the trick.

Computing the Homology, Hard but Rewarding

Let $C_n^k$ denote the space of degree n polynomials in (commuting) variables $x_1,\ldots,x_k$ (with rational coefficients) and let $d^k:C_n^k\to C_n^{k+1}$ be defined by $d^k=\sum_{i=0}^{k+1}(-)^i d^k_i$, where $(d^k_0f)(x_1,\ldots,x_{k+1}):=f(x_2,\ldots,x_{k+1})$, $(d^k_i)(x_1,\ldots,x_{k+1}):=f(x_1,\ldots,x_i+x_{i+1},\ldots,x_{k+1})$ for $1\leq i\leq k$ and $(d^k_{k+1}f)(x_1,\ldots,x_{k+1}):=f(x_1,\ldots,x_k)$. It is easy to verify that ${\mathcal C}_n:=(C_n^\star, d)$ is a chain complex, and that (for k = 1,2,3) it agrees with the degree n piece of the complex in [Complex]. We need to show that $H^1({\mathcal C}_n)=0$ for n > 1 (we don't need the vanishing of H1 for n = 0,1 as these degrees are covered by the initial condition [Init]). This follows from the following theorem.

Theorem. $H^1({\mathcal C}_1)$ is ${\mathbb Q}$; otherwise $H^k({\mathcal C}_n)=0$.

Proof (sketch). It is easy to verify "by hand" that $\dim H^k({\mathcal C}_1)=\delta_{k1}$. For n > 1 let ${\mathcal C}_1^{\otimes n}$ be the nth interior power of ${\mathcal C}_1$, whose kth chain group is $(C_1^k)^{\otimes n}$ and whose differential is defined using the diagonal action of the $d^k_i$'s. The permutation group Sn acts on ${\mathcal C}_1^{\otimes n}$ by permuting the tensor factors. If R denotes the trivial representation of Sn, then $R\otimes_{S_n}{\mathcal C}_1^{\otimes n}={\mathcal C}_n$, and so

$H^\star({\mathcal C}_n) = H^\star(R\otimes_{S_n}{\mathcal C}_1^{\otimes n}) = R\otimes_{S_n}H^\star({\mathcal C}_1^{\otimes n}) = R\otimes_{S_n}H^\star({\mathcal C}_1)^{\otimes n}$

and the results readily follows. Note that the last equality uses the Eilenberg-Zilber-Künneth formula, which holds because ${\mathcal C}_n$ (and especially ${\mathcal C}_1$) is a co-simplicial space with the $d^k_i$'s as co-face maps and with $(s^k_i f)(x_1,\ldots,x_{k-1}):=f(x_1,\ldots,x_{i-1},0,x_i,\ldots,x_{k-1})$ as co-degeneracies.

Further Examples

Let us briefly list a number of other places in mathematics where similar "non-linear algebraic functional equations" need to be solved. The techniques we have developed to solve [Main] can be applied in all of those cases, though sometimes it is fully successful and sometimes something breaks down somewhere along the way.

 In knot theory solving this equations is as easy as calculating 1+1=2 on an abacus: See .

The "Duflo Homomorphism" Equation

This is the equation

$(\Delta\otimes 1)\Upsilon = \Upsilon^{12}\Upsilon^{23}$,

written in $\left(S(L^\star)_L\otimes S(L^\star)_L\otimes U(L)\right)^L$, for an unknown $\Upsilon\in \left(S(L^\star)_L\otimes U(L)\right)^L$ where L is a Lie algebra. With appropriate qualifications, $S(L^\star)\otimes U(L)=\operatorname{Hom}(S(L),U(L))$, and our equation becomes the statement "Υ is an algebra homomorphism between the invariants of S(L) and the invariants of U(L)", and its solution is the so-called "Duflo homomorphism". This is a typical "homomorphism wanted" equation and it it has many relatives, including our primary example e(x + y) = e(x)e(y) which is also known as the statement "the additive group of ${\mathbb R}$ is isomorphic to the multiplicative group of ${\mathbb R}_+$". Note that in general, the equation for being a homomorphism, say $\varphi(xy)=\varphi(x)\varphi(y)$, is non-linear in the homomorphism $\varphi$ itself.

Where from cometh the syzygies? As in the case of the exponential function, they come from the fact that for a homomorphism, associativity in the target follows from associativity in the domain.

The "Formal Quantization" Equation

This is the equation

$(f\star g)\star h = f\star(g\star h),$

written for an unknown "$\star$-product", within a certain complicated space of "potential products" which resembles $\operatorname{Hom}(V\otimes V,V)$ or $V^\star\otimes V^\star\otimes V$ for some vector space V. This is a typical "algebraic structure wanted" equation, in which the unknown is an "algebraic structure" (an associative product $\star$, in this case) and the equation is "the structure satisfies a law" (the associative law, in our case). Note that algebraic laws are often non-linear in the structure that they govern (the example relevant here is that the associative law is "quadratic as a function of the product"). Other examples abound, with the "structure" replaced by a "bracket" or anything else, and the "law" replaced by "Jacobi's equation" or whatever you fancy.

Where from cometh the syzygies? From the pentagon shown on the right.

The Drinfel'd Pentagon Equation

This is the equation

Φ(t12,t23)Φ(t12 + t13,t24 + t34)Φ(t23,t34) = Φ(t13 + t23,t34)Φ(t12,t24 + t34).

It is an equation written in some strange non-commutative algebra ${\mathcal A}^h_4$, for an unknown "function" Φ(a,b) which in itself lives in some non-commutative algebra ${\mathcal A}^h_3$. This equation is related to tensor categories, to quasi-Hopf algebras and (strange as it may seem) to knot theory, and is commonly summarized by either of the following two pictures:

This equation is a close friend of the Drinfel'd Hexagon Equation, and you can read more about both of them at , and . These equations have many relatives living in many further exotic spaces.

Where from cometh the syzygies? Here the equations come from the pentagon, so the syzygies must come from somewhere more complicated - the "Stasheff Polyhedron":

The Drinfel'd Twist Equation

This is the equation

Φ'(x,y,z) = F − 1(x + y,z)F − 1(x,yF(y,z)F(x,y + z).

This equation is written in another strange non-commutative algebra ${\mathcal A}_3$, which is a superset of ${\mathcal A}^h_3$. The unknown F lives in some space ${\mathcal A}_2$, and Φ and Φ' are solutions of the Drinfel'd pentagon of before. You can read more about this equation at , and . This equation has many relatives living in many further exotic spaces.

Where from cometh the syzygies? Again from the pentagon, but in a different way.

The Braidor Equation

This is the equation

B(x1,x2,x3)B(x1 + x3,x2,x4)B(x1,x3,x4) = B(x1 + x2,x3,x4)B(x1,x2,x4)B(x1 + x4,x2,x3),

written in ${\mathcal A}_4$ for an unknown $B\in{\mathcal A}_3$. It is related to the picture on the right and through it to the third Reidemeister move of braid theory and knot theory. At present, the only place to read more about it is 06-1350/Homework Assignment 4.

Where from cometh the syzygies? The third Reidemeister move comes from resolving a triple point. The corresponding syzygy comes from resolving a quadruple point. See a picture at 06-1350/Homework Assignment 4.

Further Further Examples

This subsection is by definition forever empty, for if a worthwhile further further example comes to mind, mine or yours, it should be added as a subsection right above.

References

[Bar-Natan_97] ^  D. Bar-Natan, Non-associative tangles, in Geometric topology (proceedings of the Georgia international topology conference), (W. H. Kazez, ed.), 139-183, Amer. Math. Soc. and International Press, Providence, 1997.

[Bar-Natan_Le_Thurston_03] ^  D. Bar-Natan, T. Q. T. Le and D. P. Thurston, Two applications of elementary knot theory to Lie algebras and Vassiliev invariants, Geometry and Topology 7-1 (2003) 1-31, arXiv:math.QA/0204311.

[Drinfeld_90] ^  V. G. Drinfel'd, Quasi-Hopf algebras, Leningrad Math. J. 1 (1990) 1419-1457.

[Drinfeld_91] ^  V. G. Drinfel'd, On quasitriangular Quasi-Hopf algebras and a group closely connected with $\operatorname{Gal}(\bar{\mathbb Q}/{\mathbb Q})$, Leningrad Math. J. 2 (1991) 829-860.

[Le_Murakami_96] ^  T. Q. T. Le and J. Murakami, The universal Vassiliev-Kontsevich invariant for framed oriented links, Compositio Math. 102 (1996), 41-64, arXiv:hep-th/9401016.