The Existence of the Exponential Function

From Drorbn
Jump to: navigation, search



The purpose of this paperlet is to use some homological algebra in order to prove the existence of a power series e(x) (with coefficients in {\mathbb Q}) which satisfies the non-linear equation


as well as the initial condition

e(x)=1+x+(higher order terms).

Alternative proofs of the existence of e(x) are of course available, including the explicit formula e(x)=\sum_{k=0}^\infty\frac{x^k}{k!}. Thus the value of this paperlet is not in the result it proves but rather in the allegorical story it tells: that there is a technique to solve functional equations such as [Main] using homology. There are plenty of other examples for the use of that technique, in which the equation replacing [Main] isn't as easy (see Further Examples below). Thus the exponential function seems to be the easiest illustration of a general principle and as such it is worthy of documenting.

Thus below we will pretend not to know the exponential function and/or its relationship with the differential equation e'=e.


Significant parts of this paperlet were contributed by Omar Antolin Camarena. Further thanks to Yael Karshon, to Peter Lee and to the students of Math 1350 (2006) in general.

The Scheme

We aim to construct e(x) and solve [Main] inductively, degree by degree. Equation [Init] gives e(x) in degrees 0 and 1, and the given formula for e(x) indeed solves [Main] in degrees 0 and 1. So booting the induction is no problem. Now assume we've found a degree 7 polynomial e_7(x) which solves [Main] up to and including degree 7, but at this stage of the construction, it may well fail to solve [Main] in degree 8. Thus modulo degrees 9 and up, we have


where M(x,y) is the "mistake for e_7", a certain homogeneous polynomial of degree 8 in the variables x and y.

Our hope is to "fix" the mistake M by replacing e_7(x) with e_8(x)=e_7(x)+\epsilon(x), where \epsilon(x) is a degree 8 "correction", a homogeneous polynomial of degree 8 in x (well, in this simple case, just a multiple of x^8).

*1 The terms containing no \epsilon's make a copy of the left hand side of [M]. The terms linear in \epsilon are \epsilon(x+y), -e_7(x)\epsilon(y) and -\epsilon(x)e_7(y). Note that since the constant term of e_7 is 1 and since we only care about degree 8, the last two terms can be replaced by -\epsilon(y) and -\epsilon(x), respectively. Finally, we don't even need to look at terms higher than linear in \epsilon, for these have degree 16 or more, high in the stratosphere.

So we substitute e_8(x)=e_7(x)+\epsilon(x) into e(x+y)-e(x)e(y) (a version of [Main]), expand, and consider only the low degree terms - those below and including degree 8:*1


We define a "differential" d:{\mathbb Q}[x]\to{\mathbb Q}[x,y] by (df)(x,y)=f(y)-f(x+y)+f(x), and the above equation becomes

*2 It is worth noting that in some a priori sense the existence a solution of e(x+y)=e(x)e(y) is unexpected. For e must be an element of the relatively small space {\mathbb Q}[[x]] of power series in one variable, but the equation it is required to satisfy lives in the much bigger space {\mathbb Q}[[x,y]] of power series in two variables. Thus in some sense we have more equations than unknowns and a solution is unlikely. How fortunate we are that exponentials do exist, after all!

To continue with our inductive construction we need to have that e_8(x+y)-e_8(x)e_8(y)=0. Hence the existence of the exponential function hinges upon our ability to find an \epsilon for which M=d\epsilon. In other words, we must show that M is in the image of d. This appears hopeless unless we learn more about M, for the domain space of d is much smaller than its target space and thus d cannot be surjective, and if M was in any sense "random", we simply wouldn't be able to find our correction term \epsilon.*2

As we shall see momentarily by "finding syzygies", \epsilon and M fit within the 1st and 2nd chain groups of a rather short complex

\left(\epsilon\in C^1={\mathbb Q}[[x]]\right)\longrightarrow\left(M\in C^2={\mathbb Q}[[x,y]]\right)\longrightarrow\left(C^3={\mathbb Q}[[x,y,z]]\right),

whose first differential was already written and whose second differential is given by (d^2m)(x,y,z)=m(y,z)-m(x+y,z)+m(x,y+z)-m(x,y) for any m\in{\mathbb Q}[[x,y]]. We shall further see that for "our" M, we have d^2M=0. Therefore in order to show that M is in the image of d^1, it suffices to show that the kernel of d^2 is equal to the image of d^1, or simply that H^2=0.

Finding a Syzygy

A Syzygy for Exponentiation.png

So what kind of relations can we get for M? Well, it measures how close e_7 is to turning sums into products, so we can look for preservation of properties that both addition and multiplication have. For example, they're both commutative, so we should have M(x,y)=M(y,x), and indeed this is obvious from the definition. Now let's try associativity, that is, let's compute e_7(x+y+z) associating first as (x+y)+z and then as x+(y+z). In the first way we get

e_7(x+y+z) = M(x+y,z)+e_7(x+y)e_7(z) = M(x+y,z)+\left(M(x,y)+e_7(x)e_7(y)\right)e_7(z).

In the second we get

e_7(x+y+z) = M(x,y+z)+e_7(x)e_7(y+z) = M(x+y,z)+e_7(x)\left(M(y,z)+e_7(y)e_7(z)\right).

Comparing these two we get an interesting relation for M: M(x+y,z)+M(x,y)e_7(z) = M(x,y+z) + e_7(x)M(y,z) . Since we'll only use M to find the next highest term, we can be sloppy about all but the first term of M. This means that in the relation we just found we can replace e_7 by its constant term, namely 1. Upon rearranging, we get the relation promised for M:  d^2M = M(y,z)-M(x+y,z)+M(x,y+z)-M(x,y) = 0.

Computing the Homology, Easy but Limited

Now let's prove that H^2=0 for our (piece of) chain complex. That is, letting M(x,y) \in \mathbb{Q}[[x,y]] be such that d^2M=0, we'll prove that for some \epsilon(x) \in \mathbb{Q}[[x]] we have  d^1\epsilon = M .

Write the two power series as  \epsilon(x) = \sum{\frac{\alpha_i}{i!}x^i} and  M(x,y) = \sum{\frac{m_{ij}}{i!j!}x^i y^j} , where the \alpha_i are the unknowns we wish to solve for.

The coefficient of x^i y^j z^k in  M(y,z)-M(x+y,z)+M(x,y+z)-M(x,y) is

 \frac{\delta_{i0} m_{jk}}{j!k!} - {{i+j} \choose i} \frac{m_{i+j,k}}{(i+j)!k!} + {{j+k} \choose j} \frac{m_{i,j+k}}{i!(j+k)!} - \frac{\delta_{k0} m_{ij}}{i!j!}.

Here,  \delta_{i0} is a Kronecker delta: 1 if  i=0 and 0 otherwise. Since  d^2 M = 0 , this coefficient is zero. Multiplying by  i!j!k! (and noting that, for example, the first term doesn't need an  i! since the delta is only nonzero when  i=0 ) we get:

 \delta_{i0} m_{jk} - m_{i+j,k} + m_{i,j+k} + \delta_{k0} m_{ij}=0.

An entirely analogous procedure tells us that the equations we must solve boil down to  \delta_{i0} \alpha_j - \alpha_{i+j} + \delta_{j0} \alpha_i = m_{ij} .

By setting  i=j=0 in this last equation we see that  \alpha_0 = m_{00} . Now let i and j be arbitrary positive integers. This solves for most of the coefficients: \alpha_{i+j}=-m_{ij}. Any integer at least two can be written as  i+j , so this determines all of the  \alpha_m for  m \ge 2 . We just need to prove that  \alpha_m is well defined, that is, that \alpha_{i+j}=-m_{ij} doesn't depend on  i and  j but only on their sum.

But when  i and  k are strictly positive, the relation [m] reads  m_{i+j,k} = m_{i,j+k} , which show that we can "transfer"  j from one index to the other, which is what we wanted.

It only remains to find  \alpha_1 but it's easy to see this is impossible: if  \epsilon satisfies  d^1\epsilon = M , then so does  \epsilon(x)+kx for any  k , so  \alpha_1 is arbitrary. How do our coefficient equations tell us this?

Well, we can't find a single equation for  \alpha_1 ! We've already tried taking both  i and  j to be zero, and also taking them both positive. We only have taking one zero and one positive left. Doing so gives two necessary conditions for the existence of the  \alpha_m :  m_{0r} = m_{r0} = 0 for  r>0 . So no  \alpha_1 comes up, and we're still not done. Fortunately setting one of  i and  k to be zero and one positive in the realtion for the  m_{ij} does the trick.

Computing the Homology, Hard but Rewarding

Let C_n^k denote the space of degree n polynomials in (commuting) variables x_1,\ldots,x_k (with rational coefficients) and let d^k:C_n^k\to C_n^{k+1} be defined by d^k=\sum_{i=0}^{k+1}(-)^i d^k_i, where (d^k_0f)(x_1,\ldots,x_{k+1}):=f(x_2,\ldots,x_{k+1}), (d^k_i)(x_1,\ldots,x_{k+1}):=f(x_1,\ldots,x_i+x_{i+1},\ldots,x_{k+1}) for 1\leq i\leq k and (d^k_{k+1}f)(x_1,\ldots,x_{k+1}):=f(x_1,\ldots,x_k). It is easy to verify that {\mathcal C}_n:=(C_n^\star, d) is a chain complex, and that (for k=1,2,3) it agrees with the degree n piece of the complex in [Complex]. We need to show that H^1({\mathcal C}_n)=0 for n>1 (we don't need the vanishing of H^1 for n=0,1 as these degrees are covered by the initial condition [Init]). This follows from the following theorem.

Theorem. H^1({\mathcal C}_1) is {\mathbb Q}; otherwise H^k({\mathcal C}_n)=0.

Proof (sketch). It is easy to verify "by hand" that \dim H^k({\mathcal C}_1)=\delta_{k1}. For n>1 let {\mathcal C}_1^{\otimes n} be the nth interior power of {\mathcal C}_1, whose kth chain group is (C_1^k)^{\otimes n} and whose differential is defined using the diagonal action of the d^k_i's. The permutation group S_n acts on {\mathcal C}_1^{\otimes n} by permuting the tensor factors. If R denotes the trivial representation of S_n, then R\otimes_{S_n}{\mathcal C}_1^{\otimes n}={\mathcal C}_n, and so

H^\star({\mathcal C}_n) = H^\star(R\otimes_{S_n}{\mathcal C}_1^{\otimes n}) = R\otimes_{S_n}H^\star({\mathcal C}_1^{\otimes n}) = R\otimes_{S_n}H^\star({\mathcal C}_1)^{\otimes n}

and the results readily follows. Note that the last equality uses the Eilenberg-Zilber-Künneth formula, which holds because {\mathcal C}_n (and especially {\mathcal C}_1) is a co-simplicial space with the d^k_i's as co-face maps and with (s^k_i f)(x_1,\ldots,x_{k-1}):=f(x_1,\ldots,x_{i-1},0,x_i,\ldots,x_{k-1}) as co-degeneracies.

Further Examples

Let us briefly list a number of other places in mathematics where similar "non-linear algebraic functional equations" need to be solved. The techniques we have developed to solve [Main] can be applied in all of those cases, though sometimes it is fully successful and sometimes something breaks down somewhere along the way.

In knot theory solving this equations is as easy as calculating 1+1=2 on an abacus:
One Plus One on an Abacus.png

See [Bar-Natan_Le_Thurston_03].

The "Duflo Homomorphism" Equation

This is the equation

(\Delta\otimes 1)\Upsilon = \Upsilon^{12}\Upsilon^{23},

written in \left(S(L^\star)_L\otimes S(L^\star)_L\otimes U(L)\right)^L, for an unknown \Upsilon\in \left(S(L^\star)_L\otimes U(L)\right)^L where L is a Lie algebra. With appropriate qualifications, S(L^\star)\otimes U(L)=\operatorname{Hom}(S(L),U(L)), and our equation becomes the statement "\Upsilon is an algebra homomorphism between the invariants of S(L) and the invariants of U(L)", and its solution is the so-called "Duflo homomorphism". This is a typical "homomorphism wanted" equation and it it has many relatives, including our primary example e(x+y)=e(x)e(y) which is also known as the statement "the additive group of {\mathbb R} is isomorphic to the multiplicative group of {\mathbb R}_+". Note that in general, the equation for being a homomorphism, say \varphi(xy)=\varphi(x)\varphi(y), is non-linear in the homomorphism \varphi itself.

Where from cometh the syzygies? As in the case of the exponential function, they come from the fact that for a homomorphism, associativity in the target follows from associativity in the domain.

The "Formal Quantization" Equation

This is the equation

(f\star g)\star h = f\star(g\star h),
The Pentagon for an Associative Product.png

written for an unknown "\star-product", within a certain complicated space of "potential products" which resembles \operatorname{Hom}(V\otimes V,V) or V^\star\otimes V^\star\otimes V for some vector space V. This is a typical "algebraic structure wanted" equation, in which the unknown is an "algebraic structure" (an associative product \star, in this case) and the equation is "the structure satisfies a law" (the associative law, in our case). Note that algebraic laws are often non-linear in the structure that they govern (the example relevant here is that the associative law is "quadratic as a function of the product"). Other examples abound, with the "structure" replaced by a "bracket" or anything else, and the "law" replaced by "Jacobi's equation" or whatever you fancy.

Where from cometh the syzygies? From the pentagon shown on the right.

The Drinfel'd Pentagon Equation

This is the equation

\Phi(t^{12},t^{23}) \Phi(t^{12}+t^{13},t^{24}+t^{34}) \Phi(t^{23},t^{34}) = \Phi(t^{13}+t^{23},t^{34}) \Phi(t^{12}, t^{24}+t^{34}).

It is an equation written in some strange non-commutative algebra {\mathcal A}^h_4, for an unknown "function" \Phi(a,b) which in itself lives in some non-commutative algebra {\mathcal A}^h_3. This equation is related to tensor categories, to quasi-Hopf algebras and (strange as it may seem) to knot theory, and is commonly summarized by either of the following two pictures:

Two Forms of the Pentagon.png

This equation is a close friend of the Drinfel'd Hexagon Equation, and you can read more about both of them at [Drinfeld_90], [Drinfeld_91] and [Bar-Natan_97]. These equations have many relatives living in many further exotic spaces.

Where from cometh the syzygies? Here the equations come from the pentagon, so the syzygies must come from somewhere more complicated - the "Stasheff Polyhedron":

The Stasheff Polyhedron.png

The Drinfel'd Twist Equation

This is the equation

\Phi'(x,y,z)=F^{-1}(x+y,z)F^{-1}(x,y)\Phi F(y,z)F(x,y+z).

This equation is written in another strange non-commutative algebra {\mathcal A}_3, which is a superset of {\mathcal A}^h_3. The unknown F lives in some space {\mathcal A}_2, and \Phi and \Phi' are solutions of the Drinfel'd pentagon of before. You can read more about this equation at [Drinfeld_90], [Drinfeld_91] and [Le_Murakami_96]. This equation has many relatives living in many further exotic spaces.

Where from cometh the syzygies? Again from the pentagon, but in a different way.


The Braidor Equation

This is the equation

B(x_1,x_2,x_3) B(x_1+x_3,x_2,x_4) B(x_1,x_3,x_4)=B(x_1+x_2,x_3,x_4) B(x_1,x_2,x_4) B(x_1+x_4,x_2,x_3),

written in {\mathcal A}_4 for an unknown B\in{\mathcal A}_3. It is related to the picture on the right and through it to the third Reidemeister move of braid theory and knot theory. At present, the only place to read more about it is 06-1350/Homework Assignment 4.

Where from cometh the syzygies? The third Reidemeister move comes from resolving a triple point. The corresponding syzygy comes from resolving a quadruple point. See a picture at 06-1350/Homework Assignment 4.

Further Further Examples

This subsection is by definition forever empty, for if a worthwhile further further example comes to mind, mine or yours, it should be added as a subsection right above.


[Bar-Natan_97] ^  D. Bar-Natan, Non-associative tangles, in Geometric topology (proceedings of the Georgia international topology conference), (W. H. Kazez, ed.), 139-183, Amer. Math. Soc. and International Press, Providence, 1997.

[Bar-Natan_Le_Thurston_03] ^  D. Bar-Natan, T. Q. T. Le and D. P. Thurston, Two applications of elementary knot theory to Lie algebras and Vassiliev invariants, Geometry and Topology 7-1 (2003) 1-31, arXiv:math.QA/0204311.

[Drinfeld_90] ^  V. G. Drinfel'd, Quasi-Hopf algebras, Leningrad Math. J. 1 (1990) 1419-1457.

[Drinfeld_91] ^  V. G. Drinfel'd, On quasitriangular Quasi-Hopf algebras and a group closely connected with \operatorname{Gal}(\bar{\mathbb Q}/{\mathbb Q}), Leningrad Math. J. 2 (1991) 829-860.

[Le_Murakami_96] ^  T. Q. T. Le and J. Murakami, The universal Vassiliev-Kontsevich invariant for framed oriented links, Compositio Math. 102 (1996), 41-64, arXiv:hep-th/9401016.