]> Ehrhart polynomials and integer points in polytopes

Back to the blog entry

Technical remark: This text makes heavy use of MathML and SVG, using the compound XHTML+MathML+SVG DTD. Firefox should render it just fine, Safari renders the SVG but chokes on the MathML; I don't know about other browsers. This page will not validate because of some insane non-symmetry in how the svg tag and/or namespace are treated. I have no idea who is responsible for that mess, and since the math tag works in the expected and logical way, I refuse to fix this page to pander to the insanity that the W3 validator requires. See this website on the compound DTD for some background information.

Scaling polygons

Our goal is to prove the existence of Ehrhart polynomials (warning: link contains spoilers!), but we start this journey slowly, by looking at what happens to the area of a polygon when we scale it, in the following example by 2.

A polygon of area 1.5 The same polygon, scaled by a factor of 2

How do we compute the area of these polygons? One way is to overlay our pictures with a grid of unit squares, like so:

A polygon of area 1.5 The same polygon, scaled by a factor of 2

The original polygon consists of one unit square plus one half unit square and therefore has an area of 1.5. Similarly, we can count the unit squares for the scaled polygon and obtain its area, which is 6.

There is a different way of looking at it, though. If we take the original polygon on the unit square grid and scale both the polygon and the grid at the same time, we get the following picture.

The same polygon, scaled by a factor of 2

The relationship between polygon and grid has remained the same: The polygon still consists of one square plus one half square. Therefore we only need to know the area of the scaled squares to determine the area of the scaled polygon. But the area of a unit square scaled by k is just k2, and so the area of our polygon scaled by k is simply 1.5k 2.

In our argument, we only used the fact that the relationship between polygon and grid does not change. The particular shape of the polygon was irrelevant. In fact, we didn't even really need that it was a polygon: it could have been a circle or some really odd shape as well – the argument becomes a bit hand-wavey, relies a lot on intuition, and making it more robust takes some work. In any case, I hope I was able convince you that when we have any two-dimensional shape with area a and scale it by a factor k, the new shape will have area ak2.

What happens when we look into higher dimensional spaces? Now we are talking about cylinders, pyramids, prisms and other high-dimensional bodies. Let us compare them with a grid of unit cubes to determine their volume – after all, a square is simply a two-dimensional cube – and we will see that again when we scale both the bodies and the grid their relationship does not change.

So what is the volume of a unit cube when we scale it by k? The answer depends on the dimension d. We have seen the case d=2 before, where the answer was k2. Where did this answer come from?

We can intuitively see it once more from counting unit squares within the scaled square, at least when k is integer (it also works when k is non-integer, but it's a bit less intuitive).

A square of side-length 3

In this case, k=3 and the scaled square consists of kk=33=9 unit squares. That is where the answer k2 comes from.

So in d=3, by the same unit cube counting argument, the solution will be k3 – just picture a larger cube, like a Rubik's cube, built out of smaller cubes. From there it shouldn't be hard to see that in general, the volume of a cube of side length k is kd.

Returning to our previous argument that scaling a body and the unit cube grid simultaneously does not change their relationship, we can conclude that when we scale a d-dimensional body of volume a by a factor of k, the resulting body will have volume akd.

Integer points

Let us return to two-dimensional polygons for a while. We know how to deal with the area, but what happens if we're interested in the number of integer points?

A polygon of area 1.5 The same polygon, scaled by a factor of 2

The original polygon contains 5 integer points and scaling by 2 gives us a polygon with 12 integer points. It is clear already that the old rule "multiply by k2" is not true here.

Also, if we think of scaling as a smooth process, where we smoothly increase the scale factor k as we scale our polygon, the area changes smoothly as well, but the number of integer points certainly does not. Rather, the number of integer points jumps. For this reason, it seems to make sense to limit our thought to polygons whose vertices (i.e. the corners) are integer points, and to limit the scale factor k to be integral as well. This guarantees that we will always have polygons with integer vertices, which are also called lattice polygons.

But back to trying to understand what happens to the integer points when scaling such a polygon. Perhaps we should look at a very simple example first.

A square of side-length 1 A square of side-length 3

The left picture is just the unit square unscaled, which is the same as scaling by k=1. In the right picture, the unit square is scaled by k=3. We see that the number of integer points on each side is always k+1, and so the total number of points is (k+1) (k+1) = k2+2k+1 . Being a polynomial of degree 2, that's a fairly reasonable expression. Now let us look at another example.

A square of side-length 1 A square of side-length 3

How does one determine the number of integer points in these triangles? Note that the topmost row of integer points contains one point, the second row contains two points, the third row contains three, and so on. So we need to calculate the sum 1+2+3+...+k+ (k+1) There is a famous anecdote about how Gauß solved this problem as a pupil, but you really don't have to be one of the most prolific mathematicians of all time to get it. The simple idea is that when you combine the first row with the last row, then the second row with the second to last row, and so on, you always get the same number of points in those combinations – in our case, k+2. How many of those combinations are there? Well, we have k+1 rows to begin with, and then we pair them up, giving us (k+1)/2 pairs. So the total number of points is (k+2) k+1 2 We should really say something about the case when the number of rows is odd (because in that case, the middle row cannot be paired up), but it does work out fine in that case as well.

As an aside: If you look at analogous shapes in higher dimension, you will encounter analogous summation formulas.

Gluing things together

We now understand what happens when scaling the unit square and the particular triangle of the last example. What about the polygon we started out with? Note how this polygon is actually the union of a unit square and a triangle:

Our original polygon is the union of two smaller shapes

It would be nice if we could just add the number of integer points from the two smaller shapes. That doesn't quite work, however, because we would count those points twice that lie on the line segment where the two shapes meet. We need to correct that by subtracting again the number of points in this intersection. We can express this idea with the following equation:

Our original polygon is the sum of two smaller shapes minus their intersection = +

For somebody who was conditioned by school to think that mathematics is about numbers this may be hard to believe, but mathematicians really do use equations like this. From this equation, we can immediately fill in the facts we have learned above to get the number of integer points in our polygon (this is actually an application of a linear function from the vector space generated by characteristic functions of polytopes – a.k.a. the algebra of polytopes – to the vector space of functions): (k2+2k+1) + (k+2) k+1 2 - (k+1) The last part of this expression comes from the number of integers in the line segment which we subtract – by now you should be familiar with how one can obtain the corresponding formula. This expression is a bit unwieldy. Of course we could simplify it, but that's not the point. The point is that we obtained it by a succession of small, logical, and verifiable steps and should therefore be convinced that it will give the correct value for all integral scaling factors k (unless we made a silly mistake on the way).

Mathematics is also about observing patterns, and there is one particular pattern that is beginning to emerge. For all the two-dimensional polygons we have looked at so far, the answer was eventually some polynomial of degree 2, that is, some expression that could be simplified to the form a2k2+ a1k+ a0 . We have looked at one one-dimensional polytope (the line segment) and obtained a polynomial of degree 1. Is this a general pattern? Will we obtain a polynomial of degree d for each d-dimensional polytope?

First of all, note that – while certainly somewhat surprising – this would not be an entirely implausible result. After all, we already know that the volume of a polytope scales asymptotically like kd as k goes to infinity, that is, the volume of a polytope is Θ (kd) . One would intuitively expect that the number of integer points is therefore also Θ (kd) . Of course, such an intuition is not proof. Even if the intuition is right, this is still far from showing that the number of integer points is a polynomial in the scaling factor. After all, there could be non-polynomial lower order terms.

We have already seen an important tool for approaching this question, namely gluing things together. Conversely, we can decompose a complicated shape into simpler shapes. This is not entirely trivial in higher dimensions, but at least in two dimensions you should be able to convince yourself that one can decompose any polygon into triangles – this is called a triangulation. We can then get the number of integer points in any polygon by adding the number of points in all those triangles, then subtracting points on line segments where triangles meet, and possibly adding again single points where three or more triangles meet. If we can prove our conjecture for all those simple shapes, then it will be true for all polygons, because adding and subtracting the corresonding polynomials will again give us a polynomial of degree d.

Note that there is an important point here that I am ignoring. Adding and subtracting polynomials can never result in a polynomial of higher degree, but it can, in general, result in a polynomial of lower degree. This will not happen in our situation, however. Can you figure out why?

Simplices and cones

So we are back at trying to understand what happens when we scale a simplex. Simplices are the d-dimensional generalization of line segments, triangles, and tetrahedra. To be more concrete, a d-dimensional simplex Δ is the convex hull of d+1 affinely independent points which we call a0, a1, ..., ad . Any point xΔ can be written in a unique way as a convex combination of those points, that is x= i=0 d λi ai where λi0 and i=0 d λi =1 .

A triangle x x
a1 a_1

The vector ( λ0, λ1, ..., λd ) gives the barycentric coordinates of x.

Since we want to scale the simplex, it makes sense to have one vertex of the simplex which remains fixed under scaling. By translating the simplex, we can assume that a0 =0 . Then the remaining vectors span a simplicial cone. Let's look at an example.

The cone generated by the simplex

The cone is generated by the vectors a1 and a2 which are vertices of the indicated fundamental simplex. With this perspective, we can write the simplex as a set of conic combinations Δ= { x: x= i=1 d λi ai , λi0, i=1 d λi 1 } . This representation encourages us to divide the cone into zones according to the value of i=1 d λi :

The cone generated by the simplex

This in turn is directly linked to scaling the simplex by an integer factor k. Remember that we want to count the integer points in those scaled simplices, ideally in terms of some fundamental unit of the cone. One way to approach this is by tiling the cone.

The vectors ai span the fundamental parallelepiped Π= { x: x= i=1 d λi ai , 0λi<1 } . As indicated in the next picture, the cone can indeed be tiled into disjoint translates of Π of the form y+Π where y is a non-negative integer combination of the ai.

Fundamental parallelepipeds in the cone

Let us try combining the last two pictures in an attempt to count the integer points of a multiple of Δ.

Using the cone to understand scaled simplices

In this example, the fundamental parallelepiped decomposes into two zones, colored orange and olive, and the scaled simplex consists of translates of these two zone types.

Note that this is where two-dimensional pictures start being misleading for the higher dimensional case. In general, the fundamental parallelepiped decomposes into d zones, based on the value of i=1 d λi which is between 0 and d for points of Π. Translates of these zones indeed give you the scaled simplex, but the zones themselves are no longer simplices. Only the two end zones between 0 and 1 and between d-1 and d are simplices. You can visualize this in three dimensions by slicing a cube accordingly.

Nevertheless, the basic idea is now fairly simple: The translates of one such zone always look the same as far as integer points are concerned; in particular, they always have the same number of integer points. So we only need to count how many translates of which zone are contained in the scaled simplex kΔ and we're done. Unfortunately, there's still a catch. Let us look at the integer points again.

Using the cone to understand scaled simplices

In order to count the integer points reliably, we have to assign each integer point to exactly one of the translated zones. Otherwise, we would count points more than once and get an incorrect result. The trick we will use here is two-fold: make the zones half-closed in the right way, and decompose the cone into the interiors of its faces. Let's do the second step first. Every point of the cone is in the interior of exactly one of the cone's faces. This is illustrated for integer points in the following picture.

The cone generated by the simplex

In the picture, we have a large group of teal points which are in the interior of the cone itself. Then there are two groups of blue points which lie in the interiors of the one-dimensional faces of the cone. Those boundary rays are in fact one-dimensional cones, and so we can treat those blue points separately using the same argument in lower dimension or, more formally, using induction. The single black point belongs to the interior of a zero-dimensional cone, and so the same argument applies. This means that we can concentrate on the teal points in our argument.

Using the cone to understand scaled simplices

Now it should be clear how to assign the integer points to zones. By making the zones half-closed in the right way, we will consistently assign one point to each orange zone and three points to each olive zone in this example.

Counting the zones

Let us try to understand those zones better before we count them. The fundamental parallelepiped is divided into d zones Π1 ,..., Πd where Πj= { x: x= i=1 d λi ai , 0<λi1 , j-1< i=1 d λi j } Note that I have changed the direction of half-closedness to match the previous discussion. Every point of the scaled half-closed simplex kΔ belongs to exactly one translated zone of the form y+Πj where y= i=1 d μi ai with μi and i=1 d μi k-j . The last condition comes from the fact that every point in kΔ has a sum of coefficients of at most k, which is distributed to the part that comes from y and the part that comes from Πj.

The number of translates of zone Πj is thus the number of possible vectors of natural numbers ( μ1, ..., μd ) whose component sum is at most k-j. I will call such vectors valid from now on. But how can we count those valid vectors?

We can think of it in this way: We have k-j black balls, and we want to assign those in an arbitrary way into d groups. There may be some balls which are not assigned to any of the groups, and we only care about the number of balls assigned to each group. Let's put those k-j balls in a row and add d more balls. Then we mark d arbitrary balls in red so that we again have k-j black balls, but they are now separated into groups. Here's an example with k-j=7 and d=3.

Counting natural number vectors with a bounded component sum

Now we collect black balls starting from the left until we hit the first red ball and assign those black balls to the first group. We discard the red ball. Then we continue collecting black balls until the next red ball and assign them to the second group, and so on. In this example, we get the valid vector (2,0,3). Note that the last two black balls in this example are discarded and not assigned to any group, because they appear after the last red ball.

The crucial observation that makes this analogy work is that every such coloring of balls corresponds to exactly one valid vector, and every valid vector corresponds to exactly one coloring of balls. Therefore the number of valid vectors (and thus the number of translates of zone Πj) is exactly the number of possible colorings.

In how many ways can we choose which d out of the k-j+d balls we are coloring red? The answer is the binomial coefficient ( k-j+d d ) . In fact, this binomial coefficient is even pronounced "k-j+d choose d". Since we can treat j and d as if they were constant, this binomial coefficient is in fact a polynomial in k of degree d, which proves that the number of zones is a polynomial of degree d, which completes the proof that the number of integer points in a polytope scaled by an integer scaling factor k is a polynomial in k of degree d.

Wait, what?

The funny thing about mathematics is that it is usually written down in the wrong order. When you read a typical textbook, it starts with small propositions and lemmas which ultimately lead, just before the end of the book, to the big theorems in the field that the book covers. This reflects the structure of formal mathematical proofs, where one starts from definitions and axioms to prove ever more complicated statements about the objects of interest. However, when mathematics is practiced, one often starts at the end, with the big statement that one wants to prove, and then tries to find possible approaches, thereby reducing the problem into a growing number of smaller problems. I have tried to give an account of such a development in this text, which is why the end of the last section may have come a little abruptly. It is worth it, perhaps, to think about how one would lay out this development in the style of more formal proofs.

Remember, the final goal is to prove that given any d-dimensional lattice polytope, the number of integer points in that polytope scaled by k is a polynomial in k of degree d.

First, we would talk about how a half-closed scaled simplex can be divided into translates of zones. We then see that the number of translates of each zone is a polynomial. Since the number of points in the scaled simplex is just the number of points per zone (which does not depend on the scaling factor) times the number of translates of the zone, summed up over all d zone types, it follows that the number of points in the half-closed simplex is also a polynomial in the scaling factor. Then we can argue that we can decompose each d-dimensional simplex into a half-closed d-dimensional simplex and many lower-dimensional half-closed simplices. We already know that the number of integer points in each of those half-closed simplices is a polynomial. The number of points in the full simplex is just the sum of those polynomials, and is therefore again polynomial in the scaling factor.

After that, we see that every polytope can be written as a linear combination of simplices, where the d-dimensional simplices are added and the lower-dimensional simplices are either added or subtracted. Again, this means that the number of integer points of the polytope can be computed by adding and subtracting all these polynomials for the simplices, which finally gives us the conclusion that for every polytope, the number of integer points in the polytope that one obtains by scaling with k is a polynomial in k of degree d. This polynomial is called the Ehrhart polynomial of the polytope.

As you can see, this account is in a very different order from the order in which those ideas actually appeared in the text.

Further topics

Understanding the coefficients of the Ehrhart polynomial is very difficult. We can say something about the leading two coefficients and the constant coefficient, which means that the problem is understood in two dimensions. In this case, the Ehrhart polynomial is essentially a different perspective on Pick's theorem, which relates the number of integer points in a lattice polygon to its area and boundary. Unfortunately, things are more complicated even in three dimensions. One cannot expect to find an easy solution in higher dimension, because computing the Ehrhart polynomial includes counting the number of integer points, which is already a very difficult problem – but that is a topic for another time.