Gaussian Quadratures

Last time I talked about quadratures and orthogonal polynomials, so now it’s time to combine the two.

Theorem: A quadrature scheme with n interpolation points will be exact for polynomials of order up to 2n-1 if the quadrature points are chosen to be the roots of the associated nth-order orthogonal polynomial.

Note that, for an arbitrary choice of sample points, we would generally expect n points to give a quadrature for polynomials of order n-1, so 2n-1 is a substantial improvement. For example, a single interpolation point can exactly integrate over polynomials of order one rather than just zero — as we had in our original example — and two interpolation points can integrate over polynomials of order three rather than one, even better than our original example. Three points can integrate up to order five!

Proof: We can split any (2n-1)-order polynomial P_{2n-1}(x) according to the orthogonal polynomial f_n(x). In particular, we can use the decomposition

P_{2n-1}(x) =P_{n-1}(x)f_n(x)+Q_{n-1}(x),

where P_{n-1} and Q_{n-1} are polynomials of order n-1. Since f_n is predetermined, we must specify a (2n-1)-order polynomial, with 2n coefficients, by choosing the polynomials P_{n-1} and Q_{n-1}, with 2n coefficients between them. We have as many free coefficients to choose as we need, so this decomposition is always possible.

When we integrate over this expression, we see that

\int_a^b P_{2n-1}(x)\, \textrm{d}x =\int_a^b (P_{n-1}(x) f_n(x)+Q_{n-1}(x) ) \,\textrm{d}x =\int_a^b Q_{n-1}(x) \,\textrm{d}x ,

by f_n(x) being orthogonal to any P_{n-1}, regardless of our choice of integration points. We therefore have the equality

\int_a^b P_{2n-1}(x)\, \textrm{d}x =\int_a^b Q_{n-1}(x) \,\textrm{d}x = \sum_{k=1}^n \omega_k Q_{n-1}(x_k) ,

so we must show that \sum_{k=1}^n \omega_k P_{2n-1}(x_k) is equal to any one of these expressions. But, if we take x_k to be the roots of f_n(x) , then equality to the third expression follows immediately, since the first term in the decomposition of P_{2n-1}(x_k) disappears for every x_k, and so P_{2n-1}(x_k) = Q_{n-1}(x_k). \, \square

We can, therefore, choose our interpolation points to be the roots of an orthogonal polynomial to optimise our quadrature method, with respect to the order of polynomials whose integrals are computed exactly. For our case over the interval [-1,1], this gives the first optimal sets of interpolation points as

\left\{0\right\}, \, \left\{-\frac{1}{\sqrt{3}} ,\frac{1}{\sqrt{3}}\right\}, \, \left\{-\sqrt{\frac{3}{5}} ,0,\sqrt{\frac{3}{5}} \right\}, \ldots

Weighted Quadratures

As a final extension, suppose the function we are integrating is not polynomial, but it is a polynomial times some other function — w(x), say — that is explicitly known. In this case, we wish to estimate

\int_a^b w(x)f(x)\, \textrm{d}x.

For this case, we can proceed as before, except that now we choose polynomials f_i(x) that are orthogonal, in the sense that

\langle f_i,f_j\rangle = \int_a^b w(x)f_i(x)f_j(x)\, \textrm{d}x = 0.

In other words, the known function w(x) becomes a weight function inside the integral, changing our choice of orthogonal polynomials, and of interpolation points. Common examples on the space [0,1] are the Jacobi polynomials for w(x) = (1-x)^{\alpha}(1+x)^{\beta}, and the Chebyshev polynomials of first type for w(x) = 1/\sqrt{1-x^2}.

That’s about all I have to say. Gaussian quadratures are a reasonably straightforward step-up from basic quadratures, but give a sizeable increase in efficiency if we’re dealing with some function of known type, with an unknown polynomial term.

Advertisements

Quadratures and Orthogonal Polynomials

It’s been about five months since I said this post was in draft, so it’s about time I reined in my perfectionism and published the damn thing.

Since this is a graduate student blog at the moment, it seems reasonable I should write a bit more about what I’m learning at any particular time. Late last year, our department had a talk by Erik van Doorn, from the University of Twente, which looked at birth-death processes, and asymptotic results for how quickly they can be expected to converge to their equilibrium distribution. It was an overview of his paper Representations for the decay parameter of a birth-death process based on the Courant-Fischer theorem, to appear in the Journal of Applied Probability.

A good deal of the talk introduced and used basic results on orthogonal polynomials, so I went to see if any of my books mentioned the subject. It turned out there was a chapter on them in Approximation Theory and Methods by Michael Powell – a book that’s been on my bookshelf for about five years but hardly been read – regarding their use in Gaussian quadratures. The following is mostly spliced together from that chapter, and my undergraduate notes on Numerical Analysis.

Interpolation

Before we talk about quadratures, it’s best if we start with interpolation. Say we have some function over some interval, where we can take a few sample values, with no measurement error, but we have no explicit formula and can’t afford to sample it everywhere. We thus would like to use our sample values to fit an approximating function to the whole interval. One simple way to do this is to try to fit a polynomial through the sample points. We can do this by assigning each sample point a Lagrange polynomial

l_k(x) = \prod_{n \neq k} \frac{x-x_n} {x_k-x_n} ,

with value one at that sample point and zero at all the others. For example, if we take our sample points at -1,-0.5,0,0.5, and 1, then the Lagrange polynomials are those in the plot below. There’s a light grey line at one to help check they are equal to one or zero at the sample points.

Lagrange Polynomials

Our fitted curve will then just be a sum of these Lagrange polynomials, multiplied by their corresponding sample value, so we get a polynomial passing through all the sample points, and estimate the function f(x) as

\hat{f}(x) = \sum_k f(x_k) l_k(x) .

Interpolation Example
This gives a curve that passes through all the interpolation points with the smallest-order polynomial possible. It works well for estimating functions that are, indeed, polynomials, but for other functions it can run into problems. In particular, there are cases where the difference between the interpolation curve and the true function at certain points increases when we increase the number of sample points, so we can’t necessarily improve the approximation by adding points. There’s also the question of where to sample the original function, if we have control over that. I’ll pass over these issues, and move on to integration.

Quadratures

Now say that, instead of approximating a function with some samples, we want to approximate a function’s integral by sampling its value at a few points, or

\int_a^b f(x) \, \textrm{d} x \simeq \omega_0f(x_0) +\omega_1f(x_1) +\ldots +\omega_nf(x_n) .

If we want to focus on making the integration accurate when f is a low-order polynomial, the quadrature with n+1 sample points is exact for polynomials up to order n if we set the weights as

\omega_k = \int_a^b l_k(x) \, \textrm{d} x.

In other words, a quadrature is equivalent to fitting an interpolation curve, and integrating over it. For example, if we’re integrating a function over the interval [-1,1] , we could simply take one sample, with weight one. This would give a quadrature of 2f(x_0) , which is exact for any zero-order, constant function, regardless of the position of x_0.

We could take samples at the endpoints, to get the quadrature \frac{1} {2} f(-1) +\frac{1} {2} f(1) , and we can set f(x) to be constant, or proportional to x, to see the result for first-order polynomials is exact.

We could also take the endpoints and the midpoint. Then we have \frac{1} {3} f(-1) +\frac{4} {3} f(0) +\frac{1} {3} f(1) , which is exact for polynomials up to order two.

However, occasionally we stumble onto a quadrature that does a little better than expected. For the first quadrature above, since our interval is symmetric around zero, if we let x_0=0 any first-order term will be antisymmetric around this midpoint, so this quadrature is exact for first-order polynomials too. Similarly, the second quadrature is exact for quadratic terms, but the third quadrature can still only deal with quadratics, and can’t handle cubics.

Considering what happened when we placed the sample points for the first quadrature at zero, we might guess this is something to do with where we place our sample points. If so, how should we place our sample points, and what’s the highest-order function we can exactly integrate with any set number of samples? To answer this, we can use orthogonal polynomials.

Orthogonal polynomials

We say two vectors are orthogonal when their inner product is equal to zero. For example, if the inner product is simply the dot product, then

\langle x,y \rangle = \sum_{k=1}^n x_ky_k =0,

and so vectors are orthogonal if they are perpendicular to each other.

We have a similar definition and example for orthogonal polynomials, but now we choose an inner product that integrates over an interval instead of summing over two vectors’ dimensions:

\langle f,g \rangle = \int_a^b f(x)g(x)\,\textrm{d} x =0.

We can then choose a sequence of polynomials with increasing order that are all orthogonal to each other. For example, we can start the sequence with f_0(x)=1, or some multiple of it. We then seek a first-order polynomial f_1(x)=Ax+B such that

\int_a^b 1\times(Ax+B) \,\textrm{d} x =\frac{A} {2} (b^2-a^2) +B(b-a) =0.

This can be any multiple of x-(b+a) /2 . In many cases we wish the orthogonal polynomials to have be orthonormal, i.e. \langle f_k,f_k\rangle =1, so for the above we require

\int_a^b C_0^2 \,\textrm{d}x = C_0^2 (b-a) = 1,

\begin{aligned} \int_a^b C_1^2(x-(b+a)/2)^2 \,\textrm{d}x &= C_1^2 \frac{1}{3} \left[\left(b-\frac{b+a}{2}\right)^3 -\left(a-\frac{b+a}{2}\right)^3\right] \\&= C_1^2 \frac{2} {3} \left(\frac{b-a} {2}\right)^3 \\&=1,\end{aligned}

and so on, giving a condition for the value of each scaling factor C_k. We can then find the next term by looking for a second-order polynomial that is orthogonal to 1 and x, and so on. In the case where a=-1 and b=1 this gives a simple sequence of polynomials that begins with

f_0(x)=\frac{1}{\sqrt{2}} ,\, f_1(x)=\sqrt{\frac{3}{2}} x,

f_2(x)=\sqrt{\frac{5}{2}} (\frac{3}{2} x^2-\frac{1}{2}) ,\, f_3(x) =\sqrt{\frac{7}{2}} (\frac{5}{2}x^3-\frac{3}{2}x),\ldots

This is an orthonormal version of the Legendre polynomials.

Since any polynomial can then be expressed as a linear combination of members of this sequence, each polynomial in the sequence is also orthogonal to any polynomial with lower order. So, for example, f_3 is orthogonal to all constant, linear, and quadratic polynomials.

To be continued
The next post will explain why these orthogonal polynomials help us decide on interpolation points.

After a few more posts, I’m planning to return to quadratures to talk about something I’ve mentioned on other topics: since the above procedure for quadratures gives a point estimate, we have no idea of how much uncertainty we have in our estimate. I’m therefore going to talk a bit about Bayesian quadratures. In particular, I’m going to start with a 1988 paper by Persi Diaconis called “Bayesian Data Analysis”, and fill in the gaps for those of us whose knowledge of integrated Brownian motion isn’t quite up to speed.