There are many effective solvers based on a variety of methods and
implementations for approximating the solution yL(t) of the
initial value problem (IVP) for a system of ordinary differential
equations (ODEs),
| (1) |
| (2) |
| (3) |
Event location arises naturally in many models as a way of dealing with discontinuous behavior. A nice survey of the task and methods that have been proposed for dealing with it form Chapter 6 of the monograph [5] on the dynamics of multibody problems. The task also arises in the solution of delay-differential equations because of discontinuities that are induced by a lack of smoothness and propagated by the effects of delays, c.f. [4,13]. Despite a good deal of attention given to the task, little has been done to show that the numerical procedures proposed actually ``work''. The most general results are those of Mannshardt [12] who investigates one-step methods. He shows that with suitable assumptions, if a one-step method of order p is used to integrate the IVP with constant step size h and events are located accurately enough, the numerical approximation converges uniformly of order p over the whole interval of integration. We prove here similar convergence results with assumptions that describe most modern IVP solvers. Specifically we assume that the solver produces a piecewise polynomial solution y**(t) so that it is practical to locate events as accurately as possible in the precision available. We assume that when given a tolerance t, the solver produces a solution y**(t) that is uniformly accurate of order r(t). We assume only that r(t) is monotone and r(t)® 0 as t® 0. To be concrete, r(t) = t for some popular methods and implementations. Also, for a fixed order p, other implementations have r(t) = t(p+1)/p. Whatever the rate of convergence, we prove that the same rate is achieved in the presence of isolated events. For some methods and implementations it is known that the solver produces a solution with an error at t that to leading order is equal to e(t)r(t) when there are no events. We go on to prove that with such a solver, the error is everywhere proportional to r(t) except in intervals of length O(r(t)) containing the events. These theoretical results are in accord with the numerical results just cited from [6], and we provide numerical results obtained with a wide range of solvers that confirm this behavior.
We assume that the function fL of (1) is as smooth as necessary for the arguments that follow. It is then a standard result that the IVP consisting of (1) and y(a) = A is well-posed. If the first event occurs at t*, yR(t) is defined as the solution of (2) with initial value y(t*) = yL(t*). Here fR is another smooth function. It is worth remark that the initial value for the second integration might be related to yL(t*) in a more complicated way. The classic bouncing ball problem is an example for which the sign of a component is changed and the magnitude is reduced by a coefficient of restitution at the event of the ball rebounding from the ground. Our analysis applies to such problems, but for simplicity we restrict ourselves to initial values obtained by continuity.
We study the solution y(t) of the initial value problem on an interval [a,b] for which there is only one event at t* with a < t* < b by defining y(t) = yL(t) for t £ t* and y(t) = yR(t) otherwise. The assumption that the event is isolated has implications that we take up below. It is necessary that both fL and fR are defined and smooth on an interval containing t* so that yL(t) is defined for some t > t* and yR(t) for some t < t*.
Example 4 of [16] presents a concrete example of an
ill-posed problem with an event function depending on y. It goes
on to give an informal proof that the task is well-posed when the
event function depends only on t. Here we discuss similarly the
general problem. The IVP for yL(t) is well-posed on an interval
containing [a,t*], so small changes to the initial conditions
and to fL lead to a small change d(t) in yL(t). This
induces a small change e in the location t* of the
event that we approximate by expanding about (t*,yL(t*)) the
expression
|
|
This last assumption is a transversality condition to the effect
that the solution yL(t) crosses the hypersurface g(t,y) = 0
at t*. It is illuminating to write the condition in terms of
fL:
|
|
In the unperturbed problem yR(t) is defined for t ³ t* as
the solution of an IVP with initial data (t*,yL(t*)) and
smooth function fR. In the present situation the initial point
is perturbed to t* + e and the initial value to
|
There are some practical difficulties related to well-posedness that merit discussion. It is not unusual that there be more than one event function and sometimes there are many. In both theory and practice the event functions are treated independently and the first event is the one that determines the change in the differential equations. The difficulty is that the task might be well-posed for each event function, but not for the group. Clearly the task might be mathematically ill-posed if an event occurs simultaneously for more than one event function. This seems an unlikely situation, but events are located using approximate solutions y**(t) and it is quite possible that close events occur in a different order when y**(t) is used in place of y(t). In practice we have to assume that events are sufficiently well separated for the tolerance used in their computation that their numerical approximations occur in the correct order.
To understand other difficulties, let us imagine a solution y(t) that is a parabolic arc with maximum value 2 at t = 1. If the event function is g(t,y) = y - g, there are two events near t = 1 for g a little less than 2. These events come together and merge as g® 2. For g = 2 the task is not in the class we treat because the event is not a simple root; it is ill-posed because there is no event for g a little bigger than 2. It is well-posed in principle for g < 2, but in practice it becomes increasing ill-conditioned as g® 2 because events are located using numerical approximations y**(t) to y(t); for a given tolerance it is not possible to distinguish the two events when g is sufficiently close to 2. It is also ill-conditioned in practice even when y(t) is approximated well for a reason we now take up.
An issue of great practical importance is whether the solver ``notices'' an event. A variety of methods have been proposed for locating events [5], but the typical modern code monitors the sign of the event function and searches for an event in the span of a step from an approximation y**(tn) to an approximation at tn+1 = tn + h only when g(tn,y**(tn)) g(tn+1,y**(tn+1)) £ 0. It is very important to appreciate that solvers choose the step size so as to compute y(t) accurately, not to resolve changes in the event function. If the step size is large enough that two simple events occur in [tn,tn+1], the solver will not notice that an event has occurred because there is no sign change. Because of this the assumption that events are isolated is a more serious matter in practice than it might at first seem. Carver [2] suggests an approach to event location that responds to this difficulty. Assuming that the event function can be differentiated conveniently, he adjoins a differential equation for the event function to the system defining y(t). In this way the step size is selected not only to resolve changes in y(t), but also in g(t,y(t)).
A difficult software issue arises when restarting after an event. Solvers use a variety of approaches to avoid reporting the same event more than once. Among them are not checking for the event at the initial point, not checking for events during the first step, and not reporting events which are within a prescribed minimum distance from the initial point. In our experimentation with a wide range of solvers reported in § 4, we encountered some failures for this reason. For a wide range of tolerances, DDRIV2 [9] fails at one or another of the events of the F-2 test problem of [19] by repeatedly reporting the same event. It is a striking demonstration of the care that must be exercised when restarting the integration at an event.
There are few results that assert convergence of numerical solutions of IVPs with event functions. The most general is Mannshardt's investigation [12] of one-step methods. In his approach to the task a one-step method is used to form approximations yn to yL(tn) at mesh points tn = a + nh. The integration continues until a change in the sign of the event function indicates that an event has occurred in the span of the step from tn. The event is located by finding a step size for which a step from (tn,yn) to tn+s = tn + sh results in yn+s for which g(tn+ s,yn+s) » 0. This completes the first part of the integration. In the second part, the function fL is changed to fR and a step of size (1 - s)h is taken from (tn +s,yn+s) to (tn+1,yn+1). The integration then continues with constant step size h. With assumptions essentially the same as ours, he shows that if a Runge-Kutta method of order p is used for the integration and the event is located accurately enough, the numerical solution is convergent of order p on the whole interval.
At the time Mannshardt investigated the task, codes based on one-step methods produced answers only at mesh points. Indeed, this is why most codes with an event location capability were (and still are) based on Adams formulas and backward differentiation formulas (BDFs). These methods represent the solution by polynomials in the span of a step and are properly viewed as producing a piecewise polynomial approximation y**(t) on the whole interval of integration rather than merely approximations at mesh points. In Mannshardt's iterative procedure for locating an event, he actually takes steps of size sh with the one-step method to compute yn+s. With a polynomial y**(t) that is accurate throughout [tn,tn + h], an approximation to yn+s is obtained by merely evaluating a polynomial. This is so much cheaper that it becomes practical to locate an event about as accurately as possible in the precision available. Recent codes based on one-step methods use continuous extensions to obtain the same kind of approximation that is natural with Adams formulas and BDFs.
We analyze the effects of events on the numerical solution of IVPs in a way applicable to most of the solvers in wide use. To this end we assume that when given a tolerance t, the solver produces a numerical solution y**(t) that is piecewise smooth and uniformly accurate of order r(t). We assume of the rate of convergence r(t) only that it is monotone and r(t)® 0 as t® 0. The text [16] discusses a number of convergence results for various methods and controls on the local error. For example, methods implemented with an error per unit step (EPUS) control typically have r(t) = t, and the same is true with error per step and local extrapolation (XEPS). Such codes are common. A method of fixed order p based on error per step (EPS) would have r(t) = t(p+1)/p. The matter is blurred with popular codes of this kind that also vary the order p, but our convergence assumption is sufficiently general that it is an acceptable description of the behavior of most of the popular IVP solvers. We further assume that events are located about as accurately as possible in the precision available. We shall prove that whatever the rate of convergence r(t), the same rate is observed in the presence of well-separated, simple events. Event location has a remarkably small effect on the solution of IVPs: Computation of an event, even as accurately as possible, is relatively inexpensive when a continuous extension is used. The only extra steps due to events are those that arise in restarting the equation after an event. Clearly this cost is necessary because the ODEs change at events. The last step size used before an event is often on-scale for the new integration and this reduces the cost of a restart. If there are few events, all modern implementations start efficiently enough that restarting is a small portion of the overall cost. However, if there are many events, it might be noticeably faster to use one-step methods because they start faster.
For a tolerance t, suppose that an integrator returns
y**L(t) as the numerical solution of (1) with initial
value y(a) = A and that an event is found at t**.
Specifically, t** is the first root of the equation
| (4) |
| (5) |
| (6) |
We shall prove that y**(t) = y(t) + O(r(t)) on all of [a,b]. In the first part of the integration this is just our assumption (5) on the integrator, but what constitutes the first part exposes a complication. For the sake of definiteness, let us suppose that t** < t* . Then the first part of the integration is a £ t £ t**. The remainder of the interval breaks naturally into two parts, [t**, t*] and [t*,b], that have to be treated differently because y(t) is yL(t) on the first and yR(t) on the second.
How well the location t* is approximated by t** is of both
practical and theoretical importance. The point t** is
defined by the equation (4). We are assuming that the
integrator yields an approximation
| ||||||||||
|
| (7) |
| |||||||||||
|
| (9) |
|
|
| (10) |
On the interval [t*,b],
| ||||||||||
With some additional assumptions, Mannshardt [12] extends his
convergence result to an asymptotic expression for the error. It
is not difficult to do the same in the present circumstances.
Instead of merely assuming that the integrator produces a solution
y**L(t) with an error that is O(r(t)) as in
(5), let us assume now that
| (11) |
We must sharpen the result (7) by using the asymptotic
expression (11) instead of the bound (5).
Proceeding as before we find that
|
| (12) |
| |||||||||||
|
|
|
From (2) we have
| ||||||||||
|
|
|
| ||||||||||
|
|
|
Our theoretical results apply to a wide variety of methods and implementations. This is illustrated by the MATLAB ODE Suite [17]. Because event location was a design objective for the Suite, all the methods implemented produce piecewise polynomial approximate solutions. These methods include explicit and implicit Runge-Kutta, Rosenbrock, Adams, and BDF. Events are located about as accurately as possible in the precision available. Because all the solvers of this Suite have exactly the same user interface, changing the method used in an experiment is accomplished by simply changing the name of the solver. Our experiments with this wide range of methods support the theory developed in this article. However, because this Suite is so homogeneous, we have preferred to report here only results obtained with a wide variety of FORTRAN codes popular in general scientific computation.
First we note that all the FORTRAN codes we used are described by our theory because they all produce piecewise polynomial solutions that are used with a root finder to approximate event times as accurately as possible in the precision available. As to the root finder, one (DDRIV2) uses a variant of that described in [18], two (DRDEAM and DRDEBD) use the approach described in [20,21], and the rest (as well as all the solvers of the MATLAB ODE Suite) use variants of the scheme described in [7]. The solvers differ greatly in detail, but the user interfaces are sufficiently similar that we could solve the IVPs of our experiments in the same way. All computations were done in IEEE double precision arithmetic.
We are not aware of a widely-available, general-purpose, explicit Runge-Kutta code in FORTRAN with event location capability. However, two programs for solving delay-differential equations (DDEs), DKLAG5 [13] and DKLAG6 [4], can be used to solve IVPs with event location. These programs are based on (4,5) and (5,6) explicit Runge-Kutta pairs due to Sarafyan and have continuous extensions. DRDEAM [20] is a variant of the well-known DEABM implementation of Adams-Bashforth-Moulton formulas. It supplements the standard scheme of recognizing the occurrence of an event by a change of sign with a second, more sophisticated scheme. DDRIV2 [9] (and its lower level subroutine DDRIV3) has options to use either Adams-Moulton formulas or BDFs and we experimented with both. LSODAR [8,14] is a variant of the well-known LSODE code. In addition to event location, it is unusual in that it switches automatically between Adams-Moulton formulas and BDFs, depending on whether the IVP appears to be nonstiff or stiff. DRDEBD [21] is a variant of the well-known DEBBF implementation of the BDFs. Like DRDEAM, it has an exceptionally strong scheme for recognizing that an event has occurred. DDASRT is a variant of the well-known DASSL [1] program for solving differential-algebraic equations (DAEs). It is based on the BDFs. As with the DDE solvers, this DAE solver can be used to solve IVPs with event location.
In this experiment our theoretical results are illustrated with the numerical solution of (3). We obtained quite similar results with all the codes cited, so report here results for just a few. A mixed error test was used with both relative and absolute tolerances equal to EPS. The accuracy required is indicated in the tables by ITOL = -log10 (EPS ) . The maximum error overrun, ERO1, is the maximum ratio of the error observed to the error allowed by the tolerances. The cost of the integration is measured by the number, NFE1, of evaluations of the functions defining the ODEs. The maximum error in the computed event times is reported as ERT.
The most difficult aspect of designing the experiment was to show clearly what might be expected of the solver when there is no event location. We did this by solving a collection of ten IVPs that approximates (3). Let t0 = 0 and let t1,t2, ¼, t9 be the times of the nine events as determined from the analytical solution of (3). On each [tj,tj+1] we solved an IVP with the ODE of (3) for this interval and with initial value given by the analytical solution of (3) evaluated at tj. Solving this collection of IVPs approximates well the cost of integrating (3) because the same ODEs are integrated over nearly the same subintervals with nearly the same starting values, but there is no event location. Solving the collection does include the cost of repeatedly restarting to integrate a different ODE. The maximum error overrun for solving all these IVPs is reported as ERO2 and the cost as NFE2.
It is observed in the tables that event location has almost no effect on the cost as measured by function evaluations. The discrepancy seen in Table 1 is only apparent: DKLAG5 does 5 function evaluations at each of the 9 events because of bookkeeping associated with the solution of delay-differential equations. Accordingly, the cost of the integration itself is actually the same with and without event location.
The tables show that the error is controlled when events are present just as it is when no events are present. DKLAG5 is a fixed order code for which the theory predicts that the error is proportional to the tolerance and this is seen in Table 1. (This is also true of the computations with DKLAG6 which we do not report.) We have noted that the matter is blurred when the order (formula) is varied. All the other solvers do vary their order and LSODAR varies the formula, too, but it is seen that in all cases the behavior of the error is qualitatively the same whether or not events are present.
| ITOL | NFE1 | NFE2 | ERO1 | ERO2 | ERT |
| 3 | 215 | 170 | 0.577E+00 | 0.459E-02 | 0.115E-02 |
| 4 | 239 | 194 | 0.105E-01 | 0.197E-01 | 0.211E-05 |
| 5 | 329 | 284 | 0.863E-02 | 0.266E-01 | 0.173E-06 |
| 6 | 413 | 368 | 0.111E-01 | 0.307E-01 | 0.223E-07 |
| 7 | 587 | 542 | 0.114E-01 | 0.351E-01 | 0.228E-08 |
| 8 | 857 | 812 | 0.104E-01 | 0.380E-01 | 0.209E-09 |
| 9 | 1319 | 1274 | 0.129E-01 | 0.368E-01 | 0.258E-10 |
| 10 | 1991 | 1946 | 0.126E-01 | 0.372E-01 | 0.252E-11 |
| 11 | 3101 | 3056 | 0.171E-01 | 0.370E-01 | 0.332E-12 |
| ITOL | NFE1 | NFE2 | ERO1 | ERO2 | ERT |
| 3 | 156 | 138 | 0.209E+00 | 0.593E+00 | 0.557E-03 |
| 4 | 216 | 196 | 0.142E+00 | 0.463E+00 | 0.285E-04 |
| 5 | 253 | 231 | 0.143E+01 | 0.270E+01 | 0.359E-04 |
| 6 | 321 | 291 | 0.956E+00 | 0.701E+00 | 0.160E-05 |
| 7 | 379 | 308 | 0.115E+01 | 0.395E+01 | 0.346E-06 |
| 8 | 384 | 405 | 0.505E+01 | 0.288E+01 | 0.102E-06 |
| 9 | 466 | 487 | 0.316E+01 | 0.158E+01 | 0.644E-08 |
| 10 | 514 | 514 | 0.299E+01 | 0.161E+01 | 0.675E-09 |
| 11 | 674 | 626 | 0.157E+01 | 0.305E+01 | 0.290E-10 |
| ITOL | NFE1 | NFE2 | ERO1 | ERO2 | ERT |
| 3 | 335 | 365 | 0.174E+02 | 0.741E+01 | 0.265E-01 |
| 4 | 486 | 496 | 0.300E+02 | 0.700E+01 | 0.451E-02 |
| 5 | 649 | 670 | 0.190E+01 | 0.214E+02 | 0.380E-04 |
| 6 | 793 | 802 | 0.481E+02 | 0.391E+02 | 0.721E-04 |
| 7 | 916 | 904 | 0.611E+02 | 0.309E+02 | 0.917E-05 |
| 8 | 1132 | 1125 | 0.239E+02 | 0.438E+02 | 0.478E-06 |
| 9 | 1335 | 1349 | 0.205E+02 | 0.608E+02 | 0.307E-07 |
| 10 | 1638 | 1576 | 0.239E+02 | 0.341E+02 | 0.358E-08 |
| 11 | 2028 | 2023 | 0.303E+02 | 0.631E+02 | 0.454E-09 |
| ITOL | NFE1 | NFE2 | ERO1 | ERO2 | ERT |
| 3 | 138 | 130 | 0.790E+01 | 0.129E+02 | 0.119E-01 |
| 4 | 168 | 168 | 0.471E+01 | 0.209E+02 | 0.707E-03 |
| 5 | 216 | 216 | 0.401E+01 | 0.236E+02 | 0.602E-04 |
| 6 | 296 | 296 | 0.359E+01 | 0.193E+02 | 0.538E-05 |
| 7 | 384 | 384 | 0.756E+01 | 0.209E+02 | 0.151E-05 |
| 8 | 476 | 476 | 0.143E+02 | 0.320E+02 | 0.286E-06 |
| 9 | 594 | 594 | 0.278E+02 | 0.186E+02 | 0.416E-07 |
| 10 | 624 | 624 | 0.311E+02 | 0.276E+02 | 0.466E-08 |
| 11 | 812 | 812 | 0.328E+02 | 0.242E+02 | 0.492E-09 |
Our theory can be used to prove a convergence result rather like Mannshardt's that we illustrate with DKLAG5 and DKLAG6. Though not ideal for the present purpose, both solvers use a fixed order Runge-Kutta pair and it is possible to make them use a constant step size. We integrated with constant step size, but like Mannshardt, when the code stepped over an event, we located the event, stepped to it, and restarted. The problem (3) was solved for a sequence of step sizes 2-m and the maximum error measured. For a fifth order Runge-Kutta formula as in DKLAG5, the theory predicts that the ratio of errors in successive integrations will be approximately 5, and for a sixth order formula as in DKLAG6, approximately 6. The results given in Table 5 show this to be the case.
| m | DKLAG5 | DKLAG6 |
| 2 | 4.5 | 5.5 |
| 3 | 4.8 | 5.7 |
| 4 | 4.9 | 5.8 |
| 5 | 4.9 | 5.9 |
Modern IVP solvers are so robust that it is often possible for users to ignore events, and many do. In some circumstances [5,6] this is permissible, but as this experiment shows, even when a quality IVP solver reports success, it can be very expensive to ignore events and the answers can be unacceptable. Also, this experiment furnishes examples of two difficulties discussed in § 2, viz. events located in the wrong order and trouble restarting.
The quench front problem of [10] models a
cooled liquid rising on a hot metal rod by ut = uxx + g(u)
for 0 £ x £ 1 and 0 < t . Here
|
|
Event functions gi(t,u1,¼,un) = ui(t) - uc are used to deal with the discontinuity in the definition of g(u). Note that there are as many event functions as ODEs. Initially a switch is set for each component larger than uc; the switch signals the derivative subroutine to use g(u) = -Au for each such component. When the time at which the corresponding ui drops to uc is found, the switch is reset to signal the derivative subroutine to use g(u) = 0 thereafter for this component. In this way, the integrator locates the events without encountering derivative discontinuities and restarts the integration easily. The situation is quite different if event location is not used: the discontinuities in the definition of g(u) result in derivative discontinuities with strong effects on both the efficiency and reliability of the integrator.
Table 6 presents the cost of solving this problem with three of the BDF solvers using banded Jacobians and a local error tolerance of 10-6. The results correspond to n = 50, but the same behavior was observed for other tolerances and other n. NFE1 is the number of function evaluations required when event location is used and NFE2, the number when the events are ignored. As the table shows, ignoring events is very expensive, but what is worse is that when events are ignored, the solution has the wrong shape and is negative at some points.
As discussed in § 2, special precautions must be taken so that DDRIV3 starts properly following an event and to insure that events are located in the correct (in this case, successive) order. Without these precautions, sometimes DDRIV3 repeatedly reports the same event and other times, does not locate events in the correct order. In the latter situation, the final solution has the wrong shape.
| Solver | NFE1 | NFE2 |
| DDRIV3 | 1493 | 11439 |
| DDASRT | 2927 | 13380 |
| DRDEBD | 1687 | 15311 |