ARCH BRIEF ABSTRACT This is a curve fitting contest: find the parabola that looks the most like the (catenary shaped) St. Louis arch. It could be given any time after the definite integral has been developed. GENERAL INFORMATION FileName: ARCH Full title: A competition: Fitting the best parabola to the St. Louis Arch Last Update: 5/22/96 Developers: Aaron Klebanoff, Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute IN 47803 USA. Kimberly Foltz, Mathematics and Computer Science Division, Indiana Academy for Science, Mathematics, & Humanities, Muncie IN 47306 USA Dave Horn, Rogers High School, Michigan City, IN 46363 Contact: Aaron D. Klebanoff, Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute IN 47803 USA. Phone: 812-877-8151. Email: Aaron.Klebanoff@Rose-Hulman.Edu. FAX: 812-877-3198. Support: The production of this material is supported by the National Science Foundation under Division of Undergraduate Education grant DUE-9352849: Development Site for Complex, Technology-Based Problems in Calculus with Applications in Science and Engineering and the Arvin Foundation of Columbus IN. STATEMENT OF PROBLEM The Gateway Arch in St. Louis is in the shape of a catenary -- a function of the form f(x) = fc - A(cosh[c x/L] - 1) where A = fc/[(Qb/Qt) - 1] = 68.7672, c = arccosh(Qb/Qt) = 3.0022, fc = Max. Height of Centroid = 625.0925, Qb = Max cross-section at arch base = 1262.6651, Qt = Min cross-section at arch top = 125.1406, L = half of centroid at arch base = 299.2239. All measurements are given in feet. This is actually the shape of the centroid of the arch -- the set of points going through the center of the arch. The data was provided by the National Park Service. At first glance, the casual observer might mistake the shape of the St. Louis Arch for a parabola. Test the observation by seeing how closely you can fit a parabola to the arch. Defend your notion of close. KEYWORDS Curve Fitting, Catenary, Parabola. TEACHER NOTES ISSUES RELATED TO THE PROBLEM There are many different ways to measure distance. Students must be encouraged to be open minded. Students should also be warned that they should always attempt to quantify their "best fittting" curve. Prerequisites Any time during a calculus class depending on the level of sophistication which you expect. Time allotment - time management A day should be set aside to discuss curve fitting before this assignment is given. Expectations Students should have FUN judging each others methods. Be prepared to deal with students who offer trivial solutions without any mathematical foundation. Future payoffs Gives students experience with curve fitting. Students gain a sense of ownership due to the open nature of the problem. Extensions As a challenge problem, motivated students should attempt to (i) compute the Hausdorff distance of "best" fitting graphs; (ii) determine the "best" fitting parabola with the Hausdorff distance function. The Hausdorff metric or distance is used in the study of fractals to determine whether two sets (which may be quite complicated) look the same. Fortunately, the idea is quite simple. To understand the Hausdorff distance between two curves, you have to define what you mean by the distance between the curves. This in turn is easier to understand if you first define the distance between a point and a curve. 1) The distance between a point and a curve is the length of the shortest line segment between the point and the curve. That is, d(x, B) = min { d (x, y): y in B }. 2) The distance between curve A and curve B is greatest distance among all of the distances between a point on A to the curve B. That is, d(A, B) = max { d(x, B): x in A }. Similarly, the distance between curve B and curve A is the greatest distance among all of the distances between a point B to the curve A. At first, this may seem redundant, but the following example should explain that the definition is not symmetric: Suppose A = unit circle centered at the origin; and B = the circle of radius 2 centered at (5, 0). ParametricPlot[{{Cos[t], Sin[t]}, {2 Cos[t] + 5, 2 Sin[t]}}, {t, 0, 2 Pi}, AspectRatio -> Automatic] Then, the distance from A to B = 4, while the distance from B to A = 6. 3) Finally, the Hausdorff distance is simply the maximum between the distance from A to B and B to A. So, in the case of the catenary vs. parabola, call A = catenary (true shape of the St. Louis Arch); and B = parabola. While it is easy to describe, the Hausdorff distance is typically hard to compute symbolically. However, given an example, one can often exploit the geometry of the sets A and B (which may be sets of points on a curve y = f(x), for example) to find the distance between them. In the following example, we work four examples to compare the Hausdorff distance between the "best fitting" curves found in the Possible Solutions section to the catenary equation for the Arch. Example: (see the Possible Solutions section to see how g1[x], ..., g4[x] were found.) f[x] = the St. Louis arch g1[x] = Least Squares fit g2[x] = Arc Length Fit #1 (touch at vertices) g3[x] = Arc Length Fit #2 (touch at ground) g4[x] = Fit to three points: ground and vertex A = 68.7672; c = 3.0022; L = 299.2239; fc = 625.0925; f[x_] = fc - A(Cosh[c x/L]- 1); g1[x_] = 652.8286904319139 - 0.006330791164684482*x^2; g2[x_] = 625.0925 - 0.007238154230161798*x^2; g3[x_] = 648.0786735587897 - 0.007238154230161798*x^2; g4[x_] = 625.0925 - 0.006981548027223175*x^2; The Hausdorff distance between g1[x] and f[x] is 27.7362: Plot[{f[x], g1[x]}, {x, 0, L + 25}, AspectRatio -> Automatic, PlotRange -> {0, 670}] -Graphics- It's clear from the graph above that the greatest distance between either curve occurs on the y - axis and is easy to compute, so the Hausdorff distance between g1[x] and f[x] is g1[0] - f[0] 27.7362 The Hausdorff distance between g2[x] and f[x] is 34.5409: Plot[{f[x], g2[x]}, {x, 0, L + 10}, AspectRatio -> Automatic, PlotRange -> {0, 670}] -Graphics- In this case, there is clearly a place inside [0, L] where the greatest distance occurs. We find the two distances described above by finding the length of the line perpendicular through one of the curves and measuring the length of the segment between them. The Hausdorff distance is the maximum of both of these. distftog[x_] := Sqrt[(1 + 1/g2'[x]^2)(FindRoot[f[x0] == g2[x] - 1/g2'[x] (x0 - x), {x0, x}][[1, 2]] - x)^2] distgtof[x_] := Sqrt[(1 + 1/f'[x]^2)(FindRoot[g2[x0] == f[x] - 1/f'[x] (x0 - x), {x0, x}][[1, 2]] - x)^2] Plot[{distgtof[x], distftog[x]}, {x, 0.01, L}] -Graphics- Plot[distftog[x], {x, 160.4, 160.6}] -Graphics- Plot[distgtof[x], {x, 192.2, 192.25}] -Graphics- Since both of the graphs have the same maximum, the Hausdorff distance between f[x] and g2[x] is 34.5409. The Hausdorff distance between g3[x] and f[x] is 25.8622: Plot[{f[x], g3[x]}, {x, 0, L}, AspectRatio -> Automatic, PlotRange -> {0, 660}] -Graphics- In this case, it's not clear if the maximum occurs at the y-axis or in the middle of the interval. We find the two distances described above by finding the length of the line perpendicular through one of the curves and measuring the length of the segment between them. The Hausdorff distance is the maximum of both of these. distftog[x_] := Sqrt[(1 + 1/g3'[x]^2)(FindRoot[f[x0] == g3[x] - 1/g3'[x] (x0 - x), {x0, x}][[1, 2]] - x)^2] distgtof[x_] := Sqrt[(1 + 1/f'[x]^2)(FindRoot[g3[x0] == f[x] - 1/f'[x] (x0 - x), {x0, x}][[1, 2]] - x)^2] Plot[{distgtof[x], distftog[x]}, {x, 0.01, L}] -Graphics- Now we can tell that the maximum is not on the y-axis. Plot[distftog[x], {x, 178.05, 178.1}] -Graphics- Plot[distgtof[x], {x, 202.2, 202.22}] -Graphics- Since both of the graphs have the same maximum, the Hausdorff distance between f[x] and g2[x] is 25.8622. The Hausdorff distance between g4[x] and f[x] is 31.9065: Plot[{f[x], g4[x]}, {x, 0, L}, AspectRatio -> Automatic, PlotRange -> {0, 640}] -Graphics- In this case, there is clearly a place inside [0, L] where the greatest distance occurs. We find the two distances described above by finding the length of the line perpendicular through one of the curves and measuring the length of the segment between them. The Hausdorff distance is the maximum of both of these. distftog[x_] := Sqrt[(1 + 1/g4'[x]^2)(FindRoot[f[x0] == g4[x] - 1/g4'[x] (x0 - x), {x0, x}][[1, 2]] - x)^2] distgtof[x_] := Sqrt[(1 + 1/f'[x]^2)(FindRoot[g4[x0] == f[x] - 1/f'[x] (x0 - x), {x0, x}][[1, 2]] - x)^2] Plot[{distgtof[x], distftog[x]}, {x, 0.01, L}] -Graphics- Plot[distftog[x], {x, 158.5, 158.55}] -Graphics- Plot[distgtof[x], {x, 187.5, 187.7}] -Graphics- Since both of the graphs have the same maximum, the Hausdorff distance between f[x] and g2[x] is 31.9065. So measuring by Hausdorff distance, one could say that g3[x] gives the best fit. References and Sources U.S. National Park Service URL: http://www.st-louis.mo.us/st-louis/arch/catenary.html POSSIBLE SOLUTION(S) Least Squares Regression In[2]:= A = 68.7672; c = 3.0022; L = 299.2239; fc = 625.0925; f = fc - A(Cosh[c x/L]- 1); The parabola which best models the Gateway Arch will have symmetry with respect to the y-axis. Thus, it will be a quadratic function with linear coefficient zero. A general quadratic of this type can be expressed as follows, where p1 and p2 are parameters. In[7]:= g = p1 x^2 + p2; One way of finding the parabolic curve which best fits the catenary is to measure the difference between the curves; i.e., consider f(x) - g(x). However, if the curves intersect, then on some intervals this difference will be positive and on some it will be negative. Even though the differences may be large in magnitude, the opposite signs allow for the possibility that the differences will in effect "cancel". One way to force the difference to be positive is to express the difference as f(x) - g(x) on subintervals where f(x) > g(x) and as g(x) - f(x) on subintervals where g(x)> f(x). This requires finding the point(s) of intersection of the curves (which is not necessarily an easy task.) Another way to eliminate the difficulty of the negative differences is to use the absolute value of the difference f(x) - g(x), but this causes complications in the subsequent integration calculations if using current computer algebra systems. These problems are eliminated by squaring the difference. The expression ( f(x) - g(x) )^2 will give a (positive) measure of the square difference between the two curves at any point. The total difference is given by integrating over the length of the curves. Due to the symmetry in our problem, it is necessary only to integrate from 0 to the right end point. The parabola will best fit the catenary when total square error is minimized. Since we are performing what is know as a least squares regression, we refer to the total square error as SE (for Square Error), and we remark that this quantity depends on the parameters p1 and p2 for the candidate fitting curve. In[8]:= SE = Integrate[(f - g)^2, {x, 0, L}] Out[8]= 7 9 11 2 7.31452 10 - 5.58561 10 p1 + 4.79746 10 p1 - 7 2 277612. p2 + 1.78607 10 p1 p2 + 299.224 p2 To minimize the square error, we find its critical points by finding the zeroes of its derivatives. In[9]:= derSEp1 = D[SE, p1]; derSEp2 = D[SE, p2]; By setting each partial derivative equal to zero and solving the resulting system of equations simultaneously, we find the values of p1 and p2 that minimize the sum of the square differences. In[11]:= sol = Solve[{derSEp1 == 0,derSEp2 == 0}, {p1, p2}] Out[11]= {{p1 -> -0.00633079, p2 -> 652.829}} How does the least squares parabola compare to the actual arch? We will retrieve the values for p1 and p2 from the above solution and then plot each curve to see. The true arch is the solid curve and the approximating curve is dashed. In[12]:= p1star = p1 /. sol[[1,1]]; p2star = p2 /. sol[[1,2]]; bestg = g /. {p1 -> p1star, p2 -> p2star} Out[14]= 2 652.829 - 0.00633079 x In[24]:= Plot[{f, bestg}, {x, -L, L}, AspectRatio -> Automatic, PlotStyle -> {{AbsoluteThickness[2]}, {AbsoluteThickness[3], AbsoluteDashing[{3, 6}]}}, AxesLabel -> {"[ft]", "height [ft]"}]; Its important to note that the fitting curve is only best fitting over the length of the true arch. It's a bit more complicated to get a best fitting curve all the way to the ground in the least squares sense. We offer a solution to that case in ISSUES IN SOLUTIONS. Equal Arc Length (Two sub cases) This "solution" rests on the assumption that the "best" parabola will have the same arc length as the catenary. The first step is to find this length. The following code finds this using the standard formula and taking advantage of the symmetry of the curve. In[41]:= Clear[f]; A = 68.7672; c = 3.0022; L = 299.2239; fc = 625.0925; f[x_] = fc - A (Cosh[c x/L]- 1); lengthcat = NIntegrate[2 Sqrt[1 + (f'[x])^2], {x, 0, L}] Out[47]= 1480.28 Thus, our goal is to find a parabola with arc length of 1480.28 feet. As in the previous solution, we assume symmetry with respect to the y-axis. Thus, our parabola will not have a linear term yielding an expression of the form In[56]:= g = p1 x^2 + p2 Out[56]= 2 p2 + p1 x We want to find the values p1 and p2 that will make the arc length of the parabola 1480.28 feet. In the process of computing the arc length of g, the parameter p2 drops out resulting in In[67]:= lengthparab = Integrate[2 Sqrt[1+(2 p1 x)^2],{x, 0, L}]; p1star = p1 /. FindRoot[lengthparab == lengthcat,{p1, -1}] Out[68]= -0.00723815 Thus, the leading coefficient of the "best" parabola is -0.00723815. However, it is not possible to isolate the second parameter, p2, because the parameter, in the absence of a linear term, determines vertical translation, and vertical translation does not change the arc length. Thus, we are free to choose any value of p2 we wish. There are two obvious choices: 1. Make the vertex of the parabola coincide with the apex of the arch; i.e. let p2 = fc from above. In[69]:= p2star = fc; bestg1 = g /. {p1 -> p1star, p2 -> p2star} Out[70]= 2 625.0925 - 0.00723815 x In[71]:= Plot[{f[x], bestg1}, {x, -L, L}, AspectRatio -> Automatic, PlotStyle -> {{AbsoluteThickness[2]}, {AbsoluteThickness[3], AbsoluteDashing[{3, 6}]}}, AxesLabel -> {"[ft]", "height [ft]"}]; This graph looks pretty good near the top, but to achieve the desired arc lengths, the ends are below the ground. It is questionable whether it is fair to say the curves have the same length since the catenary is entirely above ground but the ends of the parabola are buried. 2. Make the parabola and the arch hit the ground at the same points. For this to occur, the parabola must pass through the point determined by root = FindRoot[f[x] == 0,{x, 300}] Out[72]= {x -> 299.226} In[79]:= xint = x /. root; p2star = p2 /. FindRoot[0 == p1star xint^2 + p2, {p2,600}] Out[80]= 648.079 In[81]:= bestg2 = g /. {p1 -> p1star, p2 -> p2star} Out[81]= 2 648.079 - 0.00723815 x In[82]:= Plot[{f[x], bestg2}, {x, -L, L}, AspectRatio -> Automatic, PlotStyle -> {{AbsoluteThickness[2]}, {AbsoluteThickness[3], AbsoluteDashing[{3, 6}]}}, AxesLabel -> {"[ft]", "height [ft]"}]; Because this parabola is entirely above ground, in one important sense it is better than the previous one. 3 point solution Another choice for the "best" parabola might be the unique parabola that shares three important points with the catenary. These points are the apex and the two intersections with the ground (the x-axis). The equation of this parabola is again of the form g(x) = p1 x^2 + p2 due to the symmetry about the vertical axis. Furthermore, since the vertex is on the vertical axis, we know that p2 = fc which yields the functional form In[83]:= g = p1 x^2 + fc Out[83]= 2 625.0925 + p1 x We choose p1 so that the x-intercept of the parabola is at (L, 0) -- the x-intercept of the catenary. Substituting L into g and setting it equal to zero yields In[87]:= p1star = p1 /. Flatten[Solve[(g /. x -> L) == 0, p1]] Out[87]= -0.00698155 In[88]:= bestg = g /. {p1 -> p1star, p2 -> p2star} Out[88]= 2 625.0925 - 0.00698155 x In[89]:= Plot[{f[x], bestg}, {x, -L, L}, AspectRatio -> Automatic, PlotStyle -> {{AbsoluteThickness[2]}, {AbsoluteThickness[3], AbsoluteDashing[{3, 6}]}}, AxesLabel -> {"[ft]", "height [ft]"}]; Hausdorff distance (see Extensions) It is difficult to obtain the parabola that is closest in the Hausdorff distance to the true arch, but it provides a nice way to judge the other solutions. Here is a summary which is shown in detail in the Extensions section, a subsection of Teacher Notes. From best to worst using Hausdorff distance: (1) Setting the arc lengths equal and matching the curves at the ground (2) Minimizing the distance in the least squares sense (3) Matching curves at the vertex and ground (4) Setting the arc lengths equal and matching at the vertex The four cases are shown below from best to worst fit in the Hausdorff sense. ISSUES IN SOLUTION Probably the most natural measure of distance for the students would be to make the area between the curves as small as possible. However, this requires that we determine the points where the curves intersect which can only be done numerically. This provides a good motivation for the using least squares instead of the absolute value of the difference. Probably the simplest solution is the 3 point fit solution (see Solutions). As shown under Extensions, it's actually not a bad solution and it's not even the worst of the four offered. So, you should also make sure that students should have a way to compare the solutions in other than visual ways. (For example, all solutions can be compared by the criteria of best fit by least squares.) In the Least Squares fit, you do not know at the onset where the "best fitting" curve will hit the ground. A more complicated yet more satisfying solution is offered below. In[97]:= A = 68.7672; c = 3.0022; L = 299.2239; fc = 625.0925; f = fc - A(Cosh[c x/L]- 1); g = -p1 x^2 + p2; In[103]:= zero = x /. Flatten[Solve[g == 0, x]] Out[103]= Sqrt[p2] -(--------) Sqrt[p1] In[104]:= SE = Integrate[(f - g)^2, {x, 0, zero}] Out[104]= 3/2 5/2 -483806. Sqrt[p2] 925.146 p2 0.533333 p2 ----------------- + ------------- - -------------- - Sqrt[p1] Sqrt[p1] Sqrt[p1] 6 2.73247 10 Sqrt[p1] Sqrt[p2] 0.0100333 Sqrt[p2] Cosh[------------------] + Sqrt[p1] 6 0.0100333 Sqrt[p2] 9.5113 10 Sinh[------------------] + Sqrt[p1] 8 0.0100333 Sqrt[p2] 2.7234 10 p1 Sinh[------------------] - Sqrt[p1] 0.0200666 Sqrt[p2] 117831. Sinh[------------------] Sqrt[p1] In[105]:= derSEp1 = D[SE, p1]; derSEp2 = D[SE, p2]; By setting each partial derivative equal to zero and solving the resulting system of equations simultaneously, we find the values of p1 and p2 that minimize the sum of the square differences. In[119]:= sol = FindRoot[{derSEp1==0, derSEp2==0}, {p1, 0.005}, {p2, 652}] Out[119]= {p1 -> 0.00684839, p2 -> 662.632} How does the least squares parabola compare to the actual arch? We will retrieve the values for p1 and p2 from the above solution and then plot each curve to see. The true arch is the solid curve and the approximating curve is dashed. In[123]:= p1star = p1 /. sol[[1]]; p2star = p2 /. sol[[2]]; bestg = g /. {p1 -> p1star, p2 -> p2star} Out[125]= 2 662.632 - 0.00684839 x In[129]:= int = Max[L, -zero /. {p1 -> p1star, p2 -> p2star}] Out[129]= 311.059 In[130]:= Plot[{f, bestg}, {x, -int, int}, AspectRatio -> Automatic, PlotStyle -> {{AbsoluteThickness[2]}, {AbsoluteThickness[3], AbsoluteDashing[{3, 6}]}}, AxesLabel -> {"[ft]", "height [ft]"}];