Root-finding without a calculator

In this article, I will introduce techniques for approximating square and cube roots. The methods used to derive these techniques are very general and can be used to find the roots of any continuous real-valued function.

Square root

A popular technique for approximating the square root of a number is to use the tangent line approximation. Say we wish to find as=aa_s=\sqrt{a}, and we know that the square root of a neighbouring number bb is bsb_s, then we would use the approximation: ab+ab2b\sqrt{a} \approx \sqrt{b} + \dfrac{a-b}{2\sqrt{b}}. This approximation is called the tangent line approximation because it is derived by using the tangent line for the square root function x\sqrt{x} to approximate its values. This approximation can also be derived by using the Taylor series of the square root function.

Derivation of the tangent line approximation formula

If we have a function f(x)f(x) that we wish to approximate using a tangent line to at x=x0x=x_0, then the tangent line equation must be of the form:

y=mx+cy = mx + c

as it is a straight line. At the point x=x0x=x_0 the following equations must be satisfied:

y=f(x)y=f(x)\begin{array}{rcl} y &=& f(x) \\\\ y' &=& f'(x) \\\\ \end{array}

or in other words:

mx0+c=f(x0)m=f(x0)\begin{array}{rcl} mx_0 + c &=& f(x_0) \\\\ m &=& f'(x_0) \end{array}

substituting the second of these equations into the first yields:

f(x0)x0+c=f(x0)c=f(x0)f(x0)x0y=f(x0)x+f(x0)f(x0)x0=f(x0)(xx0)+f(x0).\begin{array}{rcl} f'(x_0)x_0 + c &=& f(x_0) \\\\ c &=& f(x_0) - f'(x_0) x_0 \\\\ \therefore y &=& f'(x_0) x + f(x_0) - f'(x_0) x_0 \\\\ &=& f'(x_0)(x-x_0) + f(x_0). \end{array}

Which is the equation of the tangent line. If we let f(x)=xf(x) = \sqrt{x} then:

f(x)=12xy=12x0(xx0)+x0=x0+xx02x0.\begin{array}{rcl} f'(x) &=& \dfrac{1}{2\sqrt{x}} \\\\ y &=& \dfrac{1}{2\sqrt{x_0}} (x-x_0) + \sqrt{x_0} \\\\ &=& \sqrt{x_0} + \dfrac{x-x_0}{2\sqrt{x_0}}. \end{array}

If we let x0=bx_0 = b, a neighbouring number we know the square root of, and x=ax=a then this gives us:

y=b+ab2b.\begin{array}{rcl} y &=& \sqrt{b} + \dfrac{a-b}{2\sqrt{b}}. \end{array}

Which is our approximation for asa_s.

Limitations of the tangent line approximation

While its approximation is often satisfactory, it can be substantially off if bb and aa differ by a number that is fairly large relative to bb.

Newton's method

In this case, you can use Newton's method to refine the approximation. Newton's method is a technique in which we essentially use the tangent line to approximate the zeros of a function. If one of that function's roots happens to be the square root you're searching for, applying Newton's method will have the result of giving you ever improving approximations to the square root. One function that is easy to exactly compute that's root is asa_s is f(x)=x2af(x) = x^2-a. Newton's method then gives us this scheme for approximating asa_s:

xn+1=xnf(xn)f(xn)=xnxn2a2xn=xnxn2+a2xn=xn2+a2xn.\begin{array}{rcl} x_{n+1} &=& x_n - \dfrac{f(x_n)}{f'(x_n)} \\\\ &=& x_n - \dfrac{x_n^2 - a}{2x_n} \\\\ &=& x_n - \dfrac{x_n}{2} + \dfrac{a}{2x_n} \\\\ &=& \dfrac{x_n}{2} + \dfrac{a}{2x_n}. \end{array}

Where xnx_{n} is our nnth Newton's method approximation to asa_s.

Example 1, finding 2\sqrt{2} to five decimal places

A classic example where the tangent line approximation is pretty poor is when we are trying to find 2\sqrt{2}. The tangent line approximation gives:

21+2121=1+12=32=1.5.\begin{array}{rcl} \sqrt{2} &\approx& \sqrt{1} + \dfrac{2-1}{2\sqrt{1}} \\\\ &=& 1 + \dfrac{1}{2} \\\\ &=& \dfrac{3}{2} \\\\ &=& 1.5. \end{array}

When of course 2\sqrt{2} is to five decimal places 1.414211.41421, and the above approximation is only accurate to one significant figure (or not even that if you round it to the nearest integer). We can improve on this approximation using Newton's method. Our first iteration is:

x1=x02+22x0=322+22×32=34+23=3×34×3+2×43×4=912+812=17121.417.\begin{array}{rcl} x_1 &=& \dfrac{x_0}{2} + \dfrac{2}{2x_0} \\\\ &=& \dfrac{\dfrac{3}{2}}{2} + \dfrac{2}{2\times \dfrac{3}{2}} \\\\ &=& \dfrac{3}{4} + \dfrac{2}{3} \\\\ &=& \dfrac{3\times 3}{4\times 3} + \dfrac{2\times 4}{3\times 4} \\\\ &=& \dfrac{9}{12} + \dfrac{8}{12} \\\\ &=& \dfrac{17}{12} \\\\ &\approx& 1.417. \end{array}

Now this is clearly not accurate to five decimal places, so another iteration of Newton's method is required.

x2=x12+22x1=17122+11712=1724+1217=17×1724×17+12×2417×24=289408+288408=5774081.4142157.\begin{array}{rcl} x_2 &=& \dfrac{x_1}{2} + \dfrac{2}{2x_1} \\\\ &=& \dfrac{\dfrac{17}{12}}{2} + \dfrac{1}{\dfrac{17}{12}} \\\\ &=& \dfrac{17}{24} + \dfrac{12}{17} \\\\ &=& \dfrac{17 \times 17}{24 \times 17} + \dfrac{12 \times 24}{17 \times 24} \\\\ &=& \dfrac{289}{408} + \dfrac{288}{408} \\\\ &=& \dfrac{577}{408} \\ &\approx& 1.4142157. \end{array}

This is accurate to four decimal places, the fifth is not as we would round it up to two, not one. Running Newton's method again yields:

x3=x22+22x2=5774082+1577408=577816+408577=577×577816×577+408×816577×816=332929470832+332928470832=6658574708321.41421356237469.\begin{array}{rcl} x_3 &=& \dfrac{x_2}{2} + \dfrac{2}{2x_2} \\\\ &=& \dfrac{\dfrac{577}{408}}{2} + \dfrac{1}{\dfrac{577}{408}} \\\\ &=& \dfrac{577}{816} + \dfrac{408}{577} \\\\ &=& \dfrac{577 \times 577}{816 \times 577} + \dfrac{408 \times 816}{577 \times 816} \\\\ &=& \dfrac{332929}{470832} + \dfrac{332928}{470832} \\\\ &=& \dfrac{665857}{470832} \\ &\approx& 1.41421356237469. \end{array}

Which is an accurate approximation of 2\sqrt{2} to 11 decimal places.

Example 2, approximating 76\sqrt{76} to five decimal places

The tangent line approximation yields:

76815281=952×9=9518=9×1818518=16218518=157188.722.\begin{array}{rcl} \sqrt{76} &\approx& \sqrt{81} - \dfrac{5}{2\sqrt{81}} \\\\ &=& 9 - \dfrac{5}{2\times 9} \\\\ &=& 9 - \dfrac{5}{18} \\\\ &=& \dfrac{9\times 18}{18} - \dfrac{5}{18} \\\\ &=& \dfrac{162}{18} - \dfrac{5}{18} \\\\ &=& \dfrac{157}{18} \\\\ &\approx& 8.722. \end{array}

The actual value of 76\sqrt{76} is 8.7177978870813488.717797887081348, so this approximation will not suffice. Applying Newton's method yields:

x1=x02+762x0=15718×2+762×15718=15736+3815718=15736+38×18157=157×15736×157+684×36157×36=246495652+246245652=4927356528.717799.\begin{array}{rcl} x_1 &=& \dfrac{x_0}{2} + \dfrac{76}{2x_0} \\\\ &=& \dfrac{157}{18 \times 2} + \dfrac{76}{2 \times \dfrac{157}{18}} \\\\ &=& \dfrac{157}{36} + \dfrac{38}{\dfrac{157}{18}} \\\\ &=& \dfrac{157}{36} + \dfrac{38 \times 18}{157} \\\\ &=& \dfrac{157 \times 157}{36 \times 157} + \dfrac{684 \times 36}{157 \times 36} \\\\ &=& \dfrac{24649}{5652} + \dfrac{24624}{5652} \\\\ &=& \dfrac{49273}{5652} \\\\ &\approx& 8.717799. \end{array}

Which is accurate to five decimal places.

Example 3, approximating 5\sqrt{5} to five decimal places

The tangent line approximation yields:

54+124=2+14=94=2.25.\begin{array}{rcl} \sqrt{5} &\approx& \sqrt{4} + \dfrac{1}{2\sqrt{4}} \\\\ &=& 2 + \dfrac{1}{4} \\\\ &=& \dfrac{9}{4} \\\\ &=& 2.25. \end{array}

The actual value of 5\sqrt{5} is approximately 2.236067977499792.23606797749979. The first iteration of Newton's method yields:

x1=x02+52x0=942+52×94=98+109=9×9+8×109×8=16172=2.23611.\begin{array}{rcl} x_1 &=& \dfrac{x_0}{2} + \dfrac{5}{2x_0} \\\\ &=& \dfrac{\dfrac{9}{4}}{2} + \dfrac{5}{2 \times \dfrac{9}{4}} \\\\ &=& \dfrac{9}{8} + \dfrac{10}{9} \\\\ &=& \dfrac{9 \times 9 + 8 \times 10}{9 \times 8} \\\\ &=& \dfrac{161}{72} \\\\ &=& 2.236\overline{11}. \end{array}

So this is only accurate to three decimal places. The next iteration of Newton's method yields:

x2=x12+52x1=161722+52×16172=161144+5×36161=161×161+5×36×144161×144=25921+2592023184=51841231842.23606797791.\begin{array}{rcl} x_2 &=& \dfrac{x_1}{2} + \dfrac{5}{2x_1} \\\\ &=& \dfrac{\dfrac{161}{72}}{2} + \dfrac{5}{2 \times \dfrac{161}{72}} \\\\ &=& \dfrac{161}{144} + \dfrac{5 \times 36}{161} \\\\ &=& \dfrac{161 \times 161 + 5 \times 36 \times 144}{161 \times 144} \\\\ &=& \dfrac{25921 + 25920}{23184} \\\\ &=& \dfrac{51841}{23184} \\\\ &\approx& 2.23606797791. \end{array}

Which is accurate to nine decimal places, or eight if we account for the effect of rounding.

Cube root

Similarly the cube root of a number can be approximated by a tangent line approximation and refined using Newton's method. If we let f(x)=x3f(x) = \sqrt[3]{x}, then f(x)=13(x3)2f'(x) = \dfrac{1}{3(\sqrt[3]{x})^2} and:

y=f(x0)(xx0)+f(x0)=13(x03)2(xx0)+x03=x03+xx03(x03)2.\begin{array}{rcl} y &=& f'(x_0)(x-x_0) + f(x_0) \\\\ &=& \dfrac{1}{3(\sqrt[3]{x_0})^2} (x-x_0) + \sqrt[3]{x_0} \\\\ &=& \sqrt[3]{x_0} + \dfrac{x-x_0}{3(\sqrt[3]{x_0})^2}. \end{array}

Letting x0=bx_0 = b, some neighbouring point we know the cube root of exactly, and x=ax = a, the number we want to find the cube root of we get (if ac=a3a_c = \sqrt[3]{a}):

acb3+ab3(b3)2.\begin{array}{rcl} a_c &\approx& \sqrt[3]{b} + \dfrac{a-b}{3(\sqrt[3]{b})^2}. \end{array}

And to refine this approximation we can use Newton's method with f(x)=x3af(x) = x^3 - a and hence f(x)=3x2f'(x) = 3x^2. Therefore we use the formula:

xn+1=xnf(xn)f(xn)=xnxn3a3xn2=xnxn3+a3xn2=2xn3+a3xn2.\begin{array}{rcl} x_{n+1} &=& x_n - \dfrac{f(x_n)}{f'(x_n)} \\\\ &=& x_n - \dfrac{x_n^3-a}{3x_n^2} \\\\ &=& x_n - \dfrac{x_n}{3} + \dfrac{a}{3x_n^2} \\\\ &=& \dfrac{2x_n}{3} + \dfrac{a}{3x_n^2}. \end{array}

Example 4, approximating 633\sqrt[3]{63} to five decimal places

Here we use b=64b=64, as 43=644^3 = 64 and hence we know the cube root of 64 is 4. Therefore our tangent line approximation to the cube root is:

63364313(643)2=413×16=4148=191483.97916.\begin{array}{rcl} \sqrt[3]{63} &\approx& \sqrt[3]{64} - \dfrac{1}{3(\sqrt[3]{64})^2} \\\\ &=& 4 - \dfrac{1}{3\times 16} \\\\ &=& 4 - \dfrac{1}{48} \\\\ &=& \dfrac{191}{48} \\\\ &\approx& 3.9791\overline{6}. \end{array}

The correct value of 633\sqrt[3]{63} is approximately 3.97905720789639173.9790572078963917, so this is not accurate to five decimal places. So to improve this estimate we apply Newton's method:

x1=2x03+633x02=2x03+21x02=2×191483+21(19148)2=19172+21×4821912=19172+21×230436481=191×36481+21×2304×7272×36481=6967871+34836482626632=1045151926266323.97905721.\begin{array}{rcl} x_1 &=& \dfrac{2x_0}{3} + \dfrac{63}{3x_0^2} \\\\ &=& \dfrac{2x_0}{3} + \dfrac{21}{x_0^2} \\\\ &=& \dfrac{2 \times \dfrac{191}{48}}{3} + \dfrac{21}{\left(\dfrac{191}{48}\right)^2} \\\\ &=& \dfrac{191}{72} + \dfrac{21 \times 48^2}{191^2} \\\\ &=& \dfrac{191}{72} + \dfrac{21 \times 2304}{36481} \\\\ &=& \dfrac{191 \times 36481 + 21 \times 2304 \times 72}{72 \times 36481} \\\\ &=& \dfrac{6967871 + 3483648}{2626632} \\\\ &=& \dfrac{10451519}{2626632} \\\\ &\approx& 3.97905721. \end{array}

Which is accurate to seven decimal places, or eight if we round off the remaining digits from there.

General technique

Newton's method can be used for finding the roots of any continuous real-valued function, or even system of continuous real-valued functions, although it has some shortcomings. One is that it requires an initial guess as to the root, and that its results can depend heavily on this initial guess. Additionally, it can sometimes fail to converge, such as when the root occurs at a stationary point for the function (maxima or minima), as there the derivative is zero. For example, it fails for the polynomial x4+3x3+4x2+3x+1x^4 + 3x^3 + 4x^2 + 3x + 1, as its only real root is at x=1x=-1 where the function is at minima. For polynomial equations that have real roots this failure to converge is uncommon provided the initial guess is reasonably close to the true value of the root.

Newton's method belongs to a class of methods known as the Householder's methods. Newton's method is the first order Householder's method, Halley's method is the second order Householder's method and it is possible to derive the ddth order Householder's method using the formula:

xn+1=xn+d(1/f)(d1)(xn)(1/f)(d)(xn)\begin{array}{rcl} x_{n+1} = x_n + d\dfrac{(1/f)^{(d-1)}(x_n)}{(1/f)^{(d)}(x_n)} \end{array}

where the bracketed superscripts denotes derivatives of the superscript's order. For example (1/f)(d)(1/f)^{(d)} denotes the ddth derivative of 1/f1/f. The reason I mentioned this fact is that sometimes, especially when the lower order derivatives are equal to zero at the root, higher order Householder's methods may be superior.

Getting the initial guesses for the roots, especially of polynomials that have multiple real roots, can be a challenge. One technique is to use the bisection method. In this technique one takes an interval over which the sign of the function changes from positive to negative or vice versa (which implies that at least one root must lie within this interval) and continuously subdivides this interval and re-evaluates the function at its endpoints until we find the root. This method is very slow, so usually one would only apply it a few times, and then once our interval is acceptably small we would apply Newton's method to get the root. One can also evaluate the function at a long list of points within an interval and find where in the interval the sign changes, as that is where a root will be. If the interval is large enough and the function has multiple real roots there may be multiple sign change points and hence multiple roots we can converge to using Newton's method. Like Newton's method, however, it does not converge when the root is a minima or maxima.

These methods can be done by hand, but for many problems they can be the quite tedious; as such I wrote this Julia script that implements the bisection method and Newton's method to find the roots of a given function:

#!/usr/bin/env julia
    bisection(f::Function, N::Integer, a::Number, b::Number)

Uses the bisection method with N subdivisions of the specified domain [a, b] 
to find the roots of the function f within said domain. Used to find the 
initial guess that Newton's method then uses to get a more precise estimate of 
the roots.
function bisection(f, N, a, b)
    x = LinRange(a, b, N+1)
    fval = f.(x)
    xv = x[2:end]
    initGuess = xv[diff(sign.(fval)).!= 0]
    return initGuess

    newtons(f::Function, h::Float, tol::Float, itMax::Integer, 

Uses Newton's method (and if needed, up to third order Householder's methods) 
to refine initGuess, our initial guess of the root(s) of f, until either itMax 
iterations has been performed or the relative magnitude of the update Newton's 
provides to the estimate is below tol. h is the step size used to approximate 
the derivative.  
function newtons(f, h, tol, itMax, initGuess)
    # Derivatives
    function fd(x, h)
        return (f(x+h)-f(x-h))/(2*h)
    function f2d(x, h)
        return (f(x+h)-2*f(x)+f(x-h))/(h^2)
    function f3d(x, h)
        return (f(x+2*h)-2*f(x+h)+f(x-h)-f(x-2*h))/(2*h^3)

    # Check to see if (relative) difference in update is below tol
    function tolCheck(diff, sol, tol)
        if ! (sol ≈ 0)
            check = abs(diff/sol) > tol
            check = abs(diff) > tol
        return check

    # Initialize sol
    sol = initGuess
    count = Int64.(zeros(size(initGuess)))
    for j=1:length(initGuess)
        diff = 1
        while (tolCheck(diff, sol[j], tol) && count[j] < itMax)
            fn = f(sol[j])
            der = fd(sol[j], h)
            der2 = f2d(sol[j], h)
            der3 = f3d(sol[j], h)
            if (der ≈ 0) && (der2 ≈ 0)
                # Third order Householder's method
                diff = (6*fn*der^2 - 3*fn^2*der2)/(6*der^3-6*fn*der*der2 + fn^2*der3)
            elseif (der ≈ 0)
                # Second order, Halley's method
                diff = 2*fn*der/(-fn*der2 + 2*der^2)
                # First order, Newton's method
                diff = fn/der
            sol[j] -= diff
            count[j] += 1
        if (count[j] == itMax)
            print("Maximum iterations exceeded and the amount by which ")
            println("Newton's last updated the solution was: ", diff)

    # Delete NaN entries from sol and corresponding count entries
    # While loop is used because sol's length will change over this loop
    j = 1;
    while j < length(sol)
        if isnan(sol[j])
            deleteat!(sol, j)
            deleteat!(count, j)
        j += 1;
    return sol, count

    findRoot(f::Function, h::Float, tol::Float, itMax::Integer, a::Number, b::Number, N::Integer)

Uses the bisection method to get an initial guess of the root(s) of f on the 
domain [a, b] with N subdivisions, then uses Newton's method with a maximum of 
itMax iterations and a relative error tolerance of tol. h is the step size 
used to approximate the derivative. 
function findRoot(f, h, tol, itMax, a, b, N)
    initGuess = bisection(f, N, a, b)
    sol, count = newtons(f, h, tol, itMax, initGuess);
    return sol, count

    printSoln(sol::Vector{Float}, count::Vector{Integer})

Print the solution and the number of iterations used.
function printSoln(sol, count)
    if (length(sol) == 1)
        println("Root = ", sol[1], ", count = ", count[1])
        for i=1:length(sol)
            println("The $(i)th root = ", sol[i], ", count = ", count[i])

# This is where you specify the function you want to
# find the root of
function f(x)
    return x^4 + x^3 - 10x^2 - 4x + 16

# Initialize required variables
h = 1e-10
tol = 1e-15
itMax = 1000
a = -100
b = 100
N = 100000
sol, count = findRoot(f, h, tol, itMax, a, b, N)
printSoln(sol, count)