Newton's Method: An Overview

September 12, 2017 · 11 min read

Chief Engineer, Triple Dot Engineering

I was first exposed to Newton's Method in my undergraduate numerical methods course. I've used it a number of times since then in undergraduate and graduate courses, but I didn't begin to gain an appreciation for its power until my recent navigation systems course. To take a take a dive into Newton's Method and deepen my understanding of it, I spent a day tinkering with various Newton's Method implementations to solve different type of problems.

Let's start with a brief derivation

Suppose we want to draw the tangent line to a function at a given point $x_0$ .

y = f'(x_0)\cdot(x - x_0) + f(x_0)

This is fairly straight forward with some introductory calculus and some algebra. The $f'(x)\cdot(x - x_0)$ says create a line with slope $f'(x)$ (the derivative of $f(x)$ evaluated at $x_0$ ) and shifted to the right by an amount $x_0$ . The $f(x_0)$ term shifts the function up by an amount $x_0$ . This is shown graphically below:

nm overview tangent line

Okay, no magic here. Just some algebra with a touch of calculus. Now, we set $y = 0$ and solve the equation for $x$ .

0 = f'(x_0)\cdot(x - x_0) + f(x_0)

-f'(x_0)\cdot(x - x_0) = + f(x_0)

x - x_0 = -\frac{f(x_0)}{f'(x_0)}

x = x_0 - \frac{f(x_0)}{f'(x_0)}

What this does is find x-value where our slope equals zero. In other words, we follow the slope of our function down toward zero. This means that $x_0 - \frac{f(x_0)}{f'(x_0)}$ will give us a result closer to the root of our function than our initial guess $x_0$ . This is Newton's Method. In a more algorithmic notation is can be written as

x_{n+1}= x_n - \frac{f(x_n)}{f'(x_n)}

Let's try it out (manually)

Let's start with the function $f(x) = 2x^2 - 8$ . We first need to take it's derivative, giving us $f'(x) = 4x$ . We also need to make an initial guess. Now let's apply our formula with a guess of 1.

f(4) = 2(4)^2 - 8 = 24

f'(4) = 16

x_{n+1} = x_n - \frac{f(x_0)}{f'(x_0)}

x_{n+1} = 4 - \frac{24}{16}

= 4 - 1.5

= 2.5

We get an answer of 2.5, which is closer to our root of 2 than our initial guess. We could then repeat this process with 2.5 as our guess and get 2.05. We can then repeat again and so on until are answer is "close enough".

Let's code it up

Let's implement this in Python. I'm choosing Python here because it's free (unlike MATLAB) and is easy to get started with even if you're not familiar with programming. The code below should work on Python version 2.7 or newer. If you're new to Python and/or programming checkout some useful resources here or this course.

In the code below, we define our initial guess, a maximum number of iterations, and a tolerance. The initial guess doesn't have too much of an impact on how fast Newton's Method can find the solution, but there are a couple special cases that can cause problems that we'll touch on later. The maximum number of iterations defines an upper limit to how many times we want our algorithm to run. This helps us stop if we're taking too long to find an accurate enough solution. Finally, the tolerance defines what "close enough" means. We check if the absolute value of f(x) at each iteration is below our tolerance, or "close enough" to zero. If so, we stop.

#!/usr/bin/env python
from __future__ import division, print_function

X   = 10                    # Initial guess
N   = 1000                  # Maximum iterations
TOL = 1e-3                  # Tolerance

# Newton's Method
for i in range(1, N+1):
    f  = 2*x**2 - 8         # Function f(x)
    df = 4*x                # Derivative f'(x)
    X  = X - f/df           # Improved guess

    if abs(F) < TOL:        # If close enough to the root,
        break               # exit the loop
else:
    print('Maximum iterations reached. Answer: {:6.4f}\n'.format(X))
print('Converged to {:6.4f} in {:d} iterations\n'.format(X, i))

That's is. We can now run our program and find the roots of functions. Simply change the lines containing the function and its derivative to solve a different problem. Adjust the initial guess, maximum iterations, and tolerance accordingly to meet your needs.

Problems with Newton's Method

Your initial guess matters. It doesn't impact the speed of Newton's method all that much (another topic we'll touch on later), but a bad guess can cause some problems. The first special case is zero. If you choose zero as your initial guess, you will get a divide by zero error. More generally, any guess that will cause f'(x) to evaluate to zero.

Another case where you can run into problems is the case where there is more than one root. Newton's Method will only find one of them. Your initial guess will impact which one it finds. Take a look at the plots below showing the same function and roots (shown with red *) determined via Newton's Method

Figure AFigure B

In the case of Fig. A on the left, an initial guess of 4 was used. In the case of Fig. B on the right, an initial guess of 5 was used. When you have more than one root, you need some idea of what to guess. The closer roots are to one another, the closer you initial guess needs to be to make sure you find the right one.

Finding local maxima and minima

We can use a version of Newton's Method to find local maxima and minima. To do this, we modify our algorithm to find the root of a function $g(x)$ where $g(x) = f'(x)$ . So, out algorithm becomes

g(x) = f'(x)

g'(x) = f''(x)

x_{n+1} = x_n - \frac{g(x)}{g''(x)}

Alternatively we can write this as

x_{n+1} = x_n - \frac{f'(x)}{f''(x)}

A more generic method

Above, we derived Newton's Method from the algebraic equation describing the tangent line of the function.

y = f'(x_0)\cdot(x - x_0) + f(x_0)

Previously, we set y equal to zero to find the roots. If we want to find when our function is equal to a particular y value, we can simple leave y in this equation and walk through the same derivation.

f'(x_0)\cdot(x - x_0) = y - f(x_0)

x - x_0 = \frac{y}{f'(x_0)} - \frac{f(x_0)}{f'(x_0)}

x = x_0 + \frac{y}{f'(x_0)} - \frac{f(x_0)}{f'(x_0)}

We now have a more general expression for Newton's Method to solve an equation for a given y value.

x_{n+1} = x_n + \frac{y}{f'(x_n)} - \frac{f(x_n)}{f'(x_n)}

Scalability and Speed

I wanted to get a deeper understanding of how Newton's Method scales. First, I decided to look at how its efficiency scales with function complexity. To do this, I ran Newton's Method with the same initial parameters for functions of increasing polynomial order. For simplicity of implementation, we ignore lower order terms. My implementation in MATLAB, is shown below.

close all; clear all; clc;

M = 150;
poly = zeros(M, 1);
time = zeros(M, 1);
for i=1:M
    X = 100;                        % Initial Guess
    N = 1000;                       % Maximum Iterations
    TOL = 1e-6;                     % Tolerance
    for j = 1:N
        f  = 2*X.^i - 2^(i+1);      % Function f(x)
        df = 2*i*X^(i-1) ;          % Derivative f'(x)
        X  = X - f/df;              % New guess
        if abs(f) < TOL             % If our guess is close enough to zero
            break                   % exit the loop
        end
    end
    poly(i) = i;    % Collect data - polynomial order
    time(i) = j;    % Collect data - iterations to solve
end

figure
hold on
grid on
grid minor
plot(poly, time, 'b.')
xlabel('Polynomial Order')
ylabel('Number of Iterations to Solve')

Output:

Newton's method scale

What we find is that Newton's Method scales linearly with polynomial order. This means the more complex the function you want to solve, the longer it takes.

The second thing I wanted to look at was how Newton's Method scales with initial guess error. In other words, how far off our initial guess is from the actual solution. We do this with a similar approach to the MATLAB script above, but the main loop in our new script is as follows:

M = 100e3;
fact = zeros(M, 1);
time = zeros(M, 1);
for i=2:2+M                         % Newton's Method
    X   = i;                        % Initial Guess
    N   = 1000;                     % Maximum Iterations
    TOL = 1e-6;                     % Tolerance
    for j=1:N
        f  = 2*X^2 - 8;             % Function f(x)
        df = 4*X;                   % Derivative f'(x)
        X  = X - f/df;              % New guess
        if abs(f) < TOL             % If our guess is close enough to zero
            break                   % exit the loop
        end
    end
    fact(i) = i - 2;                % Collect data - error of initial guess
    time(i) = j;                    % Collect data - iterations to solve
end

In this case, we keep our function the same but change our guess with each iteration of our outer loop. What we find is that Newton's Method scales logarithmically. This means we can drastically increase our initial guess error, while only increasing the time solve slightly. The is shown in the output plot below (pay attention to scale compared to the plot above):

Newton's method scale 2

What this shows us that Newton's Method is better at solving a simpler problem with a bad guess, than a complex problem with a close guess.

Summary

That's it! That's the basics of Newton's Method. It is much more powerful than what is shown here. In fact is can be scaled to solve problems of two or more variables (which I'd like to do a post on at some point) to solve much more complex problems.

For those who have a weaker programming background and/or want examples in other languages, I've created implementations of Newton's Method in a few more languages below.

Code Samples

The following examples implement Newton's Method in various languages for reference.

MATLAB

close all; clear all; clc;

X = 10;                 % Initial guess
N = 100;                % Maximum Iterations
TOL = 1e-2;             % Tolerance

for i = 1:N
    f  = 2*X^2 - 32;    % Function f(x)     <= EDIT THIS LINE
    df = 4*X;           % Derivative f'(x)  <= EDIT THIS LINE
    X  = X - f/df;      % Improved guess

    if abs(f) < TOL     % If close enough
        break           % break
    end
end

fprintf('Converged to %6.4f in %d iterations\n', X, i)

C

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

/**
 * This is our function who's root we want to find.
 */
double f(double x)
{
    return 2.0*powf(x,2.0) - 32.0;
}

/**
 * This is the derivative of the function who's root we want to find.
 */
double df(double x)
{
    return 4*x;
}

int main(void)
{
    int i, N;
    double X, TOL, F, DF;

    X   = 10.0;                 // Initial guess
    N   = 1000;                 // Maximum Iterations
    TOL = 1e-3;                 // Tolerance

    for (i = 0; i < N; i++)
    {
        F  = f(X);              // Function f(x)
        DF = df(X);             // Derivative f'(x)
        X  = X - F/DF;          // New guess
        if (fabs(F) < TOL)       // If close enough
            break;              // break
    }
    printf("Converged to %6.4f in %d iterations\n", X, i);
}

Fortran

program NewtonsMethod

integer :: N
real :: tol, x, f, df

x   = 10        ! Initial guess
N   = 100       ! Maximum Iterations
tol = 1e-3      ! Tolerance

! Netwon's Method
do i = 1, N
    f  = 2*x**2 - 32        ! Function f(x)     <= EDIT THIS LINE
    df = 4*x                ! Derivative f'(x)  <= EDIT THIS LINE
    x = x - f/df            ! Improved guess

    ! If close enough, exit loop
    if (ABS(f) < tol)  then
        exit
    end if
end do
print '("Converged to",f6.3," in",i2," iterations")', x, i

end program NewtonsMethod

Julia

#!/usr/bin/env julia
f(x) = 2*x^2 - 32       # EDIT THIS - function f(x)
df(x) = 4*X             # EDIT THIS - function f'(x)

# Newton's Method
X   = 10                # Initial guess
N   = 1000              # Maximum Iterations
TOL = 1e-3              # Tolerance

for i = 1:N
    global i            # Make i global so we can print the num iterations
    F  = f(X)           # Function f(x)
    DF = df(X)          # Derivative f'(x)
    X  = X - F/DF       # New guess
    if abs(F) < TOL     # If close enough
        break           # break
    end
end

print(string("Converged to ", X, " in ", i, " iterations\n"))

Let's start with a brief derivation​

Let's try it out (manually)​

Let's code it up​

Problems with Newton's Method​

Finding local maxima and minima​

A more generic method​

Scalability and Speed​

Summary​

Code Samples​

MATLAB​

C​

Fortran​

Julia​