Lecure 4: Rootfinding and Interpolation

Topics

Root finding
Interpolation
Curve building

Rootfinding

The Bible: the love of money is the root of all evil
Mark Twain: the lack of money is the root of all evil

Common rootfinding methods

Search the root of a scalar function $f(x) = 0$, common methods:

Bisection
Newton-Ralphson
Secant
Brent

Illustrative Examples:

compute the implied volatility of a call option.
a more challenging problem of $\Phi(x) + .05x - 0.5 = 0$

In [3]:

from scipy.stats import norm
lecture = 4

# black scholes call
def bs_call(r, s, k, sigma, t) :
    d1 = (math.log(s/k) + (r + .5*sigma*sigma)*t)/sigma/math.sqrt(t)
    d2 = d1 - sigma*math.sqrt(t)
    return norm.cdf(d1)*s - norm.cdf(d2)*k*math.exp(-r*t)

# black scholes vega
def bs_vega(r, s, k, sigma, t) :
    d1 = (math.log(s/k) + (r + .5*sigma*sigma)*t)/sigma/math.sqrt(t)
    return math.sqrt(t)*s*norm.pdf(d1)

f = lambda s : bs_call(r=.05, s=100, k=150., sigma=s, t=1) - 1.
g = lambda s, foo, bar : bs_vega(r=.05, s=100, k=150., sigma=s, t=1) 
f2 = lambda x : norm.cdf(x) + .05*x - .5
g2 = lambda x, a, b : norm.pdf(x) + .05

fig = figure(figsize=[12, 3.5])
subplot(1, 2, 1)
sigs = np.arange(.01, .5, .01)
bsv = np.array([f(s) for s in sigs])
plot(sigs, bsv, '-')
axhline(color = 'r');
xlabel('Volatility')
ylabel('Price');
title('Black Scholes Call')
subplot(1, 2, 2)
x2 = np.arange(-5, 5, .1)
plot(x2, f2(x2));
xlabel('$x$')
axhline(color = 'r');
xlim(-5, 5)
title('$\Phi(x) + .05x - 0.5 = 0$');

Bisection

initial gueses must bucket the solution
the function must be continous
converges to the true root linearly (absolute error halves for each iteration).
- $\vert \epsilon_{n+1} \vert \le c \vert \epsilon_{n} \vert $, $c = \frac{1}{2}$ for Bisection
- the term linear is rather confusing, as the error actually reduces exponentially.

In [4]:

import scipy.optimize as opt
a = .1
b = .45

def cumf (x, func, xs) :
    xs.append(x)
    return func(x)

xs = []
nx = opt.bisect(cumf, a, b, args=(f, xs))

def show_converge(f, xs, a, b, maxIter, tag, diagline = True, txty = .5) :
    figure(figsize = [12, 3.5])
    subplot(1, 2, 1)

    xr = np.arange(a*.9, b*1.1, (b-a)/500.)
    bsv = np.array(map(f, xr))

    plot(xr, bsv, '-')
    axhline(color = 'k');

    for i in range(maxIter) :
        plot([xs[i], xs[i]], [0, f(xs[i])])
        scatter(xs[i], txty*(1. if f(xs[i]) < 0 else -1.), 
                s=150, marker = "$%d$"% i)
        if diagline and i< (maxIter-1):
            plot([xs[i], xs[i+1]], [f(xs[i]), 0])

    xlim(a*.9, b*1.1);
    ylim(f(a) - .2, f(b) + .2);
    xlabel('$x$')
    ylabel('$f(x)$')
    title(tag)

    subplot(1, 2, 2)
    es = np.abs(map(f, xs))
    semilogy(es, '.');
    xlabel('Iterations')
    title('Error');
    
show_converge(f, xs, .1, .45, 6, 'Bisection', False)

Newton-Ralphson

converges quadratically: $\vert \epsilon_{n+1} \vert \le c \; \epsilon_n^2$.

In [5]:

xs = []
nx = opt.newton(cumf, b, g, args=(f, xs))

show_converge(f, xs, 0.2, .45, 3, 'Newton-Ralphson', True)

convergence is not guaranteed, it can fail around inflection point
guaranteed convergence if the function is convex everywhere.

In [6]:

xs = []

try : 
    nx = opt.newton(cumf, 2, g2, args=(f2, xs))
except RuntimeError:
    pass

show_converge(f2, xs, -5, 9.5, 3, 'Newton-Ralphson', True, .1)

Secant

Similar to Newton-Raphson, but use previous valuations to approximate the derivatives

In [7]:

xs = []
g = lambda s, foo, bar : bs_vega(r=.05, s=100, k=150., sigma=s, t=1) 
nx = opt.newton(cumf, b, args=(f, xs))

show_converge(f, xs, 0.2, .45, 4, 'Secant', True)

Secant is also not guaranteed to converge.

In [8]:

xs = []
try : 
    nx = opt.newton(cumf, 3, args=(f2, xs))
except RuntimeError:
    pass

show_converge(f2, xs, -11, 11, 5, 'Secant', True, .1)

Brent

An adaptive algorithm.
Guaranteed to find a solution if initial guesses bucket the root
Usually converges very fast
Often the default choice in practice

In [9]:

xs = []
g = lambda s, foo, bar : bs_vega(r=.05, s=100, k=150., sigma=s, t=1) 
nx = opt.brentq(cumf, a, b, args=(f, xs))

show_converge(f, xs, 0.1, .45, 5, 'Brent', False)

The challenging $\Phi(x) + .05x - 0.5 = 0$ converges under Brent:

In [10]:

xs = []
nx = opt.brentq(cumf, -1, 4, args=(f2, xs))

show_converge(f2, xs, -1.5, 5, 5, 'Brent', True, .1)

Interpolation

interpolate: verb, insert (something) between fixed points

Common interpolation methods

From a discrete set of knot values $x_i, y_i$, reconstruct the whole function $f: x \rightarrow y$.

Common interpolation methods:

Linear
Polynomial
Cubic spline

We will cover in detail:

Tension spline
Maximum entropy (future lecture)

Linear interpolation

In [11]:

x = np.array([.1, 1.,2, 3., 5., 10., 25.])
y = np.array([.0025, .01, .016,.02, .025, .03, .035])
plot(x, y, '-o');
xlabel('Year')
ylabel('Rates');
title('Linear Interpolation');

Piecewise constant interpolation

values equals to the observation to the right (or left).
simple (maybe even silly) but useful in curve building
works well with bootstrap

In [12]:

m = dict(zip(x, y))
m.update(dict(zip(x[:-1] + 1e-6, y[1:])))
k, v = zip(*sorted(m.items()))
plot(k, v, 'o-')
title('Piecewise Constant');

Polynomial interpolation

Fit a polynomial function of order $n-1$ to the $n$ observations.
The exact solution is known as the Lagrange polynomials:

$$ p(x) = \sum_{i=0}^n(\prod_{1\le j \le n, j\ne i}\frac{x-x_j}{x_i-x_j})y_i $$

It is rarely used in practice because ...

In [13]:

z = np.polyfit(x, y, len(x)-1)
p = np.poly1d(z)
x1d = arange(0, 26, .1)
plot(x1d, p(x1d))
plot(x, y, 'o');
title('Polynomial');

Spline interpolation

Spline is an elastic string or pipe, used by draftsmen to draw smooth curves through fixed knots before the computer age:

Spline interpolation is a mathematical model for the physical spline.

Cubic spline

Cubic spline uses a 3rd order polynomial for each line section between knots

$$q_i(x) = a_i + b_i x + c_i x^2 + d_i x^3 \;,\;\; x_i < x < x_{i+1}$$

Enforce continuity up to the 2nd derivatives:

\begin{eqnarray} q_{i-1}(x_i) &=& q_{i}(x_i) \\ q'_{i-1}(x_i) &=& q'_{i}(x_i) \\ q''_{i-1}(x_i) &=& q''_{i}(x_i) \end{eqnarray}

Boundary conditions

For $n$ knots, there are $n-1$ line sections with $4(n-1)$ parameters

$n$ constraint for the hitting the $n$ knot values
3 continuity constraints for each intermediate knot: $3(n-2)$
$4n-6$ constraints in total

Two additional constraints required to uniquely determine the cubic spline:

Natural Spline: $s''_1(x_1) = s''_{n-1}(x_{n}) = 0$
End slope conditions: $s'_1(x_1) = c_0, s'_{n-1}(x_{n}) = c_1$

Cubic spline examples

Spline interpolation is very smooth

However, it does not preserve monotonicity and convexity

it could overshoot
can result in arbitrages in many situations

In [14]:

import lin

ts = lin.RationalTension(0.)
ts.build(x, y)

x2 = np.array([.01, .03, .07, .1, .6])
y2 = np.array([.0095, .023, .035, .040, .042])

figure(figsize=[12, 4])
subplot(1, 2, 1)
plot(x1d, ts.value(x1d))
plot(x, y, 'o');
title('Cubic Spline');

subplot(1, 2, 2)
ts.build(x2, y2)
x2d = arange(.01, .65, .01)
plot(x2d, ts.value(x2d))
plot(x2, y2, 'o');
xlabel('Detachement')
title('CDO Tranche Expected Loss');

Convexity

Option like instruments' prices are convex functions:

$$ v(s, k) = \mathbb{E}[\max(s - k, 0)] $$

No arbitrage requires $\frac{\partial^2 v}{\partial s^2} > 0$ and $\frac{\partial^2 v}{\partial k^2} > 0 $, because:

$$\frac{\partial^2}{\partial x^2}\mathbb{E}[\max(x, 0)] = \mathbb{E}[\frac{\partial^2}{\partial x^2} \max(x, 0)] = \mathbb{E}[\delta(x)] = p(x) $$

$\delta(x)$ is the Dirac delta function, $\int_{-\infty}^{\infty} f(x) \delta(a) dx = f(a)$
$p(x)$ is the probability density function of $s$
the tranche example is concave because its payoff is $k - \mathbb{E}[\max(s-k, 0)]$

Exponential tension spline

A tension spline is originally defined as: $f^{(4)}(x) - \lambda f''(x) = 0$

it models a physical spline stretchd by a tension force of $\lambda$
it reduces to cubic spline when $\lambda = 0$
with $\lambda \rightarrow \infty$, the spline is streched to piece-wise straight lines
the solution is a combination of exponential and linear functions of $x$

Rational tension spline

Generic tension spline has the following property:

has a scaler tension parameter $\lambda$
when $\lambda = 0$, it reduces to cubic spline
it converges to piecewise linear interpolation uniformly as $\lambda$ increases

Rational tension spline is a convenient form of generic tension spline:

In [15]:

import fmt

a, b, c, d, t = sp.symbols('a b c d t', real = True)
l = sp.symbols('lambda', positive=True)
f = sp.Function('f')
df = sp.Function("\dot{f}")
ddf = sp.Function("\ddot{f}")

r = (a + b*t + c*t**2 + d*t**3)/(1 + l*t*(1-t))
s_x, xi, xii, xi_ = sp.symbols('x, x_i, x_{i+1}, x_{i-1}', real=True)

fmt.displayMath(fmt.joinMath('=', f(t), r), fmt.joinMath('=', t, (s_x-xi)/(xii-xi)))

$$ f{\left (t \right )}=\frac{a + b t + c t^{2} + d t^{3}}{\lambda t \left(- t + 1\right) + 1}\;,\;\;\;t=\frac{x - x_{i}}{- x_{i} + x_{{i+1}}}$$

it is more tractable than the exponential tension spline as only polynomial functions are involved
solvable the same way as cubic spline, $\lambda$ is a known parameter

Constraints

The rationale tension spline's values and derivatives at $t=0$ and $t=1$:

In [16]:

from fmt import math2df
pd.options.display.max_colwidth=300

def lincollect(e, tms) :
    m = sp.collect(sp.expand(e), tms, evaluate=False)
    return [m[k] if k in m else 0 for k in tms]

coefs = sp.Matrix([a, b, c, d])
drt0 = sp.Matrix([lincollect(r.diff(t, i).subs({t:0}), coefs) for i in range(3)])
drt1 = sp.Matrix([lincollect(r.diff(t, i).subs({t:1}), coefs) for i in range(3)])

In [17]:

derivs = sp.Matrix([f(t), df(t), ddf(t)])
fmt.displayMath(fmt.joinMath('=', derivs.subs({t:0}), drt0), coefs, sep="", pre="\\small")
fmt.displayMath(fmt.joinMath('=', derivs.subs({t:1}), drt1), coefs, sep="", pre="\\small")

$$\small \left(\begin{matrix}f{\left (0 \right )}\\\dot{f}{\left (0 \right )}\\\ddot{f}{\left (0 \right )}\end{matrix}\right)=\left(\begin{matrix}1 & 0 & 0 & 0\\- \lambda & 1 & 0 & 0\\2 \lambda^{2} + 2 \lambda & - 2 \lambda & 2 & 0\end{matrix}\right)\left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)$$

$$\small \left(\begin{matrix}f{\left (1 \right )}\\\dot{f}{\left (1 \right )}\\\ddot{f}{\left (1 \right )}\end{matrix}\right)=\left(\begin{matrix}1 & 1 & 1 & 1\\\lambda & \lambda + 1 & \lambda + 2 & \lambda + 3\\2 \lambda^{2} + 2 \lambda & 2 \lambda^{2} + 4 \lambda & 2 \lambda^{2} + 6 \lambda + 2 & 2 \lambda^{2} + 8 \lambda + 6\end{matrix}\right)\left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)$$

Note the short-handed notation of $\ddot{f}(1) = \frac{d^2f(t)}{d t^2} \rvert _{t=1}$ etc.
Like cubic spline, there are $4(n-1)$ variables and constraints
Solvable as a $4(n-1)$ dimensional linear systems
However, we can do much better.

A better solution

Consider one line section between $[x_i, x_{i+1}]$:

the four coefficients are fully determined by $f(0)$, $f(1)$, $\ddot{f}(0)$, $\ddot{f}(1)$:

In [18]:

s_v = sp.Matrix([f(0), f(1), ddf(0), ddf(1)])
s_a = sp.Matrix([drt0[0, :], drt1[0, :], drt0[2, :], drt1[2, :]])

fmt.displayMath(s_a, fmt.joinMath('=', coefs, s_v), sep="", pre="\\small")

$$\small \left(\begin{matrix}1 & 0 & 0 & 0\\1 & 1 & 1 & 1\\2 \lambda^{2} + 2 \lambda & - 2 \lambda & 2 & 0\\2 \lambda^{2} + 2 \lambda & 2 \lambda^{2} + 4 \lambda & 2 \lambda^{2} + 6 \lambda + 2 & 2 \lambda^{2} + 8 \lambda + 6\end{matrix}\right)\left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)=\left(\begin{matrix}f{\left (0 \right )}\\f{\left (1 \right )}\\\ddot{f}{\left (0 \right )}\\\ddot{f}{\left (1 \right )}\end{matrix}\right)$$

The linear system can be explicity inverted (by sympy!):

In [19]:

ss = sp.simplify(sp.expand(s_a.inv()))
fmt.displayMath(fmt.joinMath('=', coefs, ss), s_v, sep="")

$$ \left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)=\left(\begin{matrix}1 & 0 & 0 & 0\\\lambda - 1 & 1 & - \frac{\lambda + 2}{2 \lambda^{2} + 8 \lambda + 6} & - \frac{1}{2 \lambda^{2} + 8 \lambda + 6}\\- 2 \lambda & \lambda & \frac{\lambda + \frac{3}{2}}{\lambda^{2} + 4 \lambda + 3} & - \frac{\lambda}{2 \lambda^{2} + 8 \lambda + 6}\\\lambda & - \lambda & - \frac{1}{2 \lambda + 6} & \frac{1}{2 \lambda + 6}\end{matrix}\right)\left(\begin{matrix}f{\left (0 \right )}\\f{\left (1 \right )}\\\ddot{f}{\left (0 \right )}\\\ddot{f}{\left (1 \right )}\end{matrix}\right)$$

the resulting $a, b, c, d$ ensures that the rational tension spline hits the target $f(t), \ddot{f}(t)$ at both ends of the line section.
therefore, we can express $\dot{f}(x_i)$ as linear combination of $f$ and $\ddot f$ at knots $x_{i-1}, x_i$ or $x_i, x_{i+1}$

Continuity conditions

Thus $\ddot{f}(x_i)$ at the knots are the only $n$ unknowns, which are sovable from the following $n$ constraints:

End condition (natural spline): $\ddot{f}(x_1) = \ddot{f}(x_{n}) = 0$
Continuity in $\dot{f}(x_i)$ for $n-2$ intermediate knots

Note that $f(\cdot)$ and its derivatives are continuous in $x$, not $t$:

$$ \frac{d^k f(t)}{dt^k} = (x_{i+1}-x_i)^k\frac{d^k f(x)}{dx^k} $$

this is because $t = \frac{x-x_i}{x_{i+1} - x_i}$
what if end conditions are given in $\dot{f}(x_1)$ and $\dot{f}(x_{n})$?

Couninuity in $f'(x_i)$

In [20]:

xl, xr = sp.symbols('x_l, x_r')
vm = {xl:(xi-xi_), xr:(xii-xi)}

sl = sp.Matrix([f(xi_), f(xi), ddf(xi_)*xl**2, ddf(xi)*xl**2])
sr = sp.Matrix([f(xi), f(xii), ddf(xi)*xr**2, ddf(xii)*xr**2])
dl = sp.simplify(sp.expand((drt1[1,:]*ss*sl/xl)[0,0]))
dr = sp.simplify(sp.expand((drt0[1,:]*ss*sr/xr)[0,0]))

terms = sp.Matrix([ddf(xi_), ddf(xi), ddf(xii)])
d1 = sp.collect(sp.expand(dl - dr), terms, evaluate=False)

mr = sp.Matrix([sp.simplify(sp.simplify(d1[k]).subs(vm)) for k in terms])
rhs = sp.simplify(-d1[sp.S.One]).subs(vm)
fmt.displayMath(mr.T, terms, sep="")
fmt.displayMath("\;\;\;\;\;\;\; =", rhs, sep="")
#displayMath(*[fmt.joinMath('=', k, v) for k, v in vm.items()])

$$ \left(\begin{matrix}\frac{x_{i} - x_{{i-1}}}{2 \lambda^{2} + 8 \lambda + 6} & \frac{\frac{\lambda x_{{i+1}}}{2} - \frac{\lambda x_{{i-1}}}{2} + x_{{i+1}} - x_{{i-1}}}{\lambda^{2} + 4 \lambda + 3} & \frac{- x_{i} + x_{{i+1}}}{2 \lambda^{2} + 8 \lambda + 6}\end{matrix}\right)\left(\begin{matrix}\ddot{f}{\left (x_{{i-1}} \right )}\\\ddot{f}{\left (x_{i} \right )}\\\ddot{f}{\left (x_{{i+1}} \right )}\end{matrix}\right)$$

$$ \;\;\;\;\;\;\; =- \frac{f{\left (x_{i} \right )}}{x_{i} - x_{{i-1}}} + \frac{f{\left (x_{{i-1}} \right )}}{x_{i} - x_{{i-1}}} - \frac{f{\left (x_{i} \right )}}{- x_{i} + x_{{i+1}}} + \frac{f{\left (x_{{i+1}} \right )}}{- x_{i} + x_{{i+1}}}$$

this linear system can be written as a tri-diagonal matrix

Tri-diagonal system

The following is the linear system of the 1st interpolation example with $\lambda = 2$:

In [21]:

import trid
ts3 = trid.TridiagTension(2.)
ts3.build(x, y)

fmt.displayMath(sp.Matrix(np.round(ts3.a, 3)), fmt.joinMath('=', "\\ddot{f}(\\boldsymbol x)", #ddf(s_x), 
                sp.Matrix(np.round(ts3.b, 4))), sep="", pre="\\small")

$$\small \left(\begin{matrix}1.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0\\0.03 & 0.253 & 0.033 & 0.0 & 0.0 & 0.0 & 0.0\\0.0 & 0.033 & 0.267 & 0.033 & 0.0 & 0.0 & 0.0\\0.0 & 0.0 & 0.033 & 0.4 & 0.067 & 0.0 & 0.0\\0.0 & 0.0 & 0.0 & 0.067 & 0.933 & 0.167 & 0.0\\0.0 & 0.0 & 0.0 & 0.0 & 0.167 & 2.667 & 0.5\\0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 1.0\end{matrix}\right)\ddot{f}(\boldsymbol x)=\left(\begin{matrix}0.0\\-0.0023\\-0.002\\-0.0015\\-0.0015\\-0.0007\\0.0\end{matrix}\right)$$

Thomas algorithm:

a sparse version of the LU decomposition
solves the tri-diagonal system in $O(n)$ operations

what is the complexity of regular LU decomposition?

Tension spline examples

In [22]:

def plotTension(x, y, lbds, xd) :
    plot(x, y, 'o');
    title('Tension Spline');

    for lbd in lbds:
        ts = lin.RationalTension(lbd)
        ts.build(x, y)
        plot(xd, ts.value(xd))
    
    legend(['data'] + ['$\lambda = %.f$' % l for l in lbds], loc='best');

lbds = (0, 2, 10, 50)

figure(figsize=[12, 4])
subplot(1, 2, 1)
plotTension(x, y, lbds, x1d)
subplot(1, 2, 2)
plotTension(x2, y2, lbds, x2d)

converges to piece wise flat curve uniformly with increasing $\lambda$

Perturbation locality

Spline interpolation is global:

changing a single knot affects the whole curve
the perbation is more localized with greater tension parameter $\lambda$

In [23]:

def plotPerturb(x, y, yp, xd, lbds) :
    plot(x, yp-y, 'o')
    for lbd in lbds:
        ts = lin.RationalTension(lbd)
        ts.build(x, y)
        tsp = lin.RationalTension(lbd)
        tsp.build(x, yp)

        plot(xd, tsp.value(xd) - ts.value(xd))
        
    xlabel('$x$')
    ylabel('$y$')
    title('Changes in Spline')
    legend(['knots'] + ['$\lambda = %.f$' % l for l in lbds], loc='best');

dy = .01
figure(figsize=[12, 4])
subplot(1, 2, 1)
idx = 2
yp = np.copy(y)
yp[idx] *= 1. + dy

plotPerturb(x, y, yp, x1d, lbds)

subplot(1, 2, 2)
idx = 1
yp = np.copy(y2)
yp[idx] *= 1. + dy
plotPerturb(x2, y2, yp, x2d, lbds)

Integral of tension spline

Though complicated, the integral is analytical nontheless
The integral can be useful in deriving analytical pricing formulae
- also derived using sympy

In [25]:

s_s0, s_s1, s_s2, s_c1, s_c2, s_com = sp.symbols('s_0, s_1, s_2, c_1, c_2, c_0')
short = sp.simplify(sp.Rational(1, 2)/l/s_com*(s_s0 + s_s1*(s_c1 + s_c2) + s_s2*(s_c1 - s_c2)))
fmt.displayMath(fmt.joinMath('=', sp.Integral(r, (t, 0, tt)), short))

com = sp.sqrt(l*(l+4))

s0 = -d*tt**2*s_com - 2*tt*(c+d)*s_com
s1 = sp.log((sp.sqrt(l+4)+sp.sqrt(l))/(sp.sqrt(l+4)+sp.sqrt(l)-2*sp.sqrt(l)*tt))
s2 = sp.log((sp.sqrt(l+4)-sp.sqrt(l))/(sp.sqrt(l+4)-sp.sqrt(l)+2*sp.sqrt(l)*tt))

c1 = d*s_com/l + s_com*(b+c+d)
c2 = l*(2*a+b+c+d)+(2*c+3*d)

#short = sp.simplify(sp.Rational(1, 2)/l/com*(s0 + s1*(c1 + c2) + s2*(c1 - c2))).subs({s_com:com})

fmt.displayMath(fmt.joinMath('=', s_com, com), fmt.joinMath('=', s_s0, s0), pre="\\small ")
fmt.displayMath(fmt.joinMath('=', s_s1, s1), fmt.joinMath('=', s_s2, s2), pre="\\small ")
fmt.displayMath(fmt.joinMath('=', s_c1, c1), fmt.joinMath('=', s_c2, c2), pre="\\small ")

$$ \int_{0}^{\tau} \frac{a + b t + c t^{2} + d t^{3}}{\lambda t \left(- t + 1\right) + 1}\, dt=\frac{1}{2 c_{0} \lambda} \left(s_{0} + s_{1} \left(c_{1} + c_{2}\right) + s_{2} \left(c_{1} - c_{2}\right)\right)$$

$$\small c_{0}=\sqrt{\lambda} \sqrt{\lambda + 4}\;,\;\;\;s_{0}=- c_{0} d \tau^{2} - 2 c_{0} \tau \left(c + d\right)$$

$$\small s_{1}=\log{\left (\frac{\sqrt{\lambda} + \sqrt{\lambda + 4}}{- 2 \sqrt{\lambda} \tau + \sqrt{\lambda} + \sqrt{\lambda + 4}} \right )}\;,\;\;\;s_{2}=\log{\left (\frac{- \sqrt{\lambda} + \sqrt{\lambda + 4}}{2 \sqrt{\lambda} \tau - \sqrt{\lambda} + \sqrt{\lambda + 4}} \right )}$$

$$\small c_{1}=\frac{c_{0} d}{\lambda} + c_{0} \left(b + c + d\right)\;,\;\;\;c_{2}=2 c + 3 d + \lambda \left(2 a + b + c + d\right)$$

Curve Building

Jessica Simpson: I love my curves

Curve building foundamentals

Inputs: prices of benchmark instruments of the same risk factor, at different maturities
Outputs: a single curve for the underlying factor that explains all observed prices to adequate precision.
Curve building is a fundamental problem in Finance, it can become extremely complicated

Common types of curves

Interest rate: OIS, Libor 3M/6M etc,
CDS (credit default swap)
Commodity
FX
Inflation

Curve terminology

Zero rate (or spot rate, yield, internal rate of return):

$$r(t) = -\frac{1}{t}\log(b(t)) \iff b(t) = e^{-r(t) t} $$

Forward rate:

$$f(t) = -\frac{1}{b(t)}\frac{d b(t)}{d t} \iff b(t) = e^{-\int_{0}^t f(s) ds} $$

Swap rate (continous coupon payments):

$$s(t) = \frac{1 - b(t)}{\int_0^t b(s) ds} \iff s(t) \int_0^t b(s) ds + b(t) = 1 $$

$b(t)$ is the price of risk free zero coupon bond maturing at $t$

Relationship between rates

In [26]:

dt = .1
t = np.arange(dt, 30, dt)

tn = np.array([0, 5, 30])
rn = np.array([.04, .06, .055])

cv = lin.RationalTension(2.)
cv.build(tn, rn)

f = cv(t) + t*cv.deriv(t)
b = exp(-t*cv(t))
intb = np.cumsum(b*dt)
s = (1-b)/intb

plot(t, cv(t));
plot(t, f)
plot(t, s)

legend(['Zero Rate', 'Forward Rate', 'Swap Rate'], loc='best');
xlabel('Time');

zero rate is the average of forward rate:

$$ r(t) = \frac{1}{t} \int_0^t f(s) ds $$

zero rate is a rough approximation to the swap rate:

$$ s(t) = \frac{1 - b(t)}{\int_0^t b(s) ds} \approx \frac{1 - b(t)}{\frac{t}{2} (1 + b(t))} \approx \frac{r(t) t}{\frac{t}{2} (1 + b(t))} \approx r(t) $$

Introduction to CDS

No Default	With Default

CDS is an insurance against a reference entity's default risk

protection buyer: makes quarterly payments of notional$\times$spread$\times$daycount.
protection seller: pays the default loss of notional$\times$(1-rec) when the reference entity defaults before CDS maturity.

Long or short ?

Important market convention: long position is always equivalent to owning the underlying bond

receiver swap is long $\iff$ own a fixed coupon risk free bond
sell CDS protection is long $\iff$ own a fixed coupon risky bond

Deltas are always comunicated by perturbing the market to the longer side, ie, the direction where long positions makes money:

interest rate decrease by 1bps
credit spread compress by 1bps
positive deltas corresponds to long positions

Often the risk computation is done by bumping rates +1bps, then flipping the signs before the final reporting.

IR vs credit terminologies

In a very loose way, the following terms are the counterparts between IR and credit/CDS market (all in continuous compounding):

	Interest Rate	Credit
primary state variable	discount factor $b(t)$	survival probability $p(t)$
yield, IRR	zero rate $r(t) = -\frac{1}{t}\log(b(t))$	quoted spread $q(t) \approx -\frac{1 - \text{rec}}{t}\log(p(t))$
forward rate	forward interest rate $f(t) = -\frac{1}{b(t)}\frac{d b(t)}{d t} $	hazard rate $h(t) = -\frac{1}{p(t)}\frac{d p(t)}{d t} $
par swap rate $s(t)$	par IR swap rate	par CDS spread
cumulative yield	$y(t) = -\log(b(t))$	$g(t) = -\log(p(t)) $

CDS benchmark instruments

The folllowing is a representatitve set of CDS quotes observed in the market:

In [27]:

import inst

def flatCurve(rate) :
    return lambda t : np.exp(-rate*t)

disc = flatCurve(.01)

terms = np.array([.25, .5, 1., 2, 3, 4, 5, 6, 7, 8, 10])
qs = np.array([80, 85, 95, 154, 222, 300, 385, 410, 451, 470, 480])

cps = np.ones(len(terms))*100
insts = [inst.CDS(m, c*1e-4, .4) for m, c in zip(terms, cps)]

#compute the upfront PV of the benchmark instruments
ufr = np.array([i.pv01(disc, flatCurve(f*1e-4))*(c-f)*1e-4 for i, f, c in zip(insts, qs, cps)])

mat_tag = 'Maturity (Y)'
pd.set_option('precision', 4)

cds_data = pd.DataFrame({mat_tag:terms, 'Coupon(bps)':cps, 'Quoted Spread(bps)':qs, 'PV (% Points)': ufr*100.}).set_index(mat_tag)
fmt.displayDF(cds_data.T, "4g")
benchmarks = dict(zip(insts, ufr))

Maturity (Y)	0.25	0.5	1.0	2.0	3.0	4.0	5.0	6.0	7.0	8.0	10.0
Coupon(bps)	100	100	100	100	100	100	100	100	100	100	100
PV (% Points)	0.04978	0.07448	0.0494	-1.05	-3.475	-7.356	-12.58	-15.92	-20.25	-23.6	-28.63
Quoted Spread(bps)	80	85	95	154	222	300	385	410	451	470	480

the CDS contract have standardised quarterly maturites, on Mar-20, Jun-20, Sep-20 and Dec-20
it usually costs more per annum for longer term contracts
for most names, there are only a few liquid CDS instruments, referred as the benchmark instruments
quoted spreads is a way to communicate the upfront payment, it loosely corresponds to the zero rate or yield in IR

What to interpolate?

It is important to choose the right quantity to interpolate

There are multiple potential choices, e..g: $f(t), f'(t)$ or $\int f(t) dt$
The criteria is which viariable is a natural fit for tension spline's stretch
Ideally we want the vairable itself to be continous, but can live with the discontinuities in $f'(t)$ when $\lambda \rightarrow \infty$

Industry standard: $$p(t) = e^{-\int_0^t h(s) ds}$$

build piecewise flat curve in hazard rate $h(t)$ for CDS:
where $p(t)$ is the survival probability of the reference entity at time $t$.

this is quite odd, why don't we use a continuous interpolation in $h(t)$?

such as piece wise linear or cubic spline?

Cumulative hazard

The cumulative hazard $g(t)$ is a suitable variable for the tension spline interpolation:

$$g(t) = \int_0^t h(s) ds = -\log(p(t))$$

$g(t)$ is continuous and monotonic
$g(0) = 0$, the boundary condition is well defined
$h(t) = g'(t)$ can be discountinous when $\lambda \rightarrow \infty$, this reverts to the standard method of piecewise flat hazard rate
with finite $\lambda$, hazard rate is smooth and continuous

This argument carries over to the interest rate curve building:

the cumulative yield, $y(t) = -\log(b(t))$, is a suitable variable to interpolate

Bootstrap

Bootstrap is the standard method to build curves in quatitative finance

Solve the knot values one by one, using benchmark instruments in the increasing order of their maturities

def bootstrap(insts, curve, pfunc, bds) :
    def objf(x, inst, pv) : # return the error to observed price
        curve.addKnot(inst.maturity, x)
        return pfunc(inst, curve) - pv

    for inst, pv in sorted(insts.items(), key = lambda x : x[0].maturity) :
        try :
            opt_x = opt.brentq(objf, bds[0], bds[1], args=(inst, pv))
            curve.addKnot(inst.maturity, opt_x)
        except ValueError:
            print "failed to find root for ", inst

    return curve

Piecewise linear bootstrap

The piecewise linear interpolation works well with bootstrapping, result in exact fit to the benchmark prices
This is the standard approach to build CDS curves in the industry

In [41]:

cdspv = inst.cdspv(disc, inst.zero2disc)

hlin = lin.PiecewiseLinear()
hlin.build(terms, terms*.01)
hlin.addKnot(0, 0)
hlin = inst.bootstrap(benchmarks, hlin, cdspv, bds = [-1., 1.])
cds_data['PV Error (bps) - Linear'] = np.round([1e4*(cdspv(i, hlin) - benchmarks[i]) for i in insts], 4)

xs = np.arange(.01, 10, .01)

figure(figsize=[12, 4])
subplot(1, 2, 1)
plot(xs, hlin(xs))
plot(terms, hlin(terms), 'o')
xlabel('Time')
title('$g(t)$');

subplot(1, 2, 2)
plot(xs, hlin.deriv(xs))
plot(terms, hlin.deriv(terms-1e-4), 'o')
xlabel('Time')
title('$h(t)$');

In [29]:

fmt.displayDF(cds_data.T, "4g")

Maturity (Y)	0.25	0.5	1.0	2.0	3.0	4.0	5.0	6.0	7.0	8.0	10.0
Coupon(bps)	100	100	100	100	100	100	100	100	100	100	100
PV (% Points)	0.04978	0.07448	0.0494	-1.05	-3.475	-7.356	-12.58	-15.92	-20.25	-23.6	-28.63
Quoted Spread(bps)	80	85	95	154	222	300	385	410	451	470	480
PV Error (bps) - Linear	0	0	0	0	0	0	0	0	0	0	0

Bootsrap with tension spline

In [30]:

def plotboot(tsit, lbd, ax, tagsf) :
    xlabel('Time')    
    
    lbd_tag = '$\\lambda=%.f$' % lbd
    df = pd.DataFrame({'$t$':xs}).set_index(['$t$'])
    
    for tag, f in tagsf.items() :
        df[tag] = f(tsit, xs) 
    df.plot(ax = ax, secondary_y = [tagsf.keys()[1]], title = 'Tension Spline ' + lbd_tag)

tagsf = {"$g(t)$" : lambda cv, xs : cv(xs), "$h(t)$": lambda cv, xs : cv.deriv(xs)}

lbds = (0, 10)
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=[12, 4])
tsit0, e0 = inst.iterboot(benchmarks, cdspv, lbd=lbds[0], x0=0, its=1)
plotboot(tsit0, lbds[0], axes[0], tagsf)    

tsit1, e1 = inst.iterboot(benchmarks, cdspv, lbd=lbds[1], x0=0, its=1)
plotboot(tsit1, lbds[1], axes[1], tagsf)

Tension spline interpolation is global:

Changes in longer maturity affects the shorter end of the curve
The benchmark CDS won't reprice exactly after the bootstrapping

In [31]:

df = pd.DataFrame(np.vstack((e0, e1))*1e4, index=['Fit Error (bps) $\\lambda$=%g' % l for l in lbds], 
                  columns = ['%gY' % t for t in terms])
fmt.displayDF(df, "4f")

	0.25Y	0.5Y	1Y	2Y	3Y	4Y	5Y	6Y	7Y	8Y	10Y
Fit Error (bps) $\lambda$=0	-0.0000	-0.0000	0.0352	0.7148	1.1093	1.9968	2.1802	2.6090	2.5612	0.8231	0.0000
Fit Error (bps) $\lambda$=10	-0.0000	-0.0000	0.0098	0.2026	0.3602	0.5634	0.6458	0.7598	0.8039	0.3038	-0.0000

Iterative algorithm

In [32]:

figure(figsize=[12, 4])
ax = subplot(1, 2, 1)

tsit, e = inst.iterboot(benchmarks, cdspv, x0=0, lbd = lbds[0], its=6)
tsit1, e1 = inst.iterboot(benchmarks, cdspv, x0=0, lbd = lbds[0], mixf=0.5, its=6)

plotboot(tsit, lbds[0], ax, tagsf)    

subplot(1, 2, 2)
semilogy(range(1, len(e)+1), np.transpose([np.linalg.norm(e, 2, 1)*1e4, np.linalg.norm(e1, 2, 1)*1e4]), 'o-')
legend(['no mixing $m=0$', 'mixing $m=0.5$'], loc='best')

xlabel('Iteration')
ylabel('bps')
title('L-2 norm of error (bps)');

Iteration is a simple and effective technique to reduce fit error

start with the bootstrapped (inaccurate) curve, redo the boostrapping
error goes below 1bps after one extra iteration in this example
simple iteration often outperforms sophisticated optimizers

Mixing the old and new results between iterations often improves stability:

instead of $x^{i+1} = f(x^i)$, do $x^{i+1} = m f(x^i) + \left(1-m\right) x^i$

Effects of tension parameter $\lambda$

No visible difference in $g(t)$
Big difference in the hazard rate $h(t)$
- the smaller $\lambda$, the smoother the $h(t)$

In [33]:

def pv_lbds(bms, cdspv, lbds, x0) :
    cvs = []
    for lbd in lbds:
        cv, e = inst.iterboot(bms, cdspv, x0, lbd, mixf = 0.5)
        cvs.append(cv)
    
    return cvs

In [34]:

lbds = (0, 2, 10, 50)
tags = ['$\\lambda=%.f$' % l for l in lbds]
cv0 = pv_lbds(benchmarks, cdspv, lbds, 0)

figure(figsize=[12, 4])
subplot(1, 2, 1)
plot(xs, np.array([cv(xs) for cv in cv0]).T);
title('$H(t)$')
xlabel('$t$')

subplot(1, 2, 2)
plot(xs, np.array([cv.deriv(xs) for cv in cv0]).T);
legend(tags, loc='best');
title('$h(t)$');
xlabel('$t$');

Perturbation locality

Changes in hazard rates are more localized with larger $\lambda$

A highly sought after property in practice

In [35]:

def showPerts(bms, bms_ps, cdspv, lbds, x0, pertf) :
    cv0 = pv_lbds(bms, cdspv, lbds, x0=x0)
    cvp1, cvp2 = [pv_lbds(b, cdspv, lbds, x0=x0) for b in bms_ps]
    
    lbd_tags = ['$\\lambda=%.f$' % lbd for lbd in lbds]

    figure(figsize=[12, 4])

    subplot(1, 2, 1)
    plot(xs, 1e4*np.array([pertf(f, g)(xs) for f, g in zip(cv0, cvp1)]).T);
    xlabel('Tenor')
    ylabel('$\Delta h(t)$ (bps)')
    title('1bps Spread Perturbation @t=%.2f' % pts[0])
    legend(lbd_tags, loc='best');

    subplot(1, 2, 2)
    plot(xs, 1e4*np.array([pertf(f, g)(xs) for f, g in zip(cv0, cvp2)]).T);
    xlabel('Tenor')
    ylabel('$\Delta h(t)$ (bps)')
    title('1bps Spread Perturbation @t=%.2f' % pts[1])
    legend(lbd_tags, loc='best');
    
    
pts = [.5, 5]
bms_ps = [{k if k.maturity != pt else inst.CDS(k.maturity, k.coupon-1e-4, k.recovery) : v 
        for k, v in benchmarks.items()} for pt in pts]
    
showPerts(benchmarks, bms_ps, cdspv, lbds, 0, lambda f, g : lambda xs : f.deriv(xs) - g.deriv(xs))

Interpolate the hazard rate?

The result is disastrous:

When $\lambda \rightarrow \infty$, it approaches linear interpolation in the hazard rate
Linearly interpolating $h(t)$ leads to zigs and zags

In [36]:

hazard_pv = inst.cdspv(disc, inst.fwd2disc)
x0 = .01
chartf = {'$h(t)$' : lambda cv, xs : cv(xs), 'H(t)' : lambda cv, xs : cv.integral(xs)}

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=[12, 4])

lbds = (0, 20)
tsit0, e0 = inst.iterboot(benchmarks, hazard_pv, x0, lbds[0], mixf=.5)
plotboot(tsit0, lbds[0], axes[0], chartf)    

tsit1, e1 = inst.iterboot(benchmarks, hazard_pv, x0, lbds[1], mixf=.5)
plotboot(tsit1, lbds[1], axes[1], chartf)

Perturbation in a single tenor generates waves throughout the hazard rate term structure.

In [37]:

showPerts(benchmarks, bms_ps, hazard_pv, lbds, x0, lambda f, g : lambda xs : f(xs) - g(xs))

State variable

Generally, it is always better to interpolate the state variable:

state variable is the primary determinant of the benchmark price, such as the cumulative hazard $g(t) = -\log(p(t))$
the derivatives, like $h(t) = g'(t)$, only changes the prices through its cumulative effects (integral), which leads to zigzags and waves when combined with a continuous interpolation.

bootstrap derivative variables + continuous interpolation = DISASTER

Lecure 4: Rootfinding and Interpolation

Topics

Rootfinding

Common rootfinding methods

Illustrative Examples:

Bisection

Newton-Ralphson

Secant

Brent

Interpolation

Common interpolation methods

Linear interpolation

Piecewise constant interpolation

Polynomial interpolation

Spline interpolation

Cubic spline

Boundary conditions

Cubic spline examples

Convexity

Exponential tension spline

Rational tension spline

Constraints

A better solution

Continuity conditions

Couninuity in $f'(x_i)$

Tri-diagonal system

Tension spline examples

Perturbation locality

Integral of tension spline

Curve Building

Curve building foundamentals

Common types of curves

Curve terminology

Relationship between rates

Introduction to CDS

Long or short ?

IR vs credit terminologies

CDS benchmark instruments

What to interpolate?

Cumulative hazard

Bootstrap

Piecewise linear bootstrap

Bootsrap with tension spline

Iterative algorithm

Effects of tension parameter $\lambda$

Perturbation locality

Interpolate the hazard rate?

State variable

Assignments