Lecture 4: Rootfinding and Interpolation¶

Topics¶

Root finding
Interpolation
Curve building

$\renewcommand{bs}{\boldsymbol}$

Rootfinding¶

The Bible: the love of money is the root of all evil
Mark Twain: the lack of money is the root of all evil
Ayn Rand：money is the root of all good

Common rootfinding methods¶

Search the root of a scalar function $f(x) = 0$, common methods:

Bisection
Secant
Newton-Ralphson
Brent

Illustrative Examples:¶

compute the implied volatility of a call option.
a more challenging problem of $\Phi(x) + .05x - 0.5 = 0$

In [3]:

from scipy.stats import norm
lecture = 4

# black scholes call
def bs_call(r, s, k, sigma, t) :
    d1 = (math.log(s/k) + (r + .5*sigma*sigma)*t)/sigma/math.sqrt(t)
    d2 = d1 - sigma*math.sqrt(t)
    return norm.cdf(d1)*s - norm.cdf(d2)*k*math.exp(-r*t)

# black scholes vega
def bs_vega(r, s, k, sigma, t) :
    d1 = (math.log(s/k) + (r + .5*sigma*sigma)*t)/sigma/math.sqrt(t)
    return math.sqrt(t)*s*norm.pdf(d1)

f = lambda s : bs_call(r=.05, s=100, k=150., sigma=s, t=1) - 1.
g = lambda s, foo, bar : bs_vega(r=.05, s=100, k=150., sigma=s, t=1) 
f2 = lambda x : norm.cdf(x) + .05*x - .5
g2 = lambda x, a, b : norm.pdf(x) + .05

fig = figure(figsize=[12, 3.5])
subplot(1, 2, 1)
sigs = np.arange(.01, .5, .01)
bsv = np.array([f(s) for s in sigs])
plot(sigs, bsv, '-')
axhline(color = 'r');
xlabel('Volatility')
ylabel('Price');
title('Black Scholes Call')
subplot(1, 2, 2)
x2 = np.arange(-5, 5, .1)
plot(x2, f2(x2));
xlabel('$x$')
axhline(color = 'r');
xlim(-5, 5)
title('$\Phi(x) + .05x - 0.5 = 0$');

Bisection¶

initial guess must bucket the solution: find $x\in [a,b]$ such that $f(x) = 0$, given
- $f(a)*f(b) \leq 0$
the function must be continuous
converges to a root linearly (absolute error halves for each iteration).
- $\epsilon_{n+1} \le c \epsilon_n^q, \;\epsilon_n = \left| x_n - x_{root}\right|,\; q=1, \;c=\frac{1}{2}$

In [4]:

import scipy.optimize as opt
a = .1
b = .45

def cumf (x, func, xs) :
    xs.append(x)
    return func(x)

xs = []
nx = opt.bisect(cumf, a, b, args=(f, xs))

def show_converge(f, xs, a, b, maxIter, tag, diagline = True, txty = .5) :
    figure(figsize = [12, 3.5])
    subplot(1, 2, 1)

    xr = np.arange(a*.9, b*1.1, (b-a)/500.)
    bsv = np.array(list(map(f, xr)))

    plot(xr, bsv, '-')
    axhline(color = 'k');

    for i in range(maxIter) :
        plot([xs[i], xs[i]], [0, f(xs[i])])
        scatter(xs[i], txty*(1. if f(xs[i]) < 0 else -1.), 
                s=150, marker = "$%d$"% i)
        if diagline and i< (maxIter-1):
            plot([xs[i], xs[i+1]], [f(xs[i]), 0])

    xlim(a*.9, b*1.1);
    ylim(f(a) - .2, f(b) + .2);
    xlabel('$x$')
    ylabel('$f(x)$')
    title(tag)

    subplot(1, 2, 2)
    es = np.abs(list(map(f, xs)))
    semilogy(es, '.');
    xlabel('Iterations')
    title('Error');
    
show_converge(f, xs, .1, .45, 6, 'Bisection', False)

Secant¶

Uses the secant line to find successive approximation

$$ x_n = x_{n-1} - f(x_{n-1})\frac{x_{n-1} - x_{n-2}}{f(x_{n-1}) - f(x_{n-2})} $$

Converges faster than Bisection method with $\epsilon_{n+1} \le c\epsilon_n^q, \; q\approx 1.618 $
Can be thought of as poor man's choice of the Newton's method --- also referred to as a quasi-Newton method.

In [5]:

xs = []
g = lambda s, foo, bar : bs_vega(r=.05, s=100, k=150., sigma=s, t=1) 
nx = opt.newton(cumf, b, args=(f, xs))

show_converge(f, xs, 0.2, .45, 4, 'Secant', True)

Secant method is not guaranteed to converge.
Suffers for wiggly functions with small/vanishing derivative.

In [6]:

xs = []
try : 
    nx = opt.newton(cumf, 3, args=(f2, xs))
except RuntimeError:
    pass

show_converge(f2, xs, -11, 11, 5, 'Secant', True, .1)

Newton-Ralphson¶

Use first derivative to approximate the root

$$ x_n = x_{n-1} - \frac{f(x_{n-1})}{f'(x_{n-1})} $$

converges quadratically: $\epsilon_{n+1} \le c\epsilon_n^q,\; q=2$.

In [7]:

xs = []
nx = opt.newton(cumf, b, g, args=(f, xs))

show_converge(f, xs, 0.2, .45, 3, 'Newton-Ralphson', True)

convergence is not guaranteed, it can fail around inflection point
guaranteed convergence if the function is convex everywhere.

In [8]:

xs = []

try : 
    nx = opt.newton(cumf, 2, g2, args=(f2, xs))
except RuntimeError:
    pass

show_converge(f2, xs, -5, 9.5, 3, 'Newton-Ralphson', True, .1)

Brent¶

An adaptive algorithm---mixes root bracketing, bisection and
Inverse quadratic interpolation (Secant uses linear interpolation)
Guaranteed to find a solution if initial guesses bucket the root
Usually converges very fast $\epsilon_{n+1} \le c\epsilon_n^q, \; q \approx 1.839$.
Often the default choice in practice

In [9]:

xs = []
g = lambda s, foo, bar : bs_vega(r=.05, s=100, k=150., sigma=s, t=1) 
nx = opt.brentq(cumf, a, b, args=(f, xs))

show_converge(f, xs, 0.1, .45, 5, 'Brent', False)

The challenging $\Phi(x) + .05x - 0.5 = 0$ converges under Brent:

In [10]:

xs = []
nx = opt.brentq(cumf, -1, 4, args=(f2, xs))

show_converge(f2, xs, -1.5, 5, 5, 'Brent', True, .1)

Interpolation¶

interpolate: verb, insert (something) between fixed points

Common interpolation methods¶

From a discrete set of knot values $x_i, y_i$, find a function $f: x \rightarrow y$.

that hits all the knot values

Common interpolation methods:

Linear
Polynomial
Cubic spline

We will cover in detail:

Tension spline
Maximum entropy (if time permits)

Linear interpolation¶

In [11]:

x = np.array([.1, 1.,2, 3., 5., 10., 25.])
y = np.array([.0025, .01, .016,.02, .025, .030, .035])
plot(x, y, '-o');
xlabel('Year')
ylabel('Rates');
title('Linear Interpolation');

Piecewise constant interpolation¶

values equals to the observation to the right (or left).
simple (maybe even silly) but useful in curve building
works well with bootstrap

In [12]:

m = dict(zip(x, y))
m.update(dict(zip(x[:-1] + 1e-6, y[1:])))
k, v = zip(*sorted(m.items()))
plot(k, v, 'o-')
title('Piecewise Constant');

Polynomial interpolation¶

Fit a polynomial function of order $n-1$ to the $n$ observations.
The exact solution is known as the Lagrange polynomials:

$$ p(x) = \sum_{i=0}^n(\prod_{1\le j \le n, j\ne i}\frac{x-x_j}{x_i-x_j})y_i $$

It is rarely used in practice because ...

In [13]:

z = np.polyfit(x, y, len(x)-1)
p = np.poly1d(z)
x1d = arange(0, 26, .1)
plot(x1d, p(x1d))
plot(x, y, 'o');
title('Polynomial');

Spline interpolation¶

Spline is an elastic string or pipe, used by draftsmen to draw smooth curves through fixed knots before the computer age:

Spline interpolation is a mathematical model for the physical spline.

Cubic spline¶

Cubic spline uses a 3rd order polynomial for each line section between knots

$$q_i(x) = a_i + b_i x + c_i x^2 + d_i x^3 \;,\;\; x_i < x < x_{i+1}$$

Enforce continuity up to the 2nd derivatives:

\begin{eqnarray} q_{i-1}(x_i) &=& q_{i}(x_i) \\ q'_{i-1}(x_i) &=& q'_{i}(x_i) \\ q''_{i-1}(x_i) &=& q''_{i}(x_i) \end{eqnarray}

Boundary conditions¶

For $n$ knots, there are $n-1$ line sections with $4(n-1)$ parameters

$n$ constraint for the hitting the $n$ knot values
3 continuity constraints for each intermediate knot: $3(n-2)$
$4n-6$ constraints in total

Two additional constraints required to uniquely determine the cubic spline:

Natural Spline: $s''_1(x_1) = s''_{n-1}(x_{n}) = 0$
End slope conditions: $s'_1(x_1) = c_0, s'_{n-1}(x_{n}) = c_1$

Cubic spline examples¶

Spline interpolation is very smooth

However, it does not preserve monotonicity and convexity

it could overshoot
can result in arbitrages in many situations

In [14]:

import lin

ts = lin.RationalTension(0.)
ts.build(x, y)

x2 = np.array([.01, .03, .07, .1, .6])
y2 = np.array([.0095, .023, .035, .040, .042])

figure(figsize=[12, 4])
subplot(1, 2, 1)
plot(x1d, ts.value(x1d))
plot(x, y, 'o');
title('Cubic Spline');

subplot(1, 2, 2)
ts.build(x2, y2)
x2d = arange(.01, .65, .01)
plot(x2d, ts.value(x2d))
plot(x2, y2, 'o');
xlabel('Detachement')
title('CDO Tranche Expected Loss');

Convexity¶

Option-like instruments' prices are convex functions:

$$ v(k) = \mathbb{E}[\max(\tilde s - k, 0)] $$

No arbitrage requires $\frac{\partial^2 v}{\partial k^2} > 0 $, because:

$$\small \frac{\partial^2}{\partial k^2}\mathbb{E}[\max(\tilde s-k, 0)] = \mathbb{E}[\frac{\partial^2}{\partial k^2} \max(\tilde s-k, 0)] = \mathbb{E}[\delta(\tilde s-k)] = p(k) $$

$\delta(x)$ is the Dirac delta function: $\int_{-\infty}^{\infty} f(x) \delta(x-a) dx = f(a)$
$p(\cdot)$ is the probability density function of $\tilde s$
The expected loss of a 0 to $k$ tranche is concave because:
- $v(k) = \mathbb{E}[\min(\tilde l, k)] = k - \mathbb{E}[\max(k-\tilde l, 0)]$

Exponential tension spline¶

A tension spline is originally defined as: $f^{(4)}(x) - \lambda f''(x) = 0$

it models a physical spline stretched by a tension force
$\lambda$ is the ratio of the tension force over the rigidity of the spline
it reduces to cubic spline when $\lambda = 0$
with $\lambda \rightarrow \infty$, the spline is stretched to piece-wise straight lines
the solution is a combination of exponential and linear functions of $x$

Rational tension spline¶

Generic tension spline has the following property:

has a scalar tension parameter $\lambda$
when $\lambda = 0$, it reduces to cubic spline
it converges to piecewise linear interpolation uniformly as $\lambda$ increases

Rational tension spline is a convenient form of generic tension spline:

In [15]:

import fmt

a, b, c, d, t = sp.symbols('a b c d t', real = True)
l = sp.symbols('lambda', positive=True)
f = sp.Function('f')
df = sp.Function("\dot{f}")
ddf = sp.Function("\ddot{f}")

r = (a + b*t + c*t**2 + d*t**3)/(1 + l*t*(1-t))
s_x, xi, xii, xi_ = sp.symbols('x, x_i, x_{i+1}, x_{i-1}', real=True)

fmt.displayMath(fmt.joinMath('=', f(t), r), fmt.joinMath('=', t, (s_x-xi)/(xii-xi)))

$$ f{\left (t \right )}=\frac{a + b t + c t^{2} + d t^{3}}{\lambda t \left(- t + 1\right) + 1}\;,\;\;\;t=\frac{x - x_{i}}{- x_{i} + x_{i+1}}$$

it is more tractable than the exponential tension spline as only polynomial functions are involved
solvable the same way as cubic spline, $\lambda$ is a known parameter

Constraints¶

The rationale tension spline's values and derivatives at $t=0$ and $t=1$:

In [16]:

from fmt import math2df
pd.options.display.max_colwidth=300

def lincollect(e, tms) :
    m = sp.collect(sp.expand(e), tms, evaluate=False)
    return [m[k] if k in m else 0 for k in tms]

coefs = sp.Matrix([a, b, c, d])
drt0 = sp.Matrix([lincollect(r.diff(t, i).subs({t:0}), coefs) for i in range(3)])
drt1 = sp.Matrix([lincollect(r.diff(t, i).subs({t:1}), coefs) for i in range(3)])

In [17]:

derivs = sp.Matrix([f(t), df(t), ddf(t)])
fmt.displayMath(fmt.joinMath('=', derivs.subs({t:0}), drt0), coefs, sep="", pre="\\scriptsize")
fmt.displayMath(fmt.joinMath('=', derivs.subs({t:1}), drt1), coefs, sep="", pre="\\scriptsize")

$$\scriptsize \left(\begin{matrix}f{\left (0 \right )}\\\dot{f}{\left (0 \right )}\\\ddot{f}{\left (0 \right )}\end{matrix}\right)=\left(\begin{matrix}1 & 0 & 0 & 0\\- \lambda & 1 & 0 & 0\\2 \lambda^{2} + 2 \lambda & - 2 \lambda & 2 & 0\end{matrix}\right)\left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)$$

$$\scriptsize \left(\begin{matrix}f{\left (1 \right )}\\\dot{f}{\left (1 \right )}\\\ddot{f}{\left (1 \right )}\end{matrix}\right)=\left(\begin{matrix}1 & 1 & 1 & 1\\\lambda & \lambda + 1 & \lambda + 2 & \lambda + 3\\2 \lambda^{2} + 2 \lambda & 2 \lambda^{2} + 4 \lambda & 2 \lambda^{2} + 6 \lambda + 2 & 2 \lambda^{2} + 8 \lambda + 6\end{matrix}\right)\left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)$$

Note the short-handed notation of $\ddot{f}(1) = \frac{d^2f(t)}{d t^2} \rvert _{t=1}$ etc.
$\lambda$ is a known constant
Like cubic spline, there are $4(n-1)$ variables and constraints
Solvable as a $4(n-1)$ dimensional linear systems ~ $O((4n)^3)$

A better solution¶

Consider one line section between $[x_i, x_{i+1}]$:

the four coefficients are fully determined by $f(0)$, $f(1)$, $\ddot{f}(0)$, $\ddot{f}(1)$:

In [18]:

s_v = sp.Matrix([f(0), f(1), ddf(0), ddf(1)])
s_a = sp.Matrix([drt0[0, :], drt1[0, :], drt0[2, :], drt1[2, :]])

fmt.displayMath(s_a, fmt.joinMath('=', coefs, s_v), sep="", pre="\\scriptsize")

$$\scriptsize \left(\begin{matrix}1 & 0 & 0 & 0\\1 & 1 & 1 & 1\\2 \lambda^{2} + 2 \lambda & - 2 \lambda & 2 & 0\\2 \lambda^{2} + 2 \lambda & 2 \lambda^{2} + 4 \lambda & 2 \lambda^{2} + 6 \lambda + 2 & 2 \lambda^{2} + 8 \lambda + 6\end{matrix}\right)\left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)=\left(\begin{matrix}f{\left (0 \right )}\\f{\left (1 \right )}\\\ddot{f}{\left (0 \right )}\\\ddot{f}{\left (1 \right )}\end{matrix}\right)$$

The linear system can be explicitly inverted (by sympy!):

In [19]:

ss = sp.simplify(sp.expand(s_a.inv()))
fmt.displayMath(fmt.joinMath('=', coefs, ss), s_v, sep="\\scriptsize")

$$ \left(\begin{matrix}a\\b\\c\\d\end{matrix}\right)=\left(\begin{matrix}1 & 0 & 0 & 0\\\lambda - 1 & 1 & - \frac{\lambda + 2}{2 \lambda^{2} + 8 \lambda + 6} & - \frac{1}{2 \lambda^{2} + 8 \lambda + 6}\\- 2 \lambda & \lambda & \frac{\lambda + \frac{3}{2}}{\lambda^{2} + 4 \lambda + 3} & - \frac{\lambda}{2 \lambda^{2} + 8 \lambda + 6}\\\lambda & - \lambda & - \frac{1}{2 \lambda + 6} & \frac{1}{2 \left(\lambda + 3\right)}\end{matrix}\right)\scriptsize\left(\begin{matrix}f{\left (0 \right )}\\f{\left (1 \right )}\\\ddot{f}{\left (0 \right )}\\\ddot{f}{\left (1 \right )}\end{matrix}\right)$$

the resulting $a, b, c, d$ ensures that the rational tension spline hits the target $f(t), \ddot{f}(t)$ at both ends of the line section.
$\dot{f}(x_i)$ becomes linear combinations of $f$ and $\ddot f$ for line sections $[x_{i-1}, x_i]$ or $[x_i, x_{i+1}]$
- the $\dot{f}(x_i)$ from the left and right line sections must equal

Continuity conditions¶

Thus $\ddot{f}(x_i)$ at the knots are the only $n$ unknowns, which are solvable from the following $n$ constraints:

End condition (natural spline): $\ddot{f}(x_1) = \ddot{f}(x_{n}) = 0$
Continuity in $\dot{f}(x_i)$ for $n-2$ intermediate knots

Note that $f(\cdot)$ and its derivatives are continuous in $x$, not $t$:

$$ \frac{d^k f(t)}{dt^k} = (x_{i+1}-x_i)^k\frac{d^k f(x)}{dx^k} $$

this is because $t = \frac{x-x_i}{x_{i+1} - x_i}$
what if end conditions are given in $\dot{f}(x_1)$ and $\dot{f}(x_{n})$?

Continuity in $f'(x_i)$¶

In [20]:

xl, xr = sp.symbols('x_l, x_r')
vm = {xl:(xi-xi_), xr:(xii-xi)}

sl = sp.Matrix([f(xi_), f(xi), ddf(xi_)*xl**2, ddf(xi)*xl**2])
sr = sp.Matrix([f(xi), f(xii), ddf(xi)*xr**2, ddf(xii)*xr**2])
dl = sp.simplify(sp.expand((drt1[1,:]*ss*sl/xl)[0,0]))
dr = sp.simplify(sp.expand((drt0[1,:]*ss*sr/xr)[0,0]))

terms = sp.Matrix([ddf(xi_), ddf(xi), ddf(xii)])
d1 = sp.collect(sp.expand(dl - dr), terms, evaluate=False)

mr = sp.Matrix([sp.simplify(sp.simplify(d1[k]).subs(vm)) for k in terms])
rhs = sp.simplify(-d1[sp.S.One]).subs(vm)
fmt.displayMath(mr.T, terms, sep="")
fmt.displayMath("\;\;\;\;\;\;\; =", rhs, sep="")
#displayMath(*[fmt.joinMath('=', k, v) for k, v in vm.items()])

$$ \left(\begin{matrix}\frac{x_{i} - x_{i-1}}{2 \left(\lambda^{2} + 4 \lambda + 3\right)} & \frac{\frac{\lambda x_{i+1}}{2} - \frac{\lambda x_{i-1}}{2} + x_{i+1} - x_{i-1}}{\lambda^{2} + 4 \lambda + 3} & \frac{- x_{i} + x_{i+1}}{2 \left(\lambda^{2} + 4 \lambda + 3\right)}\end{matrix}\right)\left(\begin{matrix}\ddot{f}{\left (x_{i-1} \right )}\\\ddot{f}{\left (x_{i} \right )}\\\ddot{f}{\left (x_{i+1} \right )}\end{matrix}\right)$$

$$ \;\;\;\;\;\;\; =- \frac{f{\left (x_{i} \right )}}{x_{i} - x_{i-1}} + \frac{f{\left (x_{i-1} \right )}}{x_{i} - x_{i-1}} - \frac{f{\left (x_{i} \right )}}{- x_{i} + x_{i+1}} + \frac{f{\left (x_{i+1} \right )}}{- x_{i} + x_{i+1}}$$

this linear system can be written as a tri-diagonal matrix

Tri-diagonal system¶

The following is the linear system of the 1st interpolation example with $\lambda = 2$:

In [21]:

import trid
ts3 = trid.TridiagTension(2.)
ts3.build(x, y)

fmt.displayMath(sp.Matrix(np.round(ts3.a, 3)), fmt.joinMath('=', "\\boldsymbol{\\ddot{f}}", #ddf(s_x), 
                sp.Matrix(np.round(ts3.b, 4))), sep="", pre="\\small")

$$\small \left(\begin{matrix}1.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0\\0.03 & 0.253 & 0.033 & 0.0 & 0.0 & 0.0 & 0.0\\0.0 & 0.033 & 0.267 & 0.033 & 0.0 & 0.0 & 0.0\\0.0 & 0.0 & 0.033 & 0.4 & 0.067 & 0.0 & 0.0\\0.0 & 0.0 & 0.0 & 0.067 & 0.933 & 0.167 & 0.0\\0.0 & 0.0 & 0.0 & 0.0 & 0.167 & 2.667 & 0.5\\0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 1.0\end{matrix}\right)\boldsymbol{\ddot{f}}=\left(\begin{matrix}0.0\\-0.0023\\-0.002\\-0.0015\\-0.0015\\-0.0007\\0.0\end{matrix}\right)$$

Thomas algorithm:

a sparse version of the LU decomposition
solves the tri-diagonal system in $O(n)$ operations

What is the complexity of regular LU decomposition?

Tension spline examples¶

In [22]:

def plotTension(x, y, lbds, xd) :
    plot(x, y, 'o');
    title('Tension Spline');

    for lbd in lbds:
        ts = lin.RationalTension(lbd)
        ts.build(x, y)
        plot(xd, ts.value(xd))
    
    legend(['data'] + ['$\lambda = %.f$' % l for l in lbds], loc='best');

lbds = (0, 2, 10, 50)

figure(figsize=[12, 4])
subplot(1, 2, 1)
plotTension(x, y, lbds, x1d)
subplot(1, 2, 2)
plotTension(x2, y2, lbds, x2d)

converges to piece wise flat curve uniformly with increasing $\lambda$

Perturbation locality¶

Spline interpolation is global:

changing a single knot affects the whole curve
the perturbation is more localized with greater tension parameter $\lambda$

In [23]:

def plotPerturb(x, y, yp, xd, lbds) :
    plot(x, yp-y, 'o')
    for lbd in lbds:
        ts = lin.RationalTension(lbd)
        ts.build(x, y)
        tsp = lin.RationalTension(lbd)
        tsp.build(x, yp)

        plot(xd, tsp.value(xd) - ts.value(xd))
        
    xlabel('$x$')
    ylabel('$y$')
    title('Changes in Spline')
    legend(['knots'] + ['$\lambda = %.f$' % l for l in lbds], loc='best');

dy = .01
figure(figsize=[12, 4])
subplot(1, 2, 1)
idx = 3
yp = np.copy(y)
yp[idx] *= 1. + dy

plotPerturb(x, y, yp, x1d, lbds)

subplot(1, 2, 2)
idx = 1
yp = np.copy(y2)
yp[idx] *= 1. + dy
plotPerturb(x2, y2, yp, x2d, lbds)

Integral of tension spline¶

Though complicated, the integral is analytical nonetheless
The integral can be useful in deriving analytical pricing formulae
- derived using sympy

In [25]:

s_s0, s_s1, s_s2, s_c1, s_c2, s_com = sp.symbols('s_0, s_1, s_2, c_1, c_2, c_0')
short = sp.simplify(sp.Rational(1, 2)/l/s_com*(s_s0 + s_s1*(s_c1 + s_c2) + s_s2*(s_c1 - s_c2)))
fmt.displayMath(fmt.joinMath('=', sp.Integral(r, (t, 0, tt)), short), pre="\\scriptsize ")

com = sp.sqrt(l*(l+4))

s0 = -d*tt**2*s_com - 2*tt*(c+d)*s_com
s1 = sp.log((sp.sqrt(l+4)+sp.sqrt(l))/(sp.sqrt(l+4)+sp.sqrt(l)-2*sp.sqrt(l)*tt))
s2 = sp.log((sp.sqrt(l+4)-sp.sqrt(l))/(sp.sqrt(l+4)-sp.sqrt(l)+2*sp.sqrt(l)*tt))

c1 = d*s_com/l + s_com*(b+c+d)
c2 = l*(2*a+b+c+d)+(2*c+3*d)

#short = sp.simplify(sp.Rational(1, 2)/l/com*(s0 + s1*(c1 + c2) + s2*(c1 - c2))).subs({s_com:com})

fmt.displayMath(fmt.joinMath('=', s_com, com), fmt.joinMath('=', s_s0, s0), pre="\\scriptsize ")
fmt.displayMath(fmt.joinMath('=', s_s1, s1), fmt.joinMath('=', s_s2, s2), pre="\\scriptsize ")
fmt.displayMath(fmt.joinMath('=', s_c1, c1), fmt.joinMath('=', s_c2, c2), pre="\\scriptsize ")

$$\scriptsize \int_{0}^{\tau} \frac{a + b t + c t^{2} + d t^{3}}{\lambda t \left(- t + 1\right) + 1}\, dt=\frac{1}{2 c_{0} \lambda} \left(s_{0} + s_{1} \left(c_{1} + c_{2}\right) + s_{2} \left(c_{1} - c_{2}\right)\right)$$

$$\scriptsize c_{0}=\sqrt{\lambda} \sqrt{\lambda + 4}\;,\;\;\;s_{0}=- c_{0} d \tau^{2} - 2 c_{0} \tau \left(c + d\right)$$

$$\scriptsize s_{1}=\log{\left (\frac{\sqrt{\lambda} + \sqrt{\lambda + 4}}{- 2 \sqrt{\lambda} \tau + \sqrt{\lambda} + \sqrt{\lambda + 4}} \right )}\;,\;\;\;s_{2}=\log{\left (\frac{- \sqrt{\lambda} + \sqrt{\lambda + 4}}{2 \sqrt{\lambda} \tau - \sqrt{\lambda} + \sqrt{\lambda + 4}} \right )}$$

$$\scriptsize c_{1}=\frac{c_{0} d}{\lambda} + c_{0} \left(b + c + d\right)\;,\;\;\;c_{2}=2 c + 3 d + \lambda \left(2 a + b + c + d\right)$$

Curve Building in Finance¶

In life, as in art, the beautiful moves in curves.

Curve building fundamentals¶

Inputs: prices of benchmark instruments of the same risk factor, at different maturities
Outputs: a single curve for the underlying factor that explains all observed prices to adequate precision.
Curve building is a fundamental problem in Finance, it can become extremely complicated

Common types of curves¶

Interest rate: OIS, Libor 3M/6M etc,
CDS (credit default swap)
Commodity
FX
Inflation

Curve terminology¶

$b(t)$ is the price of risk free zero coupon bond maturing at $t$:

Zero rate (or spot rate, yield, internal rate of return):

$$r(t) = -\frac{1}{t}\log(b(t)) \iff b(t) = e^{-r(t) t} $$

Forward rate:

$$f(t) = -\frac{1}{b(t)}\frac{d b(t)}{d t} \iff b(t) = e^{-\int_{0}^t f(s) ds} $$

Swap rate (continuous coupon):

$$s(t) = \frac{1 - b(t)}{\int_0^t b(s) ds} \iff s(t) \int_0^t b(s) ds + b(t) = 1 $$

Relationship between rates¶

In [26]:

dt = .1
t = np.arange(dt, 30, dt)

tn = np.array([0, 5, 30])
rn = np.array([.04, .06, .055])

cv = lin.RationalTension(2.)
cv.build(tn, rn)

f = cv(t) + t*cv.deriv(t)
b = exp(-t*cv(t))
intb = np.cumsum(b*dt)
s = (1-b)/intb

plot(t, cv(t));
plot(t, f)
plot(t, s)

legend(['Zero Rate', 'Forward Rate', 'Swap Rate'], loc='best');
xlabel('Time');

zero rate is the average of forward rate:

$$ r(t) = \frac{1}{t} \int_0^t f(s) ds $$

zero rate is an approximation to the swap rate (when $b(t) \approx 1$):

$$ s(t) = \frac{1 - b(t)}{\int_0^t b(s) ds} \approx \frac{1 - e^{-r(t)t}}{t} \approx \frac{r(t) t}{t} = r(t) $$

Introduction to CDS¶

No Default	With Default

CDS is an insurance against a reference entity's default risk

protection buyer: makes quarterly payments of notional$\times$spread$\times$daycount.
protection seller: pays the default loss of notional$\times$(1-rec) when the reference entity defaults before CDS maturity.

Long or short ?¶

Important market convention: long position is always equivalent to owning the underlying bond

receiver swap is long $\iff$ own a fixed coupon risk free bond
sell CDS protection is long $\iff$ own a fixed coupon risky bond

Deltas are always communicated by perturbing the market to the longer side, i.e., the direction where long positions makes money:

interest rate decrease by 1bps
credit spread compress by 1bps
positive deltas always corresponds to long positions

Often the risk computation is done by bumping rates +1bps, then flipping the signs before the final reporting.

IR vs credit terminologies¶

In a very loose way, the following terms are the counterparts between IR and credit/CDS market (all in continuous compounding):

	Interest Rate	Credit
primary state variable	discount factor $b(t)$	survival probability $p(t)$
yield, IRR	zero rate $r(t) = -\frac{1}{t}\log(b(t))$	quoted spread $q(t) \approx -\frac{1 - \text{rec}}{t}\log(p(t))$
forward rate	forward interest rate $f(t) = -\frac{1}{b(t)}\frac{d b(t)}{d t} $	hazard rate $h(t) = -\frac{1}{p(t)}\frac{d p(t)}{d t} $
par swap rate $s(t)$	par IR swap rate	par CDS spread
cumulative yield	$y(t) = -\log(b(t))$	$g(t) = -\log(p(t)) $

CDS benchmark instruments¶

The following is a representative set of CDS quotes observed in the market:

In [27]:

import inst

def flatCurve(rate) :
    return lambda t : np.exp(-rate*t)

disc = flatCurve(.01)

terms = np.array([.25, .5, 1., 2, 3, 4, 5, 6, 7, 8, 10])
qs = np.array([80, 85, 95, 154, 222, 300, 385, 410, 451, 470, 480])

cps = np.ones(len(terms))*100
insts = [inst.CDS(m, c*1e-4, .4) for m, c in zip(terms, cps)]

#compute the upfront PV of the benchmark instruments
ufr = np.array([i.pv01(disc, flatCurve(f*1e-4))*(c-f)*1e-4 for i, f, c in zip(insts, qs, cps)])

mat_tag = 'Maturity (Y)'
pd.set_option('precision', 4)

cds_data = pd.DataFrame({mat_tag:terms, 'Coupon(bps)':cps, 'Quoted Spread(bps)':qs, 'PV (% Points)': ufr*100.}).set_index(mat_tag)
fmt.displayDF(cds_data.T, "4g", fontsize=2)
benchmarks = dict(zip(insts, ufr))

Maturity (Y)	0.25	0.5	1.0	2.0	3.0	4.0	5.0	6.0	7.0	8.0	10.0
Coupon(bps)	100	100	100	100	100	100	100	100	100	100	100
Quoted Spread(bps)	80	85	95	154	222	300	385	410	451	470	480
PV (% Points)	0.04978	0.07448	0.0494	-1.05	-3.475	-7.356	-12.58	-15.92	-20.25	-23.6	-28.63

the CDS contract have standardised semi-annual maturities, on Mar-20 and Sep-20
it usually costs more per annum for longer term contracts
for most names, there are only a few liquid CDS instruments, referred as the benchmark instruments
quoted spreads is a way to communicate the upfront payment, it loosely corresponds to the zero rate or yield in IR

What to interpolate?¶

It is important to choose the right quantity to interpolate

There are many potential choices, e..g: $f(t), f'(t)$ or $\int_0^t f(s) ds$ etc.
Which variable is a natural fit for tension spline's stretch operation?
We want the variable itself to be continuous, but can live with the discontinuities in $f'(t)$ when $\lambda \rightarrow \infty$

Industry standard: $$p(t) = e^{-\int_0^t h(s) ds}$$

build piecewise flat curve in hazard rate $h(t)$ for CDS:
where $p(t)$ is the survival probability of the reference entity at time $t$.

this is quite odd, why don't we use a continuous interpolation in $h(t)$?

such as piece wise linear or cubic spline?

Triangular dependency¶

The objective:

build a curve in hazard rate $\bs h(t)$ that exactly reproduce all observed CDS prices quoted in $\bs q(t)$

	$h(3m)$	$h(6m)$	$h(1Y)$	$h(2Y)$	$h(5Y)$
$q(3m)$	X
$q(6m)$	*	X
$q(1Y)$	*	*	X
$q(2Y)$	*	*	*	X
$q(5Y)$	*	*	*	*	X

Bootstrap is the standard method to build curves with triangular dependency:

Solve the knot values one by one, using benchmark instruments with increasing maturities

Piecewise flat bootstrap¶

This is the standard approach to build CDS curves in the industry
$g(t)$ is the cumulative hazard: $g(t) = \int_0^t h(s) ds = -\log(p(t))$
Piecewise flat in hazard rate $h(t)$ is equivalent to piecewise linear in $g(t)$

In [28]:

cdspv = inst.cdspv(disc, inst.zero2disc)

hlin = lin.PiecewiseLinear()
hlin.build(terms, terms*.01)
hlin.addKnot(0, 0)
hlin = inst.bootstrap(benchmarks, hlin, cdspv, bds = [-1., 1.])
cds_data['PV Error (bps) - Linear'] = np.round([1e4*(cdspv(i, hlin) - benchmarks[i]) for i in insts], 4)

xs = np.arange(.01, 10, .01)

figure(figsize=[12, 4])
subplot(1, 2, 1)
plot(xs, hlin(xs))
plot(terms, hlin(terms), 'o')
xlabel('Time')
title('$g(t)$');

subplot(1, 2, 2)
plot(xs, hlin.deriv(xs))
plot(terms, hlin.deriv(terms-1e-4), 'o')
xlabel('Time')
title('$h(t)$');

Pros and cons of piecewise flat bootstrap¶

All benchmark prices are exactly matched, after bootstrap with piecewise flat hazard rate.
Discontinuity in hazard rate causes hedging problems, risk profiles jump over the roll dates.

In [29]:

fmt.displayDF(cds_data.T, "4g", fontsize=2)

Maturity (Y)	0.25	0.5	1.0	2.0	3.0	4.0	5.0	6.0	7.0	8.0	10.0
Coupon(bps)	100	100	100	100	100	100	100	100	100	100	100
Quoted Spread(bps)	80	85	95	154	222	300	385	410	451	470	480
PV (% Points)	0.04978	0.07448	0.0494	-1.05	-3.475	-7.356	-12.58	-15.92	-20.25	-23.6	-28.63
PV Error (bps) - Linear	-0	-0	-0	-0	-0	-0	0	0	-0	-0	-0

How to remove the discontinuity in hazard rate?

applying cubic spline to hazard rate seems to be a natural idea, does it work?

Cumulative hazard¶

The cumulative hazard $g(t)$ is a suitable variable for the tension spline interpolation:

$$g(t) = \int_0^t h(s) ds = -\log(p(t))$$

$g(t)$ is continuous and monotonic, 1 to 1 with $p(t)$
$g(0) = 0$, the boundary condition is well defined
$h(t) = g'(t)$ can be discontinuous when $\lambda \rightarrow \infty$, this reverts to the standard method of piecewise flat hazard rate
with finite $\lambda$, hazard rate is smooth and continuous

This argument carries over to the interest rate curve building:

the cumulative yield, $y(t) = -\log(b(t))$, is a suitable variable to apply tension spline

Bootstrap with tension spline¶

In [30]:

def plotboot(tsit, lbd, ax, tagsf) :
    xlabel('Time')    
    
    lbd_tag = '$\\lambda=%.f$' % lbd
    df = pd.DataFrame({'$t$':xs}).set_index(['$t$'])
    
    for tag, f in tagsf :
        df[tag] = f(tsit, xs) 
    
    df.plot(ax = ax, secondary_y = [tagsf[0][0]], title = 'Tension Spline ' + lbd_tag)
    plot(terms, tsit(terms), 'o')

tagsf = [("$g(t)$", lambda cv, xs : cv(xs)), ("$h(t)$", lambda cv, xs : cv.deriv(xs))]

lbds = [0, 2, 10, 100]
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=[12, 8])

es = []

for lbd, ax in zip(lbds, axes.flatten()) :
    tsit, e = inst.iterboot(benchmarks, cdspv, lbd=lbd, x0=0, its=1)
    plotboot(tsit, lbd, ax, tagsf)  
    es.append(e[0])

Bootstrapping with global interpolation¶

The benchmark CDS won't reprice exactly after the bootstrapping with tension spline:

Changes in longer maturity affects the shorter end of the curve
The dependency matrix is no longer triangular, but close

In [31]:

df = pd.DataFrame(np.array(es)*1e4, index=['Fit Error (bps) $\\lambda$=%g' % l for l in lbds], 
                  columns = ['%gY' % t for t in terms])
fmt.displayDF(df, "4f", fontsize=2)

	0.25Y	0.5Y	1Y	2Y	3Y	4Y	5Y	6Y	7Y	8Y	10Y
Fit Error (bps) $\lambda$=0	-0.0000	-0.0000	0.0352	0.7148	1.1093	1.9968	2.1802	2.6090	2.5612	0.8231	0.0000
Fit Error (bps) $\lambda$=2	-0.0000	-0.0000	0.0232	0.4637	0.7908	1.2876	1.4579	1.7260	1.7838	0.6388	0.0000
Fit Error (bps) $\lambda$=10	-0.0000	-0.0000	0.0098	0.2026	0.3602	0.5634	0.6458	0.7598	0.8039	0.3038	-0.0000
Fit Error (bps) $\lambda$=100	-0.0000	-0.0000	0.0013	0.0281	0.0507	0.0782	0.0900	0.1057	0.1128	0.0435	0.0000

Recall how we force matrix to be triangular using QR algorithm?

Iterative algorithm¶

Iteration is an effective method for near triangular dependency:

$$\bs x^{i+1} = f(\bs x^i)$$

where $f(\cdot)$ represent the bootstrap operation, $\bs x^i$ is knot solutions
$\bs x^0$ is a initial guess, e.g. a flat curve
simple and effective, usually outperforms sophisticated optimizers

Iteration with mixing¶

Mixing the old and new results between iterations improves stability:

$$\bs x^{i+1} = m f(\bs x^i) + \left(1-m\right) \bs x^i$$

$m = 0.5$ is often a good choice

In [32]:

figure(figsize=[12, 4])
ax = subplot(1, 2, 1)

tsit, e = inst.iterboot(benchmarks, cdspv, x0=0, lbd = lbds[0], its=6)
tsit1, e1 = inst.iterboot(benchmarks, cdspv, x0=0, lbd = lbds[0], mixf=0.5, its=6)

plotboot(tsit, lbds[0], ax, tagsf)    

subplot(1, 2, 2)
semilogy(range(1, len(e)+1), np.transpose([np.linalg.norm(e, 2, 1)*1e4, np.linalg.norm(e1, 2, 1)*1e4]), 'o-')
legend(['no mixing $m=0$', 'mixing $m=0.5$'], loc='best')

xlabel('Iteration')
ylabel('bps')
title('L-2 norm of error (bps)');

Effects of tension parameter $\lambda$¶

No visible difference in $g(t)$
Big difference in the hazard rate $h(t)$
- the smaller $\lambda$, the smoother the $h(t)$

In [33]:

def pv_lbds(bms, cdspv, lbds, x0) :
    cvs = []
    for lbd in lbds:
        cv, e = inst.iterboot(bms, cdspv, x0, lbd, mixf = 0.5)
        cvs.append(cv)
    
    return cvs

In [34]:

lbds = (0, 2, 10, 50)
tags = ['$\\lambda=%.f$' % l for l in lbds]
cv0 = pv_lbds(benchmarks, cdspv, lbds, 0)

figure(figsize=[12, 4])
subplot(1, 2, 1)
plot(xs, np.array([cv(xs) for cv in cv0]).T);
title('$g(t)$')
xlabel('$t$')

subplot(1, 2, 2)
plot(xs, np.array([cv.deriv(xs) for cv in cv0]).T);
legend(tags, loc='best');
title('$h(t)$');
xlabel('$t$');

Perturbation locality¶

Changes in hazard rates are more localized with larger $\lambda$

A highly desirable property in practice

In [35]:

def showPerts(bms, bms_ps, cdspv, lbds, x0, pertf) :
    cv0 = pv_lbds(bms, cdspv, lbds, x0=x0)
    cvp1, cvp2 = [pv_lbds(b, cdspv, lbds, x0=x0) for b in bms_ps]
    
    lbd_tags = ['$\\lambda=%.f$' % lbd for lbd in lbds]

    figure(figsize=[12, 4])

    subplot(1, 2, 1)
    plot(xs, 1e4*np.array([pertf(f, g)(xs) for f, g in zip(cv0, cvp1)]).T);
    xlabel('Tenor')
    ylabel('$\Delta h(t)$ (bps)')
    title('1bps Spread Perturbation @t=%.2f' % pts[0])
    legend(lbd_tags, loc='best');
    plot(terms, 1e4*np.array([pertf(f, g)(terms) for f, g in zip(cv0, cvp1)]).T, '.');

    subplot(1, 2, 2)
    plot(xs, 1e4*np.array([pertf(f, g)(xs) for f, g in zip(cv0, cvp2)]).T);
    xlabel('Tenor')
    ylabel('$\Delta h(t)$ (bps)')
    title('1bps Spread Perturbation @t=%.2f' % pts[1])
    legend(lbd_tags, loc='best');
    plot(terms, 1e4*np.array([pertf(f, g)(terms) for f, g in zip(cv0, cvp2)]).T, '.');
    
    
pts = [.5, 5]
bms_ps = [{k if k.maturity != pt else inst.CDS(k.maturity, k.coupon-1e-4, k.recovery) : v 
        for k, v in benchmarks.items()} for pt in pts]
    
showPerts(benchmarks, bms_ps, cdspv, lbds, 0, lambda f, g : lambda xs : f.deriv(xs) - g.deriv(xs))

Interpolate the hazard rate?¶

The result is disastrous:

When $\lambda \rightarrow \infty$, it approaches linear interpolation in the hazard rate
Linearly interpolating $h(t)$ leads to zigs and zags

In [36]:

hazard_pv = inst.cdspv(disc, inst.fwd2disc)
x0 = .01
chartf = [('$h(t)$',  lambda cv, xs : cv(xs)), ('H(t)', lambda cv, xs : cv.integral(xs))]

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=[12, 4])

lbds = (0, 20)
tsit0, e0 = inst.iterboot(benchmarks, hazard_pv, x0, lbds[0], mixf=.5)
plotboot(tsit0, lbds[0], axes[0], chartf)    

tsit1, e1 = inst.iterboot(benchmarks, hazard_pv, x0, lbds[1], mixf=.5)
plotboot(tsit1, lbds[1], axes[1], chartf)

Perturbation in a single tenor generates waves throughout the hazard rate term structure.

In [37]:

showPerts(benchmarks, bms_ps, hazard_pv, lbds, x0, lambda f, g : lambda xs : f(xs) - g(xs))

State variable¶

Generally, it is always better to interpolate the state variable:

state variable is the primary determinant of the benchmark price, such as the cumulative hazard $g(t) = -\log(p(t))$
the derivatives, like $h(t) = g'(t)$, only changes the prices through its cumulative effects (integral), which leads to zigzags and waves when combined with a continuous interpolation.

bootstrap derivative variables + continuous interpolation = DISASTER

Lecture 4: Rootfinding and Interpolation¶

Topics¶

Rootfinding¶

Common rootfinding methods¶

Illustrative Examples:¶

Bisection¶

Secant¶

Newton-Ralphson¶

Brent¶

Interpolation¶

Common interpolation methods¶

Linear interpolation¶

Piecewise constant interpolation¶

Polynomial interpolation¶

Spline interpolation¶

Cubic spline¶

Boundary conditions¶

Cubic spline examples¶

Convexity¶

Exponential tension spline¶

Rational tension spline¶

Constraints¶

A better solution¶

Continuity conditions¶

Continuity in $f'(x_i)$¶

Tri-diagonal system¶

Tension spline examples¶

Perturbation locality¶

Integral of tension spline¶

Curve Building in Finance¶

Curve building fundamentals¶

Common types of curves¶

Curve terminology¶

Relationship between rates¶

Introduction to CDS¶

Long or short ?¶

IR vs credit terminologies¶

CDS benchmark instruments¶

What to interpolate?¶

Triangular dependency¶

Piecewise flat bootstrap¶

Pros and cons of piecewise flat bootstrap¶

Cumulative hazard¶

Bootstrap with tension spline¶

Bootstrapping with global interpolation¶

Iterative algorithm¶

Iteration with mixing¶

Effects of tension parameter $\lambda$¶

Perturbation locality¶

Interpolate the hazard rate?¶

State variable¶

Assignments¶