Scientific Computing in Finance

  • MATH-GA 2048, Spring 2015
  • Courant Institute of Mathematical Sciences
  • New York University

Instructors

Hongwei Cheng

  • Head of financial model validation, Prudential
  • Ph.D. Mathematics, New York University

Yadong Li

  • Head of trading book risk modeling, Barclays
  • Ph.D. in Physics, M.S. in Computer Sciences, University of Wisconsin-Madison
  • Master of Financial Engineering, University of California, Berkeley

Teaching Assitant

Maulik Desai

  • Ph.D. EECS, Columbia University

Lecture 1: Introduction

Topics

  • Introduction
  • Software practices
  • Errors and floating point computation

Objectives of this class

  • Teach fundamental principles of scientific computing
  • Teach the most common numerical techniques in quantitative Finance
  • Develop intuitions and essential skills using real world examples
  • Help students to start a career in quantitative Finance

Levels of understanding

  • description: able to talk about the problem
  • insight: understood the dynamics and key drivers
  • abstraction (model): fomulated it as a mathematical problem
  • knowledge: solved the mathematical problem
  • experience (skill): applied the theory to real world, understood the practical limitations
  • intuition: able to see through, make connections and sensibly reason about new problems

We expect you

  • Know basic calculus and linear algebra
  • Know basic stochastic calculus and derivative pricing theories
  • Have some prior programming experience
  • Are willing to learn through hard work

Major fields in quantitative finance

  • Derivative pricing and hedging
  • Risk management and regulatory capital
  • Portfolio management
  • Quantitative strategy/Algo trading

Main topics in this class

A rough schedule of the class:

Week Main Topics Practical Problems
1 Introduction and Error Analysis Software practice
2-3 Linear Algebra Portfolio optimization, PCA, least square
4 Rootfinding and interpolation Curve building
5 Derivatives and integration Hedging, Risk Transformation
7-8 Optimization Model calibration
9-11 Monte Carlo simulation Exotic pricing, Max entropy, Allocation
12-13 ODE/PDE Pricing

Text book

  • Principles of scientific computing online book

    D Bindel and J Goodman, 2009

References

  • Interest rate modelling, volume I: Foundations and Vanilla Models
  • Interest rate modelling, volume II: Vanilla Models

    L Anderson and V. Piterbarg, 2010

  • Monte Carlo method in Financial Engineering

    P. Glasserman 2010

Lecture notes and homework

Available online at: http://yadongli.github.io/nyumath2048

Grades

  • Homework 50%
  • Project 30%
  • Final 20%
  • Extra credits for pointing out errors in slides/homework

Software Practices

Modern hardware

  • Processor speed/throughput continues to grow at an impressive rate
    • Moore's law has ruled for a long time
    • recent trends: multi-core, GPU
  • Memory capacity continues to grow, but memory speed only grows at a (relatively) slow rate

Memory hierarchy

Storage Bandwidth Latency Capacity
L-1 Cache 98GB/s 4 cycles 32K/core
L-2 Cache 50GB/s 10 cycles 256K/core
L-3 Cache 30GB/s 40 - 75 cycles 8MB shared
RAM 15GB/s 60-100 ns 2 - 128GB
Disk 50MB/s ~5ms $\infty$
Network vary much longer $\infty$

Key to high performance

  • Understand if the problem is computation or bandwidth bound
    • Most problems in practice are bandwidth bound
  • Computation bound problems:
    • caching
    • vectorization/parallelization/GPU
    • optimize your "computational kernel"
  • Bandwidth bound problems:
    • optimize cache and memory access
    • require highly specialized skills and low level programming (like in C/C++/Fortran)
  • Premature optimization for speed is the root of all evil
    • Simplicity and generality of the code are often sacrificed
    • Execution speed is usually not the most critical factor in practice

What's before performance

Correctness

  • "those who would give up correctness for a little temporary performance deserve neither correctness nor performance"
  • "the greatest performance improvement of all is when system goes from not-working to working"

Modifiability

  • "modification is undesirable but modifiability is paramount"
  • modular design, learn from hardware designers

Simplicity

  • no obvious defects is not the same as obviously no defects
  • simplicity allows humans to correctly reason about systems

Other desirable features:

  • Scalability, robustness, Easy support and maintenanc etc.

In practice, what often trumps everything else is:

Time to market !!

Conventional wisdom for programming

The typical advice includes:

  • choose good names for your variables and functions
  • write comments and documents
  • write tests
  • keep coding style consistent

These are good advice, but quite superficial.

  • all of them are compromised, even in very successful projects
  • they are usually not the critical differentiator between successes and failures

Advice for the real world:

by Paul Phillips (quoting George Orwell): "We're doing it all wrong":

War is peace

  • Fight your war at the boundary so that you can have peace within
  • Challenge your clients, customers and collaborators to limit your deliverable
  • Clearly define a minimal set of interactions to the outside world

Ignorance is strength

  • Design dumb modules that only do one thing, and only one thing
  • Keep it simple and stupid
  • Separation of concerns, modular

Freedom is slavery

  • Must be disciplined in testing, code review, releases etc.
  • Freedom $\rightarrow$ possibilities $\rightarrow$ complexity $\rightarrow$ unmodifiable code $\rightarrow$ slavery

Sun Tzu: "the supreme art of modelling is to solve problems without coding"

Choose the right programming tools

  • Programming paradigm: procedural, OOP, functional
  • Expressiveness: high level vs. low level
  • Ecosystem: library, tools, community, industry support

There is no single right answer

  • Different job calls for different tools
  • But most of the time, the choice is already made (by someone else)

Expressiveness

When you have the chance, choose the more expressive language for the job.

Eric Raymond, "How to Become a Hacker":

"Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot."

We use Python

Python

  • a powerful dynamic and strongly typed scripting language
  • extremely popular in scientific computing
  • strong momentum in quantitative finance
  • highly productive, low learning curve

Python ecosystem

  • numpy/scipy: high quality matrix and numerical library
  • matplotlib: powerful visualization library
  • IPython: highly productive interactive working environment
  • Pandas: powerful data and time series analysis package

Python resources

  • Enthought Canopy: a commercial Python distribution, free for students
  • Anaconda: another popular and free Python distribution
  • Google and Youtube

Python version:

  • Python 2.7.x : python 3 is not backward compatible
  • With the latest IPython, numpy/pandas/scipy etc.
  • either 32bit or 64bit version

Pyton setup instructions:

Errors and floating point computation

James Gosling: "95% of the folks out there are completely clueless about floating-point"

Absolute and relative error

If the true value $a$ is approximated by $\hat{a}$:

  • Absolute error: $\hat{a} - a$
  • Relative error: $\frac{\hat{a} - a}{a}$

Both measure are useful in practice.

  • Beware of the absolute values when compare relative errors

Sources of error

Impreciseness in the theory or model itself

  • George Box: "essentially, all models are wrong, but some are useful"

Errors in numerical algorithm

  • Convergence error
    • Monte Carlo simulation
    • Numerical integration
  • Truncation and discretization
  • Termination of iterations

Error due to machine representation

  • Floating point number
  • Rounding

We cover the last category first.

IEEE 754 standard

First published in 1985, the standard includes:

  • floating point number representations
    • 16bit, 32bit, 64bit etc
    • special representation for NaN, Inf etc
  • rounding
    • default rule: round to nearest, ties to even digit
      • e.g., round to 4 significant digit: $3.1245 \rightarrow 3.124, 43515 \rightarrow 43520$
    • addition and multiplication are still commutative, but not associative
    • intermediate calculations could be done at higher precision to minimize rounding error
      • e.g. Intel CPU's FPU uses 80bit representation for intermediate calculation
  • basic operations
    • arithmetic: add, multiplication, sqrt etc
    • conversion: btw formats, to/from string etc
  • exception handling
    • invalid operation, divide by zero, overflow, underflow, inexact
    • the operation returns a value (could be NaN, Inf or 0), and set an error flag

Significance of IEEE 754

The IEEE 754 is a tremendously successful standard:

  • it allows users to consistently reason about floating point computation across platforms
    • before IEEE 754, floating point number calculation behaves differently across hardware/vendors
  • IEEE 754 made compromise for practicality
    • many of them do not make sense in a strict mathematical sense
    • but they are necessary evils for the greater goods
  • it establishes a clear interface for hardware designers and software developers
  • its main designer, William Kahan won the Turing award in 1989

The IEEE 754 standard is widely implemented in modern hardware and software,

  • 99% of the CPU are at least 99% IEEE 754 compliant
  • the OS/software situation is more complicated
    • Java floating point arithmetic was not compliant with the strictfp flag (don't use it!)
    • Compiled language like C++ depends on the hardware's implementation.

IEEE 754 floating point number format

  • IEEE floating point number standard

$$\underbrace{*}_{\text{sign bit}}\;\underbrace{********}_{\text{exponent}}\overbrace{\;1.\;}^{\text{implicit}}\;\underbrace{***********************}_{\text{fraction}} $$

$$(-1)^{\text{sign}}(\text{1 + fraction})2^{\text{exponent} - p}$$

  • 32 bit vs. 64 bit format
Format Sign Fraction (incl. the implicit bit) Exponent $p$
IEEE 754 - 32 bit 1 24 8 127
IEEE 754 - 64 bit 1 53 11 1023
  • an implicit leading bit of 1 and the binary point before fraction bits
  • 0 and max exponent reserved for special interpretations: 0, NaN, Inf etc.

Examples of floating numbers

In [26]:
%pylab inline

import bitstring
import pandas as pd
import numpy as np
import fmt
import math, sys

pd.set_option("display.max_colwidth", 80)

def parts2f(sign, e, sig) :
    return ((-1)**sign*sig*2.**e)

def f2parts(v, flen) :
    ps = {32: 9, 64 : 12}
    ed = ps[flen]
    eoff = 2**(ed - 2) - 1

    f1 = bitstring.BitArray(float=v, length=flen)
    signb = f1[:1]
    sigb = f1[ed:]
    eb = f1[1:ed]
    sig = (1. + 1.*sigb.uint/(2**(flen-ed))) if eb.uint > 0 else 2.*sigb.uint/(2**(flen-ed))
    e = eb.uint - eoff
    
    bins = np.array([signb.bin, eb.bin, sigb.bin])
    vals = np.array([1-signb.uint*2, e, sig])
    
    fmt.displayDF(pd.DataFrame(np.array([bins, vals]).T, columns=['Bits', 'Decimal'], 
                               index=["Sign", "Exp", "Fraction"]).T)
Populating the interactive namespace from numpy and matplotlib

WARNING: pylab import has clobbered these variables: ['f', 'e']
`%matplotlib` prevents importing * from pylab and numpy

f2parts takes a floating number and its bit length (32 or 64) and shows the three parts.

In [27]:
f2parts(10.7, 32);
Sign Exp Fraction
Bits 0 10000010 01010110011001100110011
Decimal 1.0 3.0 1.33749997616
In [28]:
f2parts(-23445.25, 64)
Sign Exp Fraction
Bits 1 10000001101 0110111001010101000000000000000000000000000000000000
Decimal -1.0 14.0 1.43098449707

Range and precision

  • The range of floating point number are usually adequte in practice
    • the whole universe only have $10^{80}$ protons ...
  • The machine precision is the more commonly encountered limitation
    • machine precision $\epsilon_m$ is the maximum $\epsilon$ that makes $1 + \epsilon = 1$
In [29]:
prec32 = 2**(-24)
prec64 = 2**(-53)

f32 = [parts2f(0, -126, 1), parts2f(0, 254-127, (2.-prec32*2)), prec32, -np.log10(prec32)]
f64 = [parts2f(0, -1022, 1), parts2f(0, 2046-1023, (2.-prec64*2)), prec64, -np.log10(prec64)]

fmt.displayDF(pd.DataFrame(np.array([f32, f64]), index=['32 bit', '64 bit'], 
                           columns=["Min", "Max", "Machine Precision", "# of Significant Digits"]), "5g")
Min Max Machine Precision # of Significant Digits
32 bit 1.1755e-38 3.4028e+38 5.9605e-08 7.2247
64 bit 2.2251e-308 1.7977e+308 1.1102e-16 15.955

Examples of floating point representations

In [30]:
f2parts(-0., 32);
Sign Exp Fraction
Bits 1 00000000 00000000000000000000000
Decimal -1.0 -127.0 0.0
In [31]:
f2parts(np.NaN, 32);
Sign Exp Fraction
Bits 0 11111111 10000000000000000000000
Decimal 1.0 128.0 1.5
In [32]:
f2parts(-np.Inf, 32);
Sign Exp Fraction
Bits 1 11111111 00000000000000000000000
Decimal -1.0 128.0 1.0
In [33]:
f2parts(-1e-43, 32);
Sign Exp Fraction
Bits 1 00000000 00000000000000001000111
Decimal -1.0 -127.0 1.69277191162e-05

Floating point computing is NOT exact

In [34]:
a = 1./3
b = a + a + 1. - 1.
c = 2*a
print "b == c? %s !" % (b == c)
print "b - c = %e" % (b - c)
b == c? False !
b - c = -1.110223e-16

In [35]:
f2parts(b, 64);
Sign Exp Fraction
Bits 0 01111111110 0101010101010101010101010101010101010101010101010100
Decimal 1.0 -1.0 1.33333333333
In [36]:
f2parts(c, 64);
Sign Exp Fraction
Bits 0 01111111110 0101010101010101010101010101010101010101010101010101
Decimal 1.0 -1.0 1.33333333333

Can't handle numbers too large or small

In [37]:
x = 1e-308
print "inverse of %e is %e" % (x, 1/x)
inverse of 1.000000e-308 is 1.000000e+308

In [38]:
x = 1e-309
print "inverse of %e is %e" % (x, 1/x)
inverse of 1.000000e-309 is inf

Limited precision

  • Small number is often lost in addition and subtraction
In [39]:
print 1. - 1e-16 == 1.
False

In [40]:
print 1. + 1e-16 == 1.
True

In [41]:
a = math.factorial(30) # in Python's bignum
print "a = ", a
print "maxint = ", sys.maxint
print "a - maxint == a ?\n", (a - sys.maxint) == a

b = float(math.factorial(30))
print "b = ", b
print "b - maxint == b ?", (b - sys.maxint) == b
a =  265252859812191058636308480000000
maxint =  2147483647
a - maxint == a ?
False
b =  2.65252859812e+32
b - maxint == b ? True

Subtle errors

The limited precision can cause subtle errors that are puzzling even for experiened programmers.

In [42]:
a = np.arange(0, 3. - 1., 1.)
print "a = ", a
print "a/3 = ", a/3.
a =  [ 0.  1.]
a/3 =  [ 0.          0.33333333]

In [43]:
b = np.arange(0, 1 - 1./3., 1./3.)
print "b = ", b
b =  [ 0.          0.33333333  0.66666667]

It should become your instinct to always use error bound when comparing floating numbers, or use integer comparison as the condition for ending loops.

In [44]:
b = arange(0, 1 - 1./3. - 1e-10, 1./3)
print "b = ", b
b =  [ 0.          0.33333333]

Unexpected twist

Compute $e^x$ via Taylor expansion: $$e^x = \sum_{n=0}^{\infty}\frac{x^n}{n!} = 1 + x + \frac{1}{2}x^2 + ...$$ Other types of errors may be mistaken as floating point rounding error:

In [51]:
import math

def exp_taylor(x, n) :
    return sum([1.*x**i/math.factorial(i) for i in np.arange(1, n)]) + 1.

x = 5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, 100))/np.exp(x)*100)
x = -5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, 100))/np.exp(x)*100)

print "exp_taylor(%.f) = %f, exp(%.f) = %f" % (x, exp_taylor(-5, 150), x, np.exp(-5))
rel err for exp(5) is 0.06%
rel err for exp(-5) is 474.72%
exp_taylor(-5) = -0.025249, exp(-5) = 0.006738

Catastrophic cancellation

Dramatic loss of precision can happen when very similar numbers are subtracted

  • Value the follwing function around 0:

$$f(x) = \frac{1}{x^2}(1-\cos(x))$$

  • Mathematically: $\lim_{x\rightarrow 0}f(x) = \frac{1}{2}$

The straight forward implementation failed spectacularly for small $x$.

In [46]:
def f(x) :
    return (1.-np.cos(x))/x**2

def g(x) :
    return (np.sin(.5*x)**2*2)/x**2

x = np.arange(-4e-8, 4e-8, 1e-9)
df = pd.DataFrame(np.array([f(x), g(x)]).T, index=x, columns=['Numerical f(x)', 'True Value']);
df.plot();

How did it happen?

In [47]:
x = 1.1e-8
print "cos(%.16e) = %.16f" % (x, cos(x))
print "1-cos(%.e) = %.16e" % (x, 1.-cos(x))
print "true value of 1-cos(%.e) = %.16e" % (x, 2*sin(.5*x)**2)
cos(1.0999999999999999e-08) = 0.9999999999999999
1-cos(1e-08) = 1.1102230246251565e-16
true value of 1-cos(1e-08) = 6.0499999999999997e-17

  • Dramatic precision loss when subtracting too similar numbers
    • All the identical leading significants cancels
    • The result has much fewer number of significant digits
  • Catestrophic cancellation could happen in practice
    • The Deltas are often computed by bump and re-value
    • Very small bump sizes should be avoided

A real example in Finance

the CIR process is widely used in quant finance, eg, in the short rate and Heston model:

$$ dr_t = \kappa(\mu - r) dt + \sigma \sqrt{r_t} dW_t $$

the variance of $r_t$ at $t > 0$:

$$\text{Var}[r_t | r_0] = \frac{r_0\sigma^2}{\kappa}(e^{-\kappa t} - e^{-2\kappa t}) + \frac{\mu\sigma^2}{2\kappa}(1-e^{-\kappa t})^2$$

how the catastrophic cancellation arises?

In [48]:
k = arange(-3e-8, 3e-8, 1e-9)*1e-7
t = 1
v = (1-exp(-k*t))/k
plot(k, v, '.-')
title(r'$\frac{1}{\kappa}(1-\exp(-2\kappa))$')
xlabel(r'$\kappa$');
In [49]:
import sympy as sp
sp.init_printing(use_latex = True)

k = sp.symbols("k", positive = True)
e = (1-sp.exp(-k))/k
sp.limit(e, k, 0)
Out[49]:
$$1$$

Condition number

Consider the relative error of a function with multiple arguments: $f = f(x_1, ..., x_n)$:

\begin{eqnarray} \small df &=& \sum_i \frac{\partial f}{\partial x_i} dx_i \\ \frac{df}{f} &=& \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \frac{dx_i}{x_i} \end{eqnarray}

Assuming input argument's relative errors are $\epsilon_i$ (could be negative):

$$ \small \left| \frac{\Delta f}{f} \right| = \left| \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \epsilon_i \right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \left|\epsilon_i\right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \epsilon \equiv k(f) \epsilon $$

where $\epsilon$ is the maximum relative error of all inputs.

  • $k(f) = \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right|$ is defined as the condition number of a function,
  • it is the maximum growth factor of the relative error.
  • the calculation loses about $\log_{10}(k(f))$ decimal digits of accuracy.

Condition number examples

Condition number is the systematic way to detect potential numerical problems:

  • $f(x_1, x_2) = x_1 - x_2$

    $k(f) = \frac{|x_1| + |x_2|}{|x_1 - x_2|}$, ill conditioned when $x_1 \approx x_2$, catestrophic cancellation

  • $f(x_1, x_2) = x_1 x_2$

    $k(f) = 2$, the multiplication is well conditioned

In practice, it is impossible to run this analysis over complicated calculations:

  • Have to rely upon monitoring and testing
  • Know where to look when numerical instability arises

Well-posed problem can be unstable

Consider the previous example of $f(x) = \frac{1}{x^2}(1-\cos(x))$:

$$ k(f) = \left| 2 + \frac{x \sin{\left (x \right )}}{\cos{\left (x \right )} - 1} \right| $$

  • $k(f) \rightarrow \infty$ near $x = 2\pi$, it is ill-posed near $x = 2\pi$ due to catastrophic cancellation.
  • $k(f) \rightarrow 0$ near $x = 0$, it is well-posed near $x = 0$.

The numerical algorithm can be unstable even if the problem itself is well-posed.

  • A better algorithm is:

    $$f(x) = \frac{2\sin^2(\frac{x}{2})}{x^2}$$

But an ill-posed problem is always numerically unstable.

In [50]:
x, k = sp.symbols('x, k', real = True)
f = (1-sp.cos(x))/x**2
k = sp.simplify(f.diff(x)*x/f)
sp.limit(k, x, 0)
sp.simplify(1-sp.cos(2*x))
Out[50]:
$$2 \sin^{2}{\left (x \right )}$$

Ill-posed numerical algorithm

Ill-posed problem are generally not suitable for numerical solutions.

The Mandelbrot set is a famous example:

  • defined an iterative algorithm in complex number: $z_{i+1} = z_i^2 + c$
  • the color of $z_0$ is tied to the number of iterations for $|z_i|$ to go above a threshold
  • obviously unstable because of the iterative additions
  • showing exquisite details of numerical instability

Rules of engagement for floats

  • Always allow errors when comparing numbers
  • Use double precision floating point number by default (unless you know what you are doing)
  • Avoid adding and subtracting numbers of very different magnitudes
  • Avoid subtracting very similar numbers
    • avoid using very small bump sizes when computing delta
  • Take care of the limits, especially around 0
  • Use well designed numerical libraries whenever possible
    • Even simple numerical algorithm can have spectacular pitfalls
  • Test and monitor

Portability

  • Floating point computation is usually not reproducible across different platforms
  • Same code will produce different results with different compiler and hardware
    • The difference can be significant in optimization and iterative methods
  • Design your unit test to be cross platform by giving enough error allowance
  • Test and monitoring are critical

Assignments

Required Reading:

  • Bindel and Goodman: Chapter 2

Recommended Reading:

  • Review linear algebra (any elementary linear algebra book)
  • Andersen and Piterbarg: Chapter 1

Homework