Scientific Computing in Finance

  • MATH-GA 2048, Spring 2017
  • Courant Institute of Mathematical Sciences, New York University

Instructors

Yadong Li

  • Head of Trading Book Risk and XVA Modelling, Quantitative Analytics, Barclays
  • Ph.D. in Physics, M.S. in Computer Sciences, University of Wisconsin-Madison
  • Master of Financial Engineering, University of California, Berkeley

Wujiang Lou

  • Senior trader, Global Fixed Income, HSBC
  • Ph.D. in Aeronautics and Astronautics, Nanjing University

Lecture 1: Introduction

Topics

  • Introduction
  • Best software development pratices
  • Errors and floating point computation

Objectives of this class

  • Teach fundamental principles of scientific computing
  • Teach the most common numerical techniques in quantitative Finance
  • Develop intuitions and essential skills using real world examples
  • Help students to start a career in quantitative Finance

We expect you

  • Know basic calculus and linear algebra
  • Know basic stochastic calculus and derivative pricing theories
  • Have some prior programming experience
  • Are willing to learn through hard work

Major fields in quantitative finance

  • Derivative pricing and hedging
  • Risk management and regulatory capital
  • Portfolio management
  • Quantitative strategy/Algo trading

Main topics in this class

A rough schedule of the class:

Week Main Topics Practical Problems Instructor
1 Introduction and Error Analysis Error and floating point numbers Y. Li
2-3 Linear Algebra Portfolio optimization, PCA, least square Y. Li
4-5 Rootfinding and interpolation Curve building Y. Li
6 Derivatives and integration Hedging, Risk Transformation Y. Li
7-8 Monte Carlo simulation Exotic pricing Y. Li
9-10 Optimization Model calibration W. Lou
11-12 ODE/PDE Pricing W. Lou
13 Entropy and Allocation Advanced practical problems Y. Li

Text book

  • Principles of scientific computing online book

    D Bindel and J Goodman, 2009

References

  • Interest rate modelling, volume I: Foundations and Vanilla Models
  • Interest rate modelling, volume II: Vanilla Models

    L Anderson and V. Piterbarg, 2010

  • Monte Carlo method in Financial Engineering

    P. Glasserman 2010

Lecture notes and homework

Available online at: http://yadongli.github.io/nyumath2048

Grading

  • Homework 50%, can be done by groups of two students
  • Final 50%
  • Extra credits
    • extra credit homework problems
    • for pointing out errors in slides/homework

Best Software Practices

Roberto Waltman: In the one and only true way. The object-oriented version of "Spaghetti code" is, of course, "Lasagna code" (too many layers).

Modern hardware

  • Processor speed/throughput continues to grow at an impressive rate
    • Moore's law has ruled for a long time
    • recent trends: multi-core, GPU
  • Memory capacity continues to grow, but memory speed only grows at a (relatively) slow rate

Memory hierarchy

Storage Bandwidth Latency Capacity
L-1 Cache 98GB/s 4 cycles 32K/core
L-2 Cache 50GB/s 10 cycles 256K/core
L-3 Cache 30GB/s 40 - 75 cycles 8MB shared
RAM 15GB/s 60-100 ns 2 - 128GB
Disk 50MB/s ~5ms $\infty$
Network vary much longer $\infty$

About high performance

  • Understand if the problem is computation or bandwidth bound
    • Most problems in practice are bandwidth bound
  • Computation bound problems:
    • caching
    • vectorization/parallelization/GPU
    • optimize your "computational kernel"
  • Bandwidth bound problems:
    • optimize cache and memory access
    • require highly specialized skills and low level programming (like in C/C++/Fortran)
  • Premature optimization for speed is the root of all evil
    • Simplicity and generality of the code are often sacrificed
    • Execution speed is usually not the most critical factor in practice

Before optimizing speed

Correctness

  • "those who would give up correctness for a little temporary performance deserve neither correctness nor performance"
  • "the greatest performance improvement of all is when system goes from not-working to working"

Modifiability

  • "modification is undesirable but modifiability is paramount"
  • modular design, learn from hardware designers

Simplicity

  • no obvious defects is not the same as obviously no defects

Other desirable features:

  • Scalability, robustness, Easy support and maintenanc etc.

In practice, what often trumps everything else is:

Time to market !!

  • First mover has significant advantage: Facebook, Google etc.

Conventional wisdom for programming

Typical advice includes:

  • Choose good names for your variables and functions
  • Write comments and documents
  • Write good tests
  • Keep coding style consistent

These are good advice, but quite superficial.

  • all of them have been compromised, even in extremely successful projects
  • they are usually not the critical differentiator between successes and failures

Best advice for the real world:

by Paul Phillips (quoting George Orwell's 1984) "We're doing it all wrong":

War is peace

  • Fight your war at the boundary so that you can have peace within
  • Challenge your clients to reduce the scope of your deliverable
  • Define a minimal set of interactions to the outside world

Ignorance is strength

  • Design dumb modules that only do one thing
  • Keep it simple and stupid
  • Separation of concerns, modular

Freedom is slavery

  • Must be disciplined in testing, code review, releases etc.
  • Limit the interface of your module (or external commitments)
  • Freedom $\rightarrow$ possibilities $\rightarrow$ complexity $\rightarrow$ unmodifiable code $\rightarrow$ slavery

Sun Tzu: "the supreme art of quant modelling is to solve problems without coding"

Choose the right programming tools

  • Programming paradigm: procedural, OOP, functional
  • Expressivity: fewer lines of code for the same functionaltiy
  • Ecosystem: library, tools, community, industry support

There is no single right answer

  • Different job calls for different tools
  • But most of the time, the choice is already made (by someone else)

Expressivity

$$ $$

When you have the opportunity, choose the more expressive language for the job.

Eric Raymond, "How to Become a Hacker":

"Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot."

We use Python

Python

  • a powerful dynamic and strongly typed scripting language
  • extremely popular in scientific computing
  • strong momentum in quantitative finance
  • highly productive, low learning curve

Python ecosystem

  • numpy/scipy: high quality matrix and numerical library
  • matplotlib: powerful visualization library
  • IPython: highly productive interactive working environment
  • Pandas: powerful data and time series analysis package

Python resources

  • Anaconda: a popular Python distribution with essential packages, best choice
  • Enthought Canopy: a commercial Python distribution, free for students
  • Google and Youtube

Python version:

  • Python 2.7.x : python 3 is not backward compatible
  • With the latest IPython, numpy/pandas/scipy etc.
  • either 32bit or 64bit version works

Python setup instructions:

Errors and floating point computation

James Gosling: "95% of the folks out there are completely clueless about floating-point"

Absolute and relative error

If the true value $a$ is approximated by $\hat{a}$:

  • Absolute error: $\hat{a} - a$
  • Relative error: $\frac{\hat{a} - a}{a}$

Both measure are useful in practice.

  • Beware of small $|a|$ when comparing relative errors
  • Is the error normal or lognormal?

Sources of error

Imprecision in the theory or model itself

  • George Box: "essentially, all models are wrong, but some are useful"

Errors in numerical algorithm

  • Convergence error
    • Monte Carlo simulation
    • Numerical integration
  • Truncation and discretization
  • Termination of iterations

Error due to machine representation

  • Floating point number
  • Rounding

We cover the last category in this lecture.

IEEE 754 in a nutshell

First published in 1985, the IEEE 754 standard includes:

  • Floating point number representations
    • 16bit, 32bit, 64bit etc
    • special representation for NaN, Inf etc
  • Rounding rules
  • Basic operations
    • arithmetic: add, multiplication, sqrt etc
    • conversion: btw formats, to/from string etc
  • Exception handling
    • invalid operation, divide by zero, overflow, underflow, inexact
    • the operation returns a value (could be NaN, Inf or 0), and set an error flag

Rounding under IEEE754

Default rule: round to the nearest, ties to even digit

  • Example, round to 4 significant digit:
    • $3.1245 \rightarrow 3.124$
    • $43515 \rightarrow 43520$
  • Addition and multiplication are commutative with rounding, but not associative
  • Intermediate calculations could be done at higher precision to minimize numerical error, if supported by the hardware
    • Intel CPU's FPU uses 80bit representation for all intermediate calculation, even if input/outputs are in 32 bit floats
    • Early version of Java (before v1.2) enforces intermediate calculation to be in 32/64bits for float/double for binary compatibility - a poor choice

Importance of IEEE 754

IEEE 754 is a tremendously successful standard:

  • Allows users to consistently reason about floating point computation
    • before IEEE 754, floating point number calculation behaves differently across different platforms
  • IEEE 754 made compromise for practicality
    • many of them do not make strict mathematical sense
    • but they are necessary evils for the greater goods
  • Establishes a clear interface between hardware designers and software developers
  • Its main designer, William Kahan won the Turing award in 1989

The IEEE 754 standard is widely adopted in modern hardware and software,

  • The vast majority of general purpose CPUs are IEEE 754 compliant
  • the OS/software situation is more complicated
    • Java strictfp flag is a poor choice, don't use it!
    • Compiled language like C++ depends on the hardware's implementation.

IEEE 754 floating point number format

  • IEEE floating point number standard

$$\underbrace{*}_{\text{sign bit}}\;\underbrace{********}_{\text{exponent}}\overbrace{\;1.\;}^{\text{implicit}}\;\underbrace{***********************}_{\text{fraction}} $$

$$(-1)^{\text{sign}}(\text{1 + fraction})2^{\text{exponent} - p}$$

  • 32 bit vs. 64 bit format
Format Sign Fraction Exponent $p$
IEEE 754 - 32 bit 1 23 8 127
IEEE 754 - 64 bit 1 52 11 1023
  • an implicit leading bit of 1 and the binary point before fraction bits
  • 0 and max exponent reserved for special interpretations: 0, NaN, Inf etc.

Examples of floating numbers

In [1]:
%pylab inline

import bitstring
import pandas as pd
import numpy as np
import fmt
import math, sys

pd.set_option("display.max_colwidth", 80)

def parts2f(sign, e, sig) :
    return ((-1)**sign*sig*2.**e)

def f2parts(v, flen) :
    ps = {32: 9, 64 : 12}
    ed = ps[flen]
    eoff = 2**(ed - 2) - 1

    f1 = bitstring.BitArray(float=v, length=flen)
    signb = f1[:1]
    sigb = f1[ed:]
    eb = f1[1:ed]
    sig = (1. + 1.*sigb.uint/(2**(flen-ed))) if eb.uint > 0 else 2.*sigb.uint/(2**(flen-ed))
    e = eb.uint - eoff
    
    bins = np.array([signb.bin, eb.bin, sigb.bin])
    vals = np.array([1-signb.uint*2, e, sig])
    
    fmt.displayDF(pd.DataFrame(np.array([bins, vals]).T, columns=['Bits', 'Decimal'], 
                               index=["Sign", "Exp", "Fraction"]).T)
Populating the interactive namespace from numpy and matplotlib

f2parts takes a floating number and its bit length (32 or 64) and shows the three parts.

In [2]:
f2parts(10.7, 32);
Sign Exp Fraction
Bits 0 10000010 01010110011001100110011
Decimal 1.0 3.0 1.33749997616
In [3]:
f2parts(-23445.25, 64)
Sign Exp Fraction
Bits 1 10000001101 0110111001010101000000000000000000000000000000000000
Decimal -1.0 14.0 1.43098449707

Range and precision

  • The range of floating point number are usually adequte in practice
    • the whole universe only have $10^{80}$ protons ...
  • The machine precision is the more commonly encountered limitation
    • machine precision $\epsilon_m$ is the maximum $\epsilon$ that makes $1 + \epsilon = 1$
In [4]:
prec32 = 2**(-24)
prec64 = 2**(-53)

f32 = [parts2f(0, -126, 1), parts2f(0, 254-127, (2.-prec32*2)), prec32, -np.log10(prec32)]
f64 = [parts2f(0, -1022, 1), parts2f(0, 2046-1023, (2.-prec64*2)), prec64, -np.log10(prec64)]

fmt.displayDF(pd.DataFrame(np.array([f32, f64]), index=['32 bit', '64 bit'], 
                           columns=["Min", "Max", "Machine Precision", "# of Significant Digits"]), "5g")
Min Max Machine Precision # of Significant Digits
32 bit 1.1755e-38 3.4028e+38 5.9605e-08 7.2247
64 bit 2.2251e-308 1.7977e+308 1.1102e-16 15.955

Examples of floating point representations

In [5]:
f2parts(-0., 32);
Sign Exp Fraction
Bits 1 00000000 00000000000000000000000
Decimal -1.0 -127.0 0.0
In [6]:
f2parts(np.NaN, 32);
Sign Exp Fraction
Bits 0 11111111 10000000000000000000000
Decimal 1.0 128.0 1.5
In [7]:
f2parts(-np.Inf, 32);
Sign Exp Fraction
Bits 1 11111111 00000000000000000000000
Decimal -1.0 128.0 1.0
In [8]:
f2parts(-1e-43, 32);
Sign Exp Fraction
Bits 1 00000000 00000000000000001000111
Decimal -1.0 -127.0 1.69277191162e-05

Floating point computing is NOT exact

In [9]:
a = 1./3
b = a + a + 1. - 1.
c = 2*a
print "b == c? %s !" % (b == c)
print "b - c = %e" % (b - c)
b == c? False !
b - c = -1.110223e-16

In [10]:
f2parts(b, 64);
Sign Exp Fraction
Bits 0 01111111110 0101010101010101010101010101010101010101010101010100
Decimal 1.0 -1.0 1.33333333333
In [11]:
f2parts(c, 64);
Sign Exp Fraction
Bits 0 01111111110 0101010101010101010101010101010101010101010101010101
Decimal 1.0 -1.0 1.33333333333

Can't handle numbers too large or small

In [12]:
x = 1e-308
print "inverse of %e is %e" % (x, 1/x)
inverse of 1.000000e-308 is 1.000000e+308

In [13]:
x = 1e-309
print "inverse of %e is %e" % (x, 1/x)
inverse of 1.000000e-309 is inf

Limited precision

  • Small number (in relative sense) is lost in addition and subtraction
In [14]:
print 1. - 1e-16 == 1.
False

In [15]:
print 1. + 1e-16 == 1.
True

In [16]:
largeint = 2**32 - 1

print "Integer arithmetic:"
a = math.factorial(30) # in Python's bignum
print "a = ", a
print "a - %d == a ?" % largeint, (a - largeint) == a

print "\nFloating point arithmetic:"
b = float(math.factorial(30))
print "b = ", b
print "b - %d == b ?" % largeint, (b - largeint) == b
Integer arithmetic:
a =  265252859812191058636308480000000
a - 4294967295 == a ? False

Floating point arithmetic:
b =  2.65252859812e+32
b - 4294967295 == b ? True

Subtle errors

The limited precision can cause subtle errors that are puzzling even for experiened programmers.

In [17]:
a = np.arange(0, 3. - 1., 1.)
print "a = ", a
print "a/3 = ", a/3.
a =  [ 0.  1.]
a/3 =  [ 0.          0.33333333]

In [18]:
b = np.arange(0, 1 - 1./3., 1./3.)
print "b = ", b
b =  [ 0.          0.33333333  0.66666667]

Avoid equality test

Equality test in floating point number is always a bad idea:

  • Mathematical equality often fails floating point equality test because of rounding errors
  • Often produce off-by-one number of loops when used to test end loop conditions

It should become your instinct to:

  • always use error bound when comparing floating numbers, such as numpy.allclose()
  • try to use integer comparison as the end condition for loops
    • integer representation and arithmetic in IEEE 754 is exact
In [19]:
b = arange(0, 1 - 1./3. - 1e-10, 1./3)
print "b = ", b
b =  [ 0.          0.33333333]

Unexpected twist

Compute $e^x$ via Taylor expansion: $$e^x = \sum_{n=0}^{\infty}\frac{x^n}{n!} = 1 + x + \frac{1}{2}x^2 + ...$$

Other types of errors may be mistaken as floating point rounding error and escape detection:

In [20]:
import math

def exp_taylor(x, n) :
    return sum([1.*x**i/math.factorial(i) for i in np.arange(1, n)]) + 1.

terms = 100
x = 5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, terms))/np.exp(x)*100)
x = -5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, terms))/np.exp(x)*100)

print "exp_taylor(%.f) = %f, exp(%.f) = %f" % (x, exp_taylor(-5, terms), x, np.exp(-5))
rel err for exp(5) is 0.06%
rel err for exp(-5) is 474.72%
exp_taylor(-5) = -0.025249, exp(-5) = 0.006738

Catastrophic cancellation

Dramatic loss of precision can happen when very similar numbers are subtracted

  • Value the follwing function around 0:

$$f(x) = \frac{1}{x^2}(1-\cos(x))$$

  • Mathematically: $\lim_{x\rightarrow 0}f(x) = \frac{1}{2}$

The straight forward implementation failed spectacularly for small $x$.

In [22]:
def f(x) :
    return (1.-np.cos(x))/x**2

def g(x) :
    return (np.sin(.5*x)**2*2)/x**2

x = np.arange(-4e-8, 4e-8, 1e-9)
df = pd.DataFrame(np.array([f(x), g(x)]).T, index=x, columns=['Numerical f(x)', 'True Value']);
df.plot();

How did it happen?

In [23]:
x = 1.1e-8
print "cos(%.16e) = %.16f" % (x, cos(x))
print "1-cos(%.e) = %.16e" % (x, 1.-cos(x))
print "true value of 1-cos(%.e) = %.16e" % (x, 2*sin(.5*x)**2)
cos(1.0999999999999999e-08) = 0.9999999999999999
1-cos(1e-08) = 1.1102230246251565e-16
true value of 1-cos(1e-08) = 6.0499999999999997e-17

  • Dramatic precision loss when subtracting numbers that are very similar
    • All the identical leading significants cancels
    • The result has much fewer number of significant digits
  • Catestrophic cancellation could happen in practice
    • Deltas are often computed by bump and re-value
    • Very small bump sizes should be avoided

A real example in Finance

the CIR process is widely used in quant finance, eg, in the short rate and Heston model:

$$ dr_t = \kappa(\mu - r) dt + \sigma \sqrt{r_t} dW_t $$

the variance of $r_t$ at $t > 0$:

$$\text{Var}[r_t | r_0] = \frac{r_0\sigma^2}{\kappa}(e^{-\kappa t} - e^{-2\kappa t}) + \frac{\mu\sigma^2}{2\kappa}(1-e^{-\kappa t})^2$$

how the catastrophic cancellation arises?

In [31]:
k = arange(-3e-8, 3e-8, 1e-9)*1e-7
t = 1
v = (1-exp(-k*t))/k
plot(k, v, '.-')
title(r'$\frac{1}{\kappa}(1-\exp(-\kappa))$')
xlabel(r'$\kappa$');
In [25]:
import sympy as sp
sp.init_printing(use_latex = True)

k = sp.symbols("k", positive = True)
e = (1-sp.exp(-k))/k
sp.limit(e, k, 0)
Out[25]:
$$1$$

Condition number

Consider the relative error of a function with multiple arguments: $f = f(x_1, ..., x_n)$:

$$ \small df = \sum_i \frac{\partial f}{\partial x_i} dx_i \iff \frac{df}{f} = \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \frac{dx_i}{x_i} $$

Assuming input argument's relative errors are $\epsilon_i$ (could be negative):

$$ \small \left| \frac{\Delta f}{f} \right| = \left| \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \epsilon_i \right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \left|\epsilon_i\right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \epsilon_m \equiv k(f) \epsilon_m $$

where $\epsilon_m$ is the maximum relative error of all inputs.

  • $k(f) = \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right|$ is defined as the condition number of a function,
  • it is the maximum growth factor of the relative error.
  • the calculation loses about $\log_{10}(k(f))$ decimal digits of accuracy.

Well-posed and ill-posed problems

Condition number is the systematic approach to detect potential numerical problems:

  • $f(x_1, x_2) = x_1 - x_2$

    $k(f) = \frac{|x_1| + |x_2|}{|x_1 - x_2|}$, ill conditioned when $x_1 \approx x_2$, catestrophic cancellation

  • $f(x_1, x_2) = x_1 x_2$

    $k(f) = 2$, the multiplication is well conditioned

The problem is ill-posed if its condition number is large.

  • In practice, it is impossible to run condition number analysis over complicated calculations:
  • We have to rely upon monitoring and testing

Well-posed problem can be unstable

Consider the previous example of $f(x) = \frac{1}{x^2}(1-\cos(x))$:

$$ k(f) = \left| 2 + \frac{x \sin{\left (x \right )}}{\cos{\left (x \right )} - 1} \right| $$

  • $k(f) \rightarrow \infty$ near $x = 2\pi$, it is ill-posed near $x = 2\pi$ due to catastrophic cancellation.
  • $k(f) \rightarrow 2$ near $x = 0$, it is well-posed near $x = 0$.

The numerical algorithm can be unstable even if the problem itself is well-posed.

  • A better algorithm near $x=0$ is $f(x) = \frac{2}{x^2}\sin^2(\frac{x}{2})$

But an ill-posed problem is always numerically unstable by definition.

  • Direct numerical calculation is unstable around $x = 2\pi$
  • Does this help? $f(x) \approx \frac{1}{2x^2}(x-2\pi)^2$
In [1]:
x, k = sp.symbols('x, k', real = True)
f = (1-sp.cos(x))/x**2
k = sp.simplify(f.diff(x)*x/f)
sp.limit(k, x, 0)
sp.simplify(1-sp.cos(2*x))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-e9a72a95e929> in <module>()
----> 1 x, k = sp.symbols('x, k', real = True)
      2 f = (1-sp.cos(x))/x**2
      3 k = sp.simplify(f.diff(x)*x/f)
      4 sp.limit(k, x, 0)
      5 sp.simplify(1-sp.cos(2*x))

NameError: name 'sp' is not defined

Ill-posed numerical algorithm

Ill-posed problem are generally not suitable for numerical solutions.

The Mandelbrot set is a famous example:

  • defined an iterative algorithm in complex number: $z_{i+1} = z_i^2 + c$
  • the color of $z_0$ is tied to the number of iterations for $|z_i|$ to go above a threshold
  • obviously unstable because of the iterative additions
  • showing exquisite details of numerical instability

Rules of engagement for floats

  • Always allow errors when comparing numbers
  • Use double precision floating point number by default
    • unless you know what you are doing (really?)
  • Avoid adding and subtracting numbers of very different magnitudes
    • avoid using very small bump sizes when computing delta
  • Take care of the limits, especially around 0
    • many (careless) numerical implementations broke down around 0
  • Use well designed numerical libraries whenever possible
    • Even simple numerical algorithm can have spectacular pitfalls
  • Test and monitor

Portability

  • Floating point computation is usually not reproducible across different platforms
  • Same code will produce different results with different compiler and hardware
    • The difference can be significant in optimization and iterative methods
  • Design your unit test to be cross platform by giving enough error allowance
  • Test and monitoring are critical

Assignments

Required Reading:

  • Bindel and Goodman: Chapter 2

Recommended Reading:

  • Review linear algebra (any elementary linear algebra book)
  • Andersen and Piterbarg: Chapter 1

Homework