Scientific Computing in Finance

MATH-GA 2048, Spring 2015
Courant Institute of Mathematical Sciences
New York University

Instructors

Hongwei Cheng

Head of financial model validation, Prudential
Ph.D. Mathematics, New York University

Yadong Li

Head of trading book risk modeling, Barclays
Ph.D. in Physics, M.S. in Computer Sciences, University of Wisconsin-Madison
Master of Financial Engineering, University of California, Berkeley

Teaching Assitant

Maulik Desai

Ph.D. EECS, Columbia University

Lecture 1: Introduction

Topics

Introduction
Software practices
Errors and floating point computation

Objectives of this class

Teach fundamental principles of scientific computing
Teach the most common numerical techniques in quantitative Finance
Develop intuitions and essential skills using real world examples
Help students to start a career in quantitative Finance

Levels of understanding

description: able to talk about the problem
insight: understood the dynamics and key drivers
abstraction (model): fomulated it as a mathematical problem
knowledge: solved the mathematical problem
experience (skill): applied the theory to real world, understood the practical limitations
intuition: able to see through, make connections and sensibly reason about new problems

We expect you

Know basic calculus and linear algebra
Know basic stochastic calculus and derivative pricing theories
Have some prior programming experience
Are willing to learn through hard work

Major fields in quantitative finance

Derivative pricing and hedging
Risk management and regulatory capital
Portfolio management
Quantitative strategy/Algo trading

Main topics in this class

A rough schedule of the class:

Week	Main Topics	Practical Problems
1	Introduction and Error Analysis	Software practice
2-3	Linear Algebra	Portfolio optimization, PCA, least square
4	Rootfinding and interpolation	Curve building
5	Derivatives and integration	Hedging, Risk Transformation
7-8	Optimization	Model calibration
9-11	Monte Carlo simulation	Exotic pricing, Max entropy, Allocation
12-13	ODE/PDE	Pricing

Text book

Principles of scientific computing online book

D Bindel and J Goodman, 2009

References

Interest rate modelling, volume I: Foundations and Vanilla Models
Interest rate modelling, volume II: Vanilla Models

L Anderson and V. Piterbarg, 2010
Monte Carlo method in Financial Engineering

P. Glasserman 2010

Lecture notes and homework

Available online at: http://yadongli.github.io/nyumath2048

Grades

Homework 50%
Project 30%
Final 20%
Extra credits for pointing out errors in slides/homework

Software Practices

Modern hardware

Processor speed/throughput continues to grow at an impressive rate
- Moore's law has ruled for a long time
- recent trends: multi-core, GPU
Memory capacity continues to grow, but memory speed only grows at a (relatively) slow rate

Memory hierarchy

Storage	Bandwidth	Latency	Capacity
L-1 Cache	98GB/s	4 cycles	32K/core
L-2 Cache	50GB/s	10 cycles	256K/core
L-3 Cache	30GB/s	40 - 75 cycles	8MB shared
RAM	15GB/s	60-100 ns	2 - 128GB
Disk	50MB/s	~5ms	$\infty$
Network	vary	much longer	$\infty$

Key to high performance

Understand if the problem is computation or bandwidth bound
- Most problems in practice are bandwidth bound
Computation bound problems:
- caching
- vectorization/parallelization/GPU
- optimize your "computational kernel"
Bandwidth bound problems:
- optimize cache and memory access
- require highly specialized skills and low level programming (like in C/C++/Fortran)
Premature optimization for speed is the root of all evil
- Simplicity and generality of the code are often sacrificed
- Execution speed is usually not the most critical factor in practice

What's before performance

Correctness

"those who would give up correctness for a little temporary performance deserve neither correctness nor performance"
"the greatest performance improvement of all is when system goes from not-working to working"

Modifiability

"modification is undesirable but modifiability is paramount"
modular design, learn from hardware designers

Simplicity

no obvious defects is not the same as obviously no defects
simplicity allows humans to correctly reason about systems

Other desirable features:

Scalability, robustness, Easy support and maintenanc etc.

In practice, what often trumps everything else is:

Time to market !!

Conventional wisdom for programming

The typical advice includes:

choose good names for your variables and functions
write comments and documents
write tests
keep coding style consistent

These are good advice, but quite superficial.

all of them are compromised, even in very successful projects
they are usually not the critical differentiator between successes and failures

Advice for the real world:

by Paul Phillips (quoting George Orwell): "We're doing it all wrong":

War is peace

Fight your war at the boundary so that you can have peace within
Challenge your clients, customers and collaborators to limit your deliverable
Clearly define a minimal set of interactions to the outside world

Ignorance is strength

Design dumb modules that only do one thing, and only one thing
Keep it simple and stupid
Separation of concerns, modular

Freedom is slavery

Must be disciplined in testing, code review, releases etc.
Freedom $\rightarrow$ possibilities $\rightarrow$ complexity $\rightarrow$ unmodifiable code $\rightarrow$ slavery

Sun Tzu: "the supreme art of modelling is to solve problems without coding"

Choose the right programming tools

Programming paradigm: procedural, OOP, functional
Expressiveness: high level vs. low level
Ecosystem: library, tools, community, industry support

There is no single right answer

Different job calls for different tools
But most of the time, the choice is already made (by someone else)

Expressiveness

When you have the chance, choose the more expressive language for the job.

Eric Raymond, "How to Become a Hacker":

"Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot."

We use Python

Python

a powerful dynamic and strongly typed scripting language
extremely popular in scientific computing
strong momentum in quantitative finance
highly productive, low learning curve

Python ecosystem

numpy/scipy: high quality matrix and numerical library
matplotlib: powerful visualization library
IPython: highly productive interactive working environment
Pandas: powerful data and time series analysis package

Python resources

Enthought Canopy: a commercial Python distribution, free for students
Anaconda: another popular and free Python distribution
Google and Youtube

Python version:

Python 2.7.x : python 3 is not backward compatible
With the latest IPython, numpy/pandas/scipy etc.
either 32bit or 64bit version

Pyton setup instructions:

available at: http://yadongli.github.io/nyumath2048

Errors and floating point computation

James Gosling: "95% of the folks out there are completely clueless about floating-point"

Absolute and relative error

If the true value $a$ is approximated by $\hat{a}$:

Absolute error: $\hat{a} - a$
Relative error: $\frac{\hat{a} - a}{a}$

Both measure are useful in practice.

Beware of the absolute values when compare relative errors

Sources of error

Impreciseness in the theory or model itself

George Box: "essentially, all models are wrong, but some are useful"

Errors in numerical algorithm

Convergence error
- Monte Carlo simulation
- Numerical integration
Truncation and discretization
Termination of iterations

Error due to machine representation

Floating point number
Rounding

We cover the last category first.

IEEE 754 standard

First published in 1985, the standard includes:

floating point number representations
- 16bit, 32bit, 64bit etc
- special representation for NaN, Inf etc
rounding
- default rule: round to nearest, ties to even digit
  - e.g., round to 4 significant digit: $3.1245 \rightarrow 3.124, 43515 \rightarrow 43520$
- addition and multiplication are still commutative, but not associative
- intermediate calculations could be done at higher precision to minimize rounding error
  - e.g. Intel CPU's FPU uses 80bit representation for intermediate calculation

basic operations
- arithmetic: add, multiplication, sqrt etc
- conversion: btw formats, to/from string etc
exception handling
- invalid operation, divide by zero, overflow, underflow, inexact
- the operation returns a value (could be NaN, Inf or 0), and set an error flag

Significance of IEEE 754

The IEEE 754 is a tremendously successful standard:

it allows users to consistently reason about floating point computation across platforms
- before IEEE 754, floating point number calculation behaves differently across hardware/vendors
IEEE 754 made compromise for practicality
- many of them do not make sense in a strict mathematical sense
- but they are necessary evils for the greater goods
it establishes a clear interface for hardware designers and software developers
its main designer, William Kahan won the Turing award in 1989

The IEEE 754 standard is widely implemented in modern hardware and software,

99% of the CPU are at least 99% IEEE 754 compliant
the OS/software situation is more complicated
- Java floating point arithmetic was not compliant with the strictfp flag (don't use it!)
- Compiled language like C++ depends on the hardware's implementation.

IEEE 754 floating point number format

IEEE floating point number standard

$$\underbrace{*}_{\text{sign bit}}\;\underbrace{********}_{\text{exponent}}\overbrace{\;1.\;}^{\text{implicit}}\;\underbrace{***********************}_{\text{fraction}} $$

$$(-1)^{\text{sign}}(\text{1 + fraction})2^{\text{exponent} - p}$$

32 bit vs. 64 bit format

Format	Sign	Fraction (incl. the implicit bit)	Exponent	$p$
IEEE 754 - 32 bit	1	24	8	127
IEEE 754 - 64 bit	1	53	11	1023

an implicit leading bit of 1 and the binary point before fraction bits
0 and max exponent reserved for special interpretations: 0, NaN, Inf etc.

Examples of floating numbers

f2parts takes a floating number and its bit length (32 or 64) and shows the three parts.

In [27]:

f2parts(10.7, 32);

	Sign	Exp	Fraction
Bits	0	10000010	01010110011001100110011
Decimal	1.0	3.0	1.33749997616

In [28]:

f2parts(-23445.25, 64)

	Sign	Exp	Fraction
Bits	1	10000001101	0110111001010101000000000000000000000000000000000000
Decimal	-1.0	14.0	1.43098449707

Range and precision

The range of floating point number are usually adequte in practice
- the whole universe only have $10^{80}$ protons ...
The machine precision is the more commonly encountered limitation
- machine precision $\epsilon_m$ is the maximum $\epsilon$ that makes $1 + \epsilon = 1$

In [29]:

prec32 = 2**(-24)
prec64 = 2**(-53)

f32 = [parts2f(0, -126, 1), parts2f(0, 254-127, (2.-prec32*2)), prec32, -np.log10(prec32)]
f64 = [parts2f(0, -1022, 1), parts2f(0, 2046-1023, (2.-prec64*2)), prec64, -np.log10(prec64)]

fmt.displayDF(pd.DataFrame(np.array([f32, f64]), index=['32 bit', '64 bit'], 
                           columns=["Min", "Max", "Machine Precision", "# of Significant Digits"]), "5g")

	Min	Max	Machine Precision	# of Significant Digits
32 bit	1.1755e-38	3.4028e+38	5.9605e-08	7.2247
64 bit	2.2251e-308	1.7977e+308	1.1102e-16	15.955

Examples of floating point representations

In [30]:

f2parts(-0., 32);

	Sign	Exp	Fraction
Bits	1	00000000	00000000000000000000000
Decimal	-1.0	-127.0	0.0

In [31]:

f2parts(np.NaN, 32);

	Sign	Exp	Fraction
Bits	0	11111111	10000000000000000000000
Decimal	1.0	128.0	1.5

In [32]:

f2parts(-np.Inf, 32);

	Sign	Exp	Fraction
Bits	1	11111111	00000000000000000000000
Decimal	-1.0	128.0	1.0

In [33]:

f2parts(-1e-43, 32);

	Sign	Exp	Fraction
Bits	1	00000000	00000000000000001000111
Decimal	-1.0	-127.0	1.69277191162e-05

Floating point computing is NOT exact

In [34]:

a = 1./3
b = a + a + 1. - 1.
c = 2*a
print "b == c? %s !" % (b == c)
print "b - c = %e" % (b - c)

b == c? False !
b - c = -1.110223e-16

In [35]:

f2parts(b, 64);

	Sign	Exp	Fraction
Bits	0	01111111110	0101010101010101010101010101010101010101010101010100
Decimal	1.0	-1.0	1.33333333333

In [36]:

f2parts(c, 64);

	Sign	Exp	Fraction
Bits	0	01111111110	0101010101010101010101010101010101010101010101010101
Decimal	1.0	-1.0	1.33333333333

Can't handle numbers too large or small

In [37]:

x = 1e-308
print "inverse of %e is %e" % (x, 1/x)

inverse of 1.000000e-308 is 1.000000e+308

In [38]:

x = 1e-309
print "inverse of %e is %e" % (x, 1/x)

inverse of 1.000000e-309 is inf

Limited precision

Small number is often lost in addition and subtraction

In [39]:

print 1. - 1e-16 == 1.

False

In [40]:

print 1. + 1e-16 == 1.

True

In [41]:

a = math.factorial(30) # in Python's bignum
print "a = ", a
print "maxint = ", sys.maxint
print "a - maxint == a ?\n", (a - sys.maxint) == a

b = float(math.factorial(30))
print "b = ", b
print "b - maxint == b ?", (b - sys.maxint) == b

a =  265252859812191058636308480000000
maxint =  2147483647
a - maxint == a ?
False
b =  2.65252859812e+32
b - maxint == b ? True

Subtle errors

The limited precision can cause subtle errors that are puzzling even for experiened programmers.

In [42]:

a = np.arange(0, 3. - 1., 1.)
print "a = ", a
print "a/3 = ", a/3.

a =  [ 0.  1.]
a/3 =  [ 0.          0.33333333]

In [43]:

b = np.arange(0, 1 - 1./3., 1./3.)
print "b = ", b

b =  [ 0.          0.33333333  0.66666667]

It should become your instinct to always use error bound when comparing floating numbers, or use integer comparison as the condition for ending loops.

In [44]:

b = arange(0, 1 - 1./3. - 1e-10, 1./3)
print "b = ", b

b =  [ 0.          0.33333333]

Unexpected twist

Compute $e^x$ via Taylor expansion: $$e^x = \sum_{n=0}^{\infty}\frac{x^n}{n!} = 1 + x + \frac{1}{2}x^2 + ...$$ Other types of errors may be mistaken as floating point rounding error:

In [51]:

import math

def exp_taylor(x, n) :
    return sum([1.*x**i/math.factorial(i) for i in np.arange(1, n)]) + 1.

x = 5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, 100))/np.exp(x)*100)
x = -5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, 100))/np.exp(x)*100)

print "exp_taylor(%.f) = %f, exp(%.f) = %f" % (x, exp_taylor(-5, 150), x, np.exp(-5))

rel err for exp(5) is 0.06%
rel err for exp(-5) is 474.72%
exp_taylor(-5) = -0.025249, exp(-5) = 0.006738

Catastrophic cancellation

Dramatic loss of precision can happen when very similar numbers are subtracted

Value the follwing function around 0:

$$f(x) = \frac{1}{x^2}(1-\cos(x))$$

Mathematically: $\lim_{x\rightarrow 0}f(x) = \frac{1}{2}$

The straight forward implementation failed spectacularly for small $x$.

In [46]:

def f(x) :
    return (1.-np.cos(x))/x**2

def g(x) :
    return (np.sin(.5*x)**2*2)/x**2

x = np.arange(-4e-8, 4e-8, 1e-9)
df = pd.DataFrame(np.array([f(x), g(x)]).T, index=x, columns=['Numerical f(x)', 'True Value']);
df.plot();

How did it happen?

In [47]:

x = 1.1e-8
print "cos(%.16e) = %.16f" % (x, cos(x))
print "1-cos(%.e) = %.16e" % (x, 1.-cos(x))
print "true value of 1-cos(%.e) = %.16e" % (x, 2*sin(.5*x)**2)

cos(1.0999999999999999e-08) = 0.9999999999999999
1-cos(1e-08) = 1.1102230246251565e-16
true value of 1-cos(1e-08) = 6.0499999999999997e-17

Dramatic precision loss when subtracting too similar numbers
- All the identical leading significants cancels
- The result has much fewer number of significant digits
Catestrophic cancellation could happen in practice
- The Deltas are often computed by bump and re-value
- Very small bump sizes should be avoided

A real example in Finance

the CIR process is widely used in quant finance, eg, in the short rate and Heston model:

$$ dr_t = \kappa(\mu - r) dt + \sigma \sqrt{r_t} dW_t $$

the variance of $r_t$ at $t > 0$:

$$\text{Var}[r_t | r_0] = \frac{r_0\sigma^2}{\kappa}(e^{-\kappa t} - e^{-2\kappa t}) + \frac{\mu\sigma^2}{2\kappa}(1-e^{-\kappa t})^2$$

how the catastrophic cancellation arises?

In [48]:

k = arange(-3e-8, 3e-8, 1e-9)*1e-7
t = 1
v = (1-exp(-k*t))/k
plot(k, v, '.-')
title(r'$\frac{1}{\kappa}(1-\exp(-2\kappa))$')
xlabel(r'$\kappa$');

Condition number

Consider the relative error of a function with multiple arguments: $f = f(x_1, ..., x_n)$:

\begin{eqnarray} \small df &=& \sum_i \frac{\partial f}{\partial x_i} dx_i \\ \frac{df}{f} &=& \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \frac{dx_i}{x_i} \end{eqnarray}

Assuming input argument's relative errors are $\epsilon_i$ (could be negative):

$$ \small \left| \frac{\Delta f}{f} \right| = \left| \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \epsilon_i \right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \left|\epsilon_i\right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \epsilon \equiv k(f) \epsilon $$

where $\epsilon$ is the maximum relative error of all inputs.

$k(f) = \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right|$ is defined as the condition number of a function,
it is the maximum growth factor of the relative error.
the calculation loses about $\log_{10}(k(f))$ decimal digits of accuracy.

Condition number examples

Condition number is the systematic way to detect potential numerical problems:

$f(x_1, x_2) = x_1 - x_2$

$k(f) = \frac{|x_1| + |x_2|}{|x_1 - x_2|}$, ill conditioned when $x_1 \approx x_2$, catestrophic cancellation
$f(x_1, x_2) = x_1 x_2$

$k(f) = 2$, the multiplication is well conditioned

In practice, it is impossible to run this analysis over complicated calculations:

Have to rely upon monitoring and testing
Know where to look when numerical instability arises

Well-posed problem can be unstable

Consider the previous example of $f(x) = \frac{1}{x^2}(1-\cos(x))$:

$$ k(f) = \left| 2 + \frac{x \sin{\left (x \right )}}{\cos{\left (x \right )} - 1} \right| $$

$k(f) \rightarrow \infty$ near $x = 2\pi$, it is ill-posed near $x = 2\pi$ due to catastrophic cancellation.
$k(f) \rightarrow 0$ near $x = 0$, it is well-posed near $x = 0$.

The numerical algorithm can be unstable even if the problem itself is well-posed.

A better algorithm is:

$$f(x) = \frac{2\sin^2(\frac{x}{2})}{x^2}$$

But an ill-posed problem is always numerically unstable.

Ill-posed numerical algorithm

Ill-posed problem are generally not suitable for numerical solutions.

The Mandelbrot set is a famous example:

defined an iterative algorithm in complex number: $z_{i+1} = z_i^2 + c$
the color of $z_0$ is tied to the number of iterations for $|z_i|$ to go above a threshold
obviously unstable because of the iterative additions
showing exquisite details of numerical instability

Rules of engagement for floats

Always allow errors when comparing numbers
Use double precision floating point number by default (unless you know what you are doing)
Avoid adding and subtracting numbers of very different magnitudes
Avoid subtracting very similar numbers
- avoid using very small bump sizes when computing delta
Take care of the limits, especially around 0
Use well designed numerical libraries whenever possible
- Even simple numerical algorithm can have spectacular pitfalls
Test and monitor

Portability

Floating point computation is usually not reproducible across different platforms
Same code will produce different results with different compiler and hardware
- The difference can be significant in optimization and iterative methods
Design your unit test to be cross platform by giving enough error allowance
Test and monitoring are critical

Assignments

Required Reading:

Bindel and Goodman: Chapter 2