Scientific Computing in Finance

MATH-GA 2048, Spring 2016
Courant Institute of Mathematical Sciences
New York University

Instructors

Hongwei Cheng

Head of Research and CRO, Mill Hill Capital
Ph.D. Mathematics, New York University

Yadong Li

Head of Trading Book Risk Modelling, Quantitative Analytics, Barclays
Ph.D. in Physics, M.S. in Computer Sciences, University of Wisconsin-Madison
Master of Financial Engineering, University of California, Berkeley

Teaching Assitant

Maulik Desai

Ph.D. EECS, Columbia University

Lecture 1: Introduction

Topics

Introduction
Best software development pratices
Errors and floating point computation

Objectives of this class

Teach fundamental principles of scientific computing
Teach the most common numerical techniques in quantitative Finance
Develop intuitions and essential skills using real world examples
Help students to start a career in quantitative Finance

We expect you

Know basic calculus and linear algebra
Know basic stochastic calculus and derivative pricing theories
Have some prior programming experience
Are willing to learn through hard work

Major fields in quantitative finance

Derivative pricing and hedging
Risk management and regulatory capital
Portfolio management
Quantitative strategy/Algo trading

Main topics in this class

A rough schedule of the class:

Week	Main Topics	Practical Problems	Instructor
1	Introduction and Error Analysis	Error and floating point numbers	Y. Li
2-3	Linear Algebra	Portfolio optimization, PCA, least square	Y. Li
4-5	Rootfinding and interpolation	Curve building	Y. Li
6	Derivatives and integration	Hedging, Risk Transformation	Y. Li
7-8	Monte Carlo simulation	Exotic pricing	Y. Li
9-10	Optimization	Model calibration	H. Cheng
11-12	ODE/PDE	Pricing	H. Cheng
13	Entropy and Allocation	Advanced practical problems	Y. Li

Text book

Principles of scientific computing online book

D Bindel and J Goodman, 2009

References

Interest rate modelling, volume I: Foundations and Vanilla Models
Interest rate modelling, volume II: Vanilla Models

L Anderson and V. Piterbarg, 2010
Monte Carlo method in Financial Engineering

P. Glasserman 2010

Lecture notes and homework

Available online at: http://yadongli.github.io/nyumath2048

Grades (tentative)

Homework 30%
Project (tentative 20%, we may cancel project and move this portion to the homework)
Final 50%
Extra credits
- extra credit homework problems
- for pointing out errors in slides/homework

Best Software Practices

Roberto Waltman: In the one and only true way. The object-oriented version of "Spaghetti code" is, of course, "Lasagna code" (too many layers).

Modern hardware

Processor speed/throughput continues to grow at an impressive rate
- Moore's law has ruled for a long time
- recent trends: multi-core, GPU
Memory capacity continues to grow, but memory speed only grows at a (relatively) slow rate

Memory hierarchy

Storage	Bandwidth	Latency	Capacity
L-1 Cache	98GB/s	4 cycles	32K/core
L-2 Cache	50GB/s	10 cycles	256K/core
L-3 Cache	30GB/s	40 - 75 cycles	8MB shared
RAM	15GB/s	60-100 ns	2 - 128GB
Disk	50MB/s	~5ms	$\infty$
Network	vary	much longer	$\infty$

About high performance

Understand if the problem is computation or bandwidth bound
- Most problems in practice are bandwidth bound
Computation bound problems:
- caching
- vectorization/parallelization/GPU
- optimize your "computational kernel"
Bandwidth bound problems:
- optimize cache and memory access
- require highly specialized skills and low level programming (like in C/C++/Fortran)
Premature optimization for speed is the root of all evil
- Simplicity and generality of the code are often sacrificed
- Execution speed is usually not the most critical factor in practice

Before optimizing speed

Correctness

"those who would give up correctness for a little temporary performance deserve neither correctness nor performance"
"the greatest performance improvement of all is when system goes from not-working to working"

Modifiability

"modification is undesirable but modifiability is paramount"
modular design, learn from hardware designers

Simplicity

no obvious defects is not the same as obviously no defects

Other desirable features:

Scalability, robustness, Easy support and maintenanc etc.

In practice, what often trumps everything else is:

Time to market !!

Conventional wisdom for programming

Typical advice includes:

Choose good names for your variables and functions
Write comments and documents
Write good tests
Keep coding style consistent

These are good advice, but quite superficial.

all of them have been compromised, even in extremely successful projects
they are usually not the critical differentiator between successes and failures

Best advice for the real world:

by Paul Phillips (quoting George Orwell's 1984) "We're doing it all wrong":

War is peace

Fight your war at the boundary so that you can have peace within
Challenge your clients to reduce the scope of your deliverable
Define a minimal set of interactions to the outside world

Ignorance is strength

Design dumb modules that only do one thing
Keep it simple and stupid
Separation of concerns, modular

Freedom is slavery

Must be disciplined in testing, code review, releases etc.
Freedom $\rightarrow$ possibilities $\rightarrow$ complexity $\rightarrow$ unmodifiable code $\rightarrow$ slavery

Sun Tzu: "the supreme art of quant modelling is to solve problems without coding"

Choose the right programming tools

Programming paradigm: procedural, OOP, functional
Expressivity: fewer lines of code for the same functionaltiy
Ecosystem: library, tools, community, industry support

There is no single right answer

Different job calls for different tools
But most of the time, the choice is already made (by someone else)

Expressivity

$$ $$

When you have the opportunity, choose the more expressive language for the job.

Eric Raymond, "How to Become a Hacker":

"Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot."

We use Python

Python

a powerful dynamic and strongly typed scripting language
extremely popular in scientific computing
strong momentum in quantitative finance
highly productive, low learning curve

Python ecosystem

numpy/scipy: high quality matrix and numerical library
matplotlib: powerful visualization library
IPython: highly productive interactive working environment
Pandas: powerful data and time series analysis package

Python resources

Enthought Canopy: a commercial Python distribution, free for students
Anaconda: another popular and free Python distribution
Google and Youtube

Python version:

Python 2.7.x : python 3 is not backward compatible
With the latest IPython, numpy/pandas/scipy etc.
either 32bit or 64bit version works

Python setup instructions:

available at: http://yadongli.github.io/nyumath2048

Errors and floating point computation

James Gosling: "95% of the folks out there are completely clueless about floating-point"

Absolute and relative error

If the true value $a$ is approximated by $\hat{a}$:

Absolute error: $\hat{a} - a$
Relative error: $\frac{\hat{a} - a}{a}$

Both measure are useful in practice.

Beware of small $|a|$ when comparing relative errors
Is the error normal or lognormal?

Sources of error

Imprecision in the theory or model itself

George Box: "essentially, all models are wrong, but some are useful"

Errors in numerical algorithm

Convergence error
- Monte Carlo simulation
- Numerical integration
Truncation and discretization
Termination of iterations

Error due to machine representation

Floating point number
Rounding

We cover the last category in this lecture.

IEEE 754 in a nutshell

First published in 1985, the IEEE 754 standard includes:

Floating point number representations
- 16bit, 32bit, 64bit etc
- special representation for NaN, Inf etc
Rounding rules
Basic operations
- arithmetic: add, multiplication, sqrt etc
- conversion: btw formats, to/from string etc
Exception handling
- invalid operation, divide by zero, overflow, underflow, inexact
- the operation returns a value (could be NaN, Inf or 0), and set an error flag

Rounding under IEEE754

Default rule: round to the nearest, ties to even digit

Example, round to 4 significant digit:
- $3.1245 \rightarrow 3.124$
- $43515 \rightarrow 43520$
Addition and multiplication are commutative with rounding, but not associative
Intermediate calculations could be done at higher precision to minimize numerical error, if supported by the hardware
- Intel CPU's FPU uses 80bit representation for all intermediate calculation, even if input/outputs are in 32 bit floats
- Early version of Java (before v1.2) enforces intermediate calculation to be in 32/64bits for float/double for binary compatibility - a poor choice

Importance of IEEE 754

IEEE 754 is a tremendously successful standard:

Allows users to consistently reason about floating point computation
- before IEEE 754, floating point number calculation behaves differently across different platforms
IEEE 754 made compromise for practicality
- many of them do not make strict mathematical sense
- but they are necessary evils for the greater goods
Establishes a clear interface between hardware designers and software developers
Its main designer, William Kahan won the Turing award in 1989

The IEEE 754 standard is widely adopted in modern hardware and software,

The vast majority of general purpose CPUs are IEEE 754 compliant
the OS/software situation is more complicated
- Java strictfp flag is a poor choice, don't use it!
- Compiled language like C++ depends on the hardware's implementation.

IEEE 754 floating point number format

IEEE floating point number standard

$$\underbrace{*}_{\text{sign bit}}\;\underbrace{********}_{\text{exponent}}\overbrace{\;1.\;}^{\text{implicit}}\;\underbrace{***********************}_{\text{fraction}} $$

$$(-1)^{\text{sign}}(\text{1 + fraction})2^{\text{exponent} - p}$$

32 bit vs. 64 bit format

Format	Sign	Fraction	Exponent	$p$
IEEE 754 - 32 bit	1	23	8	127
IEEE 754 - 64 bit	1	52	11	1023

an implicit leading bit of 1 and the binary point before fraction bits
0 and max exponent reserved for special interpretations: 0, NaN, Inf etc.

Examples of floating numbers¶

f2parts takes a floating number and its bit length (32 or 64) and shows the three parts.

In [2]:

f2parts(10.7, 32);

	Sign	Exp	Fraction
Bits	0	10000010	01010110011001100110011
Decimal	1.0	3.0	1.33749997616

In [3]:

f2parts(-23445.25, 64)

	Sign	Exp	Fraction
Bits	1	10000001101	0110111001010101000000000000000000000000000000000000
Decimal	-1.0	14.0	1.43098449707

Range and precision

The range of floating point number are usually adequte in practice
- the whole universe only have $10^{80}$ protons ...
The machine precision is the more commonly encountered limitation
- machine precision $\epsilon_m$ is the maximum $\epsilon$ that makes $1 + \epsilon = 1$

In [4]:

prec32 = 2**(-24)
prec64 = 2**(-53)

f32 = [parts2f(0, -126, 1), parts2f(0, 254-127, (2.-prec32*2)), prec32, -np.log10(prec32)]
f64 = [parts2f(0, -1022, 1), parts2f(0, 2046-1023, (2.-prec64*2)), prec64, -np.log10(prec64)]

fmt.displayDF(pd.DataFrame(np.array([f32, f64]), index=['32 bit', '64 bit'], 
                           columns=["Min", "Max", "Machine Precision", "# of Significant Digits"]), "5g")

	Min	Max	Machine Precision	# of Significant Digits
32 bit	1.1755e-38	3.4028e+38	5.9605e-08	7.2247
64 bit	2.2251e-308	1.7977e+308	1.1102e-16	15.955

Examples of floating point representations¶

In [5]:

f2parts(-0., 32);

	Sign	Exp	Fraction
Bits	1	00000000	00000000000000000000000
Decimal	-1.0	-127.0	0.0

In [6]:

f2parts(np.NaN, 32);

	Sign	Exp	Fraction
Bits	0	11111111	10000000000000000000000
Decimal	1.0	128.0	1.5

In [7]:

f2parts(-np.Inf, 32);

	Sign	Exp	Fraction
Bits	1	11111111	00000000000000000000000
Decimal	-1.0	128.0	1.0

In [8]:

f2parts(-1e-43, 32);

	Sign	Exp	Fraction
Bits	1	00000000	00000000000000001000111
Decimal	-1.0	-127.0	1.69277191162e-05

Floating point computing is NOT exact¶

In [9]:

a = 1./3
b = a + a + 1. - 1.
c = 2*a
print "b == c? %s !" % (b == c)
print "b - c = %e" % (b - c)

b == c? False !
b - c = -1.110223e-16

In [10]:

f2parts(b, 64);

	Sign	Exp	Fraction
Bits	0	01111111110	0101010101010101010101010101010101010101010101010100
Decimal	1.0	-1.0	1.33333333333

In [11]:

f2parts(c, 64);

	Sign	Exp	Fraction
Bits	0	01111111110	0101010101010101010101010101010101010101010101010101
Decimal	1.0	-1.0	1.33333333333

Can't handle numbers too large or small¶

In [12]:

x = 1e-308
print "inverse of %e is %e" % (x, 1/x)

inverse of 1.000000e-308 is 1.000000e+308

In [13]:

x = 1e-309
print "inverse of %e is %e" % (x, 1/x)

inverse of 1.000000e-309 is inf

Limited precision

Small number (in relative sense) is lost in addition and subtraction

In [14]:

print 1. - 1e-16 == 1.

False

In [15]:

print 1. + 1e-16 == 1.

True

In [16]:

largeint = 2**32 - 1

print "Integer arithmetic:"
a = math.factorial(30) # in Python's bignum
print "a = ", a
print "a - %d == a ?" % largeint, (a - largeint) == a

print "\nFloating point arithmetic:"
b = float(math.factorial(30))
print "b = ", b
print "b - %d == b ?" % largeint, (b - largeint) == b

Integer arithmetic:
a =  265252859812191058636308480000000
a - 4294967295 == a ? False

Floating point arithmetic:
b =  2.65252859812e+32
b - 4294967295 == b ? True

Subtle errors

The limited precision can cause subtle errors that are puzzling even for experiened programmers.

In [17]:

a = np.arange(0, 3. - 1., 1.)
print "a = ", a
print "a/3 = ", a/3.

a =  [ 0.  1.]
a/3 =  [ 0.          0.33333333]

In [18]:

b = np.arange(0, 1 - 1./3., 1./3.)
print "b = ", b

b =  [ 0.          0.33333333  0.66666667]

Avoid equality test

Equality test in floating point number is always a bad idea:

Mathematical equality often fails floating point equality test because of rounding errors
Often produce off-by-one number of loops when used to test end loop conditions

It should become your instinct to:

always use error bound when comparing floating numbers, such as numpy.allclose()
try to use integer comparison as the end condition for loops
- integer representation and arithmetic in IEEE 754 is exact

In [19]:

b = arange(0, 1 - 1./3. - 1e-10, 1./3)
print "b = ", b

b =  [ 0.          0.33333333]

Unexpected twist

Compute $e^x$ via Taylor expansion: $$e^x = \sum_{n=0}^{\infty}\frac{x^n}{n!} = 1 + x + \frac{1}{2}x^2 + ...$$

Other types of errors may be mistaken as floating point rounding error and escape detection:

In [20]:

import math

def exp_taylor(x, n) :
    return sum([1.*x**i/math.factorial(i) for i in np.arange(1, n)]) + 1.

terms = 100
x = 5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, terms))/np.exp(x)*100)
x = -5
print "rel err for exp(%.f) is %.2f%%" % (x, (np.exp(x) - exp_taylor(x, terms))/np.exp(x)*100)

print "exp_taylor(%.f) = %f, exp(%.f) = %f" % (x, exp_taylor(-5, terms), x, np.exp(-5))

rel err for exp(5) is 0.06%
rel err for exp(-5) is 474.72%
exp_taylor(-5) = -0.025249, exp(-5) = 0.006738

Catastrophic cancellation

Dramatic loss of precision can happen when very similar numbers are subtracted

Value the follwing function around 0:

$$f(x) = \frac{1}{x^2}(1-\cos(x))$$

Mathematically: $\lim_{x\rightarrow 0}f(x) = \frac{1}{2}$

The straight forward implementation failed spectacularly for small $x$.

In [22]:

def f(x) :
    return (1.-np.cos(x))/x**2

def g(x) :
    return (np.sin(.5*x)**2*2)/x**2

x = np.arange(-4e-8, 4e-8, 1e-9)
df = pd.DataFrame(np.array([f(x), g(x)]).T, index=x, columns=['Numerical f(x)', 'True Value']);
df.plot();

How did it happen?¶

In [23]:

x = 1.1e-8
print "cos(%.16e) = %.16f" % (x, cos(x))
print "1-cos(%.e) = %.16e" % (x, 1.-cos(x))
print "true value of 1-cos(%.e) = %.16e" % (x, 2*sin(.5*x)**2)

cos(1.0999999999999999e-08) = 0.9999999999999999
1-cos(1e-08) = 1.1102230246251565e-16
true value of 1-cos(1e-08) = 6.0499999999999997e-17

Dramatic precision loss when subtracting numbers that are very similar
- All the identical leading significants cancels
- The result has much fewer number of significant digits
Catestrophic cancellation could happen in practice
- Deltas are often computed by bump and re-value
- Very small bump sizes should be avoided

A real example in Finance

the CIR process is widely used in quant finance, eg, in the short rate and Heston model:

$$ dr_t = \kappa(\mu - r) dt + \sigma \sqrt{r_t} dW_t $$

the variance of $r_t$ at $t > 0$:

$$\text{Var}[r_t | r_0] = \frac{r_0\sigma^2}{\kappa}(e^{-\kappa t} - e^{-2\kappa t}) + \frac{\mu\sigma^2}{2\kappa}(1-e^{-\kappa t})^2$$

how the catastrophic cancellation arises?

In [31]:

k = arange(-3e-8, 3e-8, 1e-9)*1e-7
t = 1
v = (1-exp(-k*t))/k
plot(k, v, '.-')
title(r'$\frac{1}{\kappa}(1-\exp(-\kappa))$')
xlabel(r'$\kappa$');

Condition number

Consider the relative error of a function with multiple arguments: $f = f(x_1, ..., x_n)$:

$$ \small df = \sum_i \frac{\partial f}{\partial x_i} dx_i \iff \frac{df}{f} = \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \frac{dx_i}{x_i} $$

Assuming input argument's relative errors are $\epsilon_i$ (could be negative):

$$ \small \left| \frac{\Delta f}{f} \right| = \left| \sum_i \frac{x_i}{f}\frac{\partial f}{\partial x_i} \epsilon_i \right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \left|\epsilon_i\right| \le \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right| \epsilon_m \equiv k(f) \epsilon_m $$

where $\epsilon_m$ is the maximum relative error of all inputs.

$k(f) = \sum_i \left| \frac{x_i}{f}\frac{\partial f}{\partial x_i} \right|$ is defined as the condition number of a function,
it is the maximum growth factor of the relative error.
the calculation loses about $\log_{10}(k(f))$ decimal digits of accuracy.

Well-posed and ill-posed problems

Condition number is the systematic approach to detect potential numerical problems:

$f(x_1, x_2) = x_1 - x_2$

$k(f) = \frac{|x_1| + |x_2|}{|x_1 - x_2|}$, ill conditioned when $x_1 \approx x_2$, catestrophic cancellation
$f(x_1, x_2) = x_1 x_2$

$k(f) = 2$, the multiplication is well conditioned

The problem is ill-posed if its condition number is large.

In practice, it is impossible to run condition number analysis over complicated calculations:
We have to rely upon monitoring and testing

Well-posed problem can be unstable

Consider the previous example of $f(x) = \frac{1}{x^2}(1-\cos(x))$:

$$ k(f) = \left| 2 + \frac{x \sin{\left (x \right )}}{\cos{\left (x \right )} - 1} \right| $$

$k(f) \rightarrow \infty$ near $x = 2\pi$, it is ill-posed near $x = 2\pi$ due to catastrophic cancellation.
$k(f) \rightarrow 2$ near $x = 0$, it is well-posed near $x = 0$.

The numerical algorithm can be unstable even if the problem itself is well-posed.

A better algorithm near $x=0$ is $f(x) = \frac{2}{x^2}\sin^2(\frac{x}{2})$

But an ill-posed problem is always numerically unstable by definition.

Direct numerical calculation is unstable around $x = 2\pi$
Does this help? $f(x) \approx \frac{1}{2x^2}(x-2\pi)^2$

Ill-posed numerical algorithm

Ill-posed problem are generally not suitable for numerical solutions.

The Mandelbrot set is a famous example:

defined an iterative algorithm in complex number: $z_{i+1} = z_i^2 + c$
the color of $z_0$ is tied to the number of iterations for $|z_i|$ to go above a threshold
obviously unstable because of the iterative additions
showing exquisite details of numerical instability

Rules of engagement for floats

Always allow errors when comparing numbers
Use double precision floating point number by default
- unless you know what you are doing (really?)
Avoid adding and subtracting numbers of very different magnitudes
- avoid using very small bump sizes when computing delta
Take care of the limits, especially around 0
- many (careless) numerical implementations broke down around 0
Use well designed numerical libraries whenever possible
- Even simple numerical algorithm can have spectacular pitfalls
Test and monitor

Portability

Floating point computation is usually not reproducible across different platforms
Same code will produce different results with different compiler and hardware
- The difference can be significant in optimization and iterative methods
Design your unit test to be cross platform by giving enough error allowance
Test and monitoring are critical

Assignments

Required Reading:

Bindel and Goodman: Chapter 2