Dissertation Alvaro Leitao Rodriguez
Dissertation Alvaro Leitao Rodriguez
Dissertation Alvaro Leitao Rodriguez
DOI
10.4233/uuid:fc4d5fc5-cee9-44ef-bca1-e69250c1480f
Publication date
2017
Document Version
Final published version
Citation (APA)
Leitao Rodriguez, A. (2017). Hybrid Monte Carlo methods in computational finance. [Dissertation (TU Delft),
Delft University of Technology]. https://doi.org/10.4233/uuid:fc4d5fc5-cee9-44ef-bca1-e69250c1480f
Important note
To cite this publication, please use the final published version (if applicable).
Please check the document version above.
Copyright
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent
of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Takedown policy
Please contact us and provide details if you believe this document breaches copyrights.
We will remove access to the work immediately and investigate your claim.
Proefschrift
door
To my parents;
Monte Carlo methods are highly appreciated and intensively employed in computa-
tional finance in the context of financial derivatives valuation or risk management. The
method offers valuable advantages like flexibility, easy interpretation and straightfor-
ward implementation. Furthermore, the dimensionality of the financial problem can
be increased without reducing the efficiency significantly. The latter feature of Monte
Carlo methods is important since it represents a clear advantage over other competing
numerical methods. Furthermore, in the case of option valuation problems in multiple
dimensions (typically more than five), the Monte Carlo method and its variants become
the only possible choices. Basically, the Monte Carlo method is based on the simulation
of possible scenarios of an underlying process and by then aggregating their values for a
final solution. Pricing derivatives on equity and interest rates, risk assessment or port-
folio valuation are some of the representative examples in finance, where Monte Carlo
methods perform very satisfactorily. The main drawback attributed to these methods is
the rather poor balance between computational cost and accuracy, according to the the-
oretical rate of Monte Carlo convergence. Based on the central limit theorem, the Monte
Carlo method requires hundred³ times ´ more scenarios to reduce the error by one order,
1
i.e. the rate of convergence is O n − 2 .
However, the application of basic Monte Carlo methods, usually called plain Monte
Carlo methods, is often not sufficient to solve modern problems appearing in the finan-
cial sector. Problems like the accurate computation of sensitivities and early-exercise fi-
nancial derivatives in multiple dimensions are particularly interesting examples. Except
for the simplest asset price stochastic models, the generation of the scenarios is not al-
ways a trivial task. Although the so-called Taylor-based discretization schemes, like the
Euler-Maruyama scheme, are heavily used, more involved mechanisms are required to
achieve the goal of accuracy and efficiency in a number of cases. For the computation
of the sensitivities and for early-exercise valuation, the use of plain Monte Carlo is either
not sufficient or computationally unaffordable.
Therefore, hybrid solution methods, i.e., combining Monte Carlo methods with some
other methodologies, can be developed in order to overcome the issues and provide fast,
accurate and consistent computational results. This can be done by relying on mathe-
matical, computational approaches, or combinations of both. Hybrid solution methods
are becoming essential tools to deal with the increasing complexity in quantitative fi-
nance.
In this thesis, we propose Monte Carlo-based hybrid solution methods for some cut-
ting edge problems in computational finance. From the mathematical side, we employ
Fourier-based techniques that, in combination with the Monte Carlo methods, result
in efficient methodologies to simulate and evaluate sensitivities under stochastic local
volatility models, that are nowadays used for foreign-exchange (FX) rate options. We also
employ some recent scientific computing paradigms advances, like high-performance
computing on GPUs, to extend the applicability of a Monte Carlo method for pricing
vii
viii S UMMARY
multi-dimensional early-exercise options and thus enable the computation of very high-
dimensional problems.
In Chapter 1, a general overview of the methodologies that commonly appear in
computational finance is presented. By considering the solution of partial differential
equations (PDEs) provided by the Feynman-Kac theorem and its variants, mathematical
techniques like Fourier inversion, Monte Carlo methods and PDE approaches have been
successfully connected to solve financial problems like option pricing and risk manage-
ment. We provide a brief explanation of the techniques. Other recent developments like
data-driven approaches, where a pricing model is derived solely based on the available
financial data, or parallel GPU computing are also briefly described, as they are consid-
ered in the second part of this thesis.
In Chapter 2, we propose an efficient technique to simulate exactly a representative
stochastic local volatility model, i.e. the Stochastic Alpha Beta Rho (SABR) model, con-
sidered well-established in FX and interest rate markets. We aim to employ only one
time-step in a Monte Carlo procedure. Particularly, we focus on the derivation of the
conditional distribution of the appearing time-integrated variance, an element which is
required in the simulation of the SABR asset dynamics. For that purpose, we compute
the marginal distribution by means of Fourier inversion and the use a copula to obtain a
multivariate distribution which resembles the conditional time-integrated variance dis-
tribution. Being a one time-step method, the method is suitable for the valuation of
European options and, due to the presented improvements in terms of performance,
can be particularly employed for calibration purposes.
The method introduced in Chapter 2 is generalized towards a multiple time-step ver-
sion for the simulation of the SABR model in Chapter 3. Although the multiple time-
step procedure is conceptually similar, this version entails a new challenge regarding the
computational cost. We deal with this challenge by introducing an advanced numeri-
cal interpolation which is based on a stochastic collocation technique. The proposed
method, which we call the mSABR method, allows for an efficient simulation with large
time-steps, and it can be employed to accurately price (exotic), non-standard, options
under the SABR dynamics. This simulation scheme also performs highly satisfactory
even in the context of negative interest rates, that are presently observed in the financial
world.
In Chapter 4, we develop a data-driven Fourier-based option valuation method that
appears beneficial when calculating the sensitivities of financial derivatives in a gen-
eral framework. The method is based on the assumption that only asset data samples
are available. The resulting method is a data-based generalization of the COS method,
which is a Fourier inversion method that employs the characteristic function of the rel-
evant stochastic asset variables. In our method, which we call the data-driven COS (dd-
COS) method, the use of the characteristic function is avoided. Instead, we recover the
probability density function from the available data samples, by using concepts from the
statistical learning theory. Since the ddCOS method is a data-driven procedure it is, thus,
generally applicable and can be employed, for example, in combination with the above
mentioned SABR simulation method, as described in Chapters 2 and 3.
In Chapter 5, we consider another component in our work, the efficient compu-
tation on Graphics Processing Units (GPUs). We present a parallel version of an effi-
S UMMARY ix
xi
xii S AMENVATTING
onder stochastische lokale volatiliteitsmodellen, die op dit ogenblik veel gebruikt worden
voor opties in de valutamarkt (FX). We gebruiken ook enkele recente technologische ont-
wikkelingen, zoals het snel uitvoeren van complexe berekeningen op de grafische kaart,
om zo de toepasbaarheid van Monte-Carlosimulatie uit te breiden voor het berekenen
van multidimensionale opties waarbij vervroegd uitoefenen toegestaan is. Dit maakt het
numeriek oplossen van zeer hoog-dimensionale problemen mogelijk.
In Hoofdstuk 1 wordt een algemeen overzicht gegeven van methoden die vaak in de
financiële wiskunde voorkomen. Door het beschouwen van de oplossing van partiële
differentiaalvergelijkingen (PDV’s) die door de Feynman-Kac-stelling en zijn varianten
worden gegeven, zijn wiskundige technieken zoals Fourier inversie, Monte-Carlosimu-
latie en PDV-benaderingen succesvol gecombineerd om financiële vraagstukken, zoals
het prijzen van opties en risicobeheer, op te lossen. We geven een korte toelichting op
deze technieken. Andere recente ontwikkelingen, die we beschouwen in het tweede deel
van dit proefschrift, worden ook kort beschreven in dit hoofdstuk. Door data gegene-
reerde benaderingen, waarbij een prijsmodel alleen gevormd wordt door de beschikbare
financiële data, en het uitvoeren van parallelle berekeningen op een grafische kaart zijn
voorbeelden van zulke ontwikkelingen.
In Hoofdstuk 2 stellen we een efficiënte techniek voor om een representatief sto-
chastisch lokale volatiliteitsmodel exact te simuleren, namelijk het Stochastic Alpha Beta
Rho-model (SABR), dat in de FX- en rentemarkten vaak gebruikt wordt. Wij streven er-
naar om slechts één stap in een Monte-Carloprocedure te gebruiken. In het bijzonder
richten we ons op de afleiding van de voorwaardelijke verdeling van de naar de tijd ge-
ïntegreerde variantie, een element dat nodig is bij de simulatie van het onderliggende
SABR-proces. Daarvoor berekenen we de marginale verdeling door middel van Fourier-
technieken en het gebruik van een copula om een multivariate verdeling te verkrijgen die
lijkt op de voorwaardelijke verdeling van de naar de tijd geïntegreerde variantie. Doordat
de methode slechts één tijdstap nodig heeft, is het een methode die geschikt is voor het
prijzen van Europese opties. Daarnaast hebben we verbeteringen voorgesteld, waardoor
de methode ook kan worden gebruikt voor kalibratiedoeleinden.
De methode die in Hoofdstuk 2 wordt geïntroduceerd wordt in Hoofdstuk 3 gege-
neraliseerd naar een versie met meerdere tijdstappen voor de simulatie van het SABR-
model. Hoewel het gebruik van meerdere tijdstappen qua concept gelijk is, zorgt deze
versie voor een nieuwe uitdaging met betrekking tot de rekentijd. Hiervoor introduce-
ren we een geavanceerde numerieke interpolatie die gebaseerd is op stochastische col-
locatie. De voorgestelde methode, die we de mSABR-methode noemen, zorgt voor een
efficiënte simulatie met grote tijdstappen en kan worden gebruikt om nauwkeurig niet-
standaard (exotische) opties onder het SABR-model te prijzen. Zelfs onder negatieve ren-
tetarieven, welke momenteel in de financiële wereld worden waargenomen, presteert dit
simulatieschema zeer goed.
In Hoofdstuk 4 ontwikkelen we een data-gestuurde en op Fouriertechnieken geba-
seerde prijsmethode voor opties. Deze methode blijkt in een algemene setting gunstig
bij het berekenen van de gevoeligheden van financiële derivaten. De methode is geba-
seerd op de veronderstelling dat er alleen gegevens van de onderliggende waarde be-
schikbaar zijn. De resulterende methode is een op data gebaseerde generalisatie van de
COS-methode. Dit is een Fouriermethode die de karakteristieke functie van het onderlig-
S AMENVATTING xiii
gende proces gebruikt. In onze methode, die we de data-driven COS (ddCOS) methode
noemen, wordt het gebruik van de karakteristieke functie vermeden. In plaats daarvan
benaderen we de kansdichtheidsfunctie uit de beschikbare gegevens, door gebruik te
maken van concepten uit de statistische leertheorie. Aangezien de ddCOS-methode een
data-gestuurde techniek is, is het dus algemeen toepasbaar en kan het bijvoorbeeld wor-
den gebruikt in combinatie met de SABR-simulatiemethode zoals beschreven in Hoofd-
stukken 2 en 3.
In Hoofdstuk 5 beschouwen we een ander onderdeel van ons werk, efficiënte bereke-
ningen op de grafische kaart, ook wel Graphics Processing Unit (GPU) genoemd. We pre-
senteren een parallelle versie van een efficiënte en op Monte-Carlo gebaseerde methode
voor het prijzen van multidimensionale derivaten waarbij vervroegd uitoefenen toege-
staan is. De parallellisatie en het gebruik van de GPU resulteren in een uitzonderlijke
besparing in de rekentijd, waardoor we nieuwe toepassingen van de methode kunnen
onderzoeken. Zeer hoog-dimensionale problemen worden nu hanteerbaar en genera-
lisaties over het al dan niet vroegtijdig uitoefenen kunnen worden ontwikkeld dankzij
onze GPU-implementatie.
Het werk in dit proefschrift is gebaseerd op artikelen die zijn gepubliceerd of inge-
diend tijdens het promotietraject.
Contents
Summary vii
Samenvatting xi
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 The SABR model . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Pricing techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Fourier inversion methods . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 PDE methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.4 Hybrid solution techniques . . . . . . . . . . . . . . . . . . . . . 9
1.3.5 Data in computational finance. . . . . . . . . . . . . . . . . . . . 10
1.4 GPU computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Multi-ITN-STRIKE project . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6 Outline of the thesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 One time-step Monte Carlo simulation of the SABR model 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 The SABR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 SABR Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . 17
2.3 CDF of SABR’s time-integrated variance . . . . . . . . . . . . . . . . . . 19
2.3.1 ChF of Y M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.2 Treatment of integral I M . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.3 Error and performance analysis . . . . . . . . . . . . . . . . . . . 26
2.4 Simulation of Y (T )|σ(T ): copula approach . . . . . . . . . . . . . . . . . 28
2.4.1 Families of copulas. . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Correlation: log Y (T ) and log σ(T ) . . . . . . . . . . . . . . . . . . 30
2.4.3 Sampling Y (T )|σ(T ) . . . . . . . . . . . . . . . . . . . . . . . . . 32
RT
2.4.4 Simulation of S(T ) given S(0), σ(T ) and 0 σ2 (s)ds . . . . . . . . . 33
2.5 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5.1 Copula approach analysis . . . . . . . . . . . . . . . . . . . . . . 36
2.5.2 Pricing European options by the one-step SABR method . . . . . . 38
2.6 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3 Multiple time-step Monte Carlo simulation of the SABR model 43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 ‘Almost exact’ SABR simulation . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 SABR Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . 46
3.2.2 Stochastic Collocation Monte Carlo sampler . . . . . . . . . . . . . 47
xv
xvi C ONTENTS
AV Antithetic Variates
bp Basis points
CDF Cumulative Distribution Function
CEV Constant Elasticity of Variance
ChF Characteristic function
CPU Central Processing Unit
CUDA Compute Unified Device Architecture
ES Expected Shortfall
FX Foreign-eXchange
GBM Geometric Brownian Motion
GOF Goodness-of-fit
GPGPU General-Purpose computing on Graphics Processing Units
GPU Graphics Processing Unit
MCFD Monte Carlo Finite Difference
MISE Mean Integrated Squared Error
MSE Mean Squared Error
PDE Partial Differential Equation
PDF Probability Density Function
PIDE Partial Integro-Differential Equation
RE Relative Error
SABR Stochastic Alpha Beta Rho
SCMC Stochastic Collocation Monte Carlo
SCvM Smirnov-Cramér-von Mises
SDE Stochastic Differential Equation
SGBM Stochastic Grid Bundling Method
SV Stochastic Volatility
SLV Stochastic Local Volatility
VaR Value-at-Risk
xix
CHAPTER 1
Introduction
1.1. I NTRODUCTION
Risk management and valuation problems as they appear in quantitative finance can be
very challenging from both a mathematical and a computational point of view.
The underlying financial assets are typically governed by random and uncertain mo-
vements, that are mathematically modelled by means of stochastic processes and corre-
sponding Stochastic Differential Equations (SDEs). Except for the most basic and classi-
cal models, closed-form expressions of the probability densities, governing these stochas-
tic quantities are not available. For many relevant asset price processes, however, we
know the characteristic function (ChF), which is the Fourier transform of the probabil-
ity density function (PDF) of interest. By Fourier inversion we can thus approximate the
governing PDFs.
Pricing financial derivatives like options is one of the most common research areas
in computational finance. A derivative is a financial product whose valuation depends
on another underlying asset such as stocks, commodities or interest rates. In this thesis,
we focus on option valuation. Except for particular and simple cases, pricing an option
is not a trivial task, depending on the type of the option and/or the stochastic model for
the asset movements.
For any financial institution, risk management forms an essential part of their busi-
ness. Some of the risks appearing in finance include credit risk, market risk and coun-
terparty risk. Important issues in risk management are described in the Basel Accords,
whose requirements must be satisfied by the banks and other financial institutions. Once
the risk is identified, quantification is necessary. The estimation of accurate risk mea-
sures facilitates the interpretation of particular risk factors.
with W a Brownian motion and a(·) and b(·) predictable and integrable functions.
The stochastic processes appearing in computational finance are often Itô’s pro-
cesses. The most basic and extensively used example is the geometric Brownian mo-
1
2 1. I NTRODUCTION
tion (GBM) where the functions a(·) and b(·) are defined as a(t , X ) = µX (t ) and b(t , X ) =
1 σX (t ), with µ and σ constants. GBM is used to model the underlying asset prices in the
classical Black-Scholes model, leading to log-normal asset dynamics. However, the as-
sumed simplifications (like the constant parameters among others) have often shown to
be inconsistent with the financial market behaviour.
An important indicator of the market inconsistency of GBM and the Black-Scholes
equation is found in the implied volatility smile, which means that the volatility deduced
from option quotes in the market is not constant, but appears as a function of the op-
tion strike prices and of time. A first attempt was found in the local volatility models.
The model reproduces the stochastic behaviour of the volatility observed in the market
already within the parameter calibration process. Parametric and non-parametric local
volatility processes exist, where the parametric version is governed by a functional form
with a few free parameters to be calibrated, and the non-parametric version is solely
based on option market data. One of the first parametric volatility processes was the con-
stant elasticity of variance (CEV) model [31]. The model aims to reproduce the stochastic
behaviour of the volatility observed in the market by including an elasticity parameter.
Another particularly interesting asset processes are formed by exponential Lévy pro-
cesses. Lévy processes are governed by the properties of independent and stationary
increments and continuity in probabilistic sense. A useful formula in the context of Lévy
processes is the celebrated Lévy-Khintchine formula that gives us a representation of the
process by its ChF. The ChFs of Lévy processes and of regular affine diffusion processes
are known, or can be computed, and by these, many relevant asset price processes can
be represented, like jump processes, jump-diffusion or stochastic volatility processes.
With these processes, the market observed volatility smile can be modelled.
The adoption of multiple-dimensional stochastic processes forms another success-
fully attempt in the context of modelling implied volatility smiles, especially by consider-
ing the volatility as a stochastic process. This leads to the stochastic volatility (SV) mod-
els1 or even stochastic local volatility (SLV) models. One of the representative examples
is the Stochastic Alpha Beta Rho (SABR) model in the Foreign-exchange (FX) and interest
rates markets. The SABR model will be intensively studied throughout this thesis, for
that, in the following section we briefly describe its main features.
It is important to notice that particularly these SLV models do not immediately result
in a ChF or PDF, and as such it is not trivial to perform a rapid calibration of the model
for a wide range of model and pricing problem parameters. By means of perturbation
theory, closed-form solutions for implied volatilities can be obtained in the form of an
analytic expression for a range of parameter values, but, as for many solution originating
from asymptotic theory, this expression is not general. Often in practice, contracts are
defined whose parameters lie outside the range of validity of the theory. In that case, it is
really difficult to perform a fast and accurate calibration.
based on, besides stochasticity in the volatility, a parametric local volatility component
in terms of a model parameter, β as a power in S β . The definition of the SABR model is 1
according to the following system of SDEs,
with S(t ) = S̄(t ) exp (r (T − t )) the forward value of the underlying asset S̄(t ), r the interest
rate, S 0 the spot price at t 0 = 0, and T the contract’s expiry time. The second stochastic
process σ(t ) represents the stochastic volatility, with σ(0) = σ0 . The two correlated Brow-
nian motion WS (t ) and Wσ (t ) are correlated with constant correlation coefficient ρ (i.e.
WS Wσ = ρt ). The model parameters are α > 0 (the volatility of the volatility), 0 ≤ β ≤ 1
(the elasticity) and ρ (the correlation coefficient).
In the original SABR model paper, the authors provided a closed-form formula for
the implied volatility under the SABR dynamics based on asymptotic expansion. After a
correction made by Obloj [100], the expression of the so-called Hagan formula reads
³ ´
S0
1 α log K
σi mp (K , S 0 , T ) = · 2 ¸ · ·
(1 − β) 2 S0 (1 − β)4 4 S 0 x(z)
µ ¶ µ ¶
1+ ln + ln +···
24 K 1920 K
σ20
" #
(1 − β)2 1 ρβασ0 2 − 3ρ 2 2
1+ + + α ·T +··· ,
24 (K S 0 )1−β 4 (K S 0 )(1−β)/2 24
with ³ ´
1−β
α S 0 − K 1−β
Ãp !
1 − 2ρz + z 2 + z − ρ
z= , x(z) = log .
σ0 (1 − β) 1−ρ
Some terms are very small and can be then neglected, resulting the following implied
volatility formula,
K K
µ µ ¶ µ ¶ ¶
1
σi mp (K , S 0 , T ) = 1 + B 1 log + B 2 ln2 + B2T ,
ω S0 S0
1¡
1 − β − ραω ,
¢
B1 = −
2
1 ¡
(1 − β)2 + 3 (1 − β) − ραω + 2 − 3ρ 2 α2 ω2 ,
¡ ¢ ¡ ¢ ¢
B2 =
12
¢2 1−β
1−β 1 βρα 1 2 − 3ρ 2 2
¡
S0
B3 = + + α , ω = .
24 ω2 4 ω 24 σ0
The Hagan formula is highly appreciated since it facilitates the model calibration,
i.e., the determination of the model parameters such that the implied volatility given by
the model to the market implied volatility.
4 1. I NTRODUCTION
1 0.3
Hagan
20
Hagan density
Monte Carlo
15
Implied Volatility
0.25
10
f (x)
5
0.2
0
0.15 -5
0.03 0.04 0.05 0.06 0.07 0.08 -0.05 0 0.05 0.1 0.15 0.2
K x
(a) Hagan’s implied volatility. (b) Implied density.
Figure 1.1: SABR parameter configuration: S 0 = 0.05, σ0 = 0.05, α = 0.4, β = 0.5, ρ = −0.7 and T = 7,
However, the use of an asymptotic approximation makes the Hagan formula for the
implied volatility not generally applicable. SABR model parameter configurations exist
for which the formula gives inaccurate or even incorrect values. It is well-known that
Hagan formula gives rise to problems when the asset quotes are close to zero, where a
so-called absorption boundary value must be set. This issue can be seen by constructing
the so-called implied density, i.e. the PDF associated to the implied volatility given by the
Hagan formula. In Figure 1.1, an example of such an incorrect density is shown. In Figure
1.1(a) the implied volatilities given the Hagan formula and also numerically computed
values by means of a Monte Carlo method are depicted for different strike prices K . We
can see that the Hagan formula is not accurate for very low strikes. The problem become
even more clear when representing the implied density, as in Figure 1.1(b), where the
PDF exhibits negative values close to the boundary at zero. This cannot be, and implies
arbitrage opportunities.
The valuation of European and exotic options under the SABR dynamics is generally
not a trivial task. When reliable, the implied volatility given by the Hagan formula can
be used in the Black formula2 to obtain European option prices. However, a problem
remains in the situations where the Hagan formula is not applicable. Then, Monte Carlo
methods are sometimes employed even though an "exact Monte Carlo simulation" of
the SABR model, which allows for simulation with only a few large time-steps, is not
straightforward. The use of Fourier inversion techniques is limited since the ChF is not
available. A PDE formulation for the option price under SABR dynamics has also been
developed, enabling the use of PDE numerical techniques. The two-dimensional SABR
PDE reads
1 2 2β 2−2β ∂2 V 2
β 1−β 2 ∂ V 1 2 2 ∂2 V ∂V ∂V
σ S̄ D + ραS̄ D σ + α σ + + r S̄ − r V = 0,
2 ∂S̄ 2 ∂S̄∂σ 2 ∂σ2 ∂t ∂S̄
SLV models in general and the SABR model in particular are very well-suited to model
so-called forward starting options. Forward starting options can be seen as European op- 1
tions, but starting at a predefined time in the future (at t > t 0 ). In that case, the initial
asset value is not known, but still the contract premium and hedge strategy must be set
advance (at today’s date). The SLV models are accurate for modelling the so-called for-
ward volatility, which helps to determine accurate option prices and hedge strategies for
forward starting options. In contrast, plain local volatility models are typically not able to
accurately model the forward volatility, as volatility is flattening out in time under these
models, and, thus, the volatility smile "flattens".
∂V ∂V 1 2 ∂2 V
+ µ(t , S) + σ (t , S) 2 − r V = 0, (1.1)
∂t ∂S 2 ∂S
with final condition h(T, S). The solution for V (t , S) at time t 0 < T is then given by:
where the expectation is taken under measure Q, with respect to a process S, defined by:
The expectation in Theorem 1.3.1 can be written in integral form, resulting in the
risk-neutral valuation formula,
Z
V (t 0 , S) = e−r (T −t0 ) h(T, y) f (y|F (t 0 ))dy, (1.2)
R
with r the risk-free rate, T the maturity time, and f (·) the PDF of the underlying process.
Many numerical techniques are based on the solution provided by the Feynman-Kac
theorem and, particularly, by the risk-neutral valuation formula.
6 1. I NTRODUCTION
∂V ∂V 1 2 ∂2 V ∂V
µ ¶
+ µ(t , S) + σ (t , S) 2 − g t , S,V, σ, = 0,
∂t ∂S 2 ∂S ∂S (1.3)
V (T, S) = h(T, S),
where g can be a function of V and its first derivative. This PDE also can also be ex-
pressed in a probabilistic representation by means of the following forward and back-
ward SDEs,
dS(t ) = µ(t , S)dt + σ(t , S)dW (t ), S(0) = S 0
−dY (t ) = g (t , S, Y , Z )dt + Z dW (t ), Y (T ) = g (S(T )),
with a final condition for the backward SDE. The solution consist of a pair of processes
(Y , Z ) as
∂V
Y (t ) = V (t , S), Z (t ) = σ(t , S) (t , S).
∂S
Many generalizations, to towards multi-dimensional linear PDEs, partial integro-
differential equations (PIDEs), and also towards fully non-linear PDEs have been given
in the literature.
1 X n
I¯n := f (X j ).
n j =1
I¯n → I as n → ∞,
the error of the Monte Carlo estimate I − I¯ is, by means of the central limit theorem, which
p
is assumed normally distributed with mean 0 and standard deviation s f / n.
1.3. P RICING TECHNIQUES 7
p
The order of convergence of the plain Monte Carlo method is O (1/ n), which is con-
sidered slow for many applications. 1
However, Monte Carlo methods are highly appreciated in computational finance due
to their simplicity, flexibility and immediate implementation. One of the most impor-
tant advantages is that the methods are easily extended to multi-dimensional problems,
while keeping the order of convergence. Moreover, the method’s simplicity allows to ex-
plore different convergence improvement techniques, like variance reduction techniques
[59] or multi-level Monte Carlo acceleration [57].
In the context of the Monte Carlo approximation of an expectation, like in the so-
lution of the PDE given by the Feynman-Kac theorem, an essential part of the method
is to generate sample paths. Random number generators (RNG) form the basis of these
Monte Carlo paths, and they have been studied for many years. Broadly, they can be
subdivided into “true”, pseudo- and quasi-random generators, and usually generate uni-
formly distributed samples. This is key, because when uniform samples between 0 and
1 are available, samples from any distribution can be obtained as long as the quantile
function, i.e., the inverse of the cumulative distribution function (CDF), is known. The
procedure is then as follows,
d
F Z (Z ) = U thus Zi = F Z−1 (Ui ),
d
where F Z is the CDF, = means equality in the distributional sense, U ∼ U ([0, 1]) and Ui
is a sample from U ([0, 1]). The computational efficiency highly depends on the cost of
calculating F Z−1 . We will require this distribution inversion approach in the Chapters 2
and 3 to simulate the SABR model by means of exact Monte Carlo simulation.
When the “exact” sample generation is not possible, either because the distribution
is not available or the inversion is computationally unaffordable, intermediate steps in
the numerical scenario simulation can be introduced by means of a time discretization
of the associated SDE. The Taylor-based discretization schemes are widely employed in
quantitative finance. The most representative example is the Euler-Maruyama method,
the generalization of the Euler method to SDEs available in the context of the Itô’s calcu-
lus. Another well-known Taylor-based simulation scheme is the Milstein method, which
presents a superior order of convergence to the Euler-Maruyama method.
The application of Monte Carlo methods to European-style options is rather straight-
forward and well-established. However, the use of Monte Carlo methods for other fi-
nancial contracts is usually non-trivial. Some representative examples are the efficient
Monte Carlo computation of the option sensitivities, the Greeks, [56] and the valuation
of early-exercise option contracts [94].
where ψk (·) are orthonormal functions and the expansion coefficients A k can be ob-
tained by
Z b
A k :=< f , ψk >= f (z)ψk (z)dz.
a
By using Parseval’s identity, i.e., < f , ψk >=< fˆ, ψ̂k >, where fˆ and ψ̂k denote the
Fourier transformations of f and ψk , respectively, the expansion coefficients can be
computed based on the ChF.
Fourier inversion methods usually provide a high accuracy at limited computational
cost. However, their applicability strongly relies on the availability of the ChF. In some
cases when a ChF is not directly available, the ChF can be approximated, but this may
hamper the efficiency of the methodology. The curse of dimensionality in the case of
multi-dimensional problems will also be problematic when Fourier techniques are em-
ployed.
Several pricing techniques have been developed based on a Fourier approach, like
the Carr and Madan method [19], Boyarchenko and Levendorskii [13], Lewis [92], Fang
and Oosterlee [49] and Ortiz-Gracia and Oosterlee [102].
In this thesis, we mainly employ the COS method [49], which is based on a Fourier
cosine series expansion of the PDF and the efficient computation of the expansion co-
efficients. An important advantage of this method is that closed-form solutions can be
derived for many types of option contracts.
ods. However, as we are dealing with involved problems, the application of the basic
1 version of the Monte Carlo method is not sufficient to provide a satisfactory balance
between accuracy and efficiency. We therefore combine the Monte Carlo method com-
ponents with other techniques like with copulas, stochastic collocation interpolation,
Fourier inversion, dynamic programming recursion or regression techniques. An inter-
esting aspect of Monte Carlo methods is the independency of the different computa-
tions, which makes the method highly parallelizable which facilitates the use of high-
performance computing. In this thesis we have considered GPU computing as well.
in the latest generations) of small cores and a multi-level memory hierarchy give an ex-
traordinary performance on embarrassingly parallel computations. 1
Different technologies and platforms have appeared in the framework of GPU com-
puting. The Compute Unified Device Architecture (CUDA) [35] from NVIDIA is probably
the best known and extended technology. Its main features are the GPU drivers, the
programming languages (C, C++ and Fortran) and a complete and mature ecosystem in-
cluding scientific libraries (cuBLAS, cuSOLVER, cuRAND, etc.), editors (Nsight) and de-
bugging and analysis tools (CUDA-DBG, CUDA-MEMCHECK, Visual profiler, Tau). Fur-
thermore, extensions of CUDA to use it in combination with other languages (Python,
Java, .NET, etc.) and scientific platforms (Matlab, Mathematica) have been developed.
OpenCL represents the non-proprietary alternative to CUDA, in which NVIDIA is also
involved. OpenACC is a directive-based implementation tool that allows non program-
ming experts to achieve significant acceleration of their scientific research with reduced
programming effort.
GPU computing appears appropriate for financial calculations since these are typi-
cally compute-bound and not memory-bound. GPUs have an extraordinary computation
power, but a limited amount of memory with respect to other high-performance sys-
tems (multi-core, multi-node, etc.). A particularly interesting application is formed by
the Monte Carlo method. Due to characteristics as the availability of a huge number of
independent scenarios, the Monte Carlo estimator fits perfectly in the GPU architecture
philosophy. However, when hybrid solution methods are considered, like in this thesis,
the corresponding parallelization, and therefore their implementation on GPUs may not
be a trivial task. Some techniques are intrinsically sequential and must be adapted or re-
formulated to take advantage of the GPU architecture. In certain financial applications,
the amount of memory is the limiting factor and an efficient management of the mem-
ory resources becomes essential to obtain a gain in terms of performance. The use of the
highly optimized libraries available on GPUs is strongly recommended when possible.
During the period of the thesis work, we have had access to the Dutch national su-
percomputer Cartesius [20]. In terms of GPU computing the facilities form one of the
most powerful architectures in the world. Thanks to that, we were able to take our im-
plementations to the limit, because of the memory resources and the speed provided by
the latest and most powerful generations of GPUs installed in the Cartesius system. For
prototyping purposes, we also employed the Little Green Machine, an low-consuming
and environmental-friendly system equipped with fast and compact GPUs.
pricing financial derivatives and related financial products. This has been accomplished
1 by means of financial modelling, mathematical analysis and numerical simulation, op-
timal control techniques and validation of models. Twelve early-stage researchers (PhD
students) and five experienced researchers (post-doctoral students) have successfully fol-
lowed the high level education provided within the project, building up strong collabo-
rations with the network partners.
In the problems studied, the challenge has been expressed by the non-linearity in
the financial problem formulation. On the one hand, non-linear generalizations of the
Black-Scholes equation (Theorem 1.3.1) have been introduced by considering non-linear
payoffs and market frictions. On the other hand, more complicated asset models have
been studied in ITN-STRIKE, like the already mentioned SV or SLV models.
The modelling research in the ITN-STRIKE project included transaction costs into
the Black-Scholes model, Lévy models [16], Lie Group approaches [11], optimal control
and Fokker-Planck equations [54]. The numerical methods investigated in ITN-STRIKE
included finite difference schemes for PDEs [28, 46, 71, 73] and also hybrid Monte Carlo
methods [88, 89].
Besides the research performed in the project, many scientific events have taken
place, providing a complete educational training program which ranged from advanced
mathematical concepts (Newton methods, optimal control, Lie Group methods, etc.)
to transferable skills (project management, effective writing, cultural awareness, plagia-
rism) and to computational skills (high-performance computing, GPU programming).
Furthermore, the project has provided many networking opportunities in both academic
(ECMI 2014, SCF2015, ICCF 2015, ALGORITMY 2016) and industrial (Postbank, MathFi-
nance) environments.
that only asset data samples are available and the ChF is not available and no longer
required. Whereas the order of convergence provided by the method is according the 1
convergence of Monte Carlo methods, the method is beneficial for the accurate and ef-
ficient computation of some option sensitivities. Therefore, the method appears suit-
able for risk management. We use the ddCOS method for the Greeks computation with
the SABR simulation scheme presented in Chapters 2 and 3 in the context of the Delta-
Gamma approach.
In Chapter 5, we develop a parallel version of the Stochastic Grid Bundling Method
(SGBM), a method to price multi-dimensional early-exercise option contracts. The par-
allel implementation follows the GPGPU paradigm. A new grid point bundling scheme
is proposed, which is highly suitable for parallel systems, increasing the efficiency in the
use of memory and workload distribution. This parallel version of SGBM generalizes its
applicability, allowing us to solve very high-dimensional problems (up to fifty dimen-
sions). Thanks to the performance of the parallel SGBM, a generally applicable way of
computing the early-exercise policy is proposed.
In Chapter 6, the conclusions of the presented work are summarized. We also include
an outlook for possible future research lines.
CHAPTER 2
One time-step Monte Carlo simulation of the SABR model
In this chapter, we propose a one time-step Monte Carlo method for the SABR model. We
base our approach on an accurate approximation of the CDF for the time-integrated vari-
ance (conditional on the SABR volatility), using Fourier techniques and a copula. Result-
ing is a fast simulation algorithm which can be employed to price European options under
the SABR dynamics. Our approach can thus be seen as an alternative to Hagan analytic
formula for short maturities that may be employed for model calibration purposes.
2.1. I NTRODUCTION
The SABR model [67] is an established SDE system which is often used for interest rates
and FX modelling in practice. The model belongs to the SLV models. The idea behind
SLV models is that the modelling of volatility is partly done by a local volatility and partly
by a stochastic volatility contribution, aiming to preserve the advantages and minimize
the disadvantages of the individual models.
In the original paper [67], the authors have provided a closed-form approximation
formula for the implied volatility under SABR dynamics. This is important for practi-
tioners, as it can be used highly efficiently within the calibration exercise. However, the
closed-form expression is derived by perturbation theory and its applicability is thus not
general. The formula is for example not always accurate for small strike values or for
long time to maturity options. In [100], Oblój provided a correction to the Hagan for-
mula but the same drawbacks remained. In order to improve the approximation for the
SABR implied volatility, different techniques have been proposed in the literature, based
on either perturbation theory [69, 136], heat kernel expansions [72, 104] or model map
techniques [4].
The objective of a bank is typically to employ the SABR model for pricing exotic
derivatives. Before this pricing can take place, the model needs to be calibrated to Euro-
pean options, and, if possible, to some exotic options as well. It is market practice to use
Hagan formula to calibrate the SABR model, however, since it is only an approximation
it will not always result in a sufficient fit to the market. In other words, a SABR Monte
Carlo simulation may give different implied volatilities than the Hagan formula.
In this chapter, we aim to develop a one time-step Monte Carlo method, which is ac-
curate for European derivative contracts up to two years. This kind of contract is often
traded in FX markets. In particular, we focus on the efficient simulation of SABR’s time-
integrated variance, conditional on the volatility dynamics. An analytical expression of
This chapter is based on the article “On a one time-step Monte Carlo simulation approach of the SABR model:
Application to European options”. Published in Applied Mathematics and Computation, 293:461–479, 2017
[88].
15
16 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
where S(t ) = S̄(t ) exp (r (T − t )) denotes the forward value of the underlying asset S̄(t ),
with r the interest rate, S 0 the spot price and T the contract’s final time. Quantity σ(t ) de-
notes the stochastic volatility, with σ(0) = σ0 , WS (t ) and Wσ (t ) are two correlated Brow-
nian motions with constant correlation coefficient ρ (i.e. WS Wσ = ρt ). The open param-
eters of the model are α > 0 (the volatility of the volatility), 0 ≤ β ≤ 1 (the elasticity) and
ρ (the correlation coefficient).
A useful aspect for practitioners is that each model parameter can be assigned to a
2.2. T HE SABR MODEL 17
specific feature of the market observed implied volatility when it is represented against
the strike prices, commonly known as a volatility smile or skew. The α parameter mainly
affects the smile curvature. Exponent β is connected to the forward process distribution,
being normal for β = 0 and log-normal for β = 1. In terms of the volatility smile, β also
affects the curvature. Correlation parameter ρ rotates the implied volatility curve around
the at-the-money point, obtaining “more pronounced smile” or “more pronounced skew” 2
shapes.
where à !2
1 S(0)1−β ρ K 2(1−β)
a= + (σ(t ) − σ(0)) , c= ,
ν(t ) (1 − β) α (1 − β)2 ν(t )
1 − 2β − ρ 2 (1 − β)
Z t
b = 2− , ν(t ) = (1 − ρ 2 ) σ2 (s)ds,
(1 − β)(1 − ρ 2 ) 0
2
and χ (x; δ, λ) is the non-central chi-square CDF.
It should be noted that this formula is exact for the case ρ = 0 and results in an ap-
proximation otherwise. A shifted process with an approximated initial condition is em-
ployed in the derivation for ρ 6= 0. Based on Equation (2.2), an “exact” Monte Carlo sim-
ulation scheme for the SABR model can be defined based on inverting the conditional
SABR CDF, when the conditional time-integrated variance and the volatility are already
simulated. This forms the basis of the one time-step SABR approach, where exact sim-
ulation is required for accuracy reasons when using only one large time-step (meaning
no time discretization). Furthermore, it is well known that the convergence ratio of the
p
Monte Carlo method is 1/ n, with n the number of paths or scenarios.
According to Equation (2.2), in order to apply an “exact” one time-step Monte Carlo
simulation for the SABR dynamics, we need to perform the following steps (with the
terminal time T ):
• Simulation of SABR’s volatility. From Equation (2.1), we observe that the volatility
process of the SABR model is governed by a log-normal distribution. As is well-
known, the solution follows a GBM, i.e. the exact simulation of the volatility at
terminal time, σ(T ), reads
1
σ(T ) ∼ σ(0) exp(αW̃σ (T ) − α2 T ), (2.3)
2
• Simulation of SABR’s forward asset process. We can simulate the forward dynam-
ics by inverting the CDF in Equation (2.2). For this, the conditional SABR’s time-
integrated variance (besides the volatility) is required. Concerning the forward
asset simulation, there is no analytic expression for the inverse SABR distribu-
tion so the inversion has to be calculated by means of some numerical approxi-
mation. Another possibility is to employ SDE discretization schemes, like Taylor-
based schemes (Euler-Maruyama, log-Euler, Milstein or full-truncation) or more
involved ones (low-bias or QE schemes). However, the use of such schemes gives
rise to several issues that have to be managed, like the discretization error, nega-
tivity in the asset values and computational cost.
In Algorithm 1, a schematic representation of the complete one time-step Monte
Carlo simulation procedure for the SABR model is presented. We include references to
the sections of this chapter where the respective parts of the SABR simulation are dis-
cussed.
In the above steps of the SABR model exactR T simulation, the challenging part is the
simulation of the time-integrated variance, 0 σ2 (s)ds, conditional on σ(T ) (and σ(0)).
Sometimes moment-matching techniques can be used. Here, however, we propose a
computationally efficient approximation, based
R T on a copula multi-variate distribution,
to simulate the conditional distribution of 0 σ2 (s)ds given the volatility σ(T ). Simula-
tion by means of a copula technique requires the CDF of the involved marginal
RT distribu-
tions. Although the distribution of σ(T ) is known, the distribution of 0 σ2 (s)ds needs
to be approximated. We propose a method based on a Fourier technique, which was al-
ready employed for Asian options in [137]. In Section 2.3, this derivation is presented in
detail.
2.3. CDF OF SABR’ S TIME - INTEGRATED VARIANCE 19
RT
Hereafter, for notational convenience, we will use Y (T ) := 0 σ2 (s)ds.
with f log Ỹ (T ) the PDF of log Ỹ (T ). f log Ỹ (T ) is, in turn, found by approximating the associ-
ated ChF, fˆ , and applying a Fourier inversion procedure. The ChF and the desired
log Ỹ (T )
PDF of log Ỹ (T ) form a so-called Fourier pair. Based on the work in [8] and [137], we
develop a recursive procedure to recover the ChF of f log Ỹ (T ) . We start by defining the
sequence,
σ2 (t j )
à !
= log σ2 (t j ) − log σ2 (t j −1 ) ,
¡ ¢ ¡ ¢
R j = log 2 (2.6)
σ (t j −1 )
σ2 (t j ) = σ2 (t 0 ) exp(R 1 + R 2 + · · · + R j ), (2.8)
because
σ2 (t j )
à à !!
σ2 (t 1 )
µ 2
σ (t 2 )
µ ¶ ¶
2 2
σ (t j ) = σ (t 0 ) exp log 2 + log 2 + · · · + log 2
σ (t 0 ) σ (t 1 ) σ (t j −1 )
2
σ (t j )
à à !!
= σ2 (t 0 ) exp log 2 .
σ (t 0 )
1 These time points are not to be confused with the Monte Carlo time-steps. We will have only one Monte Carlo
time-step. M is the number of points for the discrete approximation of Y (T ).
20 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
Y1 = R M ,
(2.9)
2 Y j = R M +1− j + Z j −1 , j = 2, . . . , M ,
where Equation (2.7) is used in both expressions and the independence of R M +1− j and
Z j −1 is also employed.
After the computation of fˆYM , and thus of fˆlog Ỹ (T ) , the next step is to compute the
PDF of log Ỹ (T ) by means of the COS method [49]. For that, we need to determine the
support of log Ỹ (T ), which is denoted by interval [ã, b̃]. It can be calculated by employing
the relation
log(Ỹ (T )) = log ∆t σ20 exp(Y M ) = log ∆t σ20 + Y M .
¡ ¢ ¡ ¢
An alternative to the COS method is the so-called SWIFT method [102], based on
Shannon wavelets. This technique is efficient especially when long time horizons are
considered. For the present application, however, the COS method appears superior in
terms of computational time. The use of the one-step SABR simulation will be restricted
to exercise times up to two years (details in Subsection 2.4.2).
We recover f log Ỹ (T ) by employing the COS expression, as follows
NX
−1
kπ
µ ¶
2 0
f log Ỹ (T ) (x) ≈ C k cos (x − ã) , (2.13)
b̃ − ã k=0 b̃ − ã
2.3. CDF OF SABR’ S TIME - INTEGRATED VARIANCE 21
with
kπ ãkπ
µ µ ¶ µ ¶¶
C k = ℜ fˆlog Ỹ (T ) exp −i ,
b̃ − ã b̃ − ã
and
kπ kπ kπ
µ ¶ µ ¶ µ ¶
fˆlog Ỹ (T ) log ∆t σ20 fˆYM 2
¡ ¢
= exp i
b̃ − ã b̃ − ã b̃ − ã
kπ kπ
µ ¶ µ ¶
log ∆t σ20 fˆY∗M
¡ ¢
≈ exp i ,
b̃ − ã b̃ − ã
where N is the number of COS terms and the prime 0 and ℜ symbols in (2.13) indicate
division of the first term in the summation by two and taking the real part of the complex-
valued expressions in the brackets, respectively.
The CDF F log Ỹ (T ) , as in Equation (2.5), is obtained by integrating the corresponding
approximated PDF from Equation (2.13), as follows
Z x
F log Ỹ (T ) (x) = f log Ỹ (T ) (y)dy
−∞
x NX
−1 (2.14)
¢ kπ
µ ¶
2
Z
0 ¡
≈ C k cos y − ã dy.
ã b̃ − ã k=0 b̃ − ã
2.3.1. C H F OF Y M
In the previous section, we defined a procedure to compute an approximation of the
CDF of Y (T ). This computation relies on the efficient calculation of the ChF of Y M . In
this section, we explain the recursive procedure for approximating fˆYM based on the ini-
tial ChF, fˆY1 , the ChF of the recursive process, fˆR , and the equality fˆY j (u) = fˆR (u) fˆZ j −1 (u)
(see Equations (2.6) and (2.12)). Note that this iterative computation needs to be done
only once. By definition, the ChF of Z j −1 reads
Z ∞
fˆZ j −1 (u) := (exp(x) + 1)i u f Y j −1 (x)dx.
−∞
PDF f Y j −1 is not known. To approximate it, the Fourier cosine series expansion on
f Y j −1 is applied. We suppose a bounded the integration range [a, b]. The calculation of
integration boundaries a and b follows the cumulant-based approach as described in
[137] and [49]. After truncation of the integral, we have
Z b
fˆZ∗j −1 (u) = (exp(x) + 1)i u f Y j −1 (x)dx. (2.15)
a
2 NX −1
lπ
Z b µ ¶
0
fˆZ∗j −1 (u) ≈ Bl (exp(x) + 1)i u cos (x − a) dx, (2.16)
b − a l =0 a b−a
with
lπ lπ
µ µ ¶ µ ¶¶
B l = ℜ fˆY∗j −1 exp −i a ,
b−a b−a
22 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
where N is the number of expansion terms, fˆY∗1 (u) = fˆR (u) and fˆY∗ (u) = fˆR (u) fˆZ∗ (u).
j j −1
2 Φ j −1 = M A j −1 , (2.17)
where
¤N −1
Φ j −1 = Φ j −1 (k) k=0 , Φ j −1 (k) = fˆZ∗j −1 (u k ),
£
Z b
N −1
M = [M (k, l )]k=0 , M (k, l ) = (exp(x) + 1)i uk cos ((x − a)u l ) dx, (2.18)
a
2 £ ¤N −1 ³ ´
Aj = A j (l ) l =0 , A j (l ) = ℜ fˆY∗j −1 (u l ) exp (−i au l ) ,
b−a
with
zπ
uz = . (2.19)
b−a
By the recursion procedure in Equation (2.17), we obtain the approximation fˆY∗M .
Before using this approximation, we must compute matrix M in Equation (2.18) highly
efficiently. This matrix M has to be computed only once. Its elements are given by
Z b
I M := (exp(x) + 1)i uk cos ((x − a)u l ) dx. (2.20)
a
The efficient computation of the integral in Equation (2.20) is a key aspect for the
performance of the whole procedure. The number of elements in matrix M can be large,
as it corresponds to the number of elements in the cosine series expansion i.e. N 2 . The
efficiency also depends on the required accuracy. We discuss the numerical treatment in
Subsection 2.3.2.
1¡ +
Z b
I + I− , I± =
¢
IM = exp(i (u k g (x) ± (x − a)u l ))dx.
2 a
2.3. CDF OF SABR’ S TIME - INTEGRATED VARIANCE 23
i e±i ul (x−a)
I± = ∓ 2 F 1 (−i u k , ±i u l ; 1 ± i u l ; − exp(x)).
ul
2
Unfortunately, multiple evaluations of the hypergeometric function make the com-
putation rather time-consuming.
where the matrix element D(k, m) and the vectors v and w can be computed by the
expressions
¶ (
1/2 if m = {1, n q /2 + 1},
µ
2 (m − 1)(k − 1)π
D(k, m) = cos ·
nq n q /2 1 otherwise,
à !T
2 2 2 2
v := 1, , ,··· , , ,
1 − 4 1 − 16 1 − (n q − 2)2 1 − n q2
µ µ ¶¶ µ µ ¶¶
(m − 1)π (m − 1)π
w m := h cos + h − cos ,
nq nq
¶i uk
b−a b−a a +b b−a a +b
µµ ¶ ¶µ µ ¶
h(x) = cos x+ − a u l exp x+ +1 ,
2 2 2 2 2
with k, m = 1, . . . , (n q /2+1) and with n q the number of terms in the quadrature. This was
the approach taken in [137], with computational complexity O(n q N 2 ).
g (x) ≈ c 1, j + c 2, j x, x ∈ [a j , b j ].
24 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
L Z bj
I˜M =
X
exp(i u k c 1, j ) exp(i u k c 2, j x) cos ((x − a)u l ) dx . (2.23)
j =1 aj
2
| {z }
IL
The optimal integration points a j and b j in (2.23) are found by differentiation of g (x)
w.r.t. x, i.e.
∂g (x) exp(x) 1
G(x) := = = ,
∂x exp(x) + 1 1 + exp(−x)
giving a relation between the derivative of g (x) and the logistic distribution. This repre-
sentation indicates that G(x) is the CDF of the logistic distribution with a null location
parameter and a scale parameter s = 1. In order to determine the optimal integration
points so that their positions correspond to the logistic distribution, we need to com-
pute the quantiles. The quantiles of the logistic distribution are analytically available
and given by
p
µ ¶
q(p) = log .
1−p
The algorithm for the optimal integration points, given integration interval [a, b] and
the number of intervals L, is as follows:
j
• Divide the interval P g equally in L parts: P g = [p j , p j +1 ], j = 0 . . . L − 1.
The algorithm above ensures that integration points are optimally distributed. In
Figures 2.1(a) and 2.1(b), we present the optimal integration points for L = 10 and L = 25,
respectively. The points are well-positioned in the region of interest, as expected.
However, for our specific problem, the integration interval can be rather large. As the
integration points are densely concentrated in the function’s curved part, the errors in
the outer sub-intervals dominate the overall error and more points at the beginning and
at the end of the interval are required to reduce this error. A more balanced distribution
of the integration points can be achieved by introducing a scale parameter, s 6= 1, which
2.3. CDF OF SABR’ S TIME - INTEGRATED VARIANCE 25
10 10
8 8
6 6
g(x)
4
g(x) 4
2 2
0 0
-10 -5 0 5 10 -10 -5 0 5 10
x x
10 10
8 8
6 6
g(x)
g(x)
4 4
2 2
0 0
-10 -5 0 5 10 -10 -5 0 5 10
x x
L = 25 L = 50 L = 75 L = 100
Equidistant 2.4075 × 10−3 2.6743 × 10−4 3.1064 × 10−4 2.9547 × 10−6
Optimal (s = 1) 1.3273 × 10−2 2.5343 × 10−3 9.5959 × 10−4 4.7953 × 10−4
Optimal (s = 3) 3.4837 × 10−5 1.1445 × 10−6 2.1506 × 10−7 6.7114 × 10−8
implies that the integration points correspond to a CDF with increased variance. In Fig-
ures 2.1(c) and 2.1(d), the distributions of the points for s = 3 are shown. In the case of
s 6= 1, the expression for the quantiles reads,
p
µ ¶
q(p) = s log .
1−p
The impact of the optimal point redistribution can be seen in Table 2.1. The mean
squared error (MSE), in absolute value, of the piecewise linear approximation is pre-
sented for a fixed interval [−10, 30] and a fixed number of elements in the cosine expan-
sion, N = 150, considering different numbers of sub-intervals and different integration
point locations (equidistant, optimal with s = 1 and optimal with s = 3). We observe that,
for the problem at hand, the optimal distribution with s > 1 gives better accuracy.
Error ²D . This error occurs because the SABR’s time-integrated variance isR approxi-
t
mated by its discrete equivalent, i.e. Y (T ) ≈ Ỹ (T ). Note that the process Y (t ) = 0 σ2 (s)ds
is the solution of the SDE
dY (s) = σ2 (s)ds, Y (0) = 0,
which, by employing the same time discretization as for Ỹ (T ) in Equation (2.4), can be
written as
Ȳ (t j ) − Ȳ (t j −1 ) = σ2 (t j −1 )∆t .
M M ¡
σ2 (t j −1 )∆t =
X X ¢
Ȳ (t j ) − Ȳ (t j −1 ) = Ȳ (t M ) − Ȳ (0) = Ỹ (T ).
j =1 j =1
2.3. CDF OF SABR’ S TIME - INTEGRATED VARIANCE 27
10 0
Error ǫD
-i
10 -1
2
10 -2
h-
-
10 -3
10 1 10 2 10 3
M
The error (or convergence) of the Euler-Maruyama discretization scheme has been
intensively studied in the literature (see [83], for example). Two errors can be considered
when a stochastic process is discretized: strong and weak. It was proved by Kloeden and
Platen in [83] that, under some conditions of regularity, the Euler-Maruyama scheme
has strong convergence with order γ = 12 and weak convergence with order γ = 1. In our
particular case, we focus on the strong error between Y (T ) and Ỹ (T ). Then, the error ²D
can be computed and bounded by employing the usual strong convergence expression
as µ ¶γ
T
²D := E Ỹ (T ) − Y (T ) ≤ C
£¯ ¯¤
¯ ¯ ,
M
for some constant C > 0.
In our actual experiments, we consider log Ỹ (T ) and log Y (T ). This will reduce the
error further. In order to numerically show the order of convergence of our approxima-
tion, in Figure 2.2(a) the strong error2 of log Ỹ (T ) with respect to the number of discrete
time points, M , is depicted. Additionally, “upper” (C = 1 and γ = 12 ) and “lower” (C = 1
and γ = 1) bounds are presented as references. In order to further test the convergence,
the resulting CDFs, F log Ỹ (T ) , are presented in Figure 2.2(b), again varying the number of
discrete time points, M . Both experiments perform as expected.
Error ²T . The use of cosine expansions and the COS method implies a truncation of the
integration range to interval [a, b]. This gives rise to a truncation error, which is defined
by Z
²T (H ) := f H (y)dy,
R\[a,b]
where H corresponds to log Ỹ (T ) and Y j , according to Equations (2.14) and (2.15), re-
spectively. By a sufficiently large integration range [a, b], this error can be reduced and
does not dominate the overall error.
2 As log Y (T ) is unknown, the reference value is computed by using a very fine discretization, i.e. M = 5000.
28 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
Error ²C . The convergence due to the number of terms N in the cosine series expan-
sion was derived in [49]. For any f (y|x) ∈ C∞ [a, b], ²C can be bounded by
2 being ν > 0 a constant and P ∗ (N ) a term which varies less than exponentially with re-
spect to N . As the PDF of the time-integrated variance is a very smooth function, ²C will
decay exponentially with respect to N .
P ERFORMANCE ANALYSIS
We perform an efficiency analysis considering the terms that control the evaluated er-
rors. Error ²T is not taken into account here, as we always employ a sufficiently large
integration interval.
Error ²D can be reduced by increasing M in Equation (2.4). We find that the value
of M hardly affects the computational cost, so it can be increased without a significant
impact on the performance.
Error ²C is governed by the number of series expansion terms, N . As ²C exhibits an
exponential decay w.r.t. N , a relatively small value (N ≈ 100) already results in highly
accurate approximations. Larger N values have a great impact on the performance be-
cause I M is calculated N × N times. As this calculation dominates the computational
cost, the execution time increases by O(N 2 ).
Error ²M can be controlled by the number of quadrature terms, n q , in the case of the
Clenshaw-Curtis method or by the number of sub-intervals, L, when using the piece-
wise linear approximation. Again, the impact can be significant because I M is com-
puted many times. To achieve a similar order of accuracy (∼ 10−7 ) in a generic interval
[a, b], the piecewise linear approximation is approximately three times faster than the
Clenshaw-Curtis quadrature. Therefore, we will use the piecewise linear approximation
in the following.
copula is a joint distribution function on a unit hyper-cube [0, 1]d with d marginal dis-
tributions. A distribution with a copula contains essentially the same information as the
joint distribution, like the marginal properties and the dependence structure, but this
information can be decoupled. The importance of copulas is represented by two parts
of the Sklar theorem, introduced by Abe Sklar in [122]. The first part states that a copula
can be derived from any joint distribution function, and the second part states that any 2
copula can be combined with any set of marginal distributions to result in a multivariate
distribution function. More formally, Sklar’s theorem reads
F (x 1 , . . . , x d ) = C (F 1 (x 1 ), . . . , F d (x d )),
for all x j ∈ [−∞, ∞]. Moreover, if the margins are continuous, then C is unique;
otherwise C is uniquely determined on Ran(F 1 ) × · · · × Ran(F d ), where Ran(F j ) =
F j ([−∞, ∞]) is the range of F j .
• The converse is also true. That is, if C is a copula and F 1 , . . . , F d are univariate dis-
tribution functions, then the multivariate function defined by the preceding equa-
tion is a joint distribution function with marginal distributions F j , j = 1, . . . , d .
where tν and tR,ν are univariate and the d -variate Student t distributions, respec-
tively.
30 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
C θ (F 1 . . . , F d ) = ψ−1
θ (ψθ (F 1 ) + · · · + ψθ (F d )),
à à !1/θ !
d ¡ ¢θ
C θ (F 1 . . . , F d ) = exp −
X
− log(F i ) . (2.26)
i =1
In order to calibrate parameter θ in (2.24), (2.25) or (2.26) a relation between θ and the
rank correlation coefficient Kendall’s τ can be employed. This relation is available in
closed-form for the Clayton copula, i.e. θ = 2τ/(1 − τ), and the Gumbel copula, i.e. θ =
1/(1 − τ). There is no analytic expression of θ for the Frank copula.
To apply any of these copulas, an approximation for the correlation between the in-
volved distributions needs to be determined. Here, Pearson’s correlation coefficient, P ,
is chosen since it is employed directly in the Gaussian and Student t copulas and a rela-
tion with Kendall’s τ exists ³π ´
P = sin τ .
2
In Subsection 2.4.2, an approximation for Pearson’s coefficient between log Y (T ) and
log σ(T ) of the SABR dynamics is derived.
where the logarithm and the integral are interchanged. Since the log function is concave, 2
this approximation forms a lower bound for the true value. This can be seen by applying
Jensen’s inequality, i.e.
Z T Z T
log σ2 (s)ds ≥ log σ2 (s)ds.
0 0
We will numerically show that, under our model settings with an option contract time to
maturity of less than two years, this is a highly satisfactory approximation. The correla-
tion coefficient can now be computed as
hR i
T
cov 0 log σ(s)ds, log σ(T )
P log Y (T ),log σ(T ) ≈ r hR i .
T
Var 0 log σ(s)ds Var log σ(T )
£ ¤
The quantity Var log σ(T ) is known by Equation (2.3), and we derive expressions
£ ¤
1
log σ(T ) = log σ0 − α2 T + αW (T ) =: η(T ) + αW (T ),
2
and
Z T 1
Z T Z T
log σ(s)ds = T log σ0 − α2 T 2 + α W (s)ds =: γ(T ) + α (T − s)dW (s).
0 4 0 0
RT
In order to determine the variance of 0 log σ(s)ds, we compute
"µZ
T ¶2 # "µ Z T ¶2 #
E log σ(s)ds = E γ(T ) + α (T − s) dW (s)
0 0
2 T
·Z ¸
= γ2 (T ) + α2 E (T − s)2 ds
0
1
= γ (T ) + α2 T 3 ,
2
3
and the variance reads
·Z T ¸ "µZ
T ¶2 # µ ·Z T ¸¶2
1
Var log σ(s)ds = E log σ(s)ds − E log σ(s)ds = α2 T 3 .
0 0 0 3
p
γ(T )η(T ) + 21 α2 T 2 − γ(T )η(T ) 1 2 2
2α T 3
= q¡ =q = .
1 2 3 1 2
3α T α2 T 3α T
¢¡ ¢
4 4
The approximation obtained is a constant value. We will show numerically that, for
the problems at hand, this appears to be a very reasonable approximation. The cor-
relation between log Y (T ) and log σ(T ) is affected by the maturity time T , and by the
volatility-of-volatility parameter α. In Figure 2.3, the empirical and approximated (red
grid) correlations are depicted for typical values of T and α. Because we focus on short
maturity options, we restrict T ∈ [0, 2] and α ∈ (0, 1]. The experiment shows that, only in
rather extreme cases (α > 0.7), the differences between the empirical and approximated
correlation increase, but remain within five basis points. The approximation is therefore
sufficiently accurate for our purposes, since a small error in the correlation (of less than
ten basis points) does not affect the performed simulation significantly.
3. Generate correlated uniform samples, Ulog σ(T ) and Ulog Ỹ (T ) by means of the bi-
variate copula distribution.
4. From Ulog σ(T ) and Ulog Ỹ (T ) invert the original marginals, F log σ(T ) and F log Ỹ (T ) . The
resulting quantiles should be correlated under the chosen correlation coefficient.
2.4. S IMULATION OF Y (T )|σ(T ): COPULA APPROACH 33
2
Plog Y (T ),log σ(T )
1
0.9
0.8
0.7
0.6
0.5
0.2 1
0.4 0.9
0.6 0.8
0.8 0.7
1 0.6
T 1.2
1.4 0.4
0.5
α
1.6 0.3
1.8 0.2
2 0.1
Figure 2.3: Pearson’s correlation coefficient: Empirical (surface) vs. approximation (red grid).
The inversion procedure in the fourth step is straightforward in the case of σ(T ). The
case of Ỹ (T ) is more involved, and we propose a direct inversion procedure based on lin-
ear interpolation. The insight is that, by rotating CDF F log Ỹ (T ) , we can interpolate over
probabilities instead of x−values. This is possible since F log Ỹ (T ) is a smoothly increasing
function. The interpolation polynomial provides the quantiles of the original distribu-
tion, F log Ỹ (T ) , from the probabilities Ulog Ỹ (T ) . This inversion procedure is very fast (at
almost no cost) and sufficiently accurate.
RT
2.4.4. S IMULATION OF S(T ) GIVEN S(0), σ(T ) AND 0 σ2 (s)ds
We complete the one-step SABR simulation by the conditional sampling of S(T ). The
most commonly used techniques can be classified into two categories: direct inversion
of SABR’s CDF given in Equation (2.2) and moment-matching approaches. The direct
inversion procedure has a higher computational cost because of the evaluation of the
non-central χ2 distribution. However, some recent developments make this computa-
tion affordable. In [21], Chen et al. proposed a forward asset simulation based on a
combination of moment-matching (quadratic Gaussian) and enhanced direct inversion
procedures. We employ this technique also here. Note, however, that, for R Tsome specific
values of β, the simulation of the conditional S(T ) given S(0), σ(T ) and 0 σ2 (s)ds gives
rise to analytic expressions.
34 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
C ASE β = 0
For β = 0, it is easily seen from Equation (2.1) that the asset process (in integral form)
becomes Z T q Z T
S(T ) = S(0) + ρ σ(s)dW̃σ (s) + 1 − ρ 2 σ(s)dW̃S (s),
0 0
2 where dW̃S (t ) and dW̃σ (t ) are independent Brownian motions and with
Z T 1
σ(s)dW̃σ (s) = (σ(T ) − σ(0)) , (2.27)
0 α
and s
Z T Z T
σ(s)dW̃S (s)|σ(T ) ∼ N 0, σ2 (s)ds . (2.28)
0 0
C ASE β = 1
In the case of β = 1, the asset dynamics in Equation (2.1) are log-normal and the solution
is given by
1 T 2
µ Z Z T q Z T ¶
S(T ) = S(0) exp − σ (s)ds + ρ σ(s)dW̃σ (s) + (1 − ρ )
2 σ(s)dW̃S (s) .
2 0 0 0
1 T 2
Z T Z T
S(T )
µ ¶ Z q
log =− σ (s)ds + ρ σ(s)dW̃σ (s) + (1 − ρ 2 ) σ(s)dW̃S (s),
S(0) 2 0 0 0
By employing Equation (2.29), the asset dynamics S(t ) can be sampled from
s
ρ
Z T Z T
1 2
S(T ) ∼ S(0) exp − σ (s)ds + (σ(T ) − σ(0)) + X (1 − ρ 2 ) σ2 (s)ds , (2.30)
2 0 α 0
C ASE β 6= 0, β 6= 1
As mentioned before, for the generic case of β ∈ (0, 1), we employ the enhanced inver-
sion of the SABR asset price distribution in Equation (2.2) as introduced in [21]. We
briefly summarize this approach. The asset simulation is performed either by a moment-
matched quadratic Gaussian approximation or by an enhanced direct inversion. The au-
thors in [21] proposed a threshold value to choose the suitable technique among these
2.4. S IMULATION OF Y (T )|σ(T ): COPULA APPROACH 35
two. The moment-matching approach relies on the fact that, for S(0) À 0, the distribu-
tion function in Equation (2.2) can be approximated by
µ Z t ¶
Pr S(t ) ≤ K |S(0) > 0, σ(t ), σ (s)ds ≈ χ2 (c; 2 − b, a).
2
0
with
κ2 µ
q q
ψ := , e 2 = 2ψ−1 − 1 + 2ψ−1 2ψ−1 − 1 ≥ 0, d = .
µ2 1 + e2
Applying this to our case in Equation (2.2), we can sample the conditional asset dynam-
ics, T > 0, by
µ Z T 1
¶ 2(1−β)
2 2 2 2
S(T ) = (1 − β) (1 − ρ )d (e + X ) σ (s)ds .
0
The moment-matching approximation is accurately applicable when 0 < ψ ≤ 2 and µ ≥ 0
[21]. Otherwise, a direct inversion procedure must be employed. A root-finding Newton
method is then used. In order to reduce the number of Newton iterations (and the ex-
pensive evaluation of the non-central chi-square CDF), the first iterations are based on
an approximated formula for the non-central chi-square distribution, which is based on
the normal CDF and derived by Sankaran in [114]. Then, the obtained value is employed
as the initial guess for the Newton algorithm. The result is a significant reduction in
the number of iterations and, hence, in the computational cost. Furthermore, this root-
finding procedure consists of only basic operations, so that the whole procedure can be
easily vectorized, leading to a further efficiency improvement. The resulting sampling
procedure is robust and efficient.
M ARTINGALE CORRECTION
As pointed out in [2] and [21], the exact simulation of the asset price can, in certain
SV models, give rise to loss of the martingale property (because of the approximation
of a continuous process by its discrete equivalent). This is especially seen, when the
size of the time-step is large and for certain configurations of the SABR parameters, like
small β values, close-to-zero initial asset values S 0 , high vol-vol parameter α or large
initial volatility σ0 . As we deal with one large time-step, a martingale correction needs
to be applied. We incorporate a simple but effective numerical martingale correction, as
follows
1 Xn
S(T ) = S(T ) − S p (T ) + E[S(T )],
n p=1
1 Xn
= S(T ) − S p (T ) + S 0 ,
n p=1
where S p (·) represents the forward asset value for the p−th Monte Carlo path.
Note, however, that, for practical SABR parameters, the martingale error is very small.
36 2. O NE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
S0 σ0 α β ρ T
Set I 1.0 0.5 0.4 0.7 0.0 2
Set II 0.05 0.1 0.4 0.0 −0.8 0.5
Set III 0.04 0.4 0.8 1.0 −0.5 2
2 Table 2.2: Data sets.
The experiments were performed in a computer with CPU Intel Core i7-4720HQ 2.6
GHz and RAM memory of 16 Gigabytes. The employed software package was Matlab
r2015b.
0 0
Empirical λ(u) Empirical λ(u)
Clayton Clayton
Frank Frank
-0.05 Gumbel -0.05 Gumbel
λ̂(u)
λ̂(u)
2
-0.1 -0.1
-0.15 -0.15
0 0.5 1 0 0.5 1
u u
(a) Set I. (b) Set II.
0
Empirical λ(u)
Clayton
Frank
-0.05 Gumbel
λ̂(u)
-0.1
-0.15
0 0.5 1
u
(c) Set III.
with n the number of samples and x di the i -th sample of the d -th marginal distribution.
Since we deal with a bivariate case, we have d = 2. The graphical procedure of the GOF
test consists in comparing the distance λ̂(u) := u − K˜ (u) and the empirical distance,
λ(u), obtained by using an empirical estimation of K (u).
In Figure 2.4, the distances λ̂(u) between three Archimedean copulas (Clayton, Frank
and Gumbel) for the graphical GOF test are depicted. The empirical λ(u) (black line) is
employed as the reference. The experiment is performed for each data set in Table 2.2.
As the measurable quantity, the MSE of λ̂(u) − λ(u) is presented in Table 2.3. We use
n = 100, 000 simulations.
From the GOF results for the Archimedean copulas, we find that the Gumbel copula
fits best in our framework. Once we have determined the most suitable Archimedean
copula, we perform a GOF test including the Gaussian, Student t and Gumbel copu-
las. For an objective comparison, also a measurable GOF technique is applied. Among
the several families of available generic GOF tests in the literature, in [9], Berg suggests
that methods that rely on the so-called Deheuvels or empirical copula [42] give reliable
results. Hence, we will perform a test which is based on the distances between the De-
heuvels copula, C d , and the analyzed copula C (see [45]). By using the discrete L 2 norm,
this GOF measure reads
D d (C d , C ) = kC d − C kL 2 ,
where d is the number of random variables to build the copula, d = 2. For this GOF
test, we reduce the number of samples to n = 1, 000 due to the computational (time and
memory) cost of the Deheuvels copula calculations. The other parameters remain the
same. In Table 2.4, the obtained distances, D 2 , between the tested copulas (Gaussian,
Student t and Gumbel) and the Deheuvels copula are shown.
According to the GOF results, the three copulas perform very similarly. When longer
maturities are considered, the Gumbel copula exhibits smallest errors in the GOF tests.
In terms of speed, the Gaussian copula is around three times faster than the Gumbel
copula, although the impact in the overall method is very small. The Student t copula
is discarded since its accuracy and performance are very similar to the Gaussian copula
and the calibration of the ν parameter adds extra complexity. Therefore, as a general
strategy, we conclude that the Gumbel copula is the most robust choice. When short
maturities are considered, the Gaussian copula may be a satisfactory alternative.
As usual, the option prices are obtained by averaging the maximum of the forward
asset values at maturity minus the strike and zero, i.e. max(S(T ) − K i (T ), 0) (call option).
Subsequently, the Black-Scholes formula is inverted to determine the implied volatili-
ties. Note that, once the forward asset is simulated by our method, we can price options
at multiple strikes simultaneously.
2.5. N UMERICAL EXPERIMENTS 39
1.2 4
n=100 n=100
n=10000 n=10000
1 3.5
n=1000000 n=1000000
Reference Reference
Implied volatility
Implied volatility
0.8 3
0.6 2.5
0.4 2
2
0.2 1.5
0 1
0 100 200 300 400 500 0 20 40 60 80 100 120
Time-steps Time-steps
Table 2.5: Convergence in n: mean and standard deviation of the error (basis points) and time (sec.).
Strikes K1 K2 K3 K4 K5 K6 K7
Set I (Reference: Antonov [4])
Hagan 55.07 52.34 50.08 N/A 47.04 46.26 45.97
MC 23.50 21.41 19.38 N/A 16.59 15.58 14.63
Gaussian 16.23 20.79 24.95 N/A 33.40 37.03 40.72
Gumbel 11.59 15.57 19.12 N/A 25.41 28.66 31.79
Set II (Reference: Korn [86])
Hagan −558.82 −492.37 −432.11 −377.47 −327.92 −282.98 −242.22
MC 5.30 6.50 7.85 9.32 10.82 12.25 13.66
Gaussian 9.93 9.98 10.02 10.20 10.57 10.73 11.04
Gumbel −9.93 −9.38 −8.94 −8.35 −7.69 −6.83 −5.79
Set III (Reference: MC Milstein)
Hagan 287.05 252.91 220.39 190.36 163.87 141.88 126.39
Gaussian 16.10 16.76 16.62 15.22 13.85 12.29 10.67
Gumbel 6.99 3.79 0.67 −2.27 −5.57 −9.79 −14.06
Table 2.6: One-step SABR method with Gaussian and Gumbel copulas - Implied volatility: differences in basis
points.
2.6. C ONCLUSIONS 41
alternative for pricing of European options up to maturities of two years under the SABR
model. Our one-step approach works particularly well in the case of low strike values and
higher volatilities, that pose some problems for the Hagan implied volatility formula.
2.6. C ONCLUSIONS 2
In this chapter, an efficient method to obtain samples of the SABR dynamics based on the
time-integrated variance has been developed. The technique employs a Fourier method
to derive the CDF of the time-integrated variance. By means of copulas, the conditional
sampling technique is obtained. Its application gives us a fast and accurate one time-
step Monte Carlo method for the SABR model simulation. By numerical experiments, we
have shown that our method does not suffer from well-known problems of the Hagan
formula in the case of small strike values and higher volatilities.
A generalization of our one-step SABR method to a multiple time-step version will be
presented in the next chapter. This will allow us to deal with problems with longer ma-
turities (more than two years) and also with more involved exotic options (early-exercise
and path-dependent options). We need to introduce new method components to deal
with the increasing computational complexity in that case.
The hybrid aspect of the method proposed in this chapter is represented by the combined
use of Monte Carlo and Fourier techniques. This fact has allowed us to develop an accu-
rate and highly efficient one time-step simulation scheme that can be used for calibration
purposes, where Monte Carlo-based methods are typically discarded.
CHAPTER 3
Multiple time-step Monte Carlo simulation of the SABR
model
In this chapter, we will present the multiple time-step Monte Carlo simulation technique
for pricing options under the SABR model. The proposed method is an extension of the one
time-step Monte Carlo method that we proposed in the previous chapter, for pricing Eu-
ropean options in the context of the model calibration. A highly efficient method results,
with many highly interesting and non-trivial components, like Fourier inversion for the
sum of log-normals, stochastic collocation, Gumbel copula, correlation approximation,
that are not yet seen in combination within a Monte Carlo simulation. The present multi-
ple time-step Monte Carlo method is especially useful for long-term options and for exotic
options.
3.1. I NTRODUCTION
The SABR model [67] is an established SDE model which, in practice, is often used for
interest rates and FX modelling. It is based on a parametric local volatility component in
terms of a model parameter, β, and reads
Here S(t ) = S̄(t ) exp (r (T − t )) denotes the forward price of the underlying asset S̄(t ), with
r an interest rate, S 0 the spot price and T the maturity. Further, σ(t ) represents a stochas-
tic volatility process, with σ(0) = σ0 , WS (t ) and Wσ (t ) are correlated Brownian motions
with constant correlation coefficient ρ (i.e. WS Wσ = ρt ). The parameters of the SABR
model are α > 0 (the volatility of volatility, vol-vol), 0 ≤ β ≤ 1 (the variance elasticity) and
ρ (the correlation coefficient).
By the one time-step SABR method in Chapter 2, we can efficiently calculate accurate
implied volatilities for any strike, however, only for short time to maturity (less than two
years). This fits well to the context of FX markets. Here, we extend the one time-step
Monte Carlo method and propose a multiple time-step Monte Carlo simulation tech-
nique for pricing options under the SABR dynamics. We call it the mSABR method. With
this new simulation approach, we can price options with longer maturities, payoffs re-
lying on the transitional distributions and exotic options (like path-dependent options)
under the SABR dynamics.
This chapter is based on the article “On an efficient multiple time-step Monte Carlo simulation of the SABR
model”. Published in Quantitative Finance, 2017 [89].
43
44 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
Because a robust and efficient SABR simulation scheme may be involved and quite
expensive, we aim for using as few time-steps as possible. This fact, together with the
long time horizon assumption, implies that we focus on larger time-steps. In this con-
text, the important research line is called exact simulation, where, rather than Taylor-
based simulation techniques, the PDF of the SDE under consideration is highly accu-
rately approximated. Point-of-departure is Broadie and Kaya’s exact simulation for the
Heston SV model [15]. That method is however time-consuming because of multiple
evaluations of Bessel functions and the use of Fourier inversion techniques on each
Monte Carlo path. Inspired by this approach, in [124] and [130], computational im-
3 provements were proposed. Furthermore, Andersen in [2] presented two efficient al-
ternatives to the Broadie and Kaya scheme, the so-called Truncated Gaussian (TG) and
the Quadratic Exponential (QE) schemes.
For the SABR model, exact Monte Carlo simulation is nontrivial because the forward
process in (3.1) is governed by CEV dynamics [31, 117]. In [75], the connection between
the CEV and a squared Bessel processes was explained, and an analytic approximation
for the SABR conditional distribution based on the non-central χ2 distribution was pre-
sented. The expression relies on the stochastic volatility dynamics, but also on the time-
integrated variance process (which itself also depends on the volatility).
An (approximately) exact SABR simulation can then be subdivided in the simulation
of the volatility process, the simulation of the time-integrated variance (conditional on
the volatility process) and the simulation of the underlying CEV process (conditional on
the volatility and the time-integrated variance processes). Based on this, Chen et al. pro-
posed in [21] a low-biased Monte Carlo scheme with moment-matching (following the
techniques of the QE scheme by Andersen in [2]) and a direct inversion approach. For
the simulation of the time-integrated variance, they also employed moment-matching
based on conditional moments that were approximated by a small disturbance expan-
sion. In [78], Kennedy and Pham provided the moments of the time-integrated vari-
ance for specific SABR models (like the normal, log-normal and displaced diffusion SABR
models) and derived an approximation of the SABR distribution. In [95], a summary of
different approaches was presented.
In this chapter, we approximate1 the distribution of the time-integrated variance
conditional on the volatility dynamics by joining the two involved marginal distribu-
tions, i.e. the stochastic volatility and time-integrated variance distributions, by means
of a copula technique. With both marginal distributions available, we use the Gumbel
copula to define a multivariate distribution which approximates the conditional distri-
bution of the time-integrated variance given the stochastic volatility.
An approximation of the CDF of the time-integrated variance can be obtained by a
recursive procedure, as described in [137], originally employed to price arithmetic Asian
options. This iterative technique is based on the derivation of the ChF of the time-
integrated variance process and Fourier inversion to recover the PDF.
The fact that we need to apply recursion and compute the corresponding ChF, PDF
and CDF of the time-integrated variance for each Monte Carlo sample at each time-step,
makes this approach however computationally very expensive (even with the improve-
ments already made in Chapter 2), like the Heston exact simulation. In order to reduce
1 An analytical expression exists but is not practically tractable.
3.2. ‘A LMOST EXACT ’ SABR SIMULATION 45
the use of this procedure as much as possible, highly efficient sampling from the CDF of
the time-integrated variance is proposed, which is based on the so-called Stochastic Col-
location Monte Carlo sampler (SCMC), see [65]. The technique employs a sophisticated
interpolation (based on Lagrange polynomials and collocation points) of the CDF under
consideration.
As will be seen, the resulting ‘almost exact’ Monte Carlo SABR simulation method
that we present here (the mSABR method), contains several interesting (and not com-
monly used) components, like the Gumbel copula, a recursion plus Fourier inversion
to approximate the CDF of the time-integrated variance, and efficient interpolation by
means of the SCMC sampler. The proposed mSABR scheme allows for fast and accu-
3
rate option pricing under the SABR dynamics, providing a better ratio of accuracy and
computational cost than Taylor-based simulation schemes. Compared to the other ap-
proaches mentioned in this introduction, we provide a highly accurate SABR simulation
scheme, based on only a few time-steps.
The chapter is organized as follows. In Section 3.2, SABR model simulation is briefly
introduced. In Section 3.3, the different parts of the multiple time-step copula-based
technique are described, including the derivation of the marginal distributions and the
application of the copula. Some numerical experiments are presented in Section 3.4. We
conclude in Section 3.5.
The forward dynamics are governed by a CEV process. Based on this fact and the
works by Schroder [117] and Islah [75], an analytic approximation for the CDF of the
SABR conditional process has been obtained. For some generic time interval [s, t ], 0 ≤
s < t ≤ T , assuming S(s) > 0, the conditional cumulative distribution
Rt for forward S(t )
with an absorbing boundary at S(t ) = 0, given σ(s), σ(t ) and s σ2 (z)dz, is given by
µ Z t ¶
Pr S(t ) ≤ K |S(s) > 0, σ(s), σ(t ), σ2 (z)dz = 1 − χ2 (a; b, c), (3.3)
s
where à !2
1 S(s)1−β ρ K 2(1−β)
a= + (σ(t ) − σ(s)) , c= ,
ν(t ) (1 − β) α (1 − β)2 ν(t )
1 − 2β − ρ 2 (1 − β)
Z t
b = 2− , ν(t ) = (1 − ρ 2 ) σ2 (z)dz,
(1 − β)(1 − ρ 2 ) s
2
and χ (x; δ, λ) is the non-central chi-square CDF.
46 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
• Simulation of the SABR volatility process, σ(t ) given σ(s). By Equation (3.2), the
stochastic volatility process of the SABR model exhibits a log-normal distribution.
The solution is a GBM, i.e. the exact simulation of σ(t )|σ(s) reads
µ ¶
1
σ(t ) ∼ σ(s) exp αW̃σ (t ) − α2 (t − s) . (3.4)
2
Rt
• Simulation of the SABR time-integrated variance process, s σ2 (z)dz|σ(s), σ(t ). This
conditional distribution is not efficiently tractable. We will therefore derive an ap-
proximation of the conditional distribution of the SABR time-integrated variance
given σ(t ) and σ(s). The time-integrated variance sampling can be done by simply
inverting it.
• Simulation of the SABR forward price process. The forward price S(t ) can be sim-
ulated by inverting the CDF in Equation (3.3). By this, we avoid negative forward
prices in the simulation, as an absorbing boundary at zero is considered. There is
no analytic expression for the inverse distribution and therefore this inversion has
to be computed by means of some numerical approximation.
The multiple time-step Monte Carlo simulation for the SABR model is thus summa-
rized in Algorithm 2. Input parameters are the initial conditions (S 0 and σ0 ), the maturity
time, T , the SABR parameters α, β and ρ, the number of Monte Carlo samples (n) and
the number of time-steps (m).
Rt
In Section 3.3.4, we describe the procedure to sample s σ2 (z)dz|σ(s), σ(t ). This rep-
resents the main difference with respect to the technique presented in Chapter 2, where
the conditionality is solely on the final volatility (the initial volatility is constant). The
double condition implies that the marginal distributions that will be employed within
the copula are conditional distributions themselves. Although this is, conceptually, a
minor change, it has a drastic impact on the efficiency of the procedure, increasing the
computational cost.
As a copula approach is again employed, the CDF of the time-integrated variance
given the initial volatility, σ(s), (as a marginal distribution) must be derived. In Section
3.3.1, a recursive technique to obtain an approximation of this CDF is presented. Be- 3
cause
R t 2 we need to apply this recursion to approximate the ChF, the PDF and the CDF of
s σ (z)dz|σ(s) for each sample of σ(s) at each time-step, this multiple time-step ap-
proach is very expensive in terms of execution time. To overcome this drawback, an
efficient alternative will be employed here, based on Lagrange interpolation, as in the
SCMC sampler [65]. In Section 3.2.2, this methodology is briefly described.
d
where = means equality in distributional sense, U ∼ U ([0, 1]) and u n are samples from
U ([0, 1]). The computational cost of this approach highly depends on the cost of the
inversion F Y−1 , which is assumed to be expensive.
We therefore consider another, ‘cheap’, random variable X , whose inversion, F X−1 , is
computationally much less expensive. In this framework, the following relation holds
d d
F Y (Y ) = U = F X (X ).
The samples y n of Y , and x n of X , are thus related via the following inversion,
y n = F Y−1 (F X (x n )).
However, this does not yet imply any improvement since the number of expensive
inversions F Y−1 remains the same. The goal is to compute y n by using a function g (·) =
F Y−1 (F X (·)), such that
d
F X (x) = F Y (g (x)) and Y = g (X ),
48 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
where evaluations of function g (·) do not require many inversions F Y−1 (·).
In [65], function g (·) is approximated by means of Lagrange interpolation, which is a
well-known interpolation also used in the Uncertainty Quantification (UQ) context. The
result is a polynomial, g NY (·), which approximates function g (·) = F Y−1 (F X (·)), and the
samples y n can be obtained by employing g NY (·) as
N Y N Y x n − x̄ j
y i `i (x n ), `i (x n ) =
X Y
y n ≈ g NY (x n ) = ,
i =1 j =1, j 6=i x̄ i − x̄ j
3 where x n is a vector of samples from X and x̄ i and x̄ j are so-called collocation points. NY
represents the number of collocation points and y i the exact inversion value of F Y at the
collocation point x̄ i , i.e. y i = F Y−1 (F X (x̄ i )). By applying this interpolation, the number
of inversions is reduced and, with only NY expensive inversions F Y−1 (F X (x̄ i )), we can
generate any number of samples by evaluating the polynomial g NY (x n ). This constitutes
the 1D version of SCMC sampler in [65].
According to the definition of the SCMC technique, a crucial aspect for the computa-
tional cost is parameter NY : the fewer collocation points are required, the more efficient
will be the sampling procedure. The collocation points must be optimally chosen in
a way to minimize their number. For any random variable X , the optimal collocation
points will be based on the moments of X . The optimal collocation points are here cho-
sen to be Gauss quadrature points that are defined as the zeros of the related orthogonal
polynomial. This approach leads to a stable interpolation under the probability distri-
bution of X . The complete procedure to compute the collocation points is described in
[65].
In Section
Rt 3.3.2, the application of the SCMC technique to generate samples from the
CDF of s σ2 (z)dz|σ(s) is presented. Since we deal with a conditional distribution, the
2D version of SCMC needs to be used.
but the multiple time-step version is somewhat more involved. We approximate Y (s, t )
by its discrete equivalent, i.e.
Z t M
σ2 (z)dz ≈ ∆t σ2 (t j ) =: Ỹ (s, t ),
X
Y (s, t ) := (3.5)
s j =1
and f log Ỹ | log σ(s) the PDF of log Ỹ (s, t )| log σ(s).
Density function f log Ỹ | log σ(s) is, in turn, recovered by approximating the associated
ChF, fˆ
log Ỹ | log σ(s) , and applying a Fourier inversion procedure.
σ2 (t j ) = σ2 (t 0 ) exp(R 1 + R 2 + · · · + R j ). (3.9)
Y1 = R M , Y j = R M +1− j + Z j −1 , j = 2, . . . , M , (3.10)
By Equations (3.9) and (3.10), the discrete time-integrated variance can be expressed
as
M
σ2 (t i )∆t = ∆t σ2 (s) exp(Y M ).
X
Ỹ (s, t ) = (3.11)
i =1
From Equation (3.11) and by applying the definition of ChF, we determine fˆlog Ỹ | log σ(s) ,
as follows
fˆlog Ỹ | log σ(s) (u) = E[exp(i u log Ỹ (s, t ))| log σ(s)]
2 −10
NX µ
kπ
¶
f log Ỹ | log σ(s) (x) ≈ C k cos (x − ã) , (3.13)
b̃ − ã k=0 b̃ − ã
with
kπ ãkπ
µ µ ¶ µ ¶¶
C k = ℜ fˆlog Ỹ | log σ(s) exp −i ,
b̃ − ã b̃ − ã
and
kπ kπ kπ
µ ¶ µ ¶ µ ¶
fˆlog Ỹ | log σ(s) log ∆t σ2 (s) fˆYM
¡ ¢
= exp i
b̃ − ã b̃ − ã b̃ − ã
kπ kπ
µ ¶ µ ¶
log ∆t σ2 (s) fˆY∗M
¡ ¢
≈ exp i ,
b̃ − ã b̃ − ã
where N is the number of COS terms, [ã, b̃] is the support2 of log Ỹ (s, t )| log σ(s) and the
prime 0 and ℜ symbols in Equation (3.13) mean division of the first term in the summa-
tion by two and taking the real part of the complex-valued expressions in the brackets, re-
spectively. CDF F log Ỹ | log σ(s) can be obtained by integration, plugging the approximated
f log Ỹ | log σ(s) from Equation (3.13) into Equation (3.6).
The ChF fˆ and PDF f
log Ỹ | log σ(s) need to be derived for each sample of
log Ỹ | log σ(s)
log σ(s) and this makes the computation of F log Ỹ | log σ(s) (and also its subsequent inver-
sion) very expensive. In order to overcome this issue, the SCMC technique (see Section
3.2.2) will be employed approximating these rather expensive computations by means
of an accurate interpolation.
2 It can be calculated given the support of Y
M , [a, b].
3.3. C OMPONENTS OF THE M SABR METHOD 51
where x n are the samples from X ∼ N (0, 1), which is used as the cheap variable, and v n
the samples of log σ(s); x̄ i and v̄ j are the collocation points for approximating variables
log Y and log σ(s), respectively. The Lagrange polynomials `i and ` j are defined by
N Nσ
Ỹ x n − x̄ k v n − v̄ k
`i (x n ) = , ` j (v n ) =
Y Y
.
k=1,k6=i x̄ i − x̄ k k=1,k6= j v̄ i − v̄ k
The collocation points x̄ i and v̄ j are calculated based on the moments of the ran-
dom variables. The two involved variables in the application of the 2D SCMC technique
p
are the collocation variables, X ∼ N (0, 1), and log σ(s) ∼ N (log σ0 + 12 α2 s, α s). The
moments of a normal variable, X ∼ N (µ X , s X ) are analytically available, and are given
by (
£ p¤ 0 if p is odd,
E X = p
s X (p − 1)!! if p is even,
where p is the moment order and the expression !! represents the double factorial. In
order to compute the optimal collocation points, v̄ j , the precalculated collocation points
by means of the log σ(s) moments need to be shifted according to the mean, i.e. log σ0 +
1 2
2 α s.
We first test numerically the accuracy of the SCMC technique for our problem. In
Figure 3.1, two experiments are presented. In Figure 3.1(a), we compare the samples ob-
tained by direct inversion and by the SCMC technique. The fit is highly accurate, already
with only seven collocation points. In Figure 3.1(b), the empirical CDFs are depicted,
where the excellent match is confirmed.
In order to show the performance of the SCMC technique for the simulation of the
distribution of log Ỹ | log σ(s), the execution times of generating different numbers of
samples are presented in Table 3.1.
0 1
DI Without SCMC
SCMC With SCMC
-0.5
0.8
-1
-2
0.4
-2.5
0.2
-3
3 -3.5 0
-2 -1.5 -1 -0.5 -5 -4 -3 -2 -1 0 1 2
log σ(s) x
(a) log Ỹ | log σ(s) - Direct inversion vs. SCMC. (b) F log Ỹ | log σ(s) (x).
Figure 3.1: (a) Points cloud log Ỹ | log σ(s) by direct inversion (DI) sampling (red dots) and SCMC sampling
(black circles). (b) Empirical CDFs of log Y | log σ(s) with and without using SCMC.
the approximations made in the Sections 3.3.1 and 3.3.2 are analysed, indicating how
to reduce or bound the errors. The sources of error in the approximation include ²D ,
the discretization of Y (s, t ) in Equation (3.5), ²T , the truncation in Equations (2.16) and
(3.13), ²C , due to the cosine expansion in Equations (2.16) and (3.13), ²M , the numerical
computation of I M in Equation (2.23) and ²S due to the application of SCMC in Equation
(3.14). The errors ²T , ²C and ²M were already studied in Section 2.3.3 of Chapter 2. The
error ²D is updated according to the multiple time-step version.
which, by employing the same time discretization as for Ỹ (s, t ) in Equation (3.5), can be
written as
Ȳ (s, t j ) − Ȳ (s, t j −1 ) = σ2 (t j −1 )∆t .
Then, error ²D can be computed and bounded by employing the usual strong con-
vergence expression as
t −s γ
µ ¶
²D := E Ỹ (s, t ) − Y (s, t ) ≤ C
£¯ ¯¤
¯ ¯ ,
M 3
for some constant C > 0.
In the actual experiments, we consider log Ỹ (s, t ) and log Y (s, t ). This will reduce the
error further.
Error ²S . This error appears in the application of the interpolation-based SCMC tech-
nique to get samples of log Ỹ | log σ(s). In [65], where the SCMC sampler was proposed,
an error analysis of the technique was carried out. Here we summarize the most impor-
tant results related to our context and refer the reader to the original paper for further
details.
To measure the error which results from the collocation method in a more general
case (see Section 3.2.2), we can consider either the difference between functions g (X )
and g N (X ) (X the cheap variable) or the error associated with the approximated CDF.
The first type of error is due to Lagrange interpolation, so the corresponding error esti-
mate is well-known. The error ²(ξn ) with NY collocation points given by the standard
Lagrange interpolation reads
¯ ¯
¯ ¯¯ 1 ∂NY g (x) ¯¯ N Y ¯
² X (ξn ) = g (ξn ) − g NY (ξn ) = ¯
¯ Y
(ξ − x̄ ) ¯, (3.15)
¯
n i
¯ ¯
¯ NY ! ∂x NY x=ξ i =1
¯
¯
where x̄ i are the collocation points, and ξ̃ ∈ [min(x), max(x)], x = (x̄ 1 , . . . , x̄ NỸ )T . The error
can be bounded by defining ξ as the point where ¯∂NY g (x)/∂x NY ¯ has its maximum. A
¯ ¯
small probability of large errors in the tails can be observed by deriving the error ²U (ξn ),
by substituting the uniformly distributed random variable u n in the previous equation,
using ξn = F X−1 (u n ),
¯ ¯
¯ ¯¯ 1 ∂NY g (x) ¯¯ N Y ¯
−1 −1 −1
²U (u n ) = ¯g (F X (u n )) − g NY (F X (u n ))¯ = ¯
¯ Y
(F (u ) − x̄ ) ¯.
¯
X n i
¯ NY ! ∂x NY x=ξ i =1
¯
¯
A deeper analysis can be found in the original paper of SCMC [65], including a theo-
retical convergence in L 2 .
54 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
We have chosen X to be standard normal, and Y = log Ỹ | log σ(s) can be seen as the
logarithm of a sum of log-normal variables. Therefore, it is a “close-to-normal” distri-
bution which can be efficiently approximated by a linear combination of standard nor-
malR tdistributions. In order to measure the error when SCMC is applied to the sampling
of s σ2 (z)dz|σ(s) (see Section 3.3.2), we can consider the difference between the func-
tions g (x) = F −1 (F X (x)) and g NỸ ,Nσ (x) in Equation (3.14). By following the same
log Ỹ | log σ(s)
derivation as in Equation (3.15), we have
N
¯ ¯
¯ ¯¯ 1 ∂NỸ g (x) ¯¯ Nσ
3 Ỹ ¯
²S := ¯g (ξn ) − g N
¯ Y Y
(ξ ) = (ξ − x̄ ) (v − v̄ ) ¯,
¯
,N n n i n j
¯
σ
Ỹ ¯ NỸ ! ∂x NỸ x=ξ i =1
¯ ¯
j =1
¯
where NỸ and Nσ are the number of collocation points in each dimension and v̄ j are the
collocation points corresponding to the conditional random variable log σ(s).
Rt
3.3.4. C OPULA - BASED SIMULATION OF s σ2 (z)dz|σ(t ), σ(s)
In this section we present the algorithm for the simulation of the time-integrated vari-
ance given σ(t ) and σ(s) by means of a copula. In order to obtain the joint distribution,
we require the marginal distributions. In our case, the required CDFs are F log Y | log σ(s)
and F log σ(t )| log σ(s) . In Section 3.3.1, an approximated CDF of log Y | log σ(s), F log Ỹ | log σ(s) ,
has been derived. Since σ(t ) follows a log-normal distribution, by definition, log σ(t ) is
normally distributed, and the conditional process log σ(t ) given log σ(s), follows a nor-
mal distribution,
p
µ ¶
1
log σ(t )| log σ(s) ∼ N log σ(s) − α2 (t − s), α t − s . (3.16)
2
where the logarithm and the integral are interchanged. Since the log function is concave,
this approximation forms a lower bound for the true value. This can be seen by applying
Jensen’s inequality, i.e.
Z t Z t
log σ2 (z)dz ≥ log σ2 (z)dz.
s s
3.3. C OMPONENTS OF THE M SABR METHOD 55
It has been numerically shown that, under the model settings with an interval size t − s
less than two years, the results based on this approximation are highly satisfactory (fur-
ther details in Chapter 2, Section 2.4.2). The correlation coefficient can then be approxi-
mated by
£R t
cov s log σ(z)dz, log σ(t )
¤
P log Y (T ),log σ(t ) ≈ q £R t ¤.
Var s log σ(z)dz Var log σ(t )
¤ £
£¡R t ¤ £R t
the covariance, we need E s log σ(z)dz log σ(t ) , E £s log σ(z)dz 3
¢ ¤
In order
£ to compute
and E log σ(t ) . From Equation (3.4), the last expectation as well as Var log σ(t ) are
¤ ¤
1
log σ(t ) = log σ0 − α2 t + αW (t ) =: η(t ) + αW (t ),
2
and
Z t 1
Z t
log σ(z)dz = log σ0 (t − s) − α2 (t 2 − s 2 ) + α W (z)dz
s 4 s
µ Z t ¶
=: γ(s, t ) + α tW (t ) − sW (s) − zdW (z) .
s
Rt
For the variance of s log σ(z)dz, we first compute
"µZ ¶2 # "µ ¶¶2 #
t µ Z t
E log σ(z)dz = E γ(s, t ) + α tW (t ) − sW (s) − zdW (z)
s s
"µZ ¶2 #
t
= γ (s, t ) + α E (tW (t ) − sW (s))2 + α2 E
2 2
£ ¤
zdW (z)
s
· Z t ¸
− 2α2 E (tW (t ) − sW (s)) zdW (z) ,
s
56 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
where
E (tW (t ) − sW (s))2 = t 2 E (W (t ))2 + s 2 E (W (s))2 − 2st E [W (s)W (t )]
£ ¤ £ ¤ £ ¤
= t 3 + s 3 − 2s 2 t ,
"µZ ¶2 #
t Z t
1
E zdW (z) = z 2 dz = (t 3 − s 3 ),
s s 3
· Z t ¸ · Z t ¸ · Z t ¸
E (tW (t ) − sW (s)) zdW (z) = t E W (t ) zdW (z) − sE W (s) zdW (z)
s s s
3 ·Z t Z t ¸ ·Z s Z t ¸
= tE dW (z) zdW (z) − sE dW (z) zdW (z)
s s s s
1
= t (t 2 − s 2 ),
2
and the variance reads
·Z t "µZ ¶2 # µ ·Z t ¸¶2
¸ t
Var log σ(z)dz = E log σ(z)dz − E log σ(z)dz
s s s
µ ¶
1
2 3 3 2
= α t + s − 2s t + (t 3 − s 3 ) − t (t 2 − s 2 )
3
µ ¶
1 2
= α2 t 3 + s 3 − s 2 t .
3 3
An approximation of the Pearson correlation coefficient is then obtained as
η(t )γ(s, t ) + 21 α2 (t 2 − s 2 ) − η(t )γ(s, t )
P log Y ,log σ(t ) ≈ q ¡
α2 31 t 3 + 23 s 3 − s 2 t α2 t
¢¡ ¢
(3.17)
t 2 − s2
= q¡
1 4 2 3
¢.
2 t + t s − t 2s2
3 3
Some copulas do not employ, in their definition, Pearson’s coefficient as the correla-
tion measure. For example, Archimedean copulas employ the Kendall’s τ rank correla-
tion coefficient. However, a relation between both Pearson’s and Kendall’s τ coefficients
is available, ³π ´
P = sin τ .
2
G UMBEL COPULA
In this chapter, we will use Archimedean Gumbel copula. The Gaussian copula Rt may
seem a natural choice, as σ(t ) follows a log-normal distribution and Y (T ) = s σ2 (z)dz
can be seen as summation of squared log-normal processes. However, in Section 2.5.1,
we have empirically shown that the Gumbel copula performs most accurately in this
context for a variety of SABR parameter values. The formal definition of the Archimedean
Gumbel copula, considering F 1 . . . , F d ∈ [0, 1]d as the marginal distributions, reads
à à !1/θ !
d ¡
X ¢θ
C θ (F 1 . . . , F d ) = exp − − log(F i ) , (3.18)
i =1
3.3. C OMPONENTS OF THE M SABR METHOD 57
¢θ
where − log(·) , θ > 0 is the so-called generator function of the Gumbel copula. In or-
¡
der to calibrate parameter θ, a relation between θ and the rank correlation coefficient
Kendall’s τ can be employed, i.e. θ = 1/(1 − τ).
So, we have derived approximations for all the components required to apply the
copula-based
Rt technique for the time-integrated variance simulation. The algorithm to
sample s σ2 (z)dz given σ(t ) and σ(s) then consists of the following steps:
4. Generate correlated uniforms, Ulog σ(t )| log σ(s) and Ulog Ỹ | log σ(s) from the Gumbel
copula by Equation (3.18).
5. From Ulog σ(t )| log σ(s) , invert F log σ(t )| log σ(s) to get the samples σ̄n of log σ(t )| log σ(s).
This procedure is straightforward since the normal distribution inversion is ana-
lytically available.
6. From Ulog Ỹ | log σ(s) , invert F log Ỹ | log σ(s) to get the samples ȳ n of log Ỹ | log σ(s). We
propose an inversion procedure based on linear interpolation. First we evalu-
ate the CDF function at some discrete points. Then, the insight is that, by ro-
tating the CDF under consideration, we can interpolate over probabilities. This
is possible when the CDF function is a smoothly increasing function. The inter-
polation polynomial provides the quantiles of the original distribution from some
given probabilities. Since F log Ỹ | log σ(s) is indeed a smooth and increasing function,
the interpolation-based inversion is definitely applicable. This procedure together
with the use of 2D SCMC sampler (see Section 3.3.2) results in a fast and accurate
inversion.
Rt
7. The samples σn of σ(t )|σ(s) and y n of Y (s, t ) = s σ2 (z)dz|σ(t ), σ(s) are obtained
by simply taking exponentials as
σn = exp(σ̄n ), y n = exp( ȳ n ).
Rt
3.3.5. S IMULATION OF S(t ) GIVEN S(s), σ(s), σ(t ) AND s σ2 (z)dz
Once the samples from the time-integrated variance are obtained, the simulation of the
conditional asset dynamics represents the final step of the mSABR method. For that, we
follow the same approach as in Section 2.4.4 of the previous chapter, distinguishing the
cases β = 0, β = 1 and β 6= 0, β 6= 1.
However, as the mSABR method is a multiple time-step scheme, a generic interval
[s, t ] (instead of the fixed interval [0, T ]) is employed and some extra considerations must
be taken into account.
When β 6= 1, i.e. the S(t ) process does not follow log-normal dynamics, and negative
asset prices can be obtained. This contradicts one of the main assumptions for the va-
lidity of the CDF expression for the ‘exact’ SABR simulation in Equation (3.3). Therefore,
in the case of β 6= 1, the simulation of S(t ) must be corrected to be consistent within the
58 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
S0 σ0 α β ρ T
Set I [65] 0.5 0.5 0.4 0.5 0.0 4
Set II [21] 0.04 0.2 0.3 1.0 −0.5 5
Set III [4] 1.0 0.25 0.3 0.6 −0.5 20
Set IV [5] 0.0056 0.011 1.080 0.167 0.999 1
3 whole simulation procedure. If the price at the initial time, s, is less or equal to zero, i.e.
S(s) ≤ 0, the price S(t ) must remain zero for all time. Otherwise, the absorbing boundary
at zero needs to be prescribed to avoid negative prices as
If negative values are permitted, this must be handled in a different way, as it has
been recently studied in [68] or [5], for example.
As was also mentioned in Section 2.4.4, a loss of the martingale property can appear
when a continuous process is approximated by its discrete equivalent, especially if the
discretization steps are large. In the case of the SABR model, this issue appears in the
case of small β and/or S 0 values and high α and/or σ0 . Note however that, in most
situations appearing in practice, this martingale error is very small and can be easily
controlled by an increasing number of time-steps.
Since our goal is to use as few time-steps as possible, the application of a martingale
correction becomes again essential and needs to be considered in each time-step of the
simulation. We use the same martingale correction as proposed in Chapter 2.
Strikes K1 K2 K3 K4 K5 K6 K7
Antonov 73.34% 71.73% 70.17% N/A 67.23% 65.87% 64.59%
m = T /4 73.13% 71.75% 70.41% 69.11% 67.85% 66.64% 65.48%
Error(bp) −21.51 2.54 24.38 N/A 61.71 76.66 89.26
m = T /2 73.30% 71.78% 70.29% 68.86% 67.49% 66.17% 64.93%
Error(bp) −4.12 4.94 12.71 N/A 25.48 30.40 34.73
m=T 73.25% 71.67% 70.14% 68.66% 67.24% 65.89% 64.62%
Error(bp) −9.56 −5.93 −2.79 N/A 0.92 2.21 3.17
m = 2T
Error(bp)
73.32%
−2.08
71.71%
−1.56
70.16%
−1.20
68.65%
N/A
67.22%
−1.65
65.85%
−2.35
64.55%
−3.36
3
m = 4T 73.34% 71.73% 70.18% 68.67% 67.24% 65.87% 64.58%
Error(bp) 0.15 0.58 0.78 N/A 0.43 0.04 −0.48
The experiments were performed on a computer with CPU Intel Core i7-4720HQ 2.6 GHz
and RAM memory of 16 Gigabytes. The employed software package was Matlab r2016b.
As usual, the European option prices are obtained by averaging the maximum of the
forward asset values at maturity minus the strike and zero, i.e. max(S(T ) − K i (T ), 0) (call
option). Subsequently, the Black-Scholes formula is inverted to determine the implied
volatilities. Note that, once the forward asset is simulated by the mSABR method, we can
price options at multiple strikes simultaneously.
Strikes K1 K2 K3 K4 K5 K6 K7
Antonov 73.34% 71.73% 70.17% N/A 67.23% 65.87% 64.59%
n = 102 67.29% 65.55% 63.84% 62.20% 60.63% 59.01% 57.65%
RE 8.24 × 10−2 8.61 × 10−2 9.01 × 10−2 N/A 9.82 × 10−2 1.04 × 10−1 1.07 × 10−1
n = 104 73.41% 71.87% 70.36% 68.91% 67.51% 66.19% 64.94%
RE 9.65 × 10−4 1.94 × 10−3 2.75 × 10−3 N/A 4.08 × 10−3 4.93 × 10−3 5.48 × 10−3
n = 106 73.34% 71.73% 70.18% 68.67% 67.24% 65.87% 64.58%
RE 2.04 × 10−5 8.08 × 10−5 1.11 × 10−4 N/A 6.39 × 10−5 6.07 × 10−6 7.43 × 10−5
Implied Volatility
0.8
Implied Volatility
0.7
0.7
0.6
0.5
0.4 0.6
10 2 10 3 10 4 10 5 10 6 10 2 10 3 10 4 10 5 10 6
n n
(a) Strike K 2 . (b) Strike K 6 .
presented. By the relative error (RE), we can see that, as expected, the error is approx-
p
imately reduced by a factor of 1/ n. Furthermore, we wish to show how the number
of samples affects the variance of the mSABR method. In Figure 3.2, the standard devia-
tions3 for several choices of n are presented. Two strikes are evaluated, one in-the-money
strike, K 2 , and one out-of-the-money strike4 , K 6 . An identical behaviour can be observed
for all other strikes as well. Again, we see a fast convergence and variance reduction in
terms of n.
pare the performance with other, less involved, techniques. We therefore also present a
comparison between the mSABR method R t and two techniques where the time-integrated
variance is either approximated by s σ2 (z)dz ≈ σ2 (s)(t − s) (left rectangular rule) or
Rt 2 1 2 2
s σ (z)dz ≈ 2 (σ (t ) + σ (s))(t − s) (trapezoidal rule). We denote these alternatives as
the Y -Euler and Y -trpz schemes, respectively. The simulation of S(t ) is, in both cases,
carried out as explained in Section 3.3.5. In Table 3.5, we present execution times (in sec-
onds) of the MC Euler, Y -Euler, Y -trpz and mSABR schemes, to achieve a certain level of
accuracy (measured as the error of the implied volatilities in basis points). In addition,
the required number of time-steps is presented in parentheses to provide an insight in
the efficiency. Furthermore, the speedup obtained by our method is included in Table
3.6. We observe that the mSABR method is superior in terms of the ratio between com-
putational cost and accuracy. The accuracy of Y -Euler or Y -trpz can be improved by
adding intermediate time-steps, but then also the computational cost increases.
Strikes K1 K2 K3 K4 K5 K6 K7
ρ = −0.5
MC 22.17% 21.25% 20.38% 19.57% 18.88% 18.33% 17.95%
mSABR 22.21% 21.28% 20.39% 19.58% 18.88% 18.32% 17.94%
Error(bp) 3.59 2.86 1.78 0.95 −0.19 −0.96 −1.10
ρ = 0.0
MC 21.35% 20.96% 20.71% 20.63% 20.71% 20.96% 21.34%
mSABR 21.35% 20.95% 20.69% 20.60% 20.68% 20.93% 21.32%
Error(bp) 0.04 −1.04 −2.51 −3.02 −3.33 −3.19 −2.56
3 ρ = 0.5
MC 19.66% 20.04% 20.61% 21.34% 22.20% 23.14% 24.16%
mSABR 19.59% 19.96% 20.54% 21.28% 22.15% 23.11% 24.11%
Error(bp) −6.93 −7.36 −6.77 −5.53 −4.35 −3.76 −4.05
Table 3.7: Implied volatility, varying ρ: Euler Monte Carlo vs. mSABR. Set II.
option reads
· ¸
Vi (T, S, B ) = exp (−r T ) E (S(T ) − K i )1( max S(t k ) > B ) ,
0<t k ≤T
where 1(·) represents the indicator function and t k are the predefined times where the
barrier condition (whether or not it is hit) is checked.
As a reference, a Monte Carlo method with very fine Euler-Maruyama time discretiza-
tion is employed, as in the previous experiment. The number of time-steps for the mSABR
scheme is again set to m = 4T . In Tables 3.8, 3.9 and 3.10, the obtained prices are pre-
sented for the parameter sets I, II and III, respectively. The prices are multiplied by a
factor of 100. Also, the MSE is shown. We define the MSE as
1X 7 ¡ ¢2
MSE = V MC (T, S, B ) − VimS AB R (T, S, B ) ,
7 i =1 i
where ViMC (T, S, B ) and VimS AB R (T, S, B ) are the barrier option prices provided by Monte
Carlo method and by the mSABR method, respectively.
The resulting accuracy is very satisfactory. As expected, higher option prices are ob-
tained for bigger barrier levels, B , and/or lower strikes, K i .
Strikes K1 K2 K3 K4 K5 K6 K7
B = 1.2
MC 6.2868 5.4577 4.6330 3.8263 3.0551 2.3358 1.6869
mSABR 6.3264 5.4890 4.6566 3.8436 3.0677 2.3452 1.6953
MSE 5.3196 × 10−8
B = 1.5 3
MC 10.2271 9.1568 8.0718 6.9856 5.9142 4.8753 3.8899
mSABR 10.2249 9.1507 8.0609 6.9702 5.8960 4.8572 3.8737
MSE 1.8884 × 10−8
B = 1.8
MC 13.5740 12.3571 11.1118 9.8513 8.5910 7.3477 6.1403
mSABR 13.5915 12.3686 11.1174 9.8507 8.5851 7.3380 6.1276
MSE 1.0851 × 10−8
Table 3.8: Pricing barrier options with mSABR: Vi (T, S, B ) × 100. Set I.
Strikes K1 K2 K3 K4 K5 K6 K7
B = 0.08
MC 1.1702 0.9465 0.7268 0.5215 0.3423 0.1996 0.0987
mSABR 1.1724 0.9486 0.7285 0.5226 0.3428 0.1997 0.0986
MSE 1.8910 × 10−10
B = 0.1
MC 1.3099 1.0766 0.8462 0.6290 0.4367 0.2794 0.1626
mSABR 1.3092 1.0761 0.8456 0.6282 0.4355 0.2782 0.1618
MSE 7.5542 × 10−11
B = 0.12
MC 1.3521 1.1168 0.8841 0.6644 0.4695 0.3093 0.1891
mSABR 1.3518 1.1166 0.8838 0.6639 0.4686 0.3080 0.1880
MSE 6.3648 × 10−11
Table 3.9: Pricing barrier options with mSABR: Vi (T, S, B ) × 100. Set II.
64 3. M ULTIPLE TIME - STEP M ONTE C ARLO SIMULATION OF THE SABR MODEL
Strikes K1 K2 K3 K4 K5 K6 K7
B = 2.0
MC 29.1174 23.4804 17.2273 10.7825 5.0203 1.1750 0.0036
mSABR 29.2346 23.5828 17.3086 10.8327 5.0385 1.1805 0.0036
MSE 4.8146 × 10−7
B = 2.5
MC 41.3833 34.5497 26.8311 18.6089 10.7281 4.4893 0.9434
mSABR 41.3394 34.5097 26.7948 18.5747 10.6943 4.4546 0.9320
MSE 1.2131 × 10−7
3 B = 3.0
MC 48.5254 41.1652 32.7980 23.7807 14.9344 7.5364 2.6692
mSABR 48.5008 41.1515 32.7888 23.7655 14.9097 7.5117 2.6549
MSE 3.6201 × 10−8
Table 3.10: Pricing barrier options with mSABR: Vi (T, S, B ) × 100. Set III.
where θ > 0 is a displacement, or shift, in the underlying. The volatility process, σ(t ),
remains invariant (see Equation (3.1)). Parameter θ can be seen as the maximum level of
negativity that is expected in the rates. This generalization of the SABR model is widely
used by market practitioners due to its simplicity, and, precisely, the advantage of keep-
ing the existing simulation schemes. In mSABR, the assumption S(t ) > 0 is required to
employ the method.
In order to test the mSABR scheme in the context of negative interest rates, we em-
ploy the set IV. The SABR model parameters were obtained by Antonov et al. [5] after a
calibration process to real data5 under the shifted SABR model. Note that this config-
uration corresponds to an extreme situation with high values of the correlation, ρ, and
vol-vol, α. As a shift parameter, θ = 0.02 is chosen. We compare the normal implied
volatilities6 obtained by employing Monte Carlo with very fine (1000T ) time discretiza-
tion (as a reference) and the mSABR method with m = 4T . In Figure 3.3(a), these curves
are depicted and a perfect fit is observed. The strikes, K are chosen to permit negative
values. We perform an even more extreme experiment, by setting S 0 = 0. The resulting
normal implied volatilities are presented in Figure 3.3(b), again with a excellent fit.
Due to the complexity of the negative interest rates experiment and this particular set
of parameters (ρ ≈ 1), we wish to test the performance of our method under these con-
ditions. In Tables 3.11 and 3.12 the execution times obtained are presented. Again, the
small number of time-steps due to the “almost” exact simulation of the mSABR method
provides an important gain in terms of performance.
3.5. C ONCLUSIONS
In this chapter, we have proposed an accurate and robust multiple time-step Monte
Carlo method to simulate the SABR model with only a few time-steps. The mSABR
method employs many nontrivial components that are not yet seen before in combina-
5 1Y15Y Swiss franc swaption from February 10, 2015.
6 Determined by using the normal Bachelier model instead of the log-normal Black-Scholes model.
3.5. C ONCLUSIONS 65
100 100
50 50
3
0 0
-0.5 0 0.5 1 1.5 2.0 2.5 -1 -0.5 0 0.5 1 1.5 2
K (bp) K (bp)
(a) (b)
Figure 3.3: The mSABR method in the context of negative interest rates.
tion within a Monte Carlo simulation. A copula methodology to generate samples from
the conditional SABR’s time-integrated variance process has been used. The marginal
distribution of the time-integrated variance has been derived by employing Fourier tech-
niques. We have dealt with an increasing computational complexity, due to the multi-
ple time-steps and the almost exact simulation, by applying an interpolation based on
stochastic collocation. This has resulted in an efficient SABR simulation scheme. We
have numerically shown the convergence and the stability of the method. In terms of
convergence, the mSABR method requires only very few time-steps to achieve high ac-
curacy, due to the focus on almost exact simulation (in contrast to the low-bias SABR
3 scheme [21]). This fact impacts the performance, obtaining an excellent ratio between
accuracy and computational cost. As a multiple time-step method, it can be employed
to price path-dependent options. As an example, a pricing experiment for barrier op-
tions has been carried out. Furthermore, we have shown that the mSABR scheme is also
suitable in the context of negative interest rates, in combination with a shifted model.
In this chapter, we consider a completely different hybrid solution method, which relies on
a combination of Monte Carlo methods and Fourier inversion techniques, but employing
a data-based approach. We therefore present the data-driven COS method, ddCOS. It is a
financial option valuation method which assumes the availability of asset data samples:
a ChF of the underlying asset is not required. As such, the method represents a general-
ization of the well-known COS method[49]. Convergence with respect to the number of
asset samples is according the convergence of Monte Carlo methods for pricing financial
derivatives. The ddCOS method is particularly interesting for PDF recovery and also for
the efficient computation of the option’s sensitivities Delta and Gamma. These are often
used in risk management, and can be obtained at a higher accuracy with ddCOS than
with plain Monte Carlo methods.
4.1. I NTRODUCTION
In quantitative finance, statistical distributions are commonly used for the valuation of
financial derivatives and within risk management. The underlying assets are often mod-
elled by means of SDEs. Except for the classical and most simple asset models, the cor-
responding PDF and CDF are typically not known and need to be approximated.
In order to compute derivative prices, and to approximate statistical distributions,
Fourier-based methods are often used numerical techniques. They are based on the
connection between the PDF and the ChF, which is the Fourier transform of the PDF. The
ChF is often available, and sometimes even in closed form, for the broad class of regu-
lar diffusions and also for Lévy processes. Some representative efficient Fourier pric-
ing methods include those by Carr and Madan [19], Boyarchenko and Levendorskii [13],
Lewis [92] and Fang and Oosterlee [49]. Here, we focus on the COS method from [49],
which is based on an approximation of the PDF by means of a cosine series expansion.
Still, however, the asset dynamics for which the ChF are known is not exhaustive,
and for many relevant asset price processes we do not have such information to recover
the PDF. In recent years several successful attempts have been made to employ Fourier
pricing methods without the explicit knowledge of the ChF. In Grzelak and Oosterlee [64],
for example, a hybrid model with stochastic volatility and stochastic interest rate was
linearized by means of expectation operators to cast the approximate system of SDEs in
the framework of affine diffusions. Ruijter and Oosterlee [111] discretized the governing
asset SDEs first and then worked with the ChF of the discrete asset process, within the
framework of the COS method. Borovykh et al. [12] used the Taylor expansion to derive
This chapter is based on the article “On the data-driven COS method”. Submitted for publication, 2017 [91].
67
68 4. T HE DATA - DRIVEN COS METHOD
a ChF for which they could even price Bermudan options highly efficiently. In this work,
we extend the applicability of the COS method to the situation where only data (like
samples from an unknown underlying risk neutral asset distribution) are available.
The density estimation problem, using a data-driven PDF, has been intensively stud-
ied in the last decades, particularly since it is a component in the machine learning
framework [10]. Basically, density estimators can be classified into parametric and non-
parametric estimators. The first type relies on the fact that prior knowledge is available
(like moments) to determine the relevant parameters, while for non-parametric estima-
tors the parameters need to be determined solely from the samples themselves. Within
this second type of estimators we can find histograms, kernel density and orthogonal
series estimators. A thorough description of these estimators is provided in [120]. More
recently, some applications in finance have also appeared, see [48, 97, 118], for example.
4 For the valuation of financial derivatives, we will combine density estimators with
Fourier-based methods, so orthogonal series form a natural basis. We will focus on the
framework of statistical learning, see [131]. In statistical learning, a regularization is em-
ployed to derive an expression for the data-driven empirical PDF. By representing the
unknown PDF as a cosine series expansion, a closed-form solution of the regularization
problem is known [131], which forms the basis of the data-driven COS method (ddCOS).
The use of the COS method gives us expressions for option prices and, in particular,
for the option sensitivities or Greeks. These option Greeks are the derivatives of option
price with respect to a variable or parameter. The efficient computation of the Greeks is
a challenging problem when only asset samples are available. Existing approaches are
based on Monte Carlo-based techniques, like on finite-differences (bump and revalue),
pathwise or likelihood ratio techniques, for which details can be found in [59], chapter 7.
Several extensions and improvements of these approaches have appeared, for example,
based on adjoint formulations [56], the ChF [60, 61], Malliavin calculus [41, 53], algorith-
mic differentiation [17, 43] or combinations of these [18, 22, 58]. All in all, the computa-
tion of the Greeks can be quite involved. The ddCOS method is not directly superior to
Monte Carlo methods for option valuation, but it is competitive for the computation of
the corresponding sensitivities. We derive simple expressions for the Greeks Delta and
Gamma.
The importance of Delta and Gamma in dynamic hedging and risk management is
well-known. A useful application is found in the Delta-Gamma approach [14] to quantify
market risk. The approximation of risk measures like Value-at-Risk (VaR) and Expected
Shortfall (ES) under the Delta-Gamma approach is still nontrivial. Next to Monte Carlo
methods, Fourier techniques have been employed in this context, when the ChF of the
change in the value of the option portfolio is known (see [23, 121]). For example, the
COS method has been applied in [101] to efficiently compute the VaR and ES under the
Delta-Gamma approach. The ddCOS method may generalize the applicability to the
case where only data is available.
This paper is organized as follows. The ddCOS method, and the origins in statistical
learning and Fourier-based option pricing, are presented in Section 4.2. Variance reduc-
tion techniques can also be used within the ddCOS method, providing an additional con-
vergence improvement. We provide insight and determine values for the method’s open
parameters in Section 4.3. Numerical experiments, with a focus on the option Greeks,
4.2. T HE DATA - DRIVEN COS METHOD 69
with r the risk-free rate, T the maturity time, and f (y|x) the PDF of the underlying pro-
cess, and h(T, y) represents the option value at maturity time, being the payoff function.
Typically, x and y are chosen to be scaled variables,
S(0) S(T )
µ ¶ µ ¶
x := log and y := log ,
K K
where S(t ) is the underlying asset process at time t , and K is the strike price.
f (y|x) is unknown in most cases and in the COS method it is approximated, on a
finite interval [a, b], by a cosine series expansions, i.e.,
à !
1 X∞ ³ y −a´
f (y|x) = A0 + 2 A k (x) · cos kπ ,
b−a k=1 b−a
Z b ³ y −a´
A 0 = 1, A k (x) = f (y|x) cos kπ dy, k = 1, 2, . . . .
a b−a
By substituting this expression in Equation (4.1), interchanging the summation and
integration operators using Fubini’s Theorem, and introducing the following definition,
Z b ³ y −a´
2
Hk := h(T, y) cos kπ dy,
b−a a b−a
we find that the option value is given by
∞
0
V (t , x) ≈ e−r (T −t )
X
A k (x)Hk , (4.2)
k=0
70 4. T HE DATA - DRIVEN COS METHOD
where 0 indicates that the first term is divided by two. So, the product of two real-valued
functions in Equation (4.1) is transformed into the product of their cosine expansion
coefficients, A k and Hk . Coefficients A k can be computed by the ChF and Hk is known
analytically (for many types of options).
Closed-form expressions for the option Greeks can also be derived. From the COS
option value formula, ∆ and Γ are obtained by
∞
∂V (t , x) 1 ∂V (t , x) 0 ∂A k (x) Hk
∆=
X
= ≈ exp(−r (T − t )) ,
∂S S(0) ∂x k=0 ∂x S(0)
∂2 V (t , x) ∂V (t , x) ∂2 V (t , x)
µ ¶
1
Γ= = − + (4.3)
∂S 2 S 2 (0) ∂x ∂x 2
∞ 2
X0 ∂A k (x) ∂ A k (x) Hk
µ ¶
≈ exp(−r (T − t )) − + .
4 k=0 ∂x ∂x 2 S 2 (0)
Due to the rapid decay of the coefficients, V (t , x), ∆ and Γ can be approximated with
high accuracy by truncating the infinite summation in Equations (4.2) and (4.3) to N
terms. Under suitable assumptions, exponential convergence is proved and numerically
observed.
1 X n
F n (x) = η(x − X j ), (4.5)
n j =1
where η(·) is a step function. This approximation converges to the “true CDF” with rate
p
O (1/ n). Rewriting Equation (4.4) as a linear operator equation, gives us,
C f = F ≈ Fn ,
Rx
where the operator C h := −∞ h(z)dz.
As explained in [131], this matrix equation represents an ill-posed problem, and a
risk functional should be constructed, with a regularization term, as follows
R γn ( f , F n ) = L 2H (C f , F n ) + γn W ( f ), (4.6)
where L H is a metric of the space H , γn > 0 is a parameter which gives a weight to the
regularization term W ( f ), with γn → 0 as n → ∞. The solution of C f = F n belongs to D,
the domain of definition of W ( f ). Functional W ( f ) takes real non-negative values in D.
4.2. T HE DATA - DRIVEN COS METHOD 71
nγn → ∞ as n → ∞, and
n (4.7)
γn → ∞ as n → ∞.
log n
Denoting by fˆ(u), F̂ n (u) and Kˆ (u) the Fourier transforms of f (x), F n (x) and K (x),
respectively, an expression for F̂ n (u) can be derived by applying Fourier transformation
to Equation (4.5),
1
Z
F̂ n (u) = F n (x)e−i ux dx
2π R
1
Z X n 1 X n exp(−i uX )
j
= η(x − X j )e−i ux dx = ,
2nπ R j =1 n j =1 iu
p
where i = −1 is the imaginary unit.
By employing the convolution theorem and Parseval’s identity, Equation (4.8) can be
rewritten as
°ˆ °2
° f (u) − 1 n exp(−i uX j ) °
P
n j =1 °2
° + γn °Kˆ (u) fˆ(u)°L 2 .
°
R γn ( f , F n ) = °
° °
° iu °
L2
fˆ(u) 1 X n
− exp(−i uX j ) + γn Kˆ (u)Kˆ (−u) fˆ(u) = 0,
u2 nu 2 j =1
When kernel K is the p-th derivative of the Dirac delta function, i.e., K (x) = δ(p) (x),
and the desired PDF, f (x), belongs to the class of functions whose p-th derivative (p ≥ 0)
belongs to L 2 (0, π), the risk functional becomes
Z π µZ x ¶2 Z π¡ ¢2
R γn ( f , F n ) = f (y)dy − F n (x) dx + γn f (p) (x) dx. (4.10)
0 0 0
1 2 X∞
f n (θ) = + Ã k ψk (θ), (4.11)
π π k=1
Using cosine series expansions, i.e., ψk (θ) = cos(kθ), it is well-known that ψ̂k (u) =
1
2 (δ(u−k)+δ(u+k)). This facilitates the computation of the series coefficients, Ã k , avoid-
ing the calculation of the integral. Thus, the minimum of the functional using cosine
series expansions is obtained when
õ ¶ n
1 1 X
à k = exp(i kθ j )
2n 1 + γn (−k)2 Kˆ (−k)Kˆ (k) j =1
µ ¶ n !
1 X
+ exp(−i kθ j )
1 + γn k 2 Kˆ (k)Kˆ (−k) j =1
(4.12)
1 1 X n
= cos(kθ j )
1 + γn k 2 Kˆ (k)Kˆ (−k) n j =1
1 1 X n
= cos(kθ j ),
1 + γn k 2(p+1) n j =1
where θ j ∈ (0, π) are given samples of the unknown distribution. In the last step, Kˆ (u) =
(i u)p is used.
Assuming that the samples are given, the solution contains two free parameters: reg-
ularization parameter γn , and smoothing parameter p.
In Section 4.3, we will discuss the impact of the regularization parameter on the con-
vergence to the PDF in terms of the number of samples. We will use p = 0 here.
4.2. T HE DATA - DRIVEN COS METHOD 73
True
0.4
p=0
p=1
0.3 p=2
p=3
0.2
0.1
-4 -2 0 2 4 4
S j (T )
µ ¶
Y j := log .
K
Before employing these samples in the regularization approach and because the so-
lution is defined in (0, π), we need to transform the samples by the following change of
variables,
Yj − a
θj = π ,
b−a
where the boundaries a and b are defined as
a := min (Y j ), b := max (Y j ).
1≤ j ≤n 1≤ j ≤n
74 4. T HE DATA - DRIVEN COS METHOD
4 As in the original COS method, we must truncate the infinite sum in Equation (4.13)
to a finite number of terms N , i.e.,
N
0
Ṽ (t , x) = e−r (T −t )
X
à k Hk , (4.14)
k=0
Taking derivatives in Equation (4.14) w.r.t the samples, Y j , and following the COS
expression for the sensitivities in Equation (4.3), the data-driven Greeks, ∆ ˜ and Γ̃, can be
obtained by
N
kπ Hk
µ ¶
˜ = e−r (T −t ) 0 B̃ k · −
∆
X
· ,
k=0 b − a S(0)
N µ
kπ kπ 2 Hk
µ ¶ ¶
0
Γ̃ = e−r (T −t )
X
B̃ k · − Ã k · · 2 .
k=0 b−a b−a S (0)
of AV is not possible. Therefore, if we assume that antithetic samples, Yi0 , to the original
samples Yi , can be computed without any serious computational effort, a new estimator
for the coefficients can be defined as
1¡
à k + à 0k ,
¢
Ā k :=
2
where we denote by à 0k the corresponding “antithetic coefficients”, obtained by Yi0 . By a
similar derivation as for the standard AV technique, it can be proved that the use of coef-
ficients Ā k will give us a variance reduction compared to using the à k coefficients. Other
variance reduction techniques may also be considered for the ddCOS method under the
assumption of i.i.d. samples.
In order to reduce the variance of any estimator, additional information may be in-
troduced. A well-known property to fulfil is the martingale property. To preserve this
property, a simple transformation of the samples can be made by 4
1 X n
S(T ) = S(T ) − S j (T ) + E[S(T )],
n j =1
1 X n
= S(T ) − S j (T ) + S(0) exp(r T ).
n j =1
As this modification is performed over the samples, it can also be used in the context
of the ddCOS method.
of samples. For that purpose we can exploit the relation between the empirical CDF,
F n (x), and the unknown CDF, F (x). This relation can be modelled by means of different
statistical laws. Some examples include the Kolmogorov-Smirnov, Anderson-Darling,
Kuiper and the Smirnov-Cramér-von Mises laws, by which a measure of the distance, or,
goodness-of-fit, between F n (x) and F (x) can be defined.
We are interested in a statistic which has a distribution, independent of the actual
CDF and the number of samples n, and consider the Smirnov-Cramér-von Mises (SCvM)
statistic [32, 123], defined as
Z
ω := n (F (x) − F n (x))2 dF (x).
2
R
m 2
Z
F γn (x) − F n (x) dF (x) = ω ,
¡ ¢2
R n
where m ω2 is the mean of the SCvM statistic, ω2 . In the one-dimensional case, a simpli-
fied expression can be derived [3, 123], i.e.,
n i − 0.5 2
µ ¶
X 1
F γn ( X̄ j ) − = m ω2 − , (4.16)
j =1 n 12n
In our case, the MISE can be decomposed into two terms (see [87]), as follows,
·Z π¡ ¢2
¸ N ∞
E Var à k + A 2k ,
X £ ¤ X
f n (x) − f (x) dx =
0 k=1 k=N +1
where the A k are the “true” coefficients from Equation (4.2), and the à k are from Equa-
tion (4.12). The MISE, as it is defined, is the summation of the bias and the variance of
the estimator. In the Fourier cosine expansions context, an increasing N implies smaller
bias (but bigger variance). The opposite also holds i.e. small N produces more bias and
4.3. C HOICE OF PARAMETERS IN DD COS M ETHOD 77
1 XN 1
µ
1 1
¶ ∞
2
A 2k .
X
MISE = ¢2 + A 2k − A k + (4.17)
n k=1 1 + γn k 2(p+1) 2 2
¡
k=N +1
E XAMPLE
The error measure defined in Equation (4.17) is employed to analyse the influence of the
regularization. We use the standard normal distribution as a reference test case. The
coefficients that are based on the available analytic solution are replaced by the corre-
sponding data-driven coefficients that depend on γn .
In Figure 4.2(a), we present the convergence results for different regularization pa-
rameters. Next to the rules suggested by Equations (4.15) and (4.16), we also include
the case γn = 0 to highlight the benefits of employing the regularization approach. The
obtained results confirm the improvements provided by both γn -rules, with the almost
optimal γn given by the SCvM statistic.
A second aspect which is influenced by γn is the accuracy with respect to the number
of expansion terms N in Equation (4.14). For this, in Figure 4.2(b) we present the MISE
for the standard normal distribution when different coefficients A k are employed: A k by
the γn -rule (4.15) (red lines), A k by the SCvM (4.16) (blue lines) and, as a reference, the
A k -coefficients obtained by the ChF (black dashed line). We notice that, when γn = 0
is used in the MISE formula (all dashed lines), for increasing values of N , the approxi-
mation deteriorates, resulting in increasing approximation errors. In contrast, when the
corresponding γn is used (regular lines), the error stabilizes. Since the number of expan-
sion coefficients is typically chosen high, this property of the regularization approach is
useful.
78 4. T HE DATA - DRIVEN COS METHOD
10 2 10 0
n
"rule" A k real
10 1 SCvM -1 A k rule
n 10
0 n
=0 A k rule, n
=0
10
10 -2 A k SCvM
10 -1 A k SCvM, n =0
10 -3
10 -2
10 -3 10 -4
10 -4 10 -5
10 1 10 2 10 3 10 4 10 5 0 50 100 150 200
4 Figure 4.2: Influence of γn on the convergence w.r.t. the number of samples n and the number of terms N .
O PTIMAL N - VALUES
As mentioned, with an increasing number of series expansion coefficients N , the ap-
proximation based on the regularization approach does not improve any further. This
fact indicates that we need to determine an optimal value of N , i.e. the smallest value
of N for which the MISE stabilizes, see Figure 4.2(b). Using small N -values is important
within the data-driven methodology, since parameter N considerably affects the perfor-
mance of the method.
We propose an empirical procedure to compute the optimal value of N . The MISE
in Equation (4.17) depends on the number of samples n, the number of coefficients N ,
and the coefficients themselves A k . Since we wish to compute no more coefficients than
necessary, we focus on the parameters n and N . Regarding the influence of the number
of samples, see also the curves in Figure 4.3(a), higher n-values also require higher N -
values to ensure stabilizing errors in the MISE curve. To determine a relation between n
and N , we need to simplify the MISE formula as we desire a closed-form expression. We
discard the second part in Equation (4.17), ¡as it goes to zero¢ when N increases. Within
the first part, we approximate the quantity 21 + 12 A 2k − A 2k ≈ 21 . Then, the approximate
formula for the MISE is found to be
N 1
1 X 2
MISE ≈ ¢ =: MISEN ,
n k=1 1 + γn k 2(p+1) 2
¡
where n and N are directly connected, but also γn appears. It is possible to prove that
the above MISE proxy, MISEN , is an upper bound for the first part in Equation (4.17).
From Figure 4.3(b), we observe two important facts: the MISE proxy provides a highly
satisfactory approximation for the first addend of the MISE, which converges to the MISE
when N increases. By combining these two observations, we will employ the MISE proxy
to determine the optimal number of terms N . Since the computation of γn by Equation
(4.16) involves N , we use the case where γn is determined by Equation (4.15) (which only
depends on n). Figure 4.3(b) shows that the MISEN (to a different level of accuracy) is
very similar in both cases, where the γn rule appears conservative, i.e. bigger N -values
4.3. C HOICE OF PARAMETERS IN DD COS M ETHOD 79
10 0 10 0
n = 10 1 rule
n
2
n = 10 rule - addend 1
-1 n
n = 10 3
10
10 -1 n = 10 4 n
rule - proxy
n = 10 5 SCvM
10 -2
n
n
SCvM - addend 1
-2
10 SCvM - proxy
n
10 -3
10 -3
10 -4
10 -4 10 -5
0 5 10 15 20 25 0 50 100 150 200
(a) MISE for several values of n. (b) Comparison between MISE and MISEN .
Figure 4.3: Influence of the number of samples n, and the number of coefficients N in the MISE and in the
4
MISE proxy, MISEN .
are required to reach the non-decreasing error region. The proposed procedure itera-
tively determines whether or not we reached error stabilization by checking the differ-
ences in MISEN between two consecutive N -values. When this difference is less than
a predefined tolerance, ², we have approximated the optimal N -value. Since N should
grow with n, we propose to use ² := p1n , i.e. the expected order of accuracy for the PDF
approximation should be given in terms of the number of samples.
By collecting all described components, the approximately optimal N -value becomes
a function only of n. The iterative methodology is described in Algorithm 3. In Figure 4.4,
we observe that the resulting optimal N function is an increasing staircase function (with
a predefined floor of N = 5).
Algorithm 3: Optimal N -values.
Data: n, γn
Nmi n = 5
Nmax = ∞
² = p1n
MISE∗ = ∞
for N = Nmi n : Nmax do
1
MISEN = n1 k=1
PN 2
(1+γn k 2(p+1) )2
²N = |MISE N −MISE∗ |
|MISE | N
if ²N > ² then
Nop = N
else
Break
MISE∗ = MISEN
Now, we have described the techniques to determine values for the regularization
80 4. T HE DATA - DRIVEN COS METHOD
18
16
14
12
N
10
4
1 2 3
10 10 10
n
10 1 10 1
ddCOS ddCOS
ddCOS, AV ddCOS, AV
MC MC
MC, AV MC, AV
10 0 10 0
10 -1 10 -1
10 -2 10 -2
10 1 10 2 10 3 10 4 10 5 10 1 10 2 10 3 10 4 10 5
provement in terms of precision as when it is applied to the plain Monte Carlo method.
Another observation is that, under this particular setting, the estimators (both ddCOS
and Monte Carlo) of the put option value result in smaller variances than the call option
estimators. In terms of accuracy, it is thus worth computing the put value and then use
the put-call parity formula for call options. In addition, the use of the put together with
the put-call parity is recommended since call payoff functions grow exponentially and
may give rise to cancellation errors.
We have empirically shown in Figure 4.5 that the ddCOS method converges to the
p
true price with the expected convergence rate O (1/ n), which resembles the plain Monte
Carlo convergence. However, by the ddCOS method, not only the option value but also
the sensitivities can readily be obtained. This is an advantage w.r.t Monte Carlo-based
methods for estimating sensitivities, where often, additional simulations, intermediate
time-steps or prior knowledge are required.
Thus, a similar convergence test is performed for the ∆ and Γ sensitivities, see Figure
4.6. As Monte Carlo-based method for the Greeks calculation we consider the Finite
Difference method (bump and revalue, denoted as MCFD). We have chosen MCFD for
the comparison because it is flexible and it does not require prior knowledge. MCFD may
require one or two extra simulations, starting at S(0) + ν and S(0) − ν, and the choice of
optimal shift parameter, ν, may not be trivial. The reference Delta and Gamma values are
given by the Black-Scholes formula. In Figure 4.6 we observe the expected convergence
and the reduction in the variance due to the use of AV. In both experiments, while the ∆
is very well approximated by the ddCOS and MCFD methods, the second derivative, Γ,
appears more complicated for the MCFD method. This fact was already pointed out by
Glasserman in [59]. The ddCOS estimator, however, is accurate and stable as it is based
on the data-driven PDF and the ddCOS machinery.
Using n = 100, 000, in Table 4.1 we now compare the ∆ and Γ estimations obtained
under the GBM dynamics for several strikes. The performance of the ddCOS method is
very satisfactory as it is accurate, with small RE (averaged over K ) and reproduces the
82 4. T HE DATA - DRIVEN COS METHOD
10 -1
ddCOS ddCOS
ddCOS , AV ddCOS , AV
MCFD 10 -2 MCFD
MCFD , AV MCFD , AV
10 -2
-3
10
10 -3
10 -4
-4
10
10 1 10 2 10 3 10 4 10 5 10 1 10 2 10 3 10 4 10 5
reference values very well. The difficulties of the MCFD estimating Γ are more clearly
visible.
We wish to test the ddCOS method in a more complex situation, by adding jumps
in the form of a Merton jump-diffusion asset price process. To accurately compute the
option sensitivities in this case gives rise to difficulties for Monte Carlo-based methods.
We perform a similar experiment as before, where now the underlying asset follows the
Merton jump-diffusion model, and the obtained ∆ and Γ are presented in Table 4.2. In
this case, the reference value is provided by the COS method at a high accuracy.
In terms of computational cost, the ddCOS method is a competitive alternative, as
additional simulations are not needed. Notice that in these latter experiments AV tech-
niques are not employed. Under the Merton dynamics, the ddCOS method takes 0.1813
seconds and MCFD 0.3149 seconds. In this case, the use of the ddCOS method reduces
the computational costs, as the cost of an individual simulation by the Merton model is
significantly higher than for GBM asset dynamics.
where S(t ) = S̄(t ) exp (r (T − t )) is the forward value of the underlying S̄(t ), with r the
interest rate, S 0 the spot price and T maturity time. The stochastic volatility process is
denoted by σ(t ), with σ(0) = σ0 , WS (t ) and Wσ (t ) are two correlated Brownian motions
with correlation ρ (i.e. WS Wσ = ρt ). The parameters of the SABR model are α > 0 (the
volatility of the volatility), 0 ≤ β ≤ 1 (the elasticity) and ρ (the correlation coefficient).
4.4. A PPLICATIONS OF THE DD COS METHOD 83
Table 4.1: GBM option Greeks: Call, S(0) = 100, r = 0.1, σ = 0.3 and T = 2.
Table 4.2: Merton jump-diffusion option Greeks: Call, S(0) = 100, r = 0.1, σ = 0.3, µ j = −0.2, σ j = 0.2 and λ = 8
and T = 2.
84 4. T HE DATA - DRIVEN COS METHOD
Table 4.3: Greek ∆ under the SABR model: Call, S(0) = 100, r = 0, σ0 = 0.3, α = 0.4, β = 0.6, ρ = −0.25 and T = 2.
Table 4.4: Greek ∆ under SABR model: Call, S(0) = 0.04, r = 0.0, σ0 = 0.4, α = 0.8, β = 1.0, ρ = −0.5 and T = 2.
The calculation of the Greeks under the SABR model becomes challenging but can
be addressed by the ddCOS method. To employ the method, we need samples of the
underlying asset at time T . Here, we make use of the one time-step SABR Monte Carlo
simulation introduced in Chapter 2. This one time-step SABR simulation is based on the
expression for the CDF of the conditional SABR process [75]. Thus, the ddCOS method
will be combined with the one time-step SABR simulation to efficiently compute ∆ and
Γ under the SABR dynamics.
For the numerical experiments, we consider two parameter settings. First of all, a
basic parameter set is taken, where the Hagan formula by [67] is valid and can be used
as a reference. The results are presented in Table 4.3. For the second test we use a more
difficult set of parameters (i.e., Set III in Table 2.2 of Chapter 2), where the Hagan for-
mula does not provide accurate results anymore. In Table 4.4, we observe that the dd-
COS provides accurate ∆-values in this case, without any problems. The reference value
has been computed by the MCFD in combination with the mSABR simulation scheme
from Chapter 3, with a large number of Monte Carlo paths (n = 10, 000, 000) and time
steps (4T ). The convergence in n of the ddCOS ∆ estimator under the SABR dynamics
is shown in Figure 4.7(a). The calculation of Γ when the underlying is governed by the
SABR model is again involved and the MCFD estimation is not reliable. In Figure 4.7(b),
the convergence of the ddCOS Γ estimator is presented, where we observe convergence,
with impressive variance reduction.
0.75 150
ddCOS ddCOS
Ref. 100
50
0.7
0
-50
-100
0.65
101 10 2
10 3
10 4
10 5
101 102 103 104 105
factors denoted by S and a time horizon ∆t , we define the change in S at time ∆t by ∆S.
The variation in S directly affects the value of a portfolio Π(t , S), containing derivatives
of S. We denote the changes in the value of the portfolio by ∆Π, so that the definition of
the loss in interval [t , ∆t ] is given by
with q a predefined confidence level, whereas, given the VaR, the ES measure is com-
puted as
ES := E[∆Π|∆Π > VaR(q)].
So, VaR is given as a quantile of the loss distribution, while ES is the average of the
largest possible losses.
Although simple in definition, the practical computation of these risk measures is
a challenging and computationally expensive problem, especially when the changes in
Π cannot be assumed linear in S. Then, VaR and ES estimation is often performed by
means of an Monte Carlo method. In order to find a balance between accuracy and
tractability, one of the most employed methodologies is the Delta-Gamma approxima-
tion which combines Monte Carlo path generation, a second-order Taylor expansion and
the sensitivities to reduce the computational cost and capture the non-linearity in port-
folio changes. The delta-gamma approximation of ∆Π (in the case of only one risk factor)
is given by
M ∂Πi 1X M ∂2 Πi
∆Π ≈ ∆S + (∆S)2 ,
X
wi wi
i =1 ∂S 2 i =1 ∂S 2
86 4. T HE DATA - DRIVEN COS METHOD
1 2
ddCOS ddCOS
COS COS
0.8 1.5
0.6
1
0.4
0.5
0.2
0
0
-2 -1 0 1 2 -2 0 2 4
with M the number of assets depending on risk factor S, w i and Πi the amount and the
value of asset i , respectively. The partial derivatives are evaluated at initial time t . In
the case of options contracts, these partial derivatives correspond to the ∆ and Γ sensi-
tivities. It is usually assumed that the distribution of ∆S is known (normal, Student’s t,
etc). Then, by applying the Delta-Gamma approach, the distribution of the losses, F L ,
and therefore the VaR and the ES are easily calculated.
The use of the ddCOS method in the context of the Delta-Gamma approach general-
izes its applicability. Since the use of the ChF is not longer required, we can assume non-
trivial dynamics for ∆S, where the use of Fourier inversion methods (as in [101]) would
be a limitation (it may be impossible to obtain a ChF). As we have seen, by employing the
ddCOS method, ∆ and Γ can be computed at once and, therefore, be directly employed
within the Delta-Gamma approximation. The ddCOS method can thus be used to re-
cover the distribution of ∆Π, whenever samples are available. This can be useful when
historical data is employed, and no particular distribution is assumed.
In order to show the performance of the ddCOS method within the Delta-Gamma ap-
proach, we first repeat the experiments from [101]. Two portfolios are considered, both
with the same composition (one European call and half a European put under the same
underlying asset, maturity 60 days and strike K = 101) but different time horizons, i.e. 1
day and 10 days. We denote them by Portfolio 1 and Portfolio 2, respectively. The under-
lying asset follows a GBM with S(0) = 100, r = 0.1 and σ = 0.3. Change ∆S is assumed to
be normally distributed.
In Figure 4.8 the recovered densities by the COS and ddCOS methods are depicted.
An almost perfect fit is observed, with the expected small-sized oscillations in the data-
driven approach. Since the computational domain is also driven by data, the ddCOS
recovered PDF remains within the defined domain, avoiding incorrect estimations out-
side the domain (see the COS curve in Figure 4.8(b)).
We also employ the ddCOS method to compute the risk measures VaR and ES. The
convergence of VaR and ES in terms of n is presented in Figure 4.9, with reference values
p
provided by [101]. As expected, the convergence rate for both estimators is O (1/ n).
4.4. A PPLICATIONS OF THE DD COS METHOD 87
10 0 10 1
VaR VaR
ES ES
10 -1 10 0
10 -2 10 -1
10 -3 10 -2
10 1 10 2 10 3 10 4 10 5 10 1 10 2 10 3 10 4 10 5
1 2
COS COS
ddCOS, p=1 ddCOS, p=1
0.8 ddCOS, filter 1.5 ddCOS, filter
0.6
1
0.4
0.5
0.2
0
4 0
-2 -1 0 1 2 -2 0 2 4
3 1
ddCOS f L
ddCOS F L
0.8 ddCOS f L, filter
2.5
ddCOS F L, filter
0.6
2
0.4
1.5
ddCOS VaR 0.2
ddCOS ES
1
101 102 103 104 105 0
-4 -2 0 2 4
Figure 4.11: Delta-Gamma approach under the SABR model. Setting: S(0) = 100, K = 100, r = 0.0, σ0 = 0.4,
α = 0.8, β = 1.0, ρ = −0.5, T = 2, q = 99% and ∆t = 1/365.
4.5. C ONCLUSIONS 89
Table 4.5: VaR and ES under SABR model. Setting: S(0) = 100, K = 100, r = 0.0, σ0 = 0.4, α = 0.8, β = 1.0,
ρ = −0.5, T = 2, and ∆t = 1/365.
In Table 4.5, the VaR and ES under SABR are presented for several choices of q, rang-
ing from 10% to 90%. Again, the results seem to be coherent.
4.5. C ONCLUSIONS
In this chapter, the ddCOS method has been introduced. The method extends the COS
method applicability to cases when only data samples of the underlying asset are avail- 4
able. The method exploits a closed-form solution, in terms of Fourier cosine expansions,
of a PDF. The use of the COS machinery in combination with density estimation allowed
us to develop a data-driven method which can be employed for option pricing and risk
management. The ddCOS method particularly results in an efficient method for the ∆
and Γ sensitivities computation, based solely on the samples. Therefore, it can be em-
ployed within the Delta-Gamma approximation for calculating risk measures. Through
several numerical examples, we have empirically shown the convergence of our method.
In some cases, in order to get monotonic densities, it may be beneficial to add a filter
term to the ddCOS method.
A possible future extension may be the use of other basis functions. Haar wavelets
are for example interesting since they provide positive densities and allow an efficient
treatment of dynamic data.
In this chapter, we move in yet another type of algorithm. Here we focus on a parallel GPU
version of the Monte Carlo-based SGBM for pricing multi-dimensional early-exercise op-
tions. To extend the method’s applicability, the problem dimensions will be increased dras-
tically. This makes SGBM very expensive in terms of computational costs on conventional
hardware systems based on CPUs. A parallelization strategy of the method is developed
and the GPGPU paradigm is used to reduce the execution time. An improved technique for
bundling asset paths, which is more efficient on parallel hardware is introduced. Thanks
to the performance of the GPU version of SGBM, a general approach for computing the
early-exercise policy is proposed. Comparisons between sequential and GPU parallel ver-
sions are presented.
5.1. I NTRODUCTION
In recent years, different techniques for pricing early-exercise option contracts, as they
appear in computational finance, have been developed. In the wake of the recent fi-
nancial crisis, accurately modelling and pricing these kinds of options gained additional
importance, as they also form the basis for incorporation of the so-called counterparty
risk premium to an option value, which can be seen as an option covering the fact that a
counterparty may go into default before the end of a financial contract, and thus cannot
pay possible contractual obligations.
The early-exercise pricing methods can be classified according to different kinds of
techniques depending on whether they are based on simulation and regression, transi-
tion probabilities or duality approaches. Focusing on the first class of methods, Monte
Carlo path generation and dynamic programming to determine an optimal early-exercise
policy and the option price are typically combined. Some representative methods were
developed by Longstaff and Schwartz [94] and Tsitsiklis and Van Roy [128].
The pricing problem becomes significantly more complex when the early-exercise
options are multi- or even high-dimensional. One of the recent simulation-regression
pricing techniques for early-exercise (Bermudan) options with several underlying assets,
on which we will focus, is the SGBM, proposed by Jain and Oosterlee in [76]. The method
is a hybrid of regression and bundling approaches, as it employs regressed value func-
tions, together with bundling of the state space to approximate continuation values at
different time-steps. A high biased direct estimator and an early-exercise policy are first
computed in SGBM. The early-exercise policy is then used to determine a lower bound to
This chapter is based on the article “GPU Acceleration of the Stochastic Grid Bundling Method for Early-
Exercise options”. Published in International Journal of Computer Mathematics, 92(12):2433–2454, 2015 [90].
91
92 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
the true option price which is called the path estimator. The early-exercise policy com-
putation involves the computation of expectations of certain basis functions. Usually,
these basis functions are chosen by experience in such a way that an analytic solution
for the expectations appearing in the pricing method is available.
In this chapter, we extend SGBM’s applicability by drastically increasing the problem
dimensionality, the number of bundles and, hence, the number of Monte Carlo asset
paths. As the method becomes much more time-consuming then, we parallelize SGBM
taking advantage of the GPGPU paradigm. It is known that GPGPU is very well suited for
financial purposes and Monte Carlo techniques. In the case of pricing early-exercise op-
tions, several contributions regarding GPU implementations of different techniques ap-
peared recently, Abbas-Turki and Lapeyre [1], Benguigui and Baude [7], Cvetanoska and
Stojanovski [38], Dang et al. [39], Fatica and Phillips [52] and Pagès and Wilbertz [103].
All these papers are based on a combination of a Monte Carlo method and dynamic pro-
gramming, except the Dang et al. paper, which uses PDE-based pricing methods and the
Pagès and Wilbertz paper, which employs Monte Carlo simulation together with a quan-
tization method. Our GPU version of SGBM is based on parallelization in two steps,
according to the method’s stages, i.e., forward in time (Monte Carlo path generator) fol-
5 lowed by the backward stage (early-exercise policy computation). This is a novelty with
respect to other methods since, although the parallelization of a Monte Carlo method is
immediate, the parallelization of the backward stage is not trivial for other methods, for
example not for the Least Squares Monte Carlo Method (LSM) by Longstaff and Schwartz
[94] where some load balancing issues for parallel systems can appear as only the in-the-
money paths are considered. Due to some timing and distribution issues observed in the
original bundling procedure caused by an increasing number of bundles, we present an-
other bundling technique for SGBM which is much more efficient and more suitable on
parallel hardware.
Thanks to the performance of our SGBM GPU version, we can explore different possi-
bilities to compute the expectations appearing within the SGBM formulation, generalize
towards different option contracts and, in particular, towards more general underlying
asset models. A general approach to compute the expectations needed for the early-
exercise policy is proposed and evaluated. This approach is based on the computation
of a ChF for the discretized underlying stochastic differential system. Once this ChF of
the discretized asset dynamics is determined, we can use it for the expectations calcula-
tion, for example, for multi-dimensional local volatility asset models.
The chapter is organized as follows. In Section 5.2, the Bermudan option pricing
problem is introduced. Section 5.3 describes the SGBM. The new approach to compute
the expectations for the early-exercise policy is explained in Section 5.4. Section 5.5 gives
details about the GPU implementation. In Section 5.6, numerical results and CPU/GPU
time comparisons are shown. Finally, we conclude in Section 5.7.
ρ 1,2 · · · ρ 1,d
1
ρ 1 · · · ρ 2,d
1,2
C= .. .. .. .. ,
. . . .
ρ 1,d ρ 2,d ··· 1
and the Cholesky decomposition of C reads 5
1 0 ··· 0
ρ L ··· 0
1,2 2,2
L= . (5.2)
.. .. .. ..
. . . .
ρ 1,d L 2,d ··· L d ,d
Let h t := h(St ) be an adapted process representing the intrinsic value of the option,
i.e. the holder of the option receives max(h t , 0), ifR the option is exercised at time t . With
t
the risk-free savings account process B (t ) = exp( 0 r s ds), where r t denotes the instanta-
neous risk-free rate, we define
B (t m−1 )
D tm := .
B (t m )
We consider the special case where r t = r is constant. The problem is then to com-
pute
h(Sτ )
· ¸
V (t 0 , St0 ) = max E ,
τ B (τ)
where τ is a stopping time, taking values in the finite set {0, t 1 , . . . , T }. The value of the
option at the terminal time T is equal to the option’s payoff,
¡ ¡ ¢ ¡ ¢¢
V (t m−1 , Stm−1 ) = max h Stm−1 ,Q tm−1 Stm−1 .
We are interested in finding the value of the option at the initial state St0 , i.e. V (t 0 , St0 ).
¡ ¢
V (t M , St M ) = max h(St M ), 0 ,
¡ ¢
with max h(St M ), 0 the payoff function.
This relation is used to compute the option value for all grid points at the final time-
step.
The following steps are subsequently performed for each time-step, t m , m ≤ M , re-
cursively, moving backwards in time, starting from t M .
S TEP IV: M APPING HIGH - DIMENSIONAL STATE SPACE TO A LOW- DIMENSIONAL SPACE
Corresponding to each bundle Btm−1 (β),
³ β = 1,´ . . . , ν, a parametrized value function Z :
β β
Rd × Rb 7→ R, which assigns values Z Stm , αtm to states Stm , is computed. Here αtm ∈
Rb is a vector of free parameters. The objective is then to choose, for each t m and β, a
β
parameter vector αtm so that
β
³ ´
Z Stm , αtm ≈ V (t m , Stm ).
β
³ ´
After some approximations to be discussed in Section 5.3.2, Z Stm , αtm can be com-
puted by using ordinary least squares regression.
β
h ³ ´ i
Qbtm−1 (Stm−1 (n)) = E Z Stm , αtm |Stm−1 (n) .
5
The option value is then given by
¡ ¡ ¢ ¡ ¢¢
Vb (t m−1 , Stm−1 (n)) = max h Stm−1 (n) , Qbtm−1 Stm−1 (n) .
The Steps III, IV and V are repeated backward in time, till the initial time, t 0 is reached.
Thus, the value V (t 0 , St0 ) of the option is approximately computed.
5.3.1. B UNDLING
One of the techniques proposed (in [76]) to partition the asset path data into ν non-
overlapping sets at each time t m−1 is the k-means clustering technique. The algorithm
uses an iterative refinement algorithm, where, given an initial guess of cluster means,
first of all the algorithm assigns each data item to one specific set (i.e. bundle) by calcu-
lating the distance between the item and the cluster mean (under some measure) and
subsequently updates the sets and the cluster means. This process is repeated until
some stopping condition is satisfied. The procedure needs a pre-bundling step to set
approximated initial cluster means. The bundles obtained by using the k-means tech-
nique can be very irregular, i.e. the number of data items within each bundle may vary
significantly. This fact can be a problem when many bundles are considered, and parallel
load-balancing can be an issue too.
Since our goal is to drastically increase the number of bundles used and, in particu-
lar, the problem dimensionality, the k-means algorithm becomes too expensive in terms
of computational time and memory usage in high dimensions because the iterative pro-
cess of searching new sets (of high dimension) takes much time and d -dimensional data
points for each Monte Carlo path and for each time-step have to be stored. In addition,
it may happen that some bundles do not contain a sufficient number of data points to
compute an accurate regression function when the number of bundles increases. In or-
der to overcome these two problems of the k-means clustering technique, we propose
another bundling technique in the SGBM context which is much more efficient when
96 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
SORT SPLIT
5
taking into account our goal of efficiency in high dimensions: it does not involve an
iterative process, distributes the data equally and does not require the storage of the d -
dimensional points. The details are given in the next subsection.
β
³ ´
V (t m , Stm ) ≈ Z Stm , αtm .
As in the original SGBM paper, we follow the approach of Tsitsiklis and Van Roy [128]
and we use basis functions to approximate the values of the options. Hence, two impor-
tant decisions have to be made: the form of the function Z and the basis functions. For
each particular problem we define several basis functions, φ1 , φ2 , . . . , φb , that are typ-
ically chosen based on experience, as in the case of the LSM method [94], aiming to
represent relevant properties of a given state, Stm . In Section 5.3.2 and in Section 5.4, we
present two possible choices for the basis
³ functions, one specific and the other one more
β
´
generally applicable. In our case, Z Stm , αtm depends on Stm only through φk (Stm ).
β β
³ ´ ³ ´
Hence, for some function f : Rb × Rb 7→ R, we can write Z Stm , αtm = f φk (Stm ), αtm ,
where
b
5
β β
³ ´ X
Z St m , αt m = αtm (k)φk (Stm ). (5.4)
k=1
β
An exact computation of the vector of free parameters, αtm , is generally not feasible.
Hence, Equation (5.4) can be approximated by
b
β β
³ ´ X
Z St m , α
b tm = α
b tm (k)φk (Stm ), (5.5)
k=1
satisfying
|Btm−1 (β)|
à !2
b
β
α
X X
arg min V (t m , Stm (n)) − b tm (k)φk (Stm (n)) ,
β n=1 k=1
α
b tm
for the corresponding bundle Btm−1 (β). The parametrized function in Equation (5.5) is
computed by using ordinary least squares regression, so that
β β
³ ´
V (t m , Stm (n)) = Z Stm (n), α
b t m + ²t m ,
β
where Stm−1 (n) ∈ Btm−1 (β). The regression error, ²tm , can be neglected here, as we have a
sufficiently large number of paths in each bundle.
β
h ³ ´ i
Qbtm−1 (Stm−1 (n)) = D tm−1 E Z Stm , α
b tm |Stm−1 = Stm−1 (n) ,
98 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
where Stm−1 (n) ∈ Btm−1 (β). Using Equation (5.5), this can be written as
"Ã ! #
b
β
Qbtm−1 (Stm−1 (n)) = D tm−1 E α
X
b tm (k)φk (Stm ) |Stm−1 = Stm−1 (n)
k=1
(5.6)
b
β
α
b tm (k)E φk (Stm )|Stm−1
X £ ¤
= D tm−1 = Stm−1 (n) .
k=1
The continuation value will give us a reference value to compute the early-exercise
policy and it will be used by the estimators in Section 5.3.3.
form or otherwise have analytic approximations. The intrinsic value of the option, h(·), is
usually an important and useful basis function, especially in high problem dimensions.
In this section, we describe the choices of basis functions for different Bermudan basket
5 options with the underlying processes following GBM; we thus choose in Equation (5.1),
µδ (St ) = (r t − q δ )S δ (t ), σδ (St ) = σδ S δ (t ),
where r t is the risk-free rate and q δ and σδ , δ = 1, 2, . . . , d , are the dividend yield rates and
the volatility, respectively.
In the case of a geometric Bermudan basket option, the intrinsic value of the option
is defined by
à !1 à !1
¡ ¢ d
Y d
¡ ¢ d
Y d
h St m = S δ (t m ) − K , h Stm = K − S δ (t m ) ,
δ=1 δ=1
for call or put options, respectively, where K is the strike value of the option contract.
Then, basis functions that make sense are given by
à ! 1 k−1
d d
φk (Stm ) =
Y
S δ (t m ) , k = 1, . . . , b. (5.7)
δ=1
where,
!1 !2
σ2δ
à à ! Ã
d d
1 Xd 1 Xd d
2 2
, µ̄ = , σ̄ = 2
Y X
P tm−1 = S δ (t m−1 ) r t − qδ − L ,
δ=1 d δ=1 2 d i =1 j =1 i j
5.3. S TOCHASTIC G RID B UNDLING M ETHOD 99
where,
à !
k k!
= ,
k1 , k2 , . . . , kd k1 ! k2 ! · · · kd !
which can be computed in a straightforward way by Equation (5.8).
1 XN
E[Vb (t 0 , St0 )] = Vb (t 0 , St0 (n)).
N n=1
100 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
The direct estimator corresponds to Step V of the initial description in Section 5.3.
Once the optimal early-exercise policy has been obtained, the path estimator, which
is typically biased low, can be developed based on the early-exercise policy. The result-
ing confidence interval is useful, because, depending on the problem at hand, some-
times the path estimator and sometimes the direct estimator appears superior in terms
of accuracy. The obtained confidence intervals are generally small, indicating accurate
SGBM results. In order to compute the low-biased estimates, we generate a new set of
paths, as is common for duality-based Monte Carlo methods, S(n) = {St1 (n), . . . , St M (n)},
n = 1, . . . , NL , using the same scheme as followed for generating the paths in the case
of the direct estimator and the bundling stage is performed considering the new set of
paths. Along each path, the approximate optimal policy exercises at,
Algorithm 4: SGBM.
Data: St0 , K , µδ , σδ , ρ i , j , T, N , M
Pre-Bundling (only in k-means case).
Generation of the grid points (Monte Carlo). Step I.
Option value at terminal time t = M . Step II.
for Time t = (M − 1) . . . 1 do
Bundling. Step III.
for Bundle β = 1 . . . ν do
Exercise policy (Regression). Step IV.
Continuation value. Step V.
Direct estimator. Step V.
" Ã ! #
d
ψStm+1 u 1 , u 2 , . . . , u d |Stm = E exp
¡ ¢ X
i u j S j (t m+1 ) |Stm
j =1
" Ã Ã !! #
d j
= E exp i u j S j (t m ) + µ j (Stm )∆t + σ j (Stm ) L k, j ∆W̃k (t m+1 )
X X
|Stm
j =1 k=1
à ! à " à !#!
d d d
i u j S j (t m ) + µ j (Stm )∆t · E exp i u j L k, j σ j (Stm )∆W̃k (t m+1 )
X ¡ ¢ Y X
= exp
j =1 k=1 j =k
à ! à à !!
d d d
i u j S j (t m ) + µ j (Stm )∆t · ψN (0,∆t ) u j L k, j σ j (Stm )
X ¡ ¢ Y X
= exp ,
j =1 k=1 j =k
where i is the imaginary unit and ψN (µ,σ2 ) (u) = exp i µu − 0.5σ2 u 2 is the ChF of a nor-
¡ ¢
The joint moments of the product of several random variables can be obtained by
evaluating the following expression (more details in [85] or [129]), based on the ChF of
the discrete process
being u = (u 1 , u 2 , . . . , u d ).
If we take the basis functions as used in Equation (5.5), to be a product of the under-
lying processes to some power, i.e.,
à !k−1
d
φk (Stm ) =
Y
S δ (t m ) , k = 1, . . . , b, (5.11)
δ=1
the expected value in Equation (5.6) can be easily computed by means of Equation (5.10).
5 The approximation obtained in this way is, in general, worse than the analytic value
because the ChF on which the expectations are based is related to the Euler-Maruyama
SDE discretization and thus less accurate. However, since we can increase the number of
bundles and, in particular, time-steps drastically (due to our GPU implementation), we
can employ a suitable combination of these to price products without analytic solution
for the ChF. More involved models for the underlying asset dynamics can be chosen, for
which we do not have analytic expressions for these expectations. In Section 5.6.4, more
details are given.
P ROGRAMMING MODEL
The basic operational units in CUDA are the CUDA threads. The CUDA threads are pro-
cesses that are executed in parallel. The threads are grouped into blocks of the same size
and blocks of threads are grouped into grids. Such organization eases the adaptability to
different problems. One grid executes special functions called kernels so that all threads
of the grid execute the same instructions in parallel.
5.5. I MPLEMENTATION DETAILS 103
M EMORY HIERARCHY
CUDA threads may access data from multiple memory spaces during their execution.
Each thread manages its own registers and has private local memory. Each thread block
has shared memory visible to all threads of the block and with the same lifetime as the
block. All threads have access to the same global memory. There are also two additional
read-only memory spaces accessible by all threads: the constant and texture memory
spaces. In terms of speed, the registers represent the fastest memory (but also the small-
est) while the global and local memories are the slowest. In recent GPU architectures
(Kepler GK110 and GK110B [79], for example), several levels of cache are included to
improve the accessibility.
M EMORY TRANSFERS
A key aspect in each parallel system is the memory transfer. Specifically in a GPU frame-
work, the transfers between the CPU main memory and the GPU memory space are of
great importance for the performance of the implementation. The number of transfers
and the amount of data for each transfer are important. CUDA provides different ways to
allocate and move data. In that sense, one of the CUDA 5.5 features is Unified Virtual Ad- 5
dressing (UVA) which allows asynchronous transfers and page-locked memory accesses
by using a single address space for both CPU and GPU.
(b) Bundling stage using k-means. (c) Bundling stage using equal-partitioning.
increase of the number of Monte Carlo paths. Data has to be moved to CPU memory
between the stages of parallelization, Monte Carlo path simulation and bundle compu-
tations. For example, in case of the Monte Carlo scenario generator, we need to store in
and move from GPU global memory to CPU memory N × M × d doubles1 (8 bytes).
In the following subsections, we will show more specific details of the CUDA imple-
mentation of the two SGBM versions, the original one (with k-means bundling) and the
new one (with equal-partitioning).
B UNDLING SCHEMES
For the k-means clustering the computations of the distances between the cluster means
and all of the stochastic grid points have been parallelized. However, the other parts
must be performed sequentially. The very large number of bundles makes this very ex-
pensive, since the bundling must be done in each time-step.
As mentioned, equal-partitioning bundling involves two operations: sorting and split-
ting. It is well known that efficient sorting algorithms are a challenging research topic
(see [84], for example). In parallel systems, like GPUs, this is even more important. In
recent years, a great effort has taken place, see, for example, [77], [98], [115] or [126], to
adapt classical algorithms to the new parallel systems and developing new parallel tech-
niques for sorting. Several libraries for GPU programming appeared in this field, like
Thrust [127], CUB [33], Modern GPU [99], clogs [27] or VexCL [132].
In our work, we take advantage of the CUDA Data-Parallel Primitives Library (CUDPP),
described in [36]. CUDPP is a library of data-parallel algorithm primitives that are im-
1 Double-precision floating-point format.
106 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
portant building blocks for a wide variety of data-parallel algorithms, including sorting.
The sorting algorithms are based on the work of Satish, Harris and Garland [115]. The
authors showed the performance of several sorting algorithms implemented on GPUs.
Following the results of their work, we choose the parallel Radix sort which is included
in version 2.1 of CUDPP. In addition, CUDPP provides a kernel-level API2 , allowing us to
avoid the transfers between stages of parallelization.
Once the sorting stage has been performed, the splitting stage is immediate since
the size of the bundles is known, i.e. N /ν. Each CUDA thread manages a pointer which
points at the beginning of the corresponding region of global memory for each bundle.
The global memory allocation is made for all bundles which means that the bundle’s
memory areas are adjacent and the accesses are faster.
E STIMATORS
When the bundling stage is done, the exercise policy and the final option values can be
computed by means of direct and path estimators. In order to use the GPU memory
efficiently, the obtained Monte Carlo paths (once bundled) are sorted with respect to
the bundles and stored in that way. We minimize the global memory accesses and the
5 sorted version is much more efficient because, again, the bundle’s memory areas are
contiguous. For this purpose, we use again the CUDPP 2.1 library. Note that the data is
already sorted in the case of equal-partitioning bundling.
For the direct estimator, one CUDA thread per bundle is launched at each time-step.
For each bundle, the regression and option values are calculated on the GPU. All threads
collaborate in order to compute the continuation value which determines the early-
exercise policy. As we mentioned before, T /∆t stages of parallelization are performed,
i.e. one per time-step. Hence, in each stage of parallelization, we need to transfer only
the data corresponding to the current time-step from CPU memory to GPU global mem-
ory. This data cannot be transferred at once because we wish to take advantage of the
optimized parallel sorting libraries for the sorting step after the bundling stage. This
allows us to minimize and improve the global memory accesses when the sequential
bundling is employed, i.e. considering k-means clustering. This fact does not imply
a reduction of the performance since the total transferred amount is the same. Note
again that, employing equal-partitioning bundling, these transfers are avoided since this
bundling technique is fully parallelizable.
Once the early-exercise policy is determined, the path estimator can be executed.
For that, a new set of grid points has to be generated following the procedure described
in Section 5.5.2. In the case of the path estimator, the parallelization can be done over
paths because the early-exercise policy is already known (given by the previous compu-
tation of the direct estimator) and it is not needed to perform the regression again. One
CUDA thread per path is launched and it computes the optimal exercise time and the
cash flows according to the policy. Another sorting stage is needed to assign the paths
to the corresponding bundle. Again, we use the sorting functions of CUDPP 2.1 for that
purpose.
In the final stage for both estimators, a summation is necessary to determine the final
option price. We take advantage of the Thrust library [127] to perform this reduction on
2 Application Programming Interface.
5.6. R ESULTS 107
the GPU.
We present a schematic representation of parallel SGBM in Algorithm 5, highlight-
ing the parts that have been parallelized and considering the equal-partitioning version.
The variables payoffData, expData and critData correspond to three matrices in which
we store the necessary computations for the pricing, regression and sorting stages, re-
β
spectively, for each path and each exercise time. The αt values are computed in the
regression step and determine the early-exercise policy, which is later used in the path
estimator calculation.
return DE;
// Generation of the grid points (Monte Carlo). Step I.
// Option value at terminal time t = M . Step II.
[payoffData, critData, expData] = MonteCarloGPU(St0 , K , µδ , σδ , ρ i , j , T, N , M );
for Time t = M . . . 1 do
// Bundling. Step III.
SortingGPU(critData[t-1]);
begin CUDAThread per path n = 1 . . . N
// Continuation value. Step V.
β
CV[n] = ContinuationValue(αt , expData[t-1]);
// Path estimator. Step V.
PE[n] = PathEstimator(CV[n], payoffData[t-1]);
return PE;
5.6. R ESULTS
Experiments were performed on the Accelerator Island system of the Cartesius Super-
computer (more information in [20]) with the following main characteristics:
108 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
All computations are done in double precision, because a high accuracy is required
both in the bundling as well as in the regression computations. We consider the d -
dimensional problem of pricing Bermudan basket options with the following charac-
teristics:
• Strike: K = 40.
• Volatility: σδ = 0.2, δ = 1, 2, . . . , d .
• Correlation: ρ i , j = 0.25, j = 2, . . . , d , i = 1, . . . , j .
• Maturity: T = 1.0.
Specifically, geometric and arithmetic Bermudan basket put options are chosen in
order to show the accuracy and performance. The number of basis functions is taken as
b = 3. As the stochastic asset model, we first choose the multi-dimensional GBM, and
for the discretization scheme we employ the Euler-Maruyama SDE discretization.
V (t0 , St0 )
15d Path est.
1.3 1.2
1.1
1.2
1
1.1 0.9
1 4 16 1 4 16
Bundles ν Bundles ν
(a) Geometric basket put option. (b) Arithmetic basket put option.
Figure 5.3: Convergence with equal-partitioning bundling technique. Test configuration: N = 218 and ∆t =
T /M .
Table 5.1: SGBM stages time (s) for the C and CUDA versions. Test configuration: N = 222 , ∆t = T /M , d = 5
and ν = 210 .
Table 5.2: SGBM total time (s) for the C and CUDA versions. Test configuration: N = 222 , ∆t = T /M and ν = 210 .
5 In Table 5.2, the total execution times for the d = 5, d = 10 and d = 15 problems are
presented. We observe a significant acceleration of the CUDA versions for both bundling
techniques, with a special improvement in the case of the equal-partitioning. This is
because the iterative process of k-means bundling penalizes parallelism and memory
transfers, especially when the dimensionality increases, while equal-partitioning han-
dles these issues in a more efficient way.
Table 5.3: Option price for a high-dimensional problem with equal-partitioning. Test configuration: N = 220 ,
∆t = T /M and ν = 210 .
Table 5.4: SGBM total time (s) for a high-dimensional problem with equal-partitioning. Test configuration:
N = 220 and ∆t = T /M .
112 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
1.77 1.71
Reference price Reference price
Direct estimator Direct estimator
1.765 Path estimator
1.7 Path estimator
1.76
V (t0 , St0 )
V (t0 , St0 )
1.69
1.755
1.68
1.75
1.745 1.67
1.74 1.66
10 100 1000 2000 4000 10 100 1000 2000 4000
M C Steps = T /∆t M C Steps = T /∆t
(a) Geometric basket put option. (b) Arithmetic basket put option.
Figure 5.4: CEV model convergence, γ = 1.0. Test configuration: N = 216 , ν = 210 and d = 2.
in Equation (5.1). γ ∈ [0, 1] is a free parameter called the variance elasticity, and q δ and
σδ , δ = 1, 2, . . . , d , are constant dividend yield and volatility, respectively.
The computation of the derivatives in Equation (5.10) is performed by using Wolfram
Mathematica 8 [135].
In Figure 5.4, a convergence test regarding the number of Monte Carlo time-steps is
presented. A 2-dimensional problem is considered and γ = 1.0 is used in Equation (5.12)
to compare our approximation with the reference price given by the COS method (in
the case of a geometric Bermudan basket put option) and the original SGBM following
GBM (for an arithmetic Bermudan basket put option). The figure shows that we can get
improved approximations of the option price by increasing the number of Monte Carlo
time-steps.
5.7. C ONCLUSIONS 113
1.36 1.27
Reference price Reference price
Direct estimator Direct estimator
1.355 Path estimator 1.26 Path estimator
1.25
V (t0 , St0 )
V (t0 , St0 )
1.35
1.24
1.345
1.23
1.34 1.22
1.335 1.21
10 100 1000 2000 4000 10 100 1000 2000 4000
M C Steps = T /∆t M C Steps = T /∆t
(a) Geometric basket put option. (b) Arithmetic basket put option.
Figure 5.5: CEV model convergence, γ = 1.0. Test configuration: N = 216 , ν = 210 and d = 5.
Table 5.5: CEV option pricing. Test configuration: N = 216 , ∆t = T /4000, ν = 210 and d = 2.
In Figure 5.5, the same convergence test as d = 2 case is presented for a 5-dimensional
problem. Again, we can see that the approximations are better when the number of
Monte Carlo time-steps is increased. In both cases, d = 2 and d = 5, the direct estimator
gives a better approximation compared to the reference price. It is also observed that the
direct and path estimators for the new approach perform similarly as in the case of the
original SGBM and they can provide a confidence interval for the option price.
Once we find an appropriate combination of the number of Monte Carlo paths, bun-
dles and time-steps, we can use our parallel SGBM method to price products under local
volatility dynamics. In Tables 5.5 and 5.6, the results of pricing 2-dimensional and 5-
dimensional geometric and arithmetic Bermudan basket put options are presented. We
take a different values for γ in the CEV model here. Note that we did not encounter any
problems with the Euler-Maruyama discretization of the CEV dynamics in our test cases.
The execution times (s) are around 120 and 150 for 2-dimensional and 5-dimensional
problems, respectively.
5.7. C ONCLUSIONS
In this chapter, we have presented an efficient implementation of the SGBM on a GPU
architecture. Through the GPU parallelism, we could speed up the execution times when
114 5. GPU A CCELERATION OF THE S TOCHASTIC G RID B UNDLING M ETHOD
Table 5.6: CEV option pricing. Test configuration: N = 216 , ∆t = T /4000, ν = 210 and d = 5.
the number of bundles and the dimensionality increase drastically. In addition, we have
proposed a parallel bundling technique which is more efficient in terms of memory use
and more suitable on parallel systems. These two improvements enable us to extend
the method’s applicability and explore more general ways to compute the continuation
values. A general approach for approximating expectations based on the ChF of the dis-
5 crete process was presented. This approach allowed us to use SGBM for underlying asset
models for which the continuous ChF is not available.
Several results under the CEV local volatility model were presented to show the per-
formance and the accuracy of the new proposed techniques.
Compared with other GPU parallel implementations of early-exercise option pricing
methods, our parallel SGBM is very competitive in terms of computational time and can
solve very high-dimensional problems. Furthermore, we provided a new way to paral-
lelize the backward stage, according to the bundles, which gave us a remarkable perfor-
mance improvement.
We think that the parallel SGBM technique presented forms a fine basis to deal with
Credit Valuation Adjustment of portfolios with financial derivatives in the near future.
The hybrid component in this chapter is represented by the SGBM itself (it combines Monte
Carlo, dynamic programming, bundling and local regression approaches) and the effi-
cient parallel implementation of the method on GPU-based systems. The application of
parallel computing to early-exercise option pricing methods is very challenging. Here, we
have taken advantage of the particular features of SGBM to develop an efficient parallel
GPU version, allowing the use of the method for very high-dimensional problems and un-
der more involved dynamics. In the latter, another interesting component is employed, i.e.
the ChF of the discrete process.
CHAPTER 6
Conclusions and Outlook
6.1. C ONCLUSIONS
In this thesis, several numerical methods have been presented to deal with some of the
challenging problems appearing in computational finance. We have based our compu-
tational techniques on the combination of Monte Carlo methods and other advanced
methodologies, resulting in robust and efficient hybrid methods. We have proposed an
“almost exact” simulation technique for the SABR model, which is a standard model in
the FX and interest rate markets. The data-driven COS (ddCOS) method has been intro-
duced in the thesis, which may be a useful method for applications in risk management.
We have also developed a parallel GPU version of the Stochastic Grid Bundling Method,
which opens up the applicability towards very high-dimensional problems. In the fol-
lowing paragraphs, we briefly summarize the most important findings in the thesis.
In Chapter 2, a one time-step Monte Carlo method to simulate the SABR dynam-
ics has been developed. It is based on an efficient computational procedure for the
time-integrated variance. The technique employs a Fourier method to approximate the
marginal distribution of the time-integrated variance. Then, we have used a copula to
approximate the conditional distribution (time-integrated variance conditional on the
initial volatility). Resulting is a fast and accurate one time-step Monte Carlo method for
the SABR model, that may be used in the context of pricing European options with a
contract maturity of up to two years. By several numerical tests, we have shown that our
technique does not suffer from the well-known issues of the closed-form Hagan formula.
The latter formula is not accurate when small strike values and high volatilities are con-
sidered. As an alternative to this formula, our method may be suitable for calibration
purposes.
In Chapter 3, we have defined an accurate and robust multiple time-step Monte
Carlo method to simulate the SABR model with only a few time-steps, the so-called
mSABR method. The technique represents an extension of the method presented in
Chapter 2. Within the mSABR method we have incorporated several non-trivial method
components, like the use of a copula, the stochastic collocation interpolation technique
and a correlation approximation. The copula is employed to approximate the condi-
tional time-integrated variance process. The marginal distribution of the time-integrated
variance has been derived by means of Fourier techniques and numerical integration.
Due to the use of multiple time-steps, the straightforward generation of Monte Carlo
paths becomes computationally unaffordable. To deal with this computational cost is-
sue, we have applied an efficient sampling technique based on stochastic collocation
interpolation. The mSABR method appears an attractive SABR simulation technique
with an impressive ratio of accuracy and performance. The convergence and the sta-
115
116 6. C ONCLUSIONS AND O UTLOOK
bility of the method have been numerically assessed, whereby the mSABR method re-
quires only a few time-steps to achieve high accuracy. By the performed experiments,
we have shown that mSABR scheme reduces computational costs compared to other
Monte Carlo-based approaches when accurate results are required. As a multiple time-
step Monte Carlo method, it can be applied to pricing path-dependent options, like bar-
rier options. Furthermore, we have considered the extreme situation of negative interest
rates, for which the mSABR method (in combination with a shifted model) also performs
highly satisfactory.
6.2. O UTLOOK
We think that the challenges in computational finance will necessarily require hybrid
and interdisciplinary computational approaches to come up with robust, accurate and
efficient solutions.
In Chapters 2 and 3, we have chosen components of the COS methodology as the
Fourier inversion technique to approximate the cumulative distribution function of the
time-integrated variance. Many alternatives can also be found in the literature. A par-
ticular interesting alternative is the so-called SWIFT method [102], which is based on
Shannon wavelets. The locality features of these basis functions make the technique
particularly suitable for the case of options with long maturities. The use of the SWIFT
method within the mSABR method should, for example, give fast and accurate approxi-
mations. By this, CPU time of the mSABR method may be further reduced.
The use of the clever interpolation based on stochastic collocation already gave us a
remarkable improvement in terms of computational cost in Chapter 3. In order to fur-
ther reduce execution times, the use of high-performance computing also appears to be
an interesting option in this context. In Chapter 5, GPU computing has been success-
fully employed. The parallelization may also be directly applied to the SCMC method or
to the sample generation within the mSABR method.
In the ddCOS method presented in Chapter 4, we have exploited the relation between
the Fourier cosine expansions and the density estimation from the statistical learning
framework. Again, the use of other basis functions, like wavelets, may form a possible ex-
tension. Specifically, Haar wavelets have particularly features like locality and compact
support, guaranteeing positive densities and allowing an efficient treatment of dynamic
data.
Another natural extension for the ddCOS method would be to consider density esti-
mation in higher dimensions. While in the context of the regularization approach, this
can be a challenging task, the connection between the ddCOS method and the empirical
ChF may form an interesting line of research to explore.
As mentioned, the ddCOS method has been derived in the framework of statistical
learning for density estimation. The density estimation problem has been intensively
studied from different points of view in the recent years. Particularly interesting is the
aspect of the machine learning approach and the algorithms originating from it, like
support vector machines, neural networks or deep learning. It would be highly interest-
ing to develop the ddCOS method further in the context of machine learning and data
science, especially because so much data is generated within the financial world.
In Chapter 5, finally, we have mainly worked with very basic dynamics for the under-
lying asset, i.e., geometric Brownian motion (GBM). In [29], the authors have considered
the Merton jump-diffusion model, deriving analytic formulas for the early-exercise com-
putation. Due to the increased asset path generation complexity, the gain provided by
a GPU parallel version should be even higher than in the GBM case. The inclusion of
several underlying models may give the opportunity of applying parallel SGBM towards
more complex problems, like the problem of Credit Valuation Adjustment of a financial
portfolio in the context of counterparty credit risk management.
The future for hybrid solution methods, data-driven technologies and parallel com-
puting seems bright within finance, insurance and economics.
References
[1] Lokman A. Abbas-Turki and Bernard Lapeyre. American options pricing on multi-
core graphic cards. In 2009 International Conference on Business Intelligence and
Financial Engineering, pages 307–311. IEEE, 2009.
[2] Leif B. G. Andersen. Efficient simulation of the Heston stochastic volatility model.
Journal of Computational Finance, 11(3):1–22, 2008.
[4] Alexandre Antonov, Michael Konikov, and Michael Spector. SABR spreads its
wings. Risk Magazine, pages 58–63, August 2013.
[5] Alexandre Antonov, Michael Konikov, and Michael Spector. The free boundary
SABR: natural extension to negative rates. Risk Magazine, pages 58–63, August
2015.
[6] Philippe Barbe, Christian Genest, Kilani Ghoudi, and Bruno Rémillard. On
Kendall’s process. Journal of Multivariate Analysis, 58(2):197–229, 1996.
[7] Michael Benguigui and Françoise Baude. Fast American basket option pricing on a
multi-GPU cluster. In Proceedings of the 22nd High Performance Computing Sym-
posium, pages 1–8, Tampa, FL, USA, Apr 2014. 8 pages.
[8] Eric Benhamou. Fast Fourier Transform for discrete Asian options. Journal of
Computational Finance, 6(1):49–68, 2002.
[9] Daniel Berg. Copula goodness-of-fit testing: an overview and power comparison.
The European Journal of Finance, 15(7-8):675–701, 2009.
[12] Anastasia Borovykh, Andrea Pascucci, and Cornelis W. Oosterlee. Pricing Bermu-
dan Options Under Local Lévy Models with Default. Journal of Mathematical
Analysis and Applications, 450(2):929–953, 2017.
[13] Svetlana Boyarchenko and Sergei Levendorskii. Efficient variations of the Fourier
transform in applications to option pricing. Journal of Computational Finance,
18(2):57–90, 2014.
119
120 R EFERENCES
[15] Mark Broadie and Özgür Kaya. Exact simulation of stochastic volatility and other
affine jump diffusion processes. Operations Research, 54(2):217–231, March 2006.
[16] Nicola Cantarutti, João Guerra, Manuel Guerra, and Maria do Rosário Grossinho.
Option pricing in exponential Lévy models with transaction costs, 2016. Available
at arXiv: https://arxiv.org/abs/1611.00389v3.
[18] Luca Capriotti and Michael B. Giles. Algorithmic differentiation: adjoint Greeks
made easy. Risk Magazine, 25(10).
[19] Peter Carr and Dilip B. Madan. Option valuation using the Fast Fourier transform.
Journal of Computational Finance, 2:61–73, 1999.
[21] Bin Chen, Cornelis W. Oosterlee, and Hans van der Weide. A low-bias simulation
scheme for the SABR stochastic volatility model. International Journal of Theoret-
ical and Applied Finance, 15(2):1250016–1 – 1250016–37, 2012.
[22] Nan Chen and Paul Glasserman. Malliavin Greeks without Malliavin calculus.
Stochastic Processes and their Applications, 117(11):1689–1723, 2007. Recent De-
velopments in Mathematical Finance: Special issue based on the CCCP Meeting,
April 2006, New York, NY.
[23] Rongda Chen and Lean Yu. A novel nonlinear Value-at-Risk method for modeling
risk of option portfolio with multivariate mixture of normal distributions. Eco-
nomic Modelling, 35:796–804, 2013.
[24] Umberto Cherubini and Elisa Luciano. Value-at-Risk trade-off and capital alloca-
tion with copulas. Economic notes, 30(2):235–256, 2001.
[25] Umberto Cherubini and Elisa Luciano. Bivariate option pricing with copulas. Ap-
plied Mathematical Finance, 9(2):69–85, 2002.
[26] Umberto Cherubini, Sabrina Mulinacci, Fabio Gobbi, and Silvia Romagnoli. Dy-
namic copula methods in finance, volume 625. John Wiley & Sons, 2011.
[28] Rafael Company, Vera Egorova, Lucas Jódar, and Carlos Vázquez. Computing
American option price under regime switching with rationality parameter. Com-
puters and Mathematics with Applications, 72(3):741 – 754, 2016.
121
[29] Fei Cong and Cornelis W. Oosterlee. Pricing Bermudan options under Merton
jump-diffusion asset dynamics. International Journal of Computer Mathematics,
92(12):2406–2432, 2015.
[30] Shane Cook. CUDA programming: a developer’s guide to parallel computing with
GPUs. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1st edition,
2013.
[31] John C. Cox. Note on option pricing I: Constant Elasticity of Variance diffusions.
Journal of Portfolio Management, 22:15–17, 1996.
[38] Verche Cvetanoska and Toni Stojanovski. Using high performance computing and
Monte Carlo simulation for pricing American options. CoRR, abs/1205.0106, 2012.
[41] Mark H.A. Davis and Martin P. Johansson. Malliavin Monte Carlo Greeks for jump
diffusions. Stochastic Processes and their Applications, 116(1):101–129, 2006.
[43] Jacques du Toit and Uwe Naumann. Adjoint Algorithmic Differentiation Tool
Support for Typical Numerical Patterns in Computational Finance. To appear
in Journal of Computational Finance, 2017. Available at NAG Technical Reports:
https://www.nag.co.uk/doc/techrep/pdf/tr3_14.pdf.
[44] Daniel J. Duffy. Finite difference methods in financial engineering - a partial differ-
ential equation approach. John Wiley & Sons Ltd, 2006.
122 R EFERENCES
[45] Valdo Durrleman, Ashkan Nikeghbali, and Thierry Roncalli. Which copula is the
right one?, 2000. Available at SSRN: http://ssrn.com/abstract=1032545.
[46] Vera N. Egorova, Shi-Hau Tan, Choi-Hong Lai, Rafael Company, and Lucas Jó-
dar. Moving boundary transformation for American call options with transaction
cost: finite difference methods and computing. International Journal of Computer
Mathematics, 94(2):345–362, 2017.
[47] Paul Embrechts, Alexander McNeil, and Daniel Straumann. Correlation: pitfalls
and alternatives. Risk Magazine, 12:69–71, 1999.
[48] Bert Van Es, Peter Spreij, and Harry Van Zanten. Nonparametric volatility density
estimation. Bernoulli, 9(3):451–465, 2003.
[49] Fang Fang and Cornelis W. Oosterlee. A novel pricing method for European op-
tions based on Fourier-cosine series expansions. SIAM Journal on Scientific Com-
puting, 31:826–848, 2008.
[50] Fang Fang and Cornelis W. Oosterlee. Pricing early-exercise and discrete barrier
options by Fourier-cosine series expansions. Numerische Mathematik, 114(1):27–
62, 2009.
[51] Rob Farber. CUDA Application, design and development. Elsevier, 2011.
[52] Massimiliano Fatica and Everett Phillips. Pricing American options with least
squares Monte Carlo on GPUs. In Proceedings of the 6th Workshop on High Per-
formance Computational Finance, WHPCF ’13, pages 5:1–5:6, New York, NY, USA,
2013. ACM.
[53] Eric Fournié, Jean-Michel Lasry, Jérôme Lebuchoux, Pierre-Louis Lions, and Nizar
Touzi. Applications of Malliavin calculus to Monte Carlo methods in finance. Fi-
nance and Stochastics, 3(4):391–412, 1999.
[54] Beatrice Gaviraghi, Andreas Schindele, Mario Annunziato, and Alfio Borzì. On
optimal sparse-control problems governed by jump-diffusion processes. Applied
Mathematics, 7:1978–2004, 2016.
[55] Christian Genest and Louis-Paul Rivest. Statistical inference procedures for bi-
variate Archimedean copulas. Journal of the American Statistical Association,
88(423):1034–1043, 1993.
[56] Michael B. Giles. Smoking adjoints: fast Monte Carlo Greeks. Risk Magazine,
19(1):88–92.
[57] Michael B. Giles. Multi-level Monte Carlo path simulation. Operations Research,
56(3):607–617, 2008.
[58] Michael B. Giles. Vibrato Monte Carlo Sensitivities. In Pierre L’ Ecuyer and Art B.
Owen, editors, Monte Carlo and Quasi-Monte Carlo Methods 2008, pages 369–382.
Springer Berlin Heidelberg, Berlin, Heidelberg, 2009.
123
[60] Paul Glasserman and Zongjian Liu. Sensitivity estimates from characteristic func-
tions. Operations Research, 58(6):1611–1623, 2010.
[61] Paul Glasserman and Zongjian Liu. Estimating greeks in simulating Lévy-driven
models. Journal of Computational Finance, 14(2):3–56, 2011.
[62] Emmanuel Gobet, José G. López-Salas, Plamen Turkedjiev, and Carlos Vázquez.
Stratified regression Monte-Carlo scheme for semilinear PDEs and BSDEs with
large scale parallelization on GPUs. SIAM Journal on Scientific Computing,
38(6):C652–C677, 2016.
[63] Lech A. Grzelak and Cornelis W. Oosterlee. On the Heston model with stochastic
interest rates. SIAM Journal on Financial Mathematics, 2(1):255–286, 2011.
[64] Lech A. Grzelak and Cornelis W. Oosterlee. An equity-interest rate hybrid model
with stochastic volatility and the interest rate smile. Journal of Computational
Finance, 15(4):45–77, 2012.
[66] Tinne Haentjens and Karel J. In ’t Hout. Alternating direction implicit finite differ-
ence schemes for the Heston–Hull–White partial differential equation. Journal of
Computational Finance, 16:83–110, 2012.
[67] Patrick S. Hagan, Deep Kumar, Andrew S. Lesniewski, and Diana E. Woodward.
Managing smile risk. Wilmott Magazine, pages 84–108, 2002.
[68] Patrick S. Hagan, Deep Kumar, Andrew S. Lesniewski, and Diana E. Woodward.
Arbitrage-Free SABR. Wilmott Magazine, (69):60–75, 2014.
[70] Martin B. Haugh and Leonid Kogan. Pricing American options: a duality approach.
Operations Research, 52(2):258–270, 2004.
[71] Christian Hendricks, Christof Heuer, Matthias Ehrhardt, and Michael Günther.
High-order ADI finite difference schemes for parabolic equations in the combina-
tion technique with application in finance. Journal of Computational and Applied
Mathematics, 316:175 – 194, 2017.
[74] Kazuyuki Ishiyama. Methods for evaluating density functions of exponential func-
tionals represented as integrals of geometric Brownian motion. Methodology and
Computing in Applied Probability, 7(3):271–283, 2005.
[75] Othmane Islah. Solving SABR in exact form and unifying it with LIBOR market
model, 2009. Available at SSRN: http://ssrn.com/abstract=1489428.
[76] Shashi Jain and Cornelis W. Oosterlee. The Stochastic Grid Bundling Method: Ef-
ficient pricing of Bermudan options and their Greeks. Applied Mathematics and
Computation, 269:412–431, 2015.
[77] Bilal Jan, Bartolomeo Montrucchio, Carlo Ragusa, Fiaz Gul Khan, and Omar Khan.
Fast parallel sorting algorithms on GPUs. International Journal of Distributed and
Parallel systems, 14(1), 2012.
[78] Joanne E. Kennedy and Duy Pham. On the Approximation of the SABR with
mean reversion model: a Probabilistic Approach. Applied Mathematical Finance,
21(5):451–481, 2014.
[80] Soňa Kilianova and Daniel Ševčovič. A transformation method for solving the
Hamilton-Jacobi-Bellman equation for a constrained dynamic stochastic optimal
allocation problem. The ANZIAM Journal, 55(01):14–38, 2013.
[82] David B. Kirk and Wen-mei W. Hwu. Programming massively parallel processors: a
hands-on approach. Elsevier, 2010.
[83] Peter E. Kloeden and Eckhard Platen. Numerical solution of stochastic differential
equations. Springer–Verlag, 1992.
[84] Donald E. Knuth. The art of computer programming, volume 3: (2nd Ed.) Sorting
and searching. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA,
USA, 1998.
[85] Hisashi Kobayashi, Brian L. Mark, and William Turin. Probability, random pro-
cesses, and statistical analysis. Cambridge, 2012.
[86] Ralf Korn and Songyin Tang. Exact analytical solution for the normal SABR model.
Wilmott Magazine, 2013(66):64–69, 2013.
[87] R. A. Kronmal and M. E. Tarter. The estimation of probability densities and cumu-
latives by Fourier series methods. Journal of the American Statistical Association,
63(323):925–952, 1968.
125
[88] Álvaro Leitao, Lech A. Grzelak, and Cornelis W. Oosterlee. On a one time-step
Monte Carlo simulation approach of the SABR model: application to European
options. Applied Mathematics and Computation, 293:461–479, 2017.
[89] Álvaro Leitao, Lech A. Grzelak, and Cornelis W. Oosterlee. On an efficient multiple
time step Monte Carlo simulation of the SABR model. Quantitative Finance, 2017.
[90] Álvaro Leitao and Cornelis W. Oosterlee. GPU Acceleration of the Stochastic Grid
Bundling Method for Early-Exercise options. International Journal of Computer
Mathematics, 92(12):2433–2454, 2015.
[91] Álvaro Leitao, Cornelis W. Oosterlee, Luis Ortiz-Gracia, and Sander M. Bohte. On
the data-driven COS method. Submitted for publication, 2017.
[93] David X. Li. On default correlation: a copula function approach. Journal of Fixed
Income, 9:43–54, 2000.
[94] Francis A. Longstaff and Eduardo S. Schwartz. Valuing American options by simu-
lation: a simple least-squares approach. Review of Financial Studies, 14(1):113–47,
2001.
[95] Roger Lord and Adam Farebrother. Fifty shades of SABR simulation, 2014. Presen-
tation at 10th Fixed Income Conference, WBS Training, Barcelona, Spain. Available
at http://www.rogerlord.com/fiftyshadessabrwbs.pdf.
[97] Paul Malliavin and Maria Elvira Mancino. A Fourier transform method for non-
parametric estimation of multivariate volatility. Annals of Statistics, 37(4):1983–
2010, 2009.
[98] Marko J. Misic and Milo V. Tomasevic. Data sorting using Graphics Processing
Units. Telfor Journal, 4(1), 2012.
[100] Jan Obloj. Fine-tune your smile: Correction to Hagan et al, 2008. Available at arXiv:
http://arxiv.org/abs/0708.0998v3.
[101] Luis Ortiz-Gracia and Cornelis W. Oosterlee. Efficient VaR and Expected Shortfall
computations for nonlinear portfolios within the delta-gamma approach. Applied
Mathematics and Computation, 244:16–31, 2014.
126 R EFERENCES
[102] Luis Ortiz-Gracia and Cornelis W. Oosterlee. A highly efficient Shannon wavelet
inverse Fourier technique for pricing European options. SIAM Journal on Scientific
Computing, 38(1):118–143, 2016.
[103] Gilles Pagès and Benedikt Wilbertz. GPGPUs in computational finance: massive
parallel computing for American style options. Concurrency and Computation:
Practice and Experience, 24(8):837–848, 2012.
[104] Louis Paulot. Asymptotic implied volatility at the second order with application to
the SABR model, 2009. Available at SSRN: http://ssrn.com/abstract=1413649.
[105] Vladimir Piterbarg. A multi-currency model with FX volatility skew, 2005. Available
at SSRN: http://ssrn.com/abstract=685084.
[107] Nicolas Privault and Jiadong Yu. Stratified approximations for the pricing of op-
tions on average. Journal of Computational Finance, 19(4):95–113, 2016.
[111] Maria J. Ruijter and Cornelis W. Oosterlee. Numerical Fourier Method and Second-
order Taylor Scheme for Backward SDEs in Finance. Applied Numerical Mathe-
matics, 103(C):1–26, May 2016.
[112] Maria J. Ruijter, Mark Versteegh, and Cornelis W. Oosterlee. On the application
of spectral filters in a Fourier option pricing technique. Journal of Computational
Finance, 19(1):75–106, 2014.
[113] Jason Sanders and Edward Kandrot. CUDA by example: an introduction to general-
purpose GPU programming. Addison-Wesley, 2011.
[115] Nadathur Satish, Mark Harris, and Michael Garland. Designing efficient sorting
algorithms for manycore GPUs. In Proceedings of the 23rd IEEE International Par-
allel and Distributed Processing Symposium, 2009.
[116] Sebastian Schlenkrich, André Miemiec, and Tilman Wolff-Siemssen. Low strike
extrapolation for SABR. Wilmott Magazine, 2014.
127
[117] Mark Schroder. Computing the Constant Elasticity of Variance option pricing for-
mula. Journal of Finance, 44(1):211–219, 1989.
[118] Rituparna Sen and Changie Ma. Forecasting density function: application in fi-
nance. Mathematical Finance, 5:433–447, 2015.
[120] Bernard W. Silverman. Density estimation for statistics and data analysis. Chap-
man & Hall, London, 1986.
[121] Johannes V. Siven, Jeffrey T. Lins, and Anna Szymkowiak-Have. Value-at-Risk com-
putation by Fourier inversion with explicit error bounds. Finance Research Letters,
6(2):95–105, 2009.
[124] Robert D. Smith. An almost exact simulation method for the Heston model. Jour-
nal of Computational Finance, 11(1):115–125, 2007.
[125] Maria Suárez-Taboada and Carlos Vázquez. Numerical solution of a PDE model
for a ratchet-cap pricing with BGM interest rate dynamics. Applied Mathematics
and Computation, 218(9):5217–5230, 2012.
[126] Krishnahari Thouti and S.R. Sathe. An OpenCL method of parallel sorting algo-
rithms for GPU architecture. International Journal of Experimental Algorithms,
3(1), 2012.
[128] John N. Tsitsiklis and Benjamin Van Roy. Regression methods for pricing complex
American-style options. IEEE transactions on Neural networks, 12(4):694–703, July
2001.
[130] Alexander van Haastrecht and Antoon A.J. Pelsser. Efficient, almost exact simula-
tion of the Heston stochastic volatility model. International Journal of Theoretical
and Applied Finance, 13(01):1–43, 2010.
[133] Paul Wilmott, Sam Howison, and Jeff Dewynne. The mathematics of financial
derivatives : a student introduction. Cambridge University Press, Cambridge, UK,
New York, 1995.
128 R EFERENCES
[134] Nicholas Wilt. The CUDA handbook: a comprehensive guide to GPU programming.
Addison-Wesley, 2013.
[136] Qi Wu. Series expansion of the SABR joint density. Mathematical Finance,
22(2):310–345, 2012.
[137] Bowen Zhang and Cornelis W. Oosterlee. Efficient pricing of European-style Asian
options under exponential Lévy processes based on Fourier cosine expansions.
SIAM Journal on Financial Mathematics, 4(1):399–426, 2013.
Curriculum Vitæ
E DUCATION
1997–2001 Secondary school
I.E.S. Illa de Sarón, Xove, Spain
2001–2003 Baccalaureate
I.E.S. María Sarmiento, Viveiro, Spain
2011–2013 Researcher
M2NICA research group, Department of Mathematics
University of A Coruña, A Coruña, Spain
129
List of Publications
Journal articles:
1. A. Leitao and C. W. Oosterlee, GPU Acceleration of the Stochastic Grid Bundling Method for
Early-Exercise options, International Journal of Computer Mathematics, 92(12):2433–2454,
2015.
Chapters in books:
2. A. Leitao, L. A. Grzelak and C. W. Oosterlee, A highly efficient numerical method for the
SABR model, STRIKE - Novel Methods in Computational Finance. To be published in 2017.
1. A. Leitao and C. W. Oosterlee, Modern Monte Carlo methods and GPU computing, STRIKE
- Novel Methods in Computational Finance. To be published in 2017.
131
List of Attended Conferences
Presentations:
7. Summer school on Quantitative methods for Risk management in Finance and Insurance,
A Coruña, Spain, July 2016.
6. The 19th European Conference on Mathematics for Industry, Santiago de Compostela, Spain,
June 2016.
3. Stochastics & Computational Finance: from academia to industry, Lisbon, Portugal, July
2015.
1. The 18th European Conference on Mathematics for Industry, Taormina, Italy, June 2014.
Conference papers:
1. A. Leitao and C. W. Oosterlee, On a GPU acceleration of the Stochastic Grid Bundling Method,
In Proceedings of the 18th European Conference on Mathematics for Industry, 2014, Springer
Heidelberg.
133
Acknowledgement
First and foremost, my most sincere gratitude goes to my supervisor and promotor prof.
Kees Oosterlee. His guidance and advice have been essential for the successful comple-
tion of this thesis. Kees never hesitates to share his knowledge and experience with his
students, keeping the motivation and the learning interest in the research process. Fur-
thermore, Kees directly offered me this opportunity, for which, I will always be grateful.
The last four years have been one of the nicest experience in my life which I had the
pleasure to share with lots of excellent people. The (former) students and collaborators
of Kees form a very friendly and wonderful group, not only cooperating and helping each
other but also having fun together. Many thanks to Andrea, Anton, Fei, Gemma, Ki Wai,
Marjon, Peiyao, Prashant, Qian, Sander, Shashi, Shuaiqiang and Zaza (dank u wel voor
de Nederlandse vertaling!), and the Spanish community, Luis, María, Marta and Paco.
The colleagues from the CWI, with whom, I have enjoyed the informal discussions about
several, sometimes crazy, topics during lunches and next to the coffee machine: Anne,
Barry, Bart, Benjamin, Bram, Chris, Daan, Daniel, Debarati, Folkert, Fredrik, Inti, Jan,
Jan Willem, Jeroen (×2), Joost, Keith, Krzysztof, Laurent, Nick, Sangeetika, Sirshendu,
Svetlana, Wander, Willem Jan, Xiaodong, Yous and Zhichao. With many of them, I have
also enjoyed practising basketball, football, squash or table-tennis. The help of the CWI
staff, particularly Nada and Duda, is very appreciated.
My PhD was partially carried out in the framework of a European Multi-ITN project
called STRIKE. I would like to specially thank the coordinators, Matthias Ehrhardt and
Jan ter Maten for their impressive work. I have spent a very nice time with my colleagues
in the project, the “STRIKErs” Zuzana, Zé, Pedro, Vera, Walter, Nicola, Ivan, Lara, Giang,
Shih-Hau, Beatrice, Radoslav, Christof, Silvie and Fazlollah, during the countless project
events. I also had a great time with Anastasia, Enrico, José Germán, Maarten, Matthieu,
Sidy and Zuzana in different conferences around Europe. It was really a pleasure to have
met all of them. I want to particularly express my gratitude to prof. Carlos Vázquez
Cendón who encouraged me to take this important step in my career.
I would also like to thank the members of my defence committee for reading my
dissertation and participating in my defence. Special thanks to Lech Grzelak for his con-
stant support and helpful comments during my PhD research.
Finally, I would like to express the most special gratitude to my family.
Gracias en primeiro lugar a meus pais pola educación que me deron, da que me
sinto moi orgulloso. Sempre me animaron e me apoiaron nos momentos importantes,
dándome os mellores consellos. Moitas gracias a miña nai por todo o seu cariño. E
moitas gracias a meu pai, a quen, despois de nove anos sen él, recordo cada día.
E por suposto, moitísimas gracias a Mela. Ela foi o meu gran apoio nos últimos catro
anos (e nos once anteriores), nos que soubemos desfrutar e aprender dunha situación
excepcional. A súa axuda, apoio e comprensión foron imprescindibles para acadar este
logro. Xuntos afrontaremos o que o futuro nos depare.
136 A CKNOWLEDGEMENT