Essentials of Excel VBA, Python, and R
Essentials of Excel VBA, Python, and R
Essentials of Excel VBA, Python, and R
Cheng-Few Lee
Essentials of
Excel VBA, Python,
and R
Volume II: Financial Derivatives, Risk Management
and Machine Learning
Second Edition
Essentials of Excel VBA, Python, and R
John Lee • Jow-Ran Chang •
Lie-Jane Kao • Cheng-Few Lee
123
John Lee Jow-Ran Chang
Center for PBBEF Research Dept of Quantitative Finance
Morris Plains, NJ, USA National Tsing Hua University
Hsinchu, Taiwan
Lie-Jane Kao
College of Finance Cheng-Few Lee
Takming University of Science Rutgers School of Business
and Technology The State University of New Jersey
Taipei City, Taiwan North Brunswick, NJ, USA
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and
retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and
regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed
to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty,
expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been
made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional
affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
In the new edition of this book, there are 49 chapters, and they are divided into two volumes.
Volume I, entitled “Microsoft Excel VBA, Python, and R For Financial Statistics and Portfolio
Analysis,” contains 26 chapters. Volume II entitled, “Microsoft Excel VBA, Python, and R
For Financial Derivatives, Financial Management, and Machine Learning,” contains 23
chapters. Volume I is divided into two parts. Part I Financial Statistics contains 21 chapters.
Part II Portfolio Analysis contains five chapters. Volume II is divided into five parts. Part I
Excel VBA contains three chapters. Part II Financial Derivatives contains six chapters. Part III
Applications of Python, Machine Learning for Financial Derivatives, and Risk Management
contains six chapters. Part IV Financial Management contains four chapters, and Part V
Applications of R Programs for Financial Analysis and Derivatives contains three chapters.
Part I of this volume discusses advanced applications of Microsoft Excel Programs.
Chapter 2 introduces Excel programming, Chap. 3 introduces VBA programming, and Chap. 4
discusses professional techniques used in Excel and Excel VBA techniques. There are six
chapters in Part II. Chapter 5 discusses the decision tree approach for the binomial option
pricing model, Chap. 6 discusses the Microsoft Excel approach to estimating alternative option
pricing models, Chap. 7 discusses how to use Excel to estimate implied variance, Chap. 8
discusses Greek letters and portfolio insurance, Chap. 9 discusses portfolio analysis and option
strategies, and Chap. 10 discusses simulation and its application.
There are six chapters in Part III, which describe applications of Python, machine learning for
financial analysis, and risk management. These six chapters are Linear Models for Regression
(Chap. 11), Kernel Linear Model (Chap. 12), Neural Networks and Deep Learning (Chap. 13),
Applications of Alternative Machine Learning Methods for Credit Card Default Forecasting
(Chap. 14), An Application of Deep Neural Networks for Predicting Credit Card Delinquencies
(Chap. 15), and Binomial/Trinomial Tree Option Pricing Using Python (Chap. 16).
Part IV shows how Excel can be used to perform financial management. Chapter 17 shows
how Excel can be used to perform financial ratio analysis, Chap. 18 shows how Excel can be
used to perform time value money analysis, Chap. 19 shows how Excel can be used to perform
capital budgeting under certainty and uncertainty, and Chap. 20 shows how Excel can be used
for financial planning and forecasting. Finally, Part V discusses applications of R programs for
financial analysis and derivatives. Chapter 21 discusses the theory and application of hedge
ratios. In this chapter, we show how the R program can be used for hedge ratios in terms of
three econometric methods. Chapter 22 discusses applications of a simultaneous equation in
finance research in terms of the R program. Finally, Chap. 23 discusses how to use the R
program to estimate the binomial option pricing model and the Black and Scholes option
pricing model.
In this volume, Chap. 14 was contributed by Huei-Wen Teng and Michael Lee. Chapter 15
was contributed by Ting Sun, and Chap. 22 was contributed by Fu-Lai Lin.
There are two possible applications of this volume:
A. to supplement financial derivative and risk management courses.
B. to teach students how to use Excel VBA, Python, and R to analyze financial derivatives
and perform risk management.
v
vi Preface
In sum, this book can be used by academic courses and for practitioners in the financial
industry. Finally, we appreciate the extensive help of our assistants Xiaoyi Huang and Natalie
Krawczyk.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Brief Description of Chap. 1 of Volume 1 . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Structure of This Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3.1 Excel VBA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3.2 Financial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.3 Applications of Python, Machine Learning for Financial
Derivatives, and Risk Management . . . . . . . . . . . . . . . . ....... 2
1.3.4 Financial Management . . . . . . . . . . . . . . . . . . . . . . . . . ....... 2
1.3.5 Applications of R Programs for Financial Analysis
and Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... 3
1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... 3
vii
viii Contents
time value of money in mortgage payment in an investment Then we show how the R program can be used to estimate
decision. the empirical results of investment, financing, and dividend
We discuss capital budgeting under certainty and uncer- decision in terms of two-stage least squares, three-stage least
tainty in Chap. 19. More specifically, we discuss the capital squares, and generalized method of moments.
budgeting process; the cash-flow evaluation of alternative In Chap. 23, we review binomial, trinomial, and Ameri-
investment projects; NPV and IRR methods; can option pricing models, which were previously discussed
capital-rationing decision with Excel; the statistical distri- in Chaps. 5 and 6. We then show how the R program can be
bution method with Excel; the decision tree method for used to estimate the binomial option pricing model and the
investment decisions with Excel; and simulation methods Black–Scholes option pricing model.
with Excel.
Financial planning and forecasting are discussed in
Chap. 20. We talk about procedures for financial planning 1.4 Summary
and analysis; the algebraic simultaneous equations approach
to financial planning and analysis; and the procedure of In this volume, we have shown how Excel VBA can be used
using Excel for financial planning and forecasting. to evaluate binomial, trinomial, and American option mod-
els. In addition, we also showed how implied variance in
terms of the Black–Scholes and CEV models can be esti-
1.3.5 Applications of R Programs for Financial mated. Option strategy and portfolio analysis are also
Analysis and Derivatives explored in some detail. We have also shown how Excel can
be used to perform different simulation models.
Lastly, Part E contains three chapters, which show how R We also showed how Python can be used for regression
programming can be useful for financial analysis and analysis and credit analysis in this volume. In addition, the
derivatives. In Chap. 21 of this part, we discuss theories and application of Python in estimating binomial and trinomial
applications of hedge ratios. We talk about alternative the- option pricing models is also discussed in some detail.
ories for deriving the optimal hedge ratio; alternative The application of the R language to estimate hedge ratios
methods for estimating the optimal hedge ratio; using OLS, and investigate the relationship among investment, financ-
GARCH, and CECM models to estimate the optimal hedge ing, and dividend policy is also discussed in this volume. We
ratio; and hedging horizon, maturity of futures contract, data also show how the R language can be used to estimate the
frequency, and hedging effectiveness. binomial option trees. Finally, in Part E we also show how
In Chap. 22, we first discuss the simultaneous equation the R language can be used to estimate option pricing for
model for investment, financing, and dividend decision. individual stock, stock indices, and currency options.
Part I
Excel VBA
Introduction to Excel Programming
and Excel 365 Only Features 2
Next, highlight the words above before we start using Shift key on the keyboard, and while pressing the Shift key,
Excel’s macro recorder to generate the VBA code. To press the # key on the keyboard three times. The result is
highlight the list, first select the word “John,” then press the shown below.
2.2 Excel’s Macro Recorder 9
Now let’s turn on Excel’s macro recorder. To do this, we would choose Developer ! Record Macro. The steps to do this
are shown below.
Choosing the Record Macro menu item would result in Next, type “FormatWords” in the Macro name: Option to
the Record Macro dialog box shown below. indicate the name of our macro. After doing this, press the
OK button.
Let’s first bolden the words by pressing Ctrl + B key
combination on the keyboard or press the B button under the
Home tab. The result of this action is shown below.
10 2 Introduction to Excel Programming and Excel 365 Only Features
Next, italicize the words by pressing Ctrl + I key combination on the keyboard or press the I button under the Home tab.
The result of this action is shown below.
Next, underline the words by pressing Ctrl + U key combination on the keyboard or press the U button under the Home
tab. The result of this action is shown below.
2.2 Excel’s Macro Recorder 11
Next, center the words by pressing the Center button under the Home tab. The result of this action is shown below.
The next thing to do is stop Excel’s macro recorder by clicking on the Stop Recorder button under the Developer tab. The
result of this action is shown below.
12 2 Introduction to Excel Programming and Excel 365 Only Features
Let’s look at the resulting VBA code that Excel created by pressing the Alt + F8 key combination on the keyboard or
clicking on the Macro button on the Developer tab.
Clicking on the Macro button will result in the Macro dialog box shown below.
The Macro dialog box shows all the available macros in a macro name and then press the Edit button on the Macro
workbook. The Macro dialog box shows one macro, the dialog box. Pushing the Edit button would result in the
macro that we created. Let’s now look at the “FormatWords” Microsoft Visual Basic Editor (VBE). The below shows the
macro that we created. To look at this macro, highlight the VBA code created by Excel’s macro recorder.
2.3 Excel’s Visual Basic Editor 13
The previous section recorded the macro “FormatWords.” This section will show how to run that macro. Before we do this,
we will need to set up the worksheet “Sheet2.” The “Sheet2” format is shown below.
We will use the “FormatWords” macro to format the names in worksheet “Sheet2.” To do this, we will need to select the
names as shown above and then choose Developer ! Macros or press the Alt + F8 key combination.
2.4 Running an Excel Macro 15
Choosing the Macros menu item will display the Macro dialog box shown below.
The Macro dialog box shows all the macros available for macro and then press the Run button as shown above.
use. Currently, the Macro dialog box shows only the macro The below shows the end result after pressing the Run
that we created. To run the macro that we created, select the button.
16 2 Introduction to Excel Programming and Excel 365 Only Features
Let’s now add another macro called “FormatWords2” to the workbook shown above. The first thing that we need to do is to
go to the VBE editor by pressing the key combination Alt + F11. Let’s put this macro in another module. Click on the menu
item Module in the menu Insert.
In “Module2,” type in the macro “FormatWords2.” The above shows the two modules and the macro “FormatWords2” in
the VBE. The below also indicates that “Module2” is the active component in the project.
2.5 Adding Macro Code to a Workbook 17
When the VBA program gets larger, it might make sense to name the modules to a more meaningful name. In the bottom
left of the VBE window, there is a properties window for “Module2.” Shown in the properties window (left bottom corner) is
the name property for “Module2.” Let’s change the name to “Format.” The below shows the end result. Notice in the project
window that it now shows a “Format” module.
18 2 Introduction to Excel Programming and Excel 365 Only Features
Now let’s go back and look at the Macro dialog box. The below shows the Macro dialog box after typing in the macro
“FormatWords2” into the VBE editor.
The Macro dialog box now shows the two macros that
were created.
In the sections above, we used menu items to run macros. In this section, we will use macro buttons to execute a specific
macro. Macro buttons are used when a specific macro is used frequently. Before we illustrate macro buttons, let’s set up the
worksheet “Sheet3,” as shown below.
To create a macro button, go to the Developer tab and click on the Form Controls button in the Insert menu item, as
shown below.
2.6 Macro Button 19
After that, click on the cell where we want the button to be located, and the Assign Macro dialog box will be displayed.
20 2 Introduction to Excel Programming and Excel 365 Only Features
The Assign Macro dialog box shows all the available macros to be assigned to the button. Choose the macro “Format-
Word2” as shown above and press the OK button. Pressing the OK button will assign the macro “FormatWord2” to the
button. The end result is shown below.
Next, select cell A1 and move the mouse cursor over the button “Button 1” and click on the left mouse button. This action
will result in cell A1 to be formatted. The end result is shown below.
The name “Button 1” for the button is probably not a button to display a shortcut menu for the button. Select Edit
good name. To change the name, move the mouse pointer Text from the shortcut menu. Change the name to “Format.”
over the button. After doing this, click on the right mouse The end result is shown below.
2.8 Message Box and Programming Help 21
It is not necessary, as indicated in the previous section, to go to the Macro dialog box to run the “Hello” subprocedure
shown above. To run this macro, place the cursor inside the procedure and press the F5 key on the keyboard. Pressing the F5
key will result in the following.
Notice that in the message box above, the title of the message box is “Microsoft Excel.” Suppose we want the title of the
message box to be “Hello.” The below shows the VBA code to accomplish this.
2.8 Message Box and Programming Help 23
The below shows the result of running the above code. Notice that the title of the message box is “Hello.”
The msgbox command can do a lot of things. But one problem is remembering how to program all the features. The VBE
editor is very good at dealing with this specific issue. Notice in the above code that commas separate the arguments to set the
msgbox command. This then brings up the question: How many arguments does the VBA msgbox have? The below shows
how the VBE editor assists the programmer in programming the msgbox command.
24 2 Introduction to Excel Programming and Excel 365 Only Features
We see that after typing the first comma, the VBE editor currently working on. A list is only shown when an argu-
shows two things. The first thing is a horizontal list that ment has a set of predefined values.
shows and names all the arguments of the msgbox command. If the above two features are insufficient in aiding in how
In that list, it boldens the argument that is being updated. to program the msgbox command, we can place the cursor
The second thing that the VBE editor shows is a vertical list on the msgbox command as shown below and press the F1
that lists all the possible values of the arguments that we are key on the keyboard.
2.8 Message Box and Programming Help 25
The F1 key launches the web browsers and navigates to the URL https://docs.microsoft.com/en-us/office/vba/language/
reference/user-interface-help/msgbox-function
26 2 Introduction to Excel Programming and Excel 365 Only Features
2.9 Excel 365 Only Features We will demonstrate dynamic arrays on a table that
shows the component performance of every component of
2.9.1 Dynamic Arrays the S&P 500. We will first demonstrate how to retrieve every
component performance of the S&P 500.
Dynamic array is a powerful new feature that is only
available in Excel 365. Dynamic arrays return array values 2.9.1.1 Year to Date Performance of S&P 500
to neighboring cells. The URL https://www.ablebits.com/ Components
office-addins-blog/2020/07/08/excel-dynamic-arrays- We will use Power Query to retrieve from the URL https://
functions-formulas/ defines dynamic arrays as. www.slickcharts.com/sp500/performance the year to date
performance of every component of the S&P 500.
resizable arrays that calculate automatically and return values
into multiple cells based on a formula entered in a single cell.
Step 1 is to click on the From Web button from the Data
tab.
Step 2 is to enter the URL https://www.slickcharts.com/sp500/performance and then press the OK button.
2.9 Excel 365 Only Features 27
Step 3 is to click on Table 0 and then click on the Transform Data button.
Step 4 is to right-mouse click on Table 0, and click on the Rename menu item.
Step 5 is to click on Close & Load to load the S&P 500 YTD returns to Microsoft Excel.
The Power Query result is saved in an Excel table, and the Excel table has the same name as the query SP500YTD. When
a cell is inside an Excel table, the Table Design menu appears.
2.9 Excel 365 Only Features 29
2.9.1.3 FILTER Function The following FILTER function shows all S&P 500
The FILTER function is a new Excel 365 function to handle companies that start with the letter “G.”
and filter dynamic arrays.
2.9 Excel 365 Only Features 31
2.9.2.1.1 Stock
The below steps demonstrate the retrieval of stock attributes.
Step 1. Select tickers and then click on the Stocks button.
The below shows some of the attributes available for the Stock data type.
2.9 Excel 365 Only Features 35
2.9.3 STOCKHISTORY Function range of prices for an instrument. Historical data is returned
as a dynamic array. This is indicated by the blue border
The Stocks data type returns only the current price of an around the historical data.
instrument. Use the STOCKHISTORY function to return a
36 2 Introduction to Excel Programming and Excel 365 Only Features
To know more about the STOCKHISTORY function, click on the Insert Function icon to get the Function Arguments
dialog box.
References 37
By default, the historical data shown by the STOCK- the SORT function to show the historical data in date
HISTORY function is shown in date ascending order. Often it descending order.
is shown in date descending order. To accomplish this, use
The range command above is used to reference specific cells of a worksheet. So, if the worksheet “Sheet1” is the active
worksheet, cell A5 of worksheet “Sheet1” will be populated with the value of 100. This is shown below.
“Sheet1” has the value of 100 and not cell A5 in the other “Sheet2” will be populated with the value of 100. To solve
worksheets of the workbook. But if we run the above macro this issue, experienced programmers will rewrite the above
when worksheet “Sheet2” is active, cell A5 in worksheet VBA procedure as shown below.
3.2 Excel’s Object Model 41
Notice that the VBA code line is longer in the procedure equate properties as an adjective. We can crudely equate
“Example2” than in the procedure “Example1.” To understand methods as an adverb. In Excel, some examples of objects
why, we will need to look at Excel’s object model. We can think are worksheets, workbooks, and charts. These objects have
of Excel’s object as an upside-down tree. A lot of Excel VBA properties that describe them or have methods that act on
programming is basically traversing the tree. In VBA pro- them.
gramming, moving from one level of a tree to another level is In the Excel object model, there is a parent and child
indicated by a period. The VBA code in the procedure “Exam- relationship between objects. The topmost object is the
ple2” traverses Excel’s object model through three levels. Excel object. A frequently used object and a child of the
Among all Microsoft Office products, Excel has the most Excel object is the workbook object. Another frequently used
detailed object model. When we talk about object models, object and a child of the workbook object is the worksheet.
we are talking about concepts that a professional program- Another frequently used object and a child of the worksheet
mer would talk about. When we are talking about object object is the range object. If we look at the Excel object
models, there are three words that even a novice must know. model, we will be able to see the relationship between the
Those three words are objects, properties, and methods. Excel object, the workbook object, the worksheet object, and
These words can take up chapters or even books to explain. the range object.
A very crude but somewhat effective way to think about We can use the help in the VB Editor (VBE) to look at
what these words mean is to think about English grammar. the Excel object model. To do this, we would need to choose
We can crudely equate objects as a noun. We can crudely Help ! Microsoft Visual Basic for Application Help.
42 3 Introduction to VBA Programming
Intellisense is a great aid in helping the VBA programmer in dealing with the methods, properties, and child objects of
each object.
Another tool to aid the VBA programmer in dealing with the Excel object model is the Object Browser. To view the Object
Browser, choose View ! Object Browser. This is shown below.
The below shows how to view the Excel object model from the Object Browser.
The below shows the objects, properties, and methods for the Worksheet object.
3.4 Object Browser 45
The list above indicates the object models used by the Visual Basic Editor. Of all the object models shown above, the
VBA object model is used most after the Excel object model. The below shows the VBA object model in the object browser.
46 3 Introduction to VBA Programming
The main reason that an Excel VBA programmer uses the VBA object model is that the VBA object model provides a lot
of useful functions. Professional programmers will say that the functions of an object model are properties of an object
model. For example, for the Left function shown above, we can say that the Left function is a property of the VBA object
model. The below shows an example of using the property Left of the VBA object model.
Many times, an Excel VBA programmer will write macros that use both Microsoft Excel and Microsoft Access. To do
this, we would need to set up the VBE so that it can also use Access’s object model. To do this, we would first have to
choose Tools ! Reference in the VBE. This is shown below.
3.4 Object Browser 47
In the above References dialog box, the Excel object model is selected. The bottom of the dialog box shows the location
of the file that contains Excel’s object model. The file that contains an object model is called a type library.
To program Microsoft Access while programming Excel, we will need to find the type library for Microsoft Access. The
below shows the Microsoft Access object model being selected.
48 3 Introduction to VBA Programming
If we press the OK button and go back to the References dialog box, we will see the following.
Notice that the References dialog box now shows all the selected object libraries on the top. We now should be able to see
Microsoft Access’s object model in the object browser. The below shows that Microsoft Access’s object model is included in
the object browser’s list.
3.4 Object Browser 49
The below shows Microsoft Access’s object model in the object browser.
The Excel object model does not have a method to make the PC make a beep sound. Fortunately, it turns out that the
Access object does have a method to make the PC make a beep sound. The below is a macro that will make the PC make a
beep sound. The Access keyword indicates that we are using the Access object model. The keyword Docmd is a child object
of the Access object. The keyword Beep is a method of the DoCmd object.
It turns out that in the VBA object model, there is also a beep method. The below shows a macro using the VBA object
model to make the PC make a beep sound.
50 3 Introduction to VBA Programming
3.5 Variables
In VBA, programming variables are used to store and manipulate data during macro execution. When dealing with data, it is
often useful when processing data to only deal with a specific type of data. In VBA, it is possible to define a specific type for
specific variables. Below is a summary of the different types available in VBA. This list was obtained from the URL https://
docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/data-type-summary.
There are a lot of things happening in the macro 4. Double quotes are used to hold string values.
“Example7”: 5. “&” is used to put together two strings.
6. The character “_” is used to indicate that the VBA
1. In this macro, we used the keyword Dim to define one command line is continued in the next line.
variable to hold an integer data type and one variable to 7. We calculated the data we received and put the calculated
hold a string data type, and one variable to hold a long result in ranges A1–A3.
data type.
2. In this macro, we used the keyword inputbox to prompt We will now show why data-typing a variable is impor-
the user for data. tant. The first input box requested an integer. The number
3. We used the single apostrophe to tell the VBE to ignore four will be added to the inputted number. Suppose that by
everything to the right. Programmers use the single accident, we enter a word instead. The below shows what
apostrophe to comment about the VBA code. happens when we do this.
3.5 Variables 53
The above shows that the VBE will complain about From the data type list, it is important to note that the
having the wrong data type for the variable “iNum.” There variant data type is a data type that can be any type. The type
are VBA techniques to handle this type of situation so the of a variable is determined during run time (when the macro is
user will not have to see the above VBA error message. running). The macro “Example7” can be rewritten as follows.
54 3 Introduction to VBA Programming
In VBA programming, it is actually possible to use variables without first being defined, but good programming practice
dictates that every variable should be defined. Excel VBA has the two keywords Option Explicit to indicate that every
variable must be declared. The below shows what happens when Option Explicit is used and when a variable is not defined
when trying to run a macro.
Notice that using the Option Explicit keywords results in When a new module is inserted into a project, the key-
the following: words Option Explicit by default are not inserted into the
new module. This can cause problems, especially in bigger
1. The variable that is not defined is highlighted. macros. The VBE has a feature where the keywords Option
2. A message indicating that a variable is not defined is Explicit are automatically included in a new module. To do
displayed. this, choose Tools ! Options. This is shown below.
3.7 Object Variables 55
Choose the Required Variable Declaration option in the Editor tab of the Options dialog box to set it so the keywords
Options Explicit are included with every new module. It is important to note that by default the Required Variable Decla-
ration option is not selected.
The data type Object is used to define a variable to “point” to objects in the Excel object model. Like the data type Variant,
the specific object data type for the data type Object is determined at run time. The macro below will set the cell A5 in the
worksheet “Sheet2” to the value “VBA Programming.” This macro is not sensitive to which worksheet is active.
56 3 Introduction to VBA Programming
The below rewrites the macro “Example9” by defining the variable “ws” as a worksheet data type and the variable
“rRange” as a range data type.
Next, in the Insert Function dialog box, select User then press the OK button. Pressing the OK button will result
Defined in the category drop-down box. in the following.
We will now show how to make it so there is a description for our TenPercentInterest function in the Insert Function dialog
box. The first thing that we will need to do is to choose Developer ! Macro as shown below
3.9 Adding a Function Description 59
The resulting Macro dialog box is shown below. The next thing to do would be to press the Options button
of the Macro dialog box to get the Macro Options dialog
box shown below.
When you create a custom function in VBA, Excel, by default, puts the function in the User Defined category of the Insert
Function dialog box. In this section, we will show how through VBA to set it so that the function CDInterest shows up in the
“financial” category of the Insert Function dialog box.
Below is the VBA procedure to set it so that the CDInterest function will be categorized in the “financial” category.
The MacroOptions method of the Application object puts This task is done by the procedure Auto_Open because
the function CDInterest in the “finance” category of the VBA will execute the procedure called “Auto_Open” when
Insert Function dialog box. The MacroOptions method must a workbook is opened.
be executed every time when we open the workbook that The below shows the function CDInterest in the
contains the function CDInterest. “Financial” category in the Insert Function dialog box.
3.11 Conditional Programming with the IF Statement 61
Up to this point, the VBA code that we have been writing is executed sequentially from top to bottom. When the VBA code
reaches the bottom, it stops. We will now look at looping, the concept of where VBA code is executed more than once. The
first looping code that we will look at is the For loop. The For loop is used when it can be determined how many times the
loop should be. To demonstrate the For loop, we will extend our CD program in our previous section. We will add the
procedure below to ask how many CDs we want to calculate.
64 3 Introduction to VBA Programming
The following shows how to assign 15,000 to the 20th deposits, we prompted the user for the principal amount.
salary item: This process is very time-consuming and very tedious. In the
business world, it is common that the information of interest
Salary(20) = 15000 is already in an application. The procedure would then be to
extract the information to a file to be processed. For our
salary example, we will extract the salary data to a csv file
Suppose we need to calculate every 2 weeks the income format. A CSV file format is basically a text file that is
tax to be withheld from 30 employees. This situation is very separated by commas. A common application to read CSV
similar to our example in calculating the interest of the files is Microsoft Windows Notepad. The below shows the
certificate of deposits. When we calculate the certificate of “salary.csv” that we are interested in processing.
70 3 Introduction to VBA Programming
The thing to note about the csv file is that the first row is usually the header. The first row is the row that describes the
columns of a dataset. In the salary file above, we can say that the header contains two fields. One field is the date field, and
the other field is the salary field.
3.14 Arrays 71
Pushing the Calculate Tax button will result in the following workbook.
72 3 Introduction to VBA Programming
When normal people think about lists, they usually start with
the number 1. A lot of times, programmers begin a list with
the number 0. In VBA programming, the beginning of an
array index is 0. To set it so that the beginning of array index
is 1, we would use the statement “Option Base 1.” This was
done in the procedure “SalaryTax” in the previous
procedure.
3.16 Collections
In VBA programming, there is a lot of programming with a group of like items. Groups of like items are called Collections.
Examples are collections of workbooks, worksheets, cells, charts, and names. There are two ways to reference a collection.
The first way is through an index. The second way is by name. For example, suppose we have the following workbook that
contains three worksheets.
3.16 Collections 73
It is important to note what the effect of removing an item from a collection to a VBA code is. The below shows the
workbook without the worksheet “John.”
74 3 Introduction to VBA Programming
The above demonstrates that referencing an item in a function description and then discussed specifying a func-
collection by name is preferable when there are additions or tion category. We discussed conditional programming with
deletions to a collection. the IF statement, for loop, and while loop. We also talked
about arrays. We talked about option base 1 and collections.
3.17 Summary
References
In this chapter, we discussed Excel’s object model, the
Intellisense menu, and the object browser. We also looked at https://www.excelcampus.com/vba/intellisense-keyboard-shortcuts/
variables and talked about option explicit. We discussed https://docs.microsoft.com/en-us/office/vba/language/reference/user-
object variables and functions. We discussed adding a interface-help/data-type-summary
Professional Techniques Used in Excel
and VBA 4
4.1 Introduction
Notice that the current region area contains the header or On Error Resume Next
row 1. Many times when data is imported, we will want to 'Open CD file. It is assumed in same location as this
exclude the header row. To solve this problem, we will look workbook
at the offset property of the range object in the next section. Set wbCD = Workbooks.Open(ThisWorkbook.Path & ``\'' &
``CD.csv'')
If wbCD Is Nothing Then
MsgBox ``Could not find the file CD.csv in the path '' _
4.3 Offset Property of the Range Object
& ThisWorkbook.Path, vbCritical
End
The offset property is one of those properties and methods
End If
that are usually mentioned in passing in most books. The
'Figure out salary range
offset property has two arguments. The first argument is for
'CurrentRegion Method will find row and columns that are
the row offset. The second argument is for the column offset.
completely
Below is a procedure that illustrates the offset property. 'surronded by blank cells
Set rCD = ActiveSheet.Cells(1).CurrentRegion
'/
'Offset the current region by one row.
***************************************************
'The offset property has row offset argument and column
******************************
offset argument
'/Purpose: To find the data range of an imported file
Set rCD = rCD.Offset(rowoffset:=1, columnoffset:=0)
'/
rCD.Select
***************************************************
MsgBox ``The address of the data is '' & rCD.Address
****************************
wbCD.Close False
Sub CurrentRegionOffset()
End Sub
Dim rCD As Range
Dim wbCD As Workbook
78 4 Professional Techniques Used in Excel and VBA
Notice that when we used the offset property, we shifted On Error Resume Next
the whole current region by one row. As shown above, 'Open CD file. It is assumed in same location as this
offsetting the current region by one row causes the blank row workbook
16 to be included. To solve this problem, we will use the Set wbCD = Workbooks.Open(ThisWorkbook.Path & ``\'' &
resize property of the range object. The resize property is ``CD.csv'')
discussed in the next section. If wbCD Is Nothing Then
MsgBox ``Could not find the file CD.csv in the path '' _
& ThisWorkbook.Path, vbCritical
4.4 Resize Property of the Range Object End
End If
Like the offset property, the resize property is one of those 'Figure out salary range
properties and methods that are usually mentioned in passing 'CurrentRegion Method will find row and columns that are
in most books. completely
The resize property has two arguments. The first argu- 'surronded by blank cells
ment is to resize the row to a certain size. The second Set rCD = ActiveSheet.Cells(1).CurrentRegion
argument is to resize the column to a certain size. Below is a 'Offset the current region by one row.
procedure that illustrates the resize property. 'The offset property has row offset argument and column
offset argument
Set rCD = rCD.Offset(rowoffset:=1, columnoffset:=0)
'/
'resize the range by the amount previous rows -1
***************************************************
'resize the columns to same number columns as previ-
******************************
ously
'/Purpose: To find the data range of an imported file
Set rCD = rCD.Resize(rowsize:=rCD.Rows.Count—1,
'/
columnsize:=rCD.Columns.Count)
***************************************************
rCD.Select
****************************
MsgBox ``The address of the data is '' & rCD.Address
Sub CurrentRegionOffsetResize()
wbCD.Close False
Dim rCD As Range
End Sub
Dim wbCD As Workbook
4.5 UsedRange Property of the Range Object 79
Below shows what happens after pushing the Select UsedRange button.
80 4 Professional Techniques Used in Excel and VBA
Below shows what happens after pushing the Select CurrentRegion button.
4.6 Go to Special Dialog Box of Excel 81
Sub IntoArrayColumnNoTransPose()
Dim vNum As Variant
vNum = Worksheets(``Column'').Range(``a1'').Cur-
rentRegion
End Sub
The first thing to notice is that the first line of the pro-
cedure is highlighted in yellow in the code window. The
yellow highlighted line is shown above. The other thing to
note is the Locals window. It shows the value of all the
variables. At this point, it indicates that the variable “vNum”
has an empty value, which means no value.
The next thing that we need to do now is to press the F8
key on the keyboard to move to the next VBA line. Below
shows what happens after pressing the F8 key.
4.7 Importing Column Data into Arrays 87
Pressing the F5 key will run the VBA code until the
breakpoint. Below shows the state of the VBE after pressing
the F5 key.
92 4 Professional Techniques Used in Excel and VBA
Sub IntoArrayRow()
Dim vNum As Variant
vNum = WorksheetFunction.Transpose(WorksheetFunc-
tion. _
Transpose(Worksheets(``Row''). _
Range(``a1'').CurrentRegion.Value))
End Sub
.CurrentRegion.ClearContents
.Resize(1, 4) = (v)
End With
End Sub
Sub TransferToColumn()
Dim v As Variant
v = Array(1, 2, 3, 4)
With ActiveSheet.Range(``a1'')
.CurrentRegion.ClearContents
.Resize(4, 1) = WorksheetFunction.Transpose(v)
End With
End Sub
96 4 Professional Techniques Used in Excel and VBA
End If
Sub Listfiles() Set wb = Workbooks.Add
Dim FSO As New FileSystemObject Set ws = wb.Worksheets(1)
Dim objFolder As Folder ws.Cells(2, 1).Select
Dim objFile As File ActiveWindow.FreezePanes = True
Dim strPath As String 'Adding Column names
Dim NextRow As Long ws.Cells(1, ``A'').Value = ``File Name''
Dim wb As Workbook ws.Cells(1, ``B'').Value = ``Size''
Dim ws As Worksheet ws.Cells(1, ``C'').Value = ``Modified Date/Time''
Dim wsMain As Worksheet ws.Cells(1, ``D'').Value = ``User Name''
Set wsMain = ThisWorkbook.Worksheets(``Main'') ws.Cells(1, 1).Resize(1, 4).Font.Bold = True
'Specify the path of the folder 'Find the next available row
strPath = wsMain.Range(``Directory'') NextRow = ws.Cells(2, 1).Row
If Not FSO.FolderExists(strPath) Then 'Loop through each file in the folder
MsgBox ``The folder '' & strPath & `` does not exits.'' For Each objFile In objFolder.Files
Exit Sub 'List the name of the current file
End If ws.Cells(NextRow, 1).Value = objFile.Name
'Create the object of this folder ws.Cells(NextRow, 2).Value = Format(objFile.Size,
Set objFolder = FSO.GetFolder(strPath) ``#,##0'')
'Check if the folder is empty or not ws.Cells(NextRow, 3).Value = Format(objFile.
If objFolder.Files.Count = 0 Then DateLastModified, ``mmm-dd-yyyy'')
MsgBox ``No files were found ...'', vbExclamation ws.Cells(NextRow, 4).Value = Application.UserName
Exit Sub 'find the next row
110 4 Professional Techniques Used in Excel and VBA
NextRow = NextRow + 1
Next objFile
With ws
.Cells.EntireColumn.AutoFit
End With
End Sub
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 115
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_5
116 5 Binomial Option Pricing Model Decision Tree Approach
For example, a call option on an IBM stock with an A put option becomes valuable when the exercise price is
exercise price of $100 when the stock price of an IBM stock more than the current price of the underlying stock price.
is $110 is worth $10. The reason it is worth $10 is because a For example, a put option on an IBM stock with an
holder of the call option can buy the IBM stock at $100 and exercise price of $100 when the stock price of an IBM stock
then sell the IBM stock at the prevailing price of $110 for a is $90 is worth $10. The reason it is worth $10 is because a
profit of $10. Also, a call option on an IBM stock with an holder of the put option can buy the IBM stock at the pre-
exercise price of $100 when the stock price of an IBM stock vailing price of $90 and then sell the IBM stock at the put
is $90 is worth $0. price of $100 for a profit of $10. Also, a put option on an
A put option gives the owner the right but not the obli- IBM stock with an exercise price of $100 when the stock
gation to sell the underlying security at a specified price. price of the IBM stock is $110 is worth $0.
30
20
10
Value
-10
90 95 100 105 110 115 120 125 130 135
Price
-20
-30
30
20
10
Value
-10
60 65 70 75 80 85 90 95 100 105
Price
-20
-30
5.3 Option Pricing—One Period 117
Below are the charts showing the value of call and put
options of the above IBM stock at varying prices:
110 10 0
100 ?? ??
90 0 10
Let’s first consider the issue of pricing a call option. With the above equation, we can rewrite the first
Using a one-period decision tree, we can illustrate the price equation as
of a stock if it goes up and the price of a stock if it goes
down. Since we know the possible endings values of a stock, 110S þ ð90SÞ ¼ 10;
we can derive the possible ending values of a call option. If 20S ¼ 10;
the stock price increases to $110, the price of the call option S ¼ :5
will then be $10 ($110−$100). If the stock price decreases to
$90, the value of the call option will worth $0 because it We can solve for B by substituting the value 0.5 for S in
would be below the exercise price of $100. We have just the first equation as follows:
discussed the possible ending value of a call option in period
110ð:5Þ þ 1:07B ¼ 10;
1. But, what we are really interested is what is the value now
of the call option knowing the two resulting values of a call 55 þ 1:07B ¼ 10;
option. 1:07B ¼ 45;
To help determine the value of a one-period call option, B ¼ 42:05607:
it’s useful to know that it is possible to replicate the resulting
two states of the value of the call option by buying a com- Therefore, from the above simple algebraic exercise, we
bination of stocks and bonds. Below is the formula to should at period 0 buy 0.5 shares of IBM stock and borrow
replicate the situation where the price increases to $110. We 42.05607 at 7 percent to replicate the payoff of the call
will assume that the interest rate for the bond is 7%. option. This means the value of a call option should be
0.5*100−42.05607 = 7.94393.
110S þ 1:07B ¼ 10; If this were not the case, there would then be arbitrage
90S þ 1:07B ¼ 0: profits. For example, if the call option were sold for $8 there
would be a profit of 0.056607. This would result in an
We can use simple algebra to solve for both S and B. The
increase in the selling of the call option. The increase in the
first thing that we need to do is to rearrange the second
supply of call options would push the price down for the call
equation as follows:
options. If the call option were sold for $7, there would be a
1:07B ¼ 90S: saving of 0.94393. This saving would result in the increase
demand for the call option. This increase demand would
118 5 Binomial Option Pricing Model Decision Tree Approach
result in the price of the call option to increase. The equi- 5.4 Put Option Pricing—One Period
librium point would be 7.94393.
Using the above-mentioned concept and procedure, Like the call option, it is possible to replicate the resulting
Benninga (2000) derived a one-period call option model as two states of the value of the put option by buying a com-
bination of stocks and bonds. Below is the formula to
C ¼ qu Max½Sð1 þ uÞ X; 0 þ qd Max½Sð1 þ dÞ X; 0; ð5:1Þ
replicate the situation where the price decreases to $90:
where
110S þ 1:07B ¼ 0;
id 90S þ 1:07B ¼ 10:
qu ¼ ;
ð1 þ iÞðu dÞ
We will use simple algebra to solve for both S and B. The
ui first thing we will do is to rewrite the second equation as
qd ¼ ;
ð1 þ iÞðu dÞ follows:
1:07B ¼ 10 90S:
u= increase factor, The next thing to do is to substitute the above equation to
d= down factor, the first put option equation. Doing this would result in the
i= interest rate. following:
If we let i = r, p = (r-d)/(u-d), 1—p = (u-r)/(u-d), R = 1/ 110S þ 10 90S ¼ 0:
(1 + r), Cu = Max[S(1 + u)—X, 0] and Cd = Max[S(1 + d)
—X, 0], then we have The following solves for S:
5.5 Option Pricing―Two Period We cannot calculate the value of the call and put options
in period 1 the same way we did in period 2 because it’s not
We now will look at pricing options for two periods. Below the ending value of the stock. In period 1, there are two
shows the stock price decision tree based on the parameters possible call values. One value is when the stock price
indicated in the last section. increases and one value is when the stock price decreases.
The call option decision tree shown above shows two pos-
Stock Price sible values for a call option in period 1. If we just focus on
Period 0 Period 1 Period 2
the value of a call option when the stock price increases from
121 period 1, we will notice that it is like the decision tree for a
110
99 call option for one period. This is shown below.
100
99 Call Option
90 Period 0 Period 1 Period 2
81
This decision tree was created based on the assumption 21.00
21.00 0.00
16.68 0.14
0 1.00
0.60
0 1.00
3.46
0 19.00
In the same fashion, we can price the value of a call
option when a stock price decreases. The price of a call
option when a stock price decreases from period 0 is $0. The 5.6 Option Pricing—Four Period
resulting decision tree is shown below.
Call Option We now will look at pricing options for three periods. Below
Period 0 Period 1 Period 2 shows the stock price decision tree based on the parameters
21.00
indicated in the last section.
16.68 Stock Price
0 Period 0 Period 1 Period 2 Period 3
0 133.1
0 121
0 108.9
110
In the same fashion, we can price the value of a call option 108.9
99
in period 0. The resulting decision tree is shown below. 89.1
Call Option 100
Period 0 Period 1 Period 2 108.9
99
21.00 89.1
16.68 90
0 89.1
13.25 81
0 72.89999
0 From the above stock price decision tree, we can figure
0 out the values for the call and put options for period 3. The
values for the call and put options are shown below.
We can calculate the value of a put option in the same
manner as we did in calculating the value of a call option.
The decision tree for a put option is shown below.
33.10001 0
8.900002 0
8.900002 0
0 10.9
8.900002 0
0 10.9
0 10.9
0 27.10001
5.7 Using Microsoft Excel to Create the Binomial Option Call Trees 121
33.10001 0
27.54206 0
8.900002 0
22.87034 0.214211
8.900002 0
7.070095 1.528038
0 10.9
18.95538 0.585163
8.900002 0
7.070095 1.528038
0 10.9
5.616431 2.960303
0 10.9
0 12.45795
0 27.10001
Pushing the binomial option button shown above will get The dialog box shown above shows the parameters for
the dialog box shown below. the binomial option pricing model. These parameters are
changeable. The dialog box shows the default values.
Pushing the European Option button produces four
binomial option decision trees.
5.7 Using Microsoft Excel to Create the Binomial Option Call Trees 123
The table at the beginning of this section indicated that 31 Benninga (2000, p260) defined the price of a call option
calculations were required to create a decision tree that has in a binomial option pricing model with n periods as
four periods. This section showed four decision trees. n
X
Therefore, the Excel file did 31 * 4 = 121 calculations to C¼
n i ni
ð5:5Þ
i d max½Sð1 þ uÞ ð1 þ dÞ
qiu qni X; 0
create the four decision trees. i¼0
124 5 Binomial Option Pricing Model Decision Tree Approach
1 Xn
n!
C¼ pk ð1 pÞnk max½0; ð1 þ uÞk ð1 þ dÞnk S X:
R k¼0 k!ðn k!Þ
n
ð5:7Þ
The definition of the pricing of a put option in a binomial
option pricing model with n period would then be defined as
1 Xn
n!
P¼ pk ð1 pÞnk max½0; X
R k¼0 k!ðn k!Þ
n
5.9 Alternative Tree Methods and d, depend only on volatility r and on dt, not on drift as
shown below:
In this section, we will introduce three binomial tree meth- pffiffiffi
ods and one trinomial tree method to price option values. u ¼ er dt
Three binomial tree methods include Cox, Ross, and
1
Rubinstein (1979), Jarrow and Rudd (1983), and Leisen and d¼
Reimer (1996). These methods will generate different kinds u
of underlying asset trees to represent different trends of asset To offset the absence of a drift component in u and d, the
movement. Kamrad and Ritchken (1991) extended binomial probability of an up move in the CRR tree is usually greater
tree method to multinomial approximation models. Trino- than 0.5 to ensure that the expected value of the price
mial tree method is one of the multinomial models. increases by a factor of exp[(r-q)dt] on each step. The for-
mula for p is
We can see the call option value at time zero is equal to vvec(i) = Application.Max(S * (u ^ i) * (d ^ (
3.244077 in Cell C12. We also can write a VBA function to Nstep - i)) - X, 0)
price call option. Below is the function: Next i
For j = Nstep - 1 To 0 Step -1
' Returns CRR Binomial Option Value For i = 0 To j
Function CRRBinCall(S, X, r, q, T, sigma, Nstep) vvec(i) = (p * vvec(i + 1) + (1 - p) * vvec
Dim dt, erdt, ermqdt, u, d, p (i)) / erdt
Dim i As Integer, j As Integer Next i
Dim vvec() As Variant Next j
ReDim vvec(Nstep) CRRBinCall = vvec(0)
dt = T / Nstep End Function
erdt = Exp(r * dt)
ermqdt = Exp((r - q) * dt)
u = Exp(sigma * Sqr(dt)) Using this function and putting parameters in the func-
d=1/u tion, we can get call option value under different steps. This
p = (ermqdt - d) / (u - d) result is shown below.
For i = 0 To Nstep
5.9 Alternative Tree Methods 127
The function in cell B12 is Expressed algebraically, the trinomial tree parameters are
pffiffiffi
¼ CRRBinCallðB3; B4; B5; B6; B8; B7; B10Þ u ¼ ekr dt
We can see the result in B12 is equal to C12.
1
d¼
u
5.9.2 Trinomial Tree The formula for probability p is given as follows:
pffiffiffiffi
Because binomial tree methods are computationally expen- 1 ðr r2 =2Þ dt
pu ¼ 2 þ
sive, Kamrad and Ritchken (1991) propose multinomial 2k 2kr
models. New multinomial models include as special cases
1
existing models. The more general models are shown to be pm ¼ 1
computationally more efficient. k2
p d ¼ 1 pu pm
If parameter k is equal to 1, then trinomial tree model
reduces to a binomial tree model. Below is the underlying
asset price pattern base on trinomial tree model.
128 5 Binomial Option Pricing Model Decision Tree Approach
The call option value at time zero is 3.269028 in cell C12. ReDim vvec(2 * Nstep)
In addition, we also can write a function to price a call option dt = T / Nstep
based on trinomial tree model. The function is shown below. erdt = Exp(r * dt)
ermqdt = Exp((r - q) * dt)
' Returns Trinomial Option Value u = Exp(lamda * sigma * Sqr(dt))
Function TriCall(S, X, r, q, T, sigma, Nstep, lamda) d=1/u
Dim dt, erdt, ermqdt, u, d, pu, pm, pd pu = 1 / (2 * lamda ^ 2) + (r - sigma ^ 2 / 2) * Sqr
Dim i As Integer, j As Integer (dt) / (2 * lamda * sigma)
Dim vvec() As Variant pm = 1 - 1 / (lamda ^ 2)
5.9 Alternative Tree Methods 129
pd = 1 - pu - pm
For i = 0 To 2 * Nstep
vvec(i) = Application.Max(S * (d ^ Nstep) * (u ^
i) - X, 0)
Next i
For j = Nstep - 1 To 0 Step -1
For i = 0 To 2 * j
vvec(i) = (pu * vvec(i + 2) + pm * vvec
(i + 1) + pd * vvec(i)) / erdt
Next i
Next j
TriCall = vvec(0)
End Function
Similar data can use in this function and get the same call
option at today’s price.
Simultaneously, this paper demonstrated, with the aid of Property Get BinomialCalc() As Long
Microsoft Excel and decision trees, the binomial option BinomialCalc = mBinomialCalc
model in a less mathematical fashion. This paper allowed the End Property
reader to focus more on the concepts by studying the asso- Property Set TreeWorkbook(wb As Workbook)
ciated decision trees, which were created by Microsoft Set mwbTreeWorkbook = wb
Excel. This paper also demonstrates that using Microsoft End Property
Property Get TreeWorkbook() As Workbook
Excel releases the reader from the computation burden of the
Set TreeWorkbook = mwbTreeWorkbook
binomial option model.
End Property
This paper also published the Microsoft Excel VBA code
Property Set TreeWorksheet(ws As Worksheet)
that created the binomial option decision trees. This allows
Set mwsTreeWorksheet = ws
for those who are interested to study the many advanced
End Property
Microsoft Excel VBA programming concepts that were used
Property Get TreeWorksheet() As Worksheet
to create the decision trees. One major computer science Set TreeWorksheet = mwsTreeWorksheet
programming concept used by Microsoft Excel VBA is End Property
recursive programming. Recursive programming is the ideal Property Set CallTree(ws As Worksheet)
of a procedure calling itself many times. Inside the proce- Set mwsCallTree = ws
dure, there are statements to decide when not to call itself. End Property
Property Get CallTree() As Worksheet
Set CallTree = mwsCallTree
Appendix 5.1: EXCEL CODE—Binomial Option End Property
Pricing Model Property Set PutTree(ws As Worksheet)
Set mwsPutTree = ws
'/
End Property
***************************************************
Property Get PutTree() As Worksheet
************************
Set PutTree = mwsPutTree
'/Essentials of Microsoft Excel 2013 VBA, SAS
End Property
'/ and MINITAB 17
Property Set BondTree(ws As Worksheet)
'/ for Statistical and Financial Analysis
Set mwsBondTree = ws
'/
End Property
'/
Property Get BondTree() As Worksheet
***************************************************
Set BondTree = mwsBondTree
************************
End Property
Option Explicit
Dim mwbTreeWorkbook As Workbook Property Let PFactor(r As Double)
Dim mwsTreeWorksheet As Worksheet Dim dRate As Double
Dim mwsCallTree As Worksheet dRate = ((1 + r) - Me.txtBinomialD) / (Me.
Dim mwsPutTree As Worksheet txtBinomialU - Me.txtBinomialD)
Dim mwsBondTree As Worksheet Let mdblPFactor = dRate
Dim mdblPFactor As Double End Property
Dim mBinomialCalc As Long Property Get PFactor() As Double
Dim mOptionType As String Let PFactor = mdblPFactor
'/ End Property
************************************************** Private Sub cmdCalculate_Click()
'/Purpose: Keep track the numbers of binomial calc
Me.Hide
'/*************************************************
BinomialOption
Property Let OptionType(t As String)
Unload Me
mOptionType = t
End Sub
End Property
Property Get OptionType() As String Private Sub cmdCalculateAmerican_Click()
Me.Hide
OptionType = mOptionType
Me.OptionType = ``American''
End Property
BinomialOption
Property Let BinomialCalc(l As Long)
Unload Me
mBinomialCalc = l
End Sub
End Property
132 5 Binomial Option Pricing Model Decision Tree Approach
where
6.1 Introduction
S
ln þ r þ 12 r2 T
This chapter shows how Microsoft Excel can be used to d1 ¼ X
p ffiffiffi
ffi
r T
estimate call and put options for (a) Black–Scholes model
S
for individual stock, (b) Black–Scholes model for stock ln X þ r 12 r2 T pffiffiffiffi
indices, and (c) Black–Scholes model for currencies. In d2 ¼ pffiffiffiffi ¼ d1 r T
r T
addition, we are going to present how an Excel program can
be used to estimate American options. Section 6.2 presents
an option pricing model for Individual Stocks, Sect. 6.3
C= price of the call option.
presents an option pricing model for Stock Indices, Sect. 6.4
S= current price of the stock.
presents option pricing model for Currencies, Sect. 6.5
X= exercise price of the option.
presents Bivariate Normal Distribution Approach to calcu-
e= 2.71828…
late American call options, Sect. 6.6 presents the Black’s
r= short-term interest rate (T-Bill rate) = Rf.
approximation method to calculate American Call options,
T= time to expiration of the option, in years
Sect. 6.6 presents how to evaluate American call option
N(di) = value of the cumulative standard normal
when dividend yield is known, and Sect.6.9 summarizes this
distribution (i = 1,2)
chapter. Appendix 6.1 defines the Bivariate Normal Proba-
r2 = variance of the stock rate of return.
bility Density Function and Appendix 6.2 presents the Excel
program to calculate the American call option when divi- The put option formula can be defined as
dend payments are known.
P ¼ XerðTÞ Nðd2 Þ SNðd1 Þ; ð6:2Þ
where
6.2 Option Pricing Model for Individual
Stock P= price of the put option.
The other notations have been defined in Eq. (6.1).
The call option formula for an individual stock can be
Assume S = 42, X = 40, r = 0.1, r = 0.2, and T = 0.5.
defined as
The following shows how to set up Microsoft Excel to solve
C ¼ SNðd1 Þ XerðTÞ Nðd2 Þ; ð6:1Þ the problem:
This chapter was written by Professor Cheng F. Lee and Dr. Ta-Peng
Wu of Rutgers University.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 137
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_6
138 6 Microsoft Excel Approach to Estimating Alternative Option Pricing Models
Fig. 6.1 The inputs and excel functions of European call and put options
The following shows the answer to the problem in 2
lnðS=XÞ þ r rf r2 ðTÞ pffiffiffiffi
Microsoft Excel: (Fig. 6.4). d2 ¼ pffiffiffiffi ¼ d1 r T
From the Excel output, we find that the prices of a call r T
option and a put option are $59.26 and $5.01, respectively.
Fig. 6.3 The inputs and Excel functions of European call and put options
where
6.5 Futures Options
P= the price of the put option.
The other notations have been defined in Eq. (6.5). Black (1976) showed that the original call option formula for
Assume that S = 130, X = 125, r = 0.06, rf = 0.02, stocks can be easily modified to be used in pricing call
r = 0.15, and T = 4/12. The following shows how to set up options on futures. The formula is
Microsoft Excel to solve the problem:
The following shows the answer to the problem in C T; F; r2 ; X; r ¼ erT ½FN ðd1 Þ XN ðd2 Þ; ð6:6Þ
Microsoft Excel: (Fig. 6.6).
lnðF=X Þ þ 12r2 T
From the Excel output, we find that the prices of a call d1 ¼ pffiffiffiffi ; ð6:7Þ
option and a put option are $8.43 and $1.82, respectively. r T
6.5 Futures Options 141
Following Chap. 19 of Lee et.al (2013), the call option CðSt ; T tÞ ¼ St þ D X:
formula for American options for a stock that pays a divi-
Both N1(b1) and N2(b2) represent the cumulative uni-
dend, and there is at least one known dividend, can be
variate normal density function. N2(a, b; q) is the cumulative
defined as
bivariate normal density function with upper integral limits a
pffiffiffiffiffiffiffi pffiffiffiffiffiffiffi
CðS; T; XÞ ¼ Sx ½N1 ðb1 Þ þ N2 ða1 ; b1 ; t=T Þ and b and correlation coefficient q ¼ t=T .
pffiffiffiffiffiffiffi If we want to calculate the call option value of the
Xert ½N1 ðb2 ÞerðTtÞ þ N2 ða2 ; b2 ; t=T Þ
American option, we need first to calculate a1 and b1. For
þ Dert N1 ðb2 Þ; calculating a1 andb1, we need to first calculate Sx and St .
ð6:9Þ The calculation of Sx can be found in Eq. 6.9. The calcu-
lation will be explained in the following example from
where Chap. 19 of Lee et.al (2013).
Sx
An American call option whose exercise price is $48 has
ln þ r þ 12 r2 T pffiffiffiffi
a1 ¼ X
pffiffiffiffi ; a2 ¼ a1 r T ð6:10Þ an expiration time of 90 days. Assume the risk-free rate of
r T interest is 8% annually, the underlying price is $50, the
x standard deviation of the rate of return of the stock is 20%,
ln SS þ r þ 12 r2 t pffi and the stock pays a dividend of $2 exactly for 50 days.
b1 ¼
t
pffi ; b2 ¼ b1 r t ð6:11Þ
r t (a) What is the European call value? (b) Can the early
exercise price predicted? (c) What is the value of the
Sx ¼ S DerT ; ð6:12Þ American call?
6.6 Using Bivariate Normal Distribution Approach to Calculate American Call Options 143
Fig. 6.5 The inputs and Excel functions of European Call and Put options
(a) The current stock net price of the present value of the From the standard normal table, we obtain
promised dividend is
Nð0:25285Þ ¼ 0:5 þ :3438 ¼ 0:599809
50 Nð0:15354Þ ¼ 0:5 þ :3186 ¼ 0:561014:
Sx ¼ 50 2e0:08ð =365Þ ¼ 48:0218:
So the European call value is.
The European call value can be calculated as
C ¼ ð48:516Þð0:599809Þ 48ð0:980Þð0:561014Þ
90
C ¼ ð48:0218ÞNðd1 Þ 48e0:08ð =365Þ Nðd2 Þ; ¼ 2:40123:
where (b) The present value of the interest income that would be
earned by deferring exercise until expiration is
½lnð48:208=48Þ þ ð0:08 þ 0:5ð0:20Þ2 Þð90=365Þ
d1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:25285
:20 90=365 Xð1 erðTtÞ Þ ¼ 48ð1 e0:08ð9050Þ=365 Þ
d2 ¼ 0:292 0:0993 ¼ 0:15354: ¼ 48ð1 0:991Þ ¼ 0:432:
144 6 Microsoft Excel Approach to Estimating Alternative Option Pricing Models
Since d = 2 > 0.432, therefore, the early exercise is not since both b1 and b2 depend on the critical ex-dividend stock
precluded. price St , which can be determined by
(c) The value of the American call is now calculated as CðSt ; 40=365; 48Þ ¼ St þ 2 48:
pffiffiffiffiffiffiffiffiffiffiffiffiffi By using trial and error, we find that St = 46.9641. An
C ¼ 48:208½N1 ðb1 Þ þ N2 ða1 ; b1 ; 50=90Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffi Excel program used to calculate this value is presented in
48e0:08ð90=365Þ ½N1 ðb2 Þe0:08ð40=365Þ þ N2 ða2 ; b2 ; 50=90Þ Fig. 6.7.
þ 2e0:08ð50=365Þ N1 ðb2 Þ Substituting Sx = 48.208, X = $48 and St* into Eqs. (6.8)
ð6:13Þ and (6.9), we can calculate a1, a2, b1, and b2:
a1 ¼ d1 ¼ 0:25285:
a2 ¼ d2 ¼ 0:15354:
6.6 Using Bivariate Normal Distribution Approach to Calculate American Call Options 145
48:208
2 + N2(b, 0;qba) -d to calculate the value of both N2(a,b;q) as
ln 46:9641 þ 0:08 þ 0:22 365 50
b1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:4859: follows:
ð:20Þ 50=365
½ð0:7454Þð0:25285Þ þ 0:4859ð1Þ
qab ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:87002
b2 ¼ 0:485931 0:074023 ¼ 0:4119:
ð0:25285Þ2 2ð0:7454Þð0:25285Þð0:4859Þ þ ð0:4859Þ2
pffiffiffiffiffiffiffiffiffiffiffiffiffi
In addition, we also know q ¼ 50=90 = -0.7454.
½ð0:7454Þð0:4859Þ 0:25285ð1Þ
From the above information, we now calculate related qba ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:31979
normal probability as follows: ð0:25285Þ2 2ð0:7454Þð0:25285Þð0:4859Þ þ ð0:4859Þ2
By using the same data as the bivariate normal distribution You then calculate the call price at time t (the time of the
(from Sect. 6.4) we will show how Black’s approximation dividend payment) using the current stock price.
method can be used to calculate the value of an American 0:22
option. The first step is to calculate the stock price minus the ln 50
48 þ 0:08 þ 2 ð0:13699Þ
d1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:7365
current value of the dividend and then calculate d1 and d2 to 0:8 0:13699
calculate the call price at time T (the time of maturity). 0:22
48 þ 0:08 2 ð0:13699Þ
ln 50
d1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:6625:
2e0:13699ð0:08Þ þ 2e0:24658ð0:08Þ ¼ 0 ¼ 1:9782: 0:8 0:13699
• The option price can therefore be calculated from the • We can get from the normal table
Black–Scholes formula with S0=48.0218, K = 48,
r = 0.08, r = 0.2, and T = 0.24658. We have Nðd1 Þ ¼ 0:7693; N ðd2 Þ ¼ 0:7462:
ln 48:0218
48 þ 0:08 þ 0:22
2 ð0:24658Þ
• And the call price is
d1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:2529
0:8 0:24658 50ð0:7693Þ 48e0:08ð0:24658Þ ð0:7462Þ ¼ $3:04:
2
ln 48:0218
48 þ 0:08 0:22 ð0:24658Þ
d1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:1535: Comparing the greater of the two call option values will
0:8 0:24658 show if it is worth waiting until the time-to-maturity or
• We can get from the normal table exercising at the dividend payment.
( c2
To find the critical stock price S , it is necessary to solve
cðS; tÞ þ A2 S when S\S
CðS; tÞ ¼ S ;
SK when S S S n o
S K ¼ cðS ; tÞ þ 1 eqðTtÞ N½d1 ðS Þ :
c2
where
Since this cannot be done directly, an iterative procedure
S n o
A2 ¼ 1 eqðTtÞ N½d1 ðS Þ must be developed.
c2
" rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi#
4a 6.8.2 VBA Program for Calculating American
c2 ¼ ðb 1Þ þ ðb 1Þ2 þ =2
h Option When Dividend Yield is Known
S þ r q þ r2 ð T t Þ
ln K WE can use Excel Goal Seek tool to develop the iterative
2
d1 ¼ pffiffiffiffiffiffiffiffiffiffiffi process. We set Cell F7 equal to zero by changing Cell B3 to
r Tt find S . The function in Cell F7 is
2r B3
a¼ ¼ B12 þ ð1 EXPðB6*B8Þ*NORMSDISTðB9ÞÞ
r2 F6
B3 þ B4:
2ð r qÞ
b¼
r2 After doing the iterative procedure, the result shows that
S is equal to 44.82072.
h ¼ 1 erðTtÞ
6.8 American Call Option When Dividend Yield is Known 151
T) * Application.NormSDist(d1)) * a / gamma2 - a + X
If ya * yc < 0 Then
b=c
Else
a=c
End If
Loop
Sa = (a + b) / 2
End If
d1 = (Log
(Sa / X) + (r - q + 0.5 * sigma ^ 2) * T) / (sigma * Sqr
(T))
A2 = (Sa / gamma2) * (1 - Exp(-q * T) * Application.
NormSDist(d1))
If S < Sa Then
AmericanCall = BSCall
(S, X, r, q, T, sigma) + A2 * (S / Sa) ^ gamma2
Else
AmericanCall = S - X
End If
End Function
f ðx0i ; x0j Þ ¼ exp½a1 ð2x0i a1 Þ þ b1 ð2x0j b1 Þ þ 2qðx0i a1 Þðx0j Appendix 6.2: Excel Program to Calculate
b1 Þ: the American Call Option When Dividend
Payments are Known
The pairs of weights (w) and corresponding abscissa
values (x0 ) are. The following is a Microsoft Excel program which can be
used to calculate the price of an American call option using
i,j w x0 the bivariate normal distribution method: (Table B1)
1 0.24840615 0.10024215
(continued)
154 6 Microsoft Excel Approach to Estimating Alternative Option Pricing Models
Table 6.1 Microsoft Excel program for calculating the American call options
Appendix 6.2: Excel Program to Calculate the American … 155
156 6 Microsoft Excel Approach to Estimating Alternative Option Pricing Models
S
r2
ln X þ r qþ 2 T
7.1 Introduction d¼ pffiffiffiffi
r T
In this chapter, we will introduce how to use Excel to esti- where the stock price, exercise price, interest rate, dividend
mate implied volatility. First, we use approximate linear yield, and time until option expiration are denoted by S, K, r,
function to derive the volatility implied by Black–Merton– q, and T, respectively. The instantaneous standard deviation
Scholes model. Second, we use nonlinear method, which of the log stock price is represented by r, and N(.) is the
include Goal Seek and Bisection method, to calculate standard normal distribution function. If we can get the
implied volatility. Third, we demonstrate how to get the parameter in the model, we can calculate the option price.
volatility smile using IBM data. Fourth, we introduce con- The Black–Scholes formula in the spreadsheet is shown
stant elasticity volatility (CEV) model and use bisection below:
method to calculate the implied volatility of CEV model.
Finally, we calculate the 52-week historical volatility of a
stock. We used the Excel function webserivce to retrieve the
52 historical stock prices.
This chapter is broken down into the following sections.
In Sect. 7.2, we use Excel to estimate the implied variance
with Black–Scholes option pricing model. In Sect. 7.3, we
discuss volatility smile, and in Sect. 7.4 we use Excel to
estimate implied variance with CEV model. Section 7.5
looks at the web service Excel function. In Sect. 7.6, we
look at retrieve a stock price for a specific date. In Sect. 7.7,
we look at a calculated holiday list, and in Sect. 7.8 we
calculate historical volatility. Finally, in Sect. 7.9, we sum-
marize the chapter.
7.2 Excel Program to Estimate Implied For a call option on a stock, the Black–Scholes formula in
Variance with Black–Scholes Option cell B12 is
Pricing Model
¼ B3 EXPðB6 B8Þ NORMSDISTðB9Þ B4
7.2.1 Black, Scholes, and Merton Model EXPðB5 B8Þ NORMSDISTðB10Þ;
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 157
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_7
158 7 Alternative Methods to Estimate Implied Variance
d2 = d1 - sigma * Sqr(T)
Nd1 = Application.NormSDist(d1)
Nd2 = Application.NormSDist(d2)
End Function
The user-defined VBA function in cell C12 is a nonlinear equation. Corrado and Miller (1996) have sug-
gested an analytic formula that produces an approximation
¼ BSCallðB3; B4; B5; B6; B8; B7Þ: for the implied volatility. They start by approximating N(z)
The call value in cell C12 is 5.00 which is equal to B12 as a linear function:
calculated by spreadsheet.
1 1 z3 z5
NðzÞ ¼ þ pffiffiffiffiffiffi z þ þ... :
2 2p 6 40
7.2.2 Approximating Linear Function Substituting expansions of the normal cumulative prob-
for Implied Volatility pffiffiffiffi
abilities N(d) and Nðd r T Þ into Black–Scholes call
option price
All model parameters except the log stock price standard
pffiffiffiffi
deviation are directly observable from market data. This qT 1 d rT 1 dr T
allows a market-based estimate of a stock's future price c ¼ Se þ pffiffiffiffiffiffi þ Xe þ pffiffiffiffiffiffi :
2 2p 2 2p
volatility to be obtained by inverting Eq. (7.1), thereby
yielding an implied volatility.
Unfortunately, there is no closed-form solution for an After solving the quadratic equation and some approxima-
implied standard deviation from Eq. (7.1). We have to solve tions, we can get
7.2 Excel Program to Estimate Implied Variance … 159
pffiffiffiffiffiffiffiffiffiffiffi 0 ffi1
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
s
2p=T @ MK M K 2 ðM K Þ2 A
r¼ c þ c ;
MþK 2 2 p
M = S * Exp(-q * T)
K = X * Exp(-r * T)
p = Application.Pi()
BSIVCM = -1
Else
End If
End Function
160 7 Alternative Methods to Estimate Implied Variance
The Corrado and Miller implied volatility formula in G6 Given a function f(x) and its derivative f’(x), we begin with a
is first guess x0 for a root of the function f. The process is
iterated as
¼ BSIVCMðB3; B4; B5; B6; B8; F12Þ:
f ðxn Þ
The approximation value in G6 is 0.3614 which is equal xn þ 1 ¼ xn
f 0 ðxn Þ
to F6.
until a sufficiently accurate value is approached.
In order to use Newton–Raphson to estimate implied
7.2.3 Nonlinear Method for Implied Volatility volatility, we need f’(.), in option pricing model is Vega.
There are two nonlinear methods for implied volatility. The @C pffiffiffiffi
v¼ ¼ SeqT T N 0 ðd1 Þ:
first one is Newton–Raphson method. The second one is @r
bisection. Using the slope to improve the accuracy of sub- Goal Seek is a procedure in Excel. It uses the Newton–
sequent guesses is known as the Newton–Raphson method. Raphson method to solve the root of nonlinear equation. In
figure given below, we would like to show how to use Goal
7.2.3.1 Newton–Raphson Method Seek procedure to find the implied volatility. The details of
Newton–Raphson method is a method for finding succes- our vanilla option are set out (cells B3–B8). Suppose the
sively better approximations to the roots of a nonlinear observed call option market value is 5.00. Our work is to
function. choose a succession of volatility estimates in cell B6 until
x : f ðxÞ ¼ 0: the BSM call option value in cell B11 equals to the observed
price, 5.00. This can be done by applying the Goal Seek
The Newton–Raphson method in one variable is accom- command in the Data part of Excel’s menu.
plished as follows: [Data] ! [What If Analysis] ! [Goal Seek]
7.2 Excel Program to Estimate Implied Variance … 161
yb = BSCall(S, X, r, q, T, b) - callprice
ya = BSCall(S, X, r, q, T, a) - callprice
If yb * ya > 0 Then
BSIVBisection = CVErr(xlErrValue)
Else
yc = BSCall(S, X, r, q, T, c) - callprice
ya = BSCall(S, X, r, q, T, a) - callprice
If ya * yc < 0 Then
b = c
Else
a = c
End If
Loop
BSIVBisection = (a + b) / 2
End If
End Function
End Function
bias = 0.0001
iv = initial
Do
iv = iv - ya / ydasha
BSIVNewton = iv
End Function
40
F(σ)=Cbs-Cmarket
35
30
25
20
15
10 F(X)=Cbs-…
5
0
-5 0.01 0.51 1.01 1.51 2.01 2.51 3.01 3.51 4.01 4.51 5.01 5.51 6.01 6.51
7.3 Volatility Smile 167
Bisecon
1.00E+00
1.00E-01
1.00E-02
1.00E-03
1.00E-04 Error
1.00E-05
1.00E-06
1.00E-07
4 7 10 14 17 20 iteraon
Newton
1.00E-01
1.00E-03
1.00E-05
1.00E-07 Error
1.00E-09
1.00E-11
1.00E-13
2 3 4 iteraon
We can find that Bisection method needs 20 iterations to distributed. If we introduce extra distribution parameters into
reduce an error of around 10–6. However, Newton–Raphson the option pricing determination formula, we can obtain the
method only needs four iterations to produce an error of constant elasticity volatility (CEV) option pricing formula.
around 10–13. This problem may occur in the past but This formula can be found in Sect. 7.4 of this chapter. Lee
today’s computer is more efficient. So, we don’t need to care et al. (2004) show that the CEV model performs better than
about this problem too much now. the Black–Scholes model in evaluating either call or put
option value.
A plot of the implied volatility of an option as a function
7.3 Volatility Smile of its strike price is known as a volatility smile. Now we use
IBM’s data to show the volatility smile. The call option data
The existence of volatility smile is due to Black–Scholes listed in table given below can be found from Yahoo Finance
formula which cannot precisely evaluate the either call or put http://finance.yahoo.com/q/op?s=IBM&date=1450396800.
option value. The main reason is that the Black–Scholes We use the IBM option contract with expiration date on
formula assumes the stock price per share is log-normally July 30.
168 7 Alternative Methods to Estimate Implied Variance
In this table, there are many inputs including dividend Corrado and Miller’s formula and Bisection methods. In this
payment, current stock price per share, exercise price per example, we use $135 as our exercise price for call option,
share, risk-free interest rate, and volatility of stock and the correspondent market ask price is $4.85. The implied
time-to-maturity. Dividend yield is calculated by dividend volatilities calculated by those two methods are 0.3399 and
payment divided by current stock price. By using different 0.3410, respectively.
methods discussed in Sect. 7.2, given the market price of the Now we calculate the implied volatility by using different
call option, we can calculate the implied volatility by using exercise price and correspondent different market price.
7.4 Excel Program to Estimate Implied Variance with CEV Model 169
noncentralChisquare df=5
0.16
0.14
0.12
0.1 ncp=0
0.08 ncp=2
0.06 ncp=4
0.04 ncp=6
0.02
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
170 7 Alternative Methods to Estimate Implied Variance
Under the theory in this chapter, we can write a call Hence, the formula for CEV call option in B14 is
option price under CEV model. The figure to do this is given
¼ IFðB9\1; B3 EXPðB6 B8Þ ð1 ncdchiðB11; B12 þ 2; B13ÞÞ
below: B4 EXPðB5 B8Þ*ncdchiðB13; B12; B11Þ;
B3 EXPðB6 B8Þ ð1 ncdchiðB13; B12; B11ÞÞ
B4 EXPðB5 B8Þ*ncdchiðB11; 2 B12; B13ÞÞ:
Dim v As Double
Dim aa As Double
Dim bb As Double
Dim cc As Double
- 1))
bb = 1 / (1 - alpha)
Else
End If
End Function
7.4 Excel Program to Estimate Implied Variance with CEV Model 171
If yb * ya > 0 Then
CEVIVBisection = CVErr(xlErrValue)
Else
7.4 Excel Program to Estimate Implied Variance with CEV Model 173
c = (a + b) / 2
If ya * yc < 0 Then
b = c
Else
a = c
End If
Loop
CEVIVBisection = (a + b) / 2
End If
End Function
if (alpha ~= 1)
v = (sigma^2)*T;
a = K.^(2*(1-alpha))./(v*(1-alpha)^2);
b = ones(size(K)).*(1/(1-alpha));
c = (F.^ (2 *(1-alpha)))./(v*(1-alpha)^2);
% Multiplying the call price by KK enable us to scale back
% if (0 < alpha && alpha < 1)
if (alpha < 1)
call = KK.*( F.*( ones(size(K)) - ncx2cdf( a,b + 2,c)) -
K.*(ncx2cdf(c,b,a))).*exp(-r.*T);
elseif (alpha > 1)
call = KK.*( F.*ncx2cdf(c,-b,a) - K.*ncx2cdf(a,2-b,c)).*exp(-r.*T);
end
else
call = 0; % function not defined for alpa < 0 or = 1
end
end
eFi;n;t ¼ CFi;n;t Cd
i;n;t ðd0 ; a0 Þ:
F
ð7:5Þ
function STradingTM=cevslpine(TradingTM,TM)
sigma=[0.1:0.05:0.7];
alpha=[-0.5; -0.3; -0.1; 0.1; 0.3; 0.5; 0.7; 0.9];
LA=length(alpha);
LB=length(sigma);
L=length(TradingTM);
Tn=ones(L,1);
Tr=ones(L,1);
y=ones(L,length(alpha),length(sigma));
a=ones(L,1);
b=ones(L,1);
iniError=ones(L,1);
inisigmaplace=ones(L,1);
iniaplhaplace=ones(L,1);
inisigma=ones(L,1);
inialpha=ones(L,1);
for i=1:L
Tn(i)=Tr(i)+TradingTM(i,1)-1;
if(i<L) Tr(i+1)=Tn(i)+1; end
end
for k=1:L
for i=1:LA
for j=1:LB
y(k,i,j)= sum(abs(TM(Tr(k):Tn(k),2)-CevFCall(TM(Tr(k):Tn(k),3), TM(Tr(k):Tn(k),1),
TM(Tr(k):Tn(k),4)/360.0, TM(Tr(k):Tn(k),5), sigma(j), alpha(i))));
end
end
[~,b]=min(y(k,:,:));
[iniError(k),inisigmaplace(k)]=min(min(y(k,:,:)));
inialphaplace(k)=b(inisigmaplace(k));
inisigma(k)=sigma(inisigmaplace(k));
inialpha(k)=alpha(inialphaplace(k));
disp(sprintf('iteration %d contract %d alpha and %d sigma', k, i,j));
end
(2) For each date t, we can obtain the optimal parameters in (3) We use optimization function in MATLAB to find a
each group by solving the minimum value of absolute minimum value of the unconstrained multivariable
pricing errors (minAPE) as function. The function code is given below:
X
N
½x; fval ¼ fminuncðfun; x0 Þ; ð7:7Þ
minAPEi;t ¼ min eFi;n;t ; ð7:6Þ
d0 ;a0
n¼1
where x is the optimal parameters of CEV model, fval is the
where N is the total number of option contracts in group i at local minimum value of minAPE, fun is the specified
time t. MATLAB function of Eq. (7.4), and x0 is the initial points of
Appendix 7.1: Application of CEV Model to Forecasting Implied Volatilities for Options on Index Futures 183
if (alpha ~= 1)
v = (sigma^2)*T;
a = K.^(2*(1-alpha))./(v*(1-alpha)^2);
b = ones(size(K)).*(1/(1-alpha));
c = (F.^ (2 *(1-alpha)))./(v*(1-alpha)^2);
% Multiplying the call price by KK enable us to scale back
% if (0 < alpha && alpha < 1)
if (alpha < 1)
call = ( F.*( ones(size(K)) - ncx2cdf( a,b + 2,c)) -
K.*(ncx2cdf(c,b,a))).*exp(-r.*T);
elseif (alpha > 1)
call =( F.*ncx2cdf(c,-b,a) - K.*ncx2cdf(a,2-b,c)).*exp(-r.*T);
end
else
call = 0; % function not defined for alpa < 0 or = 1
end
end
L=Ini_ed-Ini_id+1;
Tr=STradingTM(:,3);
Tn=STradingTM(:,4);
x_1=STradingTM(:,5);
x_2=STradingTM(:,6);
EstCev=ones(L,9);
CIVAPE=ones(L,1);
CIAAPE=ones(L,1);
CErrorAPE=ones(L,1);
CIVPPE=ones(L,1);
CIAPPE=ones(L,1);
CErrorPPE=ones(L,1);
CIVSSE=ones(L,1);
CIASSE=ones(L,1);
CErrorSSE=ones(L,1);
%countforloop=0;
fileID=fopen('EstCev.txt', 'w');
%parfor i=1:L
parfor i=1:L
Id_global=Ini_id+i-1;
APE=@(x) sum(abs(TM(Tr(i):Tn(i),2)-CevFCall(TM(Tr(i):Tn(i),3), TM(Tr(i):Tn(i),1),
TM(Tr(i):Tn(i),4)/360.0, TM(Tr(i):Tn(i),5), x(1), x(2))));
184 7 Alternative Methods to Estimate Implied Variance
CIVAPE(i)=x(1);
CIAAPE(i)=x(2);
CErrorAPE(i)=fval;
CErrorPPE(i)=abs(sum((TM(Tr(i):Tn(i),2)-CevFCall(TM(Tr(i):Tn(i),3),
TM(Tr(i):Tn(i),1), TM(Tr(i):Tn(i),4)/360.0,TM(Tr(i):Tn(i),5), x(1),
x(2)))./TM(Tr(i):Tn(i),2)));
CErrorSSE(i)=sum(abs(TM(Tr(i):Tn(i),2)-CevFCall(TM(Tr(i):Tn(i),3),
TM(Tr(i):Tn(i),1), TM(Tr(i):Tn(i),4)/360.0,TM(Tr(i):Tn(i),5), x(1), x(2))).^2);
end
disp(sprintf('farloop is over'));
fclose(fileID);
EstCev=[CIVAPE CIAAPE CErrorAPE CIVPPE CIAPPE CErrorPPE CIVSSE CIASSE CErrorSSE];
%matlabpool close
end
The data is the options on S&P 500 index futures which The futures options expired on March, June, and
expired within January 1, 2010 to December 31, 2013which September in both 2010 and 2011 are selected because they
are traded at the Chicago Mercantile Exchange (CME).2 The have over 1-year trading date (above 252 observations)
reason for using options on S&P 500 index futures instead of while other options only have more or less 100 observations.
S&P 500 index is to eliminate from non-simultaneous price Studying futures option contracts with same expired months
effects between options and its underlying assets (Harvey in 2010 and 2011 will allow the examination of IV charac-
and Whaley 1991). The option and future markets are closed teristics and movements over time as well as the effects of
at 3:15 p.m. Central Time (CT), while stock market is closed different market climates.
at 3 p.m. CT. Therefore, using closing option prices to In order to ensure reliable estimation of IV, we estimate
estimate the volatility of underlying stock return is prob- market volatility by using multiple option transactions instead
lematic even though the correct option pricing model is used. of a single contract. For comparing prediction power of Black
In addition to no non-synchronous price issue, the underly- model and CEV model, we use all futures options expired in
ing assets, S&P 500 index futures, do not need to be adjusted 2010 and 2013 to generate implied volatility surface. Here we
for discrete dividends. Therefore, we can reduce the pricing exclude the data based on the following criteria:
error in accordance with the needless dividend adjustment.
According to the suggestions in Harvey and Whaley (1991, (1) IV cannot be computed by Black model.
1992a, 1992b), we select simultaneous index option prices (2) Trading volume is lower than 10 for excluding minus-
and index future prices to do empirical analysis. cule transactions.
The risk-free rate is based on 1-year Treasury Bill from (3) Time-to-maturity is less than 10 days for avoiding
Federal Reserve Bank of ST. LOUIS.3 Daily closing price liquidity-related biases.
and trading volumes of options on S&P 500 index futures (4) Quotes not satisfying the arbitrage restriction: excluding
and its underlying asset can be obtained from Datastream. option contact if its price larger than the difference
between S&P500 index future and exercise price.
(5) Deep-in/out-of-money contacts where the ratio of
S&P500 index future price to exercise price is either
2
Nowadays, Chicago Mercantile Exchange (CME), Chicago Board of above 1.2 or below 0.8.
Trade (CBOT), New York Mercantile Exchange (NYMEX), and
Commodity Exchange (COMEX) are merged and operate as designated
contract markets (DCM) of the CME Group which is the world's After arranging data based on these criteria, we still have
leading and most diverse derivatives marketplace. Website of CME 30,364 observations of future options which are expired
group: http://www.cmegroup.com/. within the period of 2010–2013. The period of option prices
3
Website of Federal Reserve Bank of ST. LOUIS: http://research.
stlouisfed.org/.
is from March 19, 2009 to November 5, 2013.
Appendix 7.1: Application of CEV Model to Forecasting Implied Volatilities for Options on Index Futures 185
To deal with moneyness- and maturity-related biases, we observations. The whole period of option prices is from
use the “implied volatility matrix” to find proper parameters March 19, 2009 to November 5, 2013. The observations for
in CEV model. The option contracts are divided into nine each group are presented in Table 7.1.
categories by moneyness and time-to-maturity. Option con- The whole period of option prices is from March 19,
tracts are classified by moneyness level as at-the-money 2009 to November 5, 2013. Total observation is 30, 364.
(ATM), out-of-the-money (OTM), or in-the-money The lengths of period in groups are various. The range of
(ITM) based on the ratio of underlying asset price, S, to lengths is from 260 (group with ratio below 0.90 and
exercise price, K. If an option contract with S/K ratio is time-to-maturity within 30 days) to 1,100 (whole samples).
between 0.95 and 1.01, it belongs to ATM category. If its Since most trades are in the futures options with short
S/K ratio is higher (lower) than 1.01 (0.95), the option time-to-maturity, the estimated implied volatility of the
contract belongs to ITM (OTM) category. option samples in 2009 may be significantly biased because
According to the large observations in ATM and OTM, we we didn’t collect the futures options expired in 2009.
divide moneyness-level group into five levels: ratio above Therefore, we only use option prices in the period between
1.01, ratio between 0.98 and 1.01, ratio between 0.95 and January 1, 2010 and November 5, 2013 to estimate param-
0.98, ratio between 0.90 and 0.95, and ratio below 0.90. By eters of CEV model. In order to find global optimization
expiration day, we classified option contracts into short term instead of local minimum of absolute pricing errors, the
(less than 30 trading days), medium term (between 30 and 60 ranges for searching suitable d0 and a0 are set as d0 2
trading days), and long term (more than 60 trading days). ½0:01; 0:81 with interval 0.05, and a0 2 ½0:81; 1:39 with
In Fig. 7.1, we find that each option on index future interval 0.1, respectively. We find the value of parameters,
contract’s IV estimated by Black model varies across db0 ; c
a0 , within the ranges such that minimize value of
moneyness and time-to-maturity. This graph shows volatility
absolute pricing errors in Eq. (7.5). Then we use this pair of
skew (or smile) in options on S&P 500 index futures, i.e.,
the implied volatilities decrease as the strike price increases parameters, db0 ; c
a0 , as optimal initial estimates in the
(the moneyness level decreases). procedure of estimating local minimum minAPE based on
Even though everyday implied volatility surface changes, Steps (1)–(3). The initial parameter setting of CEV model is
this characteristic still exists. Therefore, we divided future presented in Table 7.2.
option contracts into a six by four matrix based on money- The sample period of option prices is from January 1,
ness and time-to-maturity levels when we estimate implied 2010 to November 5, 2013. During the estimating procedure
volatilities of futures options in CEV model framework in for initial parameters of CEV model, the volatility for S&P
accordance with this character. The whole option samples 500 index futures equals to d0 Sa0 1 .
expired within the period of 2010–2013 contains 30,364
In Table 7.2, the average sigma are almost the same while subsample data from January 2012 to May 2013 to test
the average alpha value in either each group or whole sample in-the-sample fitness, the average daily implied volatility of
is less than one. This evidence implies that the alpha of CEV both CEV and Black models, and average alpha of CEV
model can capture the negative relationship between S&P model are computed in Table 7.4. The fitness performance is
500 index future prices and its volatilities shown in Fig. 7.1. shown in Table 7.5. The implied volatility graphs for both
The instant volatility of S&P 500 index future prices equals models are shown in Fig. 7.2. In Table 7.4, we estimate the
to d0 Sa0 1 where S is S&P 500 index future prices, d0 and a0 optimal parameters of CEV model by using a more efficient
, are the parameters in CEV model. The estimated parame- program. In this efficient program, we scale the strike price
ters in Table 7.2 are similar across time-to-maturity level but and future price to speed up the program where the implied
volatile across moneyness. volatility of CEV model equals to d ratioa1 , ratio is the
Because of the implementation and computational costs, moneyness level, and d and a are the optimal parameters of
we select the sub-period from January 2012 to November program which are not the parameters of CEV model in
2013 to analyze the performance of CEV model. The total Eq. (7.4). In Table 7.5, we found that CEV model performs
number of observations and the length of trading days in well at in-the-money group.
each group are presented in Table 7.3. The estimated The subsample period of option prices is from January 1,
parameters in Table 7.2 are similar across time-to-maturity 2012 to November 5, 2013. Total observation is 13, 434.
level but volatile across moneyness. Therefore, we investi- The lengths of period in groups are various. The range of
gate the performance of all groups except the groups on the lengths is from 47 (group with ratio below 0.90 and
bottom row of Table 7.3. The performance of models can be time-to-maturity within 30 days) to 1,100 (whole samples).
measured by either the implied volatility graph or the aver- The range of daily observations is from 1 to 30.
age absolute pricing errors (AveAPE). The implied volatility Figure 7.2 shows the IV computed by CEV and Black
graph should be flat across different moneyness level and models. Although their implied volatility graphs are similar
time-to-maturity. We use subsample like Bakshi et al. (1997) in each group, the reasons to cause volatility smile are totally
and Chen et al. (2009) did to test implied volatility consis- different. In Black model, the constant volatility setting is
tency among moneyness-maturity categories. Using the misspecified. The volatility parameter of Black model in
Appendix 7.1: Application of CEV Model to Forecasting Implied Volatilities for Options on Index Futures 187
Table 7.3 Total number of observations and trading days in each group
Time-to-maturity (TM) TM < 30 30 ≦ TM ≦ 60 TM > 60 All TM
Moneyness (S/K ratio) Days Total Obs Days Total Obs Days Total Obs Days Total Obs
S/K ratio > 1.01 172 272 104 163 81 122 249 557
0.98 ≦ S/K ratio≦ 1.01 377 1,695 354 984 268 592 448 3,271
0.95 ≦ S/K ratio < 0.98 362 1,958 405 1,828 349 1,074 457 4,860
0.9 ≦ S/K ratio < 0.95 315 919 380 1,399 375 1,318 440 3,636
S/K ratio < 0.9 32 35 40 73 105 173 134 281
All ratio 441 4,879 440 4,447 418 3,279 461 12,605
Moneyness (S/K ratio) CEV Black Obs CEV Black Obs CEV Black Obs CEV Black Obs
S/K ratio > 1.01 1.65 1.88 202 1.81 1.77 142 5.10 5.08 115 5.80 6.51 459
0.98 ≦ S/K ratio ≦ 1.01 6.63 7.02 1,290 4.00 4.28 801 4.59 4.53 529 18.54 18.90 2,620
0.95 ≦ S/K ratio < 0.98 2.38 2.34 1,560 4.25 4.14 1,469 3.96 3.89 913 14.25 14.15 3,942
0.9 ≦ S/K ratio < 0.95 0.69 0.68 710 1.44 1.43 1,094 3.68 3.62 1,131 7.08 7.10 2,935
S/K ratio < 0.9 0.01 0.01 33 0.13 0.18 72 0.61 0.60 171 0.69 0.68 276
Fig. 7.2b varies across moneyless and time-to-maturity option price and its underlying asset. For example, in
levels while the IV in CEV model is a function of the Fig. 7.2c, the in-the-money future options near expired date
underlying price and the elasticity of variance (alpha have significantly negative relationship between future price
parameter). Therefore, we can image that the prediction and its volatility.
power of CEV model will be better than Black model The in-sample period of option prices is from January 1,
because of the explicit function of IV in CEV model. We can 2012 to May 30, 2013. In the in-sample estimating proce-
use alpha to measure the sensitivity of relationship between dure, CEV implied volatility for S&P 500 index futures
188 7 Alternative Methods to Estimate Implied Variance
(CEV IV) equals to dðS /K ratio Þa1 in accordance to to November 2013 to compare the prediction power of Black
reduce computational costs. The optimization setting of and CEV models. We use the estimated parameters in pre-
finding CEV IV and Black IV is under the same criteria. vious day as the current day’s input variables of model.
The in-sample period of option prices is from January 1, Then, the theoretical option price computed by either Black
2012 to May 30, 2013. or CEV model can calculate bias between theoretical price
The better performance of CEV model may result from and market price. Thus, we can calculate the average abso-
the overfitting issue that will hurt the forecastability of CEV lute pricing errors (AveAPE) for both models. The lower the
model. Therefore, we use out-of-sample data from June 2013 value of a model’s AveAPE, the higher the pricing
References 189
prediction power of the model. The pricing errors of Chen, R., C.F. Lee. and H. Lee. 2009. “Empirical performance of the
out-of-sample data are presented in Table 7.6. Here we find constant elasticity variance option pricing model.” Review of Pacific
Basin Financial Markets and Policies, 12(2), 177–217.
that CEV model can predict options on S&P 500 index Cox, J. C. 1975. “Notes on option pricing I: constant elasticity of
futures more precisely than Black model. Based on the better variance diffusions.” Working paper, Stanford University.
performance in both in-sample and out-of-sample, we claim Cox, J. C. and S. A. Ross. 1976. “The valuation of options for
that CEV model can describe the options of S&P 500 index alternative stochastic processes.” Journal of Financial Economics 3,
145–166.
futures more precisely than Black model. Corrado, Charles J., and Thomas W. Miller Jr. “A note on a simple,
With regard to generate implied volatility surface to accurate formula to compute implied standard deviations.” Journal
capture whole prediction of the future option market, the of Banking & Finance 20.3 (1996): 595–603.
CEV model is the better choice than Black model because it Merton, Robert C. “Theory of rational option pricing.” The Bell Journal
of economics and management science (1973): 141–183.
not only captures the skewness and kurtosis effects of Harvey, C. R. and R. E. Whaley. 1991. “S&P 100 index option
options on index futures but also has less computational volatility.” Journal of Finance, 46, 1551–1561.
costs than other jump-diffusion stochastic volatility models. Harvey, C. R. and R. E. Whaley. 1992a. “Market volatility prediction
In sum, we show that CEV model performs better than and the efficiency of the S&P 100 index option market.” Journal of
Financial Economics, 31, 43–73.
Black model in aspects of either in-sample fitness or Harvey, C. R. and R. E. Whaley. 1992b. “Dividends and S&P 100
out-of-sample prediction. The setting of CEV model is more index option valuation.” Journal of Futures Market, 12, 123–137.
reasonable to depict the negative relationship between S&P Jackwerth, JC and M Rubinstein. 2001. “Recovering stochastic
500 index future price and its volatilities. The elasticity of processes fromoption prices.” Working paper, London Business
School.
variance parameter in CEV model captures the level of this Larguinho M., J.C.Dias, and C.A. Braumann. 2013. “On the compu-
characteristic. The stable volatility parameter in CEV model tation of option prices and Greeks under the CEV model.”
in our empirical results implies that the instantaneous Quantitative Finance, 13(6), 907–917.
volatility of index future is mainly determined by current Lee, C.F., T. Wu and R. Chen. 2004. “The constant elasticity of variance
models:New evidence from S&P 500 index options.” Review of
future price and the level of elasticity of variance parameter. Pacific Basin Financial Markets and Policies, 7(2), 173–190.
Lee, Cheng Few, and John C. Lee, eds. Handbook Of Financial
Econometrics, Mathematics, Statistics, And Machine Learning (In 4
References Volumes). World Scientific, 2020.
MacBeth, JD and LJ Merville. 1980. “Tests of the Black-Scholes and
Cox Calloption valuation models.” Journal of Finance, 35, 285–
Bakshi, G, C Cao and Z Chen. 1997. “Empirical performance of 301.
alternative optionpricing models.” Journal of Finance, 52, 2003– Pun C. S. and H.Y. Wong. 2013. “CEV asymptotics of American
2049. options.” Journal of Mathematical Analysis and Applications, 403
Beckers, S. 1980. “The constant elasticity of variance model and its (2), 451–463.
implicationsfor option pricing.” Journal of Finance, 35, 661–673. Singh, V.K. and N. Ahmad. 2011. “Forecasting performance of
Black, Fischer, and Myron Scholes. “The pricing of options and constant elasticity of variance model: empirical evidence from
corporate liabilities.” Journal of political economy 81.3 (1973): India.” International Journal of Applied Economics and Finance, 5,
637–654. 87–96.
Greek Letters and Portfolio Insurance
8
8.2 Delta r2s
ln S
X
t
þ r 2 s pffiffiffi
d2 ¼ pffiffiffi ¼ d1 r s s ;
The delta of an option, D, is defined as the rate of change of rs s
the option price respected to the rate of change of underlying
s ¼ T t;
asset price:
N ðÞ is the cumulative density function of normal
@P
D¼ ; distribution.
@S
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 191
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_8
192 8 Greek Letters and Portfolio Insurance
Z d1 Z d1 where
1 u2
N ð d1 Þ ¼ f ðuÞdu ¼ pffiffiffiffiffiffi e 2 du
r2s
1 1 2p ln S
X
t
þ r q þ 2 s
d1 ¼ pffiffiffi ;
For a European call option on a non-dividend stock, delta rs s
can be shown as
r2s
ln S
X
t
þ r q 2 s pffiffiffi
D ¼ Nðd1 Þ d2 ¼ pffiffiffi ¼ d1 r s s ;
rs s
For a European put option on a non-dividend stock, delta can
be shown as For a European call option on a dividend-paying stock, delta
can be shown as
D ¼ Nðd1 Þ 1
D ¼ eqs Nðd1 Þ:
If the underlying asset is a dividend-paying stock pro-
viding a dividend yield at rate q, Black and Scholes formulas For a European put option on a dividend-paying stock, delta
for the prices of a European call option on a dividend-paying can be shown as
stock and a European put option on a dividend-paying stock
D ¼ eqs ½Nðd1 Þ 1:
are
The formula for delta of a call option in Cell E3 is By calculating the delta ratio, a financial institution that
sells option to a client can make a delta-neutral position to
¼ BSCallDeltaðB3; B4; B5; B6; B8; B7Þ hedge the risk of changes in the underlying asset price.
Suppose that the current stock price is $100, the call option
price on stock is $10, and the current delta of the call option
8.2.3 Application of Delta is 0.4. A financial institution sold 10 call options to its client,
so the client has right to buy 1,000 shares at the
Figure 8.1 shows the relationship between the price of a call time-to-maturity. To construct a delta hedge position, the
option and the price of its underlying asset. The delta of this financial institution should buy 0.4 1,000 = 400 shares of
call option is the slope of the line at the point of A corre- stock. If the stock price goes up to $1, the option price will
sponding to the current price of the underlying asset. go up by $0.40. In this situation, the financial institution has
@P @P @s @P
H¼ ¼ ¼ ð1Þ ;
@t @s @t @s
where s ¼ T t is the time-to-maturity. For the derivation
of theta for various kinds of stock options, we use the def-
inition of negative differential on time-to-maturity.
8.3 Theta
8.3.2 Excel Function of Theta of the European
The theta of an option, H, is defined as the rate of change of Call Option
the option price with respect to the passage of time:
The function of theta for a European call option in Cell Because the passage of time on an option is not uncertain,
E4 is we do not need to make a theta hedge portfolio against the
effect of the passage of time. However, we still regard theta
¼ BSCallThetaðB3; B4; B5; B6; B8; B7Þ as a useful parameter, because it is a proxy of gamma in the
delta-neutral portfolio. For the specific detail, we will dis-
cuss in the following sections.
8.3.3 Application of Theta
The value of option is the combination of time value and 8.4 Gamma
stock value. When time passes, the time value of the option
decreases. Thus, the rate of change of the option price with The gamma of an option, C, is defined as the rate of change
respect to the passage of time, theta, is usually negative. of delta respective to the rate of change of underlying asset
price:
196 8 Greek Letters and Portfolio Insurance
he function of gamma for a European call option in Cell If we only consider the first three terms, the approxima-
E5 is tion is then.
Change in Portfolio Value ¼ D Change in interest rate of − 2,400. To make a delta-neutral and gamma-neutral
¼ ðDuration P) portfolio, we should add a long position of
2,400/1.2 = 2,000 shares and a short position of
Change in interest rate,
2,000 0.7 = 1,400 shares in the original portfolio.
we can calculate the value changes of the portfolio. The
above relation corresponds to the previous discussion of
delta measure. We want to know how the price of the 8.5 Vega
portfolio changes given a change in interest rate. Similar to
delta, modified duration only shows the first-order approxi- The vega of an option, v, is defined as the rate of change of
mation of the changes in value. In order to account for the the option price respective to the volatility of the underlying
nonlinear relation between the interest rate and portfolio asset:
value, we need a second-order approximation similar to the @P
gamma measure before, this is then the convexity measure. v¼
@r
Convexity is the interest rate gamma divided by price as
given below: where P is the option price and r is the volatility of the
stock price. We next show the derivation of vega for various
Convexity ¼ C=P, kinds of stock options.
and this measure captures the nonlinear part of the price
changes due to interest rate changes. Using the modified
8.5.1 Formula of Vega for Different Kinds
duration and convexity together allows us to develop first- as
of Stock Options
well as second-order approximation of the price changes
similar to the previous discussion.
For a European call option on a non-dividend stock, vega
Change in Portfolio Value Duration P ðchange in rate) can be shown as
1 pffiffiffi
þ Convexity P ðchange in rateÞ2
2 v ¼ St s N0 ðd1 Þ:
The function of vega for a European call option in Cell vega-neutral, we should include at least two kinds of options
E5 is on the same underlying asset in our portfolio.
For example, a delta-neutral and gamma-neutral portfolio
¼ BSCallVegaðB3; B4; B5; B6; B8; B7Þ contains option A, option B, and underlying asset. The
gamma and vega of this portfolio are − 3,200 and − 2,500,
respectively. Option A has a delta of 0.3, gamma of 1.2, and
8.5.3 Application of Vega vega of 1.5. Option B has a delta of 0.4, gamma of 1.6, and
vega of 0.8. The new portfolio will be both gamma-neutral
Suppose a delta-neutral and gamma-neutral portfolio has a and vega-neutral when adding xA of option A and xB of
vega equal to v and the vega of a particular option is vo . option B into the original portfolio.
Similar to gamma, we can add a position of v=vo in option
Gamma Neutral: 3200 þ 1:2xA þ 1:6xB ¼ 0:
to make a vega-neutral portfolio. To maintain delta-neutral,
we should change the underlying asset position. However, Vega Neutral: 2500 þ 1:5 xA þ 0:8xB ¼ 0:
when we change the option position, the new portfolio is not
gamma-neutral. Generally, a portfolio with one option can- From the two equations shown above, we can get the
not maintain its gamma-neutral and vega-neutral at the same solution that xA = 1000 and xB = 1250. The delta of new
time. If we want a portfolio to be both gamma-neutral and portfolio is 1000 0.3 + 1250 0.4 = 800. To maintain
200 8 Greek Letters and Portfolio Insurance
The function in Cell B4:B5 is 8.6.1 Formula of Rho for Different Kinds
of Stock Options
¼ MMULTðMINVERSEðA2 : B3Þ; C2 : C3Þ
Because this is matrix function, we need to use For a European call option on a non-dividend stock, rho can
[ctrl] + [shift] + [enter] to get our result. be shown as
The function of rho in Cell E7 is the volatility of the stock are 5% and 30% per annum,
respectively. The rho of this European call can be calculated
¼ BSCallRhoðB3; B4; B5; B6; B8; B7Þ as follows:
8.7 Formula of Sensitivity for Stock Options the other one with negative gamma ðC\0Þ; and they both
with Respect to Exercise Price have a value of $1 ðP ¼ 1Þ. The trade-off can be written as
1 2 2
For a European call option on a non-dividend stock, the Hþ r S C ¼ r:
sensitivity can be shown as 2
For the first portfolio, if gamma is positive and large, then
@Ct
¼ ers Nðd2 Þ: theta is negative and large. When gamma is positive, chan-
@X ges in stock prices result in higher value of the option. This
For a European put option on a non-dividend stock, the means that when there is no change in stock prices, the value
sensitivity can be shown as of the option declines as we approach the expiration date. As
a result, the theta is negative. On the other hand, when
@Pt gamma is negative and large, changes in stock prices result
¼ ers Nðd2 Þ
@X in lower option value. This means that when there is no
For a European call option on a dividend-paying stock, the stock price change, the value of the option increases as we
sensitivity can be shown as approach the expiration and theta is positive. This gives us a
trade-off between gamma and theta and they can be used as
@Ct proxy for each other in a delta-neutral portfolio.
¼ ers Nðd2 Þ:
@X
For a European put option on a dividend-paying stock, the 8.9 Portfolio Insurance
sensitivity can be shown as
In this chapter, we have shown the partial derivatives of Bjork, T. Arbitrage Theory in Continuous Time. New York: Oxford
stock option with respect to five variables. Delta (D), the rate University Press, 1998.
of change of option price to change in the price of under- Boyle, P. P. and D. Emanuel. “Discretely Adjusted Option Hedges.”
Journal of Financial Economics, v. 8(3) (1980), pp. 259–282.
lying asset, is first derived. After delta is obtained, gamma Duffie, D. Dynamic Asset Pricing Theory. Princeton, NJ: Princeton
(C) can be derived as the rate of change of delta with respect University Press, 2001.
to the underlying asset price. Another two risk measures are Fabozzi, F. J. Fixed Income Analysis, 2nd Edn. New York: Wiley,
theta (H) and rho (q); they measure the change in option 2007.
Figlewski, S. “Options Arbitrage in Imperfect Markets.” Journal of
value with respect to passing time and interest rate, respec- Finance, v. 44(5) (1989), pp. 1289–1311.
tively. Finally, one can also measure the change in option Galai, D. “The Components of the Return from Hedging Options
value with respect to the volatility of the underlying asset against Stocks.” Journal of Business, v. 56(1) (1983), pp. 45–54.
and this gives us the vega (v). The applications of these Hull, J. Options, Futures, and Other Derivatives, 8th Edn. Upper Saddle
River, NJ: Pearson, 2011.
Greek letters in the portfolio management have also been Hull, J. and A. White. “Hedging the Risks from Writing Foreign
discussed. In addition, we use the Black and Scholes PDE to Currency Options.” Journal of International Money and Finance, v.
show the relationship between these risk measures. In sum, 6(2) (1987), pp. 131–152.
risk management is one of the important topics in finance for Karatzas, I. and S. E. Shreve. Brownian Motion and Stochastic
Calculus. Berlin: Springer, 2000.
both academics and practitioners. Given the recent credit Klebaner, F. C. Introduction to Stochastic Calculus with Applications.
crisis, one can observe that it is crucial to properly measure London: Imperial College Press, 2005.
the risk related to the even more complicated financial assets. McDonald, R. L. Derivatives Markets, 2nd Edn. Boston, MA:
The comparative static analysis of option pricing models Addison-Wesley, 2005.
Shreve, S. E. Stochastic Calculus for Finance II: Continuous Time
gives an introduction to the portfolio risk management. Model. New York: Springer, 2004.
Tuckman, B. Fixed Income Securities: Tools for Today's Markets, 2nd
Edn. New York: Wiley, 2002.
Portfolio Analysis and Option Strategies
9
The simplest method for solving a system of linear equations 9.2.2 Cramer’s Rule
is to repeatedly eliminate variables. This method can be
described as follows: Explicit formulas for small systems (Reference: Wikipedia).
a1 x þ b1 y ¼ c 1
Consider the linear system which in
1. In the first equation, solve for one of the variables in a x þ b y ¼ c2
2 2
terms of the others. a1 b1 x c
matrix format is ¼ 1 .
2. Substitute this expression into the remaining equations. a2 b2 y c2
This yields a system of equations with one fewer equa- Assume a1 b2 b1 a2 is nonzero. Then, x and y can be
tion and one fewer unknown. found with Cramer’s rule as
3. Continue until you have reduced the system to a single
c 1 b1 a1 b1 c 1 b2 b1 c 2
linear equation. x¼ = ¼
4. Solve this equation and then back-substitute until the c 2 b2 a2 b2 a1 b2 b1 a2
entire solution is found.
and
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 205
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_9
206 9 Portfolio Analysis and Option Strategies
a c1 a1 b1 a1 c2 c1 a2 1 5 8 þ 3 7 2 þ 5 3 4 5 5 23 3 81 7 4
y ¼ 1
z¼
= ¼ : 1 5 3 þ 3 6 2 þ ð2Þ 3 4 ð2Þ 5 2 3 3 3 1 6 4
a2 c 2 a2 b2 a1 b2 b1 a2 40 þ 42 þ 60 50 72 28 8
¼ ¼ ¼2
15 þ 36 24 þ 20 27 24 4
The rules for 3 3 matrices are similar. Given
8
< a1 x þ b1 y þ c 1 z ¼ d1
a x þ b2 y þ c2 z ¼ d2 which in matrix format is
: 2
a x þ b3 y þ c 3 z ¼ d3 9.2.3 Matrix Method
2 3 32 3 2 3
a1 b1 c 1 x d1
4 a2 b2 c2 54 y 5 ¼ 4 d2 5. Using the example in the last two sections above, we can
a3 b3 c 3 z d3 derive the following matrix equation:
Then the values of x, y, and z can be found as follows: 2 3 2 31 2 3
x 1 3 2 5
d1 b1 c1 a1 d1 c1 a1 b1 d1 4y5 ¼ 43 5 6 5 4 7 5
d2 b2 c2 a2 d2 c2 a2 b2 d2
z 2 4 3 8
d3 b3 c3 a3 d3 c3 a3 b3 d3
x ¼ ; y ¼ ; and z ¼ :
a1 b1 c1 a1 b1 c1 a1 b1 c1 The inversion of matrix A is by the definition
a2 b2 c2 a2 b2 c2 a2 b2 c2
a3 b3 c3 a3 b3 c3 a3 b3 c3 1
A1 ¼ ðAdjAÞ;
det A
And then you need to use determinant calculation, and the
The Adjoint A is defined by the transpose of the cofactor
calculation for the determinant is as follows:
matrix. First we need to calculate the cofactor matrix of A.
For example, for 3 3 matrices, the determinant of a
Suppose the cofactor matrix is:
3 3 matrix is defined by
a b c 2 3
A11 A12 A13
e f
d e f ¼ a b d f þ c d e cofactor matrix ¼ 4 A21 A22 A23 5;
h i g i g h
g h i A31 A32 A33
¼ aðei fhÞ bðdi fgÞ þ cðdh egÞ 5 6 3 6 3 5
A11 ¼ ¼ 9; A12 ¼ ¼ 3; A13 ¼ ¼ 2;
¼ aei þ bfg þ cdh ceg bdi afh: 4 3 2 3 2 4
We use the same example as we did in the first method:
2 3 2 3 2 3 3 2 1 2
5 3 2 1 5 2 1 3 5 A21 ¼ ¼ 17; A22 ¼ ¼ 7;
4 3 2 3
47 5 6 5 43 7 6 5 43 5 75
1 3
8 4 3 2 8 3 2 4 8 A23 ¼ ¼ 2;
x¼2 3;y ¼ 2 3;z ¼ 2 3 2 4
1 3 2 1 3 2 1 3 2
43 5 6 5 43 5 6 5 43 5 6 5
3 2 1 2
2 4 3 2 4 3 2 4 3 A31 ¼ ¼ 28; A32 ¼ ¼ 12;
5 6 3 6
5 5 3 þ 3 6 8 þ ð2Þ 7 4 ð2Þ 5 8 3 7 3 5 6 4 1 3
x¼ A33 ¼ ¼ 4;
1 5 3 þ 3 6 2 þ ð2Þ 3 4 ð2Þ 5 2 3 3 3 1 6 4 3 5
75 þ 144 28 þ 80 63 120 60
¼ ¼ ¼ 15 Therefore,
15 þ 36 24 þ 20 27 24 4
2 3
1 7 3 þ 5 6 2 þ ð2Þ 3 8 ð2Þ 7 2 5 3 3 1 6 8
9 3 2
y¼
1 5 3 þ 3 6 2 þ ð2Þ 3 4 ð2Þ 5 2 3 3 3 1 6 4 Cofactor matrix ¼ 4 17 7 2 5;
21 þ 60 48 þ 28 45 48 32 28 12 4
¼ ¼ ¼8
15 þ 36 24 þ 20 27 24 4
9.3 Markowitz Model for Portfolio Selection 207
Then, we can get Adjoint A: 9.3 Markowitz Model for Portfolio Selection
2 3
9 17 28 The Markowitz model of portfolio selection is a mathemat-
Adj A ¼ 4 3 7 12 5; ical approach for deriving optimal portfolios. There are two
2 2 4
methods to obtain optimal weights for portfolio selection,
The determinant of A we have calculated in Cramer’s these two methods are as follows: (a) The least risk for a
rule: given level of expected return and (b) The greatest expected
2 3 return for a given level of risk.
1 3 2 How does a portfolio manager apply these techniques in
Det A ¼ 4 3 5 6 5 ¼ 4; the real world?
2 4 3 The process would normally begin with a universe of
3 2 9 3 securities available to the fund manager. These securities
2 17
28
9 17 28 4 4 4 would be determined by the goals and objectives of the
1 4 6 7
A1 ¼ 3 7 12 5 ¼ 4 34 74 3 5; mutual fund. For example, a portfolio manager who runs a
ð4Þ
2 2 4 1 12 1 mutual fund specializing in health-care stocks would be
2
required to select securities from the universe of health-care
Therefore, stocks. This would greatly reduce the analysis of the fund
manager by limiting the number of securities available.
2 3 2 9 3 2 3
x 4
17
4 28
4 5 The next step in the process would be to determine the
6 7 6 3 7 6 7 proportions of each security to be included in the portfolio.
4 y 5 ¼4 4 74 3 5475
To do this, the fund manager would begin by setting a target
z 12 12 1 8
2 9 28 3 2 3
4 5þ 4 7þ 4 8
17
15
6 3 7 7
¼6 7 6 7
4 4 5 þ 4 7 þ 3 8 5 ¼ 4 8 5:
1 2
2 5 þ 12 7 þ 1 8
Excel matrix inversion and multiplication method dis- The final step in the process would be for the fund
cussed in this section is identical to the method discussed in manager to find the portfolio with the lowest variance given
a previous section. the target rate of return.
208 9 Portfolio Analysis and Option Strategies
2 3 2 3
0:0910 0:0018 0:0008 1 0:0053 W1 model. The monthly rates of return for these three companies
6 0:0036 0:1228 0:0020 1 0:0055 7 6 W2 7 from 2016 to 2020 for all three stocks can be found in
6 7 6 7
6 0:0008 0:0020 0:1050 1 7
0:0126 7 6 7 Appendix 9.1. The means, variances, and variance–covari-
6 6 W3 7
4 1 1 1 0 0 5 4 k1 5 ance matrices for these three companies are presented in
0:0053
2 0:0055
3 0:0126 0 0 k2 Fig. 9.1. By using the excel program, we can calculate the
0 optimal Markowitz portfolio model, and its results are pre-
6 0 7
6 7 sent in Fig. 9.2.
6
¼ 6 0 7
7 In Fig. 9.2, the top portion is the equation system used to
4 1 5 calculate optimal weights, which was discussed previously.
0:00106 Then we use the input data and calculate related information
ð9:5Þ for the equation system as presented in Step 1. Step 2 pre-
sents the procedure for calculating optimal weights. Finally,
When matrix A is properly inverted and post-multiplied in the lower portion of this figure, we present the expected
by K, the solution vector A1 K is derived: rate of return and the variance for this optimal portfolio.
There is a special case in terms of the Markowitz model.
W A1 K
2 3 2 3 This case is the Minimum Variance Model. The only dif-
W1 0:9442 ference between these two models is that we exclude the
6 7 6 7
6 W2 7 6 0:6546 7 expected return constraint that is
6 7 6 7 ð9:6Þ
6 W3 7 ¼ 6 0:5988 7
6 7 6 7 X
n
6 7 6 7
4 k1 5 4 0:1937 5 Wi EðRi Þ ¼ E
k2 20:1953 i¼1
price at the expiration time T, and the strike price, respec- Long Call + Net Premium Paid) and the Lower Break-even
tively. Given X(E) = $140, ST (you can find the value for ST point can be calculated as (Strike Price of Long Put − Net
in the first column of the table in Fig. 9.4), and the Premium Paid). For this example, the upper break-even
premiums for the call option $2.04 and put option $0.68, point is $142.72 and the lower break-even point is $137.28.
Fig. 9.4 shows the values for long straddle at different
stock prices at time T. For information in detail, you can
find the excel function in Fig. 9.5 for calculations of the 9.4.2 Short Straddle
numbers in Fig. 9.4. The profit profile of the long straddle
position is constructed in Fig. 9.6. The Break-even Contrary to the long straddle strategy, an investor will use a
point means when the profit equals to zero. The formula short straddle via a short call and a short put on IBM stock
for calculating the Upper Break-even point is (Strike Price of with the same exercise price of $150 when he or she expects
212 9 Portfolio Analysis and Option Strategies
Fig. 9.5 Excel formula for calculating the value of a long straddle position at option expiration
9.4 Option Strategies 213
Long Straddle
30
25
20
15
10
0
115 120 125 130 135 140 145 150 155 160 165
-5
little or no movement in the price of IBM stock. Given X Break-even point for Long Vertical Spread can be calculated
(E) = $150, ST (you can find the value for ST in the first as (Strike Price of Long Call + Net Premium Paid).
column of the table in Fig. 9.7) and the premiums for the call For this example, the break-even point is $152.63.
option $4.35 and put option $4.15, Fig. 9.7 shows the values
for short straddle at different stock prices at time T. For
information in detail, you can find the excel function in 9.4.4 Short Vertical Spread
Fig. 9.8 for calculations of the numbers in Fig. 9.7. The
profit profile of the short straddle position is constructed in Contrary to a long vertical spread, this strategy combines a
Fig. 9.9. The Break-even point means when the profit equals long call (or put) with a high strike price and a short call (or
to zero. The Upper Break-even point for Short Straddle can put) with a low strike price. For example, an investor pur-
be calculated as (Strike Price of Short Call + Net Premium chases a call with the exercise price of $150 and sells a call
Received) and the Lower Break-even point can be calculated with the exercise price of $155. Given X1 (E1) = $150, X2
as (Strike Price of Short Put − Net Premium Received). For (E2) = $155, ST (you can find the value for ST in the first
this example, the upper break-even point is $158.50 and the column of the table in Fig. 9.13), and the premiums for the
lower break-even point is $141.50. long call option is $4.35 and the short call option is $2.13,
Fig. 9.13 shows the values for the short vertical spread at
different stock prices at time T. For information in detail, you
9.4.3 Long Vertical Spread can find the excel function in Fig. 9.14 for calculations of
the numbers in Fig. 9.13. The profit profile of the short
This strategy combines a long call (or put) with a low strike vertical spread is constructed in Fig. 9.15. The Break-even
price and a short call (or put) with a high strike price. For point means when the profit equals to zero. The Break-even
example, an investor purchases a call with the exercise price point for Short Vertical Spread can be calculated as (Strike
of $155 and sells a call with the exercise price of $150. Given Price of Short Call + Net Premium Received). For this
X1(E1) = $155, X2(E2) = $150, ST (you can find the value example, the break-even point is $152.22.
for ST in the first column of the table in Fig. 9.10), and the
premiums for the long call option is $1.97 and the short call
option is $4.60, Fig. 9.10 shows the values for Long Vertical 9.4.5 Protective Put
Spread at different stock prices at time T. For information in
detail, you can find the excel function in Fig. 9.11 for cal- Assume that an investor wants to invest in the IBM stock on
culations of the numbers in Fig. 9.10. The profit profile of the March 9, 2011, but does not desire to bear any potential loss
Long Vertical Spread is constructed in Fig. 9.12. The for prices below $150. The investor can purchase IBM stock
Break-even point means when the profit equals to zero. The and at the same time buy the put option with a strike price of
214 9 Portfolio Analysis and Option Strategies
Fig. 9.8 Excel formula for calculating the value of a short straddle position at option expiration
9.4 Option Strategies 215
Short Straddle
5
0
115 120 125 130 135 140 145 150 155 160 165
-5
-10
-15
-20
-25
-30
$150. Given current stock S0 = $155.54, exercise price X information in detail, you can find the excel function in
(E) = $150, ST (you can find the value for ST in the first Fig. 9.17 for calculations of the numbers in Fig. 9.16. The
column of the table in Fig. 9.16), and the premium for the profit profile of the Protective Put position is constructed in
put option $4.40 (the ask price), Fig. 9.16 shows the values Fig. 9.18. The Break-even point means when the profit
for Protective Put at different stock prices at time T. For equals to zero. The Break-even point for Protective Put can
216 9 Portfolio Analysis and Option Strategies
Fig. 9.11 Excel formula for calculating the value of a long vertical spread position at option expiration
30
20
10
0
120 125 130 135 140 145 150 155 160 165 170
-10
-20
-30
be calculated as (Purchase Price of underlying + Premium obligation of delivering the stock is covered by the stock
Paid). For this example, the break-even point is $155.54. held in the portfolio. In essence, the sale of the call sold the
claim to any stock value above the strike price in return for
the initial premium. Suppose a manager of a stock fund
9.4.6 Covered Call holds a share of IBM stock on October 12, 2015, and she
plans to sell the IBM stock if its price hits $155. Then she
This strategy involves investing in a stock and selling a call can write a share of a call option with a strike price of $155
option on the stock at the same time. The value at the to establish the position. She shorts the call and collects
expiration of the call will be the stock value minus the value premiums. Given that current stock price S0 = $151.14, X
of the call. The call is “covered” because the potential (E) = $155, ST (you can find the value for ST in the first
9.4 Option Strategies 217
Fig. 9.14 Excel formula for calculating the value of a short vertical spread position at option expiration
218 9 Portfolio Analysis and Option Strategies
column of the table in Fig. 9.19), and the premium for the Fig. 9.21. It can be shown that the payoff pattern of a cov-
call option $1.97(the bid price), Fig. 9.19 shows the values ered call is exactly equal to shorting a put. Therefore, the
for the covered call at different stock prices at time T. For covered call has frequently been used to replace shorting a
information in detail, you can find the excel function in put in dynamic hedging practice. The Break-even point
Fig. 9.20 for calculations of the numbers in Fig. 9.19. The means when the profit equals to zero. The Break-even point
profit profile of the covered call position is constructed in for a Covered Call can be calculated as (Purchase price of
9.4 Option Strategies 219
Fig. 9.17 Excel formula for calculating the value of a protective put position at option expiration
ProtecƟve Put
30
20
10
0
115 120 125 130 135 140 145 150 155 160 165
-10
-20
-30
underlying + Premium Received). For this example, the Buying a protective put using the put option with an exercise
break-even point is $149.17. price of $150 places a lower bound of $150 on the value of
the portfolio. At the same time, the investor can write a call
option with an exercise price of $155. You can find the ST,
9.4.7 Collar which is the value for ST in the first column of the table in
Fig. 9.22. The call and the put sell at $1.97 (the bid price)
A collar combines a protective put and a short call option to and $4.40 (the ask price), respectively, making the net outlay
bracket the value of a portfolio between two bounds. For for the two options to be only $2.43. Figure 9.22 shows the
example, an investor holds the IBM stock selling at $151.10. values of the collar position at different stock prices at time
220 9 Portfolio Analysis and Option Strategies
Fig. 9.20 Excel formula for calculating the value of a covered call position at option expiration
9.4 Option Strategies 221
Covered Call
40
30
20
10
0
120 125 130 135 140 145 150 155 160 165 170
-10
-20
-30
Fig. 9.23 Excel formula for calculating the value of a collar position at option expiration
30
20
10
0
120 125 130 135 140 145 150 155 160 165 170
-10
-20
-30
Appendix 9.2: Options Data for IBM (Stock Price = 141.34) on July 23, 2021
Contract name Strike Last Bid Ask Change % Change Volume Open Implied
price (%) interest volatility
IBM210730C00139000 139 2.79 2.64 2.94 0.06 +2.20 10 242 0.2073
IBM210730C00140000 140 2.04 1.98 2.16 0.39 +23.64 601 777 0.1929
IBM210730C00141000 141 1.44 1.39 1.47 0.26 +22.03 1,199 477 0.179
IBM210730C00142000 142 0.94 0.89 1.07 0.14 +17.50 997 601 0.1897
IBM210730C00143000 143 0.61 0.54 0.59 0.13 +27.08 291 437 0.1716
IBM210730C00144000 144 0.32 0.32 0.37 0.05 +18.52 437 739 0.1763
IBM210730C00145000 145 0.2 0.17 0.2 0.03 +17.65 616 1066 0.1738
IBM210730C00146000 146 0.11 0.1 0.12 0.02 +22.22 254 585 0.1797
IBM210730C00147000 147 0.07 0.06 0.08 −0.02 −22.22 65 252 0.1904
IBM210730C00148000 148 0.05 0.04 0.06 0 – 40 515 0.2041
IBM210730C00149000 149 0.05 0.03 0.05 0 – 9 132 0.2207
IBM210730C00150000 150 0.04 0.03 0.04 0.01 +33.33 82 1161 0.2344
IBM210730C00152500 152.5 0.03 0.02 0.03 −0.01 −25.00 34 690 0.2774
IBM210730C00155000 155 0.02 0.02 0.03 0 – 25 328 0.3262
IBM210730C00157500 157.5 0.02 0.02 0.03 −0.01 −33.33 2 961 0.375
IBM210730C00160000 160 0.02 0.01 0.03 0 – 66 138 0.4219
IBM210730C00162500 162.5 0.01 0.01 0.16 −0.04 −80.00 3 75 0.5391
IBM210730C00165000 165 0.01 0 0.02 −0.02 −66.67 6 50 0.4844
IBM210730P00125000 125 0.02 0 0 0 – 18 0 0.25
IBM210730P00128000 128 0.02 0 0 0 – 39 0 0.25
IBM210730P00129000 129 0.06 0 0 0 – 6 0 0.25
IBM210730P00130000 130 0.03 0 0 0 – 74 0 0.125
IBM210730P00131000 131 0.04 0 0 0 – 17 0 0.125
IBM210730P00132000 132 0.05 0 0 0 – 17 0 0.125
IBM210730P00133000 133 0.06 0 0 0 – 88 0 0.125
IBM210730P00134000 134 0.07 0 0 0 – 11 0 0.125
IBM210730P00135000 135 0.09 0 0 0 – 95 0 0.125
IBM210730P00136000 136 0.12 0 0 0 – 89 0 0.0625
IBM210730P00137000 137 0.14 0 0 0 – 70 0 0.0625
IBM210730P00138000 138 0.25 0 0 0 – 390 0 0.0625
IBM210730P00139000 139 0.41 0 0 0 – 193 0 0.0313
IBM210730P00140000 140 0.68 0 0 0 – 431 0 0.0313
IBM210730P00141000 141 0.97 0 0 0 – 284 0 0.0078
IBM210730P00142000 142 1.64 0 0 0 – 85 0 0
IBM210730P00143000 143 2.12 0 0 0 – 37 0 0
IBM210730P00144000 144 2.87 0 0 0 – 207 0 0
IBM210730P00145000 145 3.87 0 0 0 – 17 0 0
IBM210730P00146000 146 4.73 0 0 0 – 33 0 0
IBM210730P00147000 147 6.13 0 0 0 – 2 0 0
IBM210730P00148000 148 6.75 0 0 0 – 2 0 0
(continued)
References 225
(continued)
Contract name Strike Last Bid Ask Change % Change Volume Open Implied
price (%) interest volatility
IBM210730P00149000 149 8.14 0 0 0 – 1 0 0
IBM210730P00150000 150 8.68 0 0 0 – 10 0 0
IBM210730P00152500 152.5 11.25 0 0 0 – 10 0 0
Jarrow R. and S. Turnbull. Derivatives Securities, 2nd ed. Cincinnati, Merton, R. “Theory of Rational Option Pricing.” Bell Journal of
OH: South-Western College Pub, 1999. Economics and Management Science, v. 4 (Spring 1973), pp. 141–
Jarrow, R. A. and A. Rudd. Option Pricing. Homewood, IL: Richard D. 183.
Irwin, 1983. Mossin, J. “Optimal Multiperiod Portfolio Policies.” Journal of
Lee, C. F. and A. C. Lee. Encyclopedia of Finance. New York: Business, v.41 (April 1968), pp. 215–229.
Springer, 2006. Rendleman, R. J. Jr. and B. J. Barter. “Two-State Option Pricing.”
Lee, C. F. and Alice C. Lee, Encyclopedia of Finance. New York, NY: Journal of Finance, v. 34 (September 1979), pp. 1093–1110.
Springer, 2006. Ritchken, P. Options: Theory, Strategy and Applications. Glenview, IL:
Lee, C. F. Handbook of Quantitative Finance and Risk Management. Scott, Foresman, 1987.
New York, NY: Springer, 2009. Ross, S. A. “On the General Validity of the Mean-Variance Approach
Lee, C. F., A. C. Lee and J. C. Lee . Handbook of Quantitative Finance in Large Markets,” in W. F. Sharpe and C. M. Cootner, Financial
and Risk Management. New York: Springer, 2010. Economics: Essays in Honor of Paul Cootner, pp. 52–84. New
Lee, C. F., J. C. Lee and A. C. Lee. Statistics for Business and York: PrenticeHall, Inc., 1982.
Financial Economics. Singapore: World Scientific Publishing Co., Rubinstein, M. and H. Leland. “Replicating Options with Positions in
2013. Stock and Cash.” Financial Analysts Journal, v. 37 (July/August
Levy, H. and M. Sarnat. “A Note on Portfolio Selection and Investors’ 1981), pp.63–72.
Wealth.” Journal of Financial and Quantitative Analysis, v. Rubinstein, M. and J. Cox. Option Markets. Englewood Cliffs, NJ:
6 (January 1971), pp. 639–642. Prentice-Hall, 1985.
Lewis, A. L. “A Simple Algorithm for the Portfolio Selection Sears, S. and G. Trennepohl. “Measuring Portfolio Risk in Options.”
Problem.” Journal of Finance, v. 43 (March 1988), pp. 71–82. Journal of Financial and Quantitative Analysis, v. 17 (September
Liaw, K. T. and R. L. Moy, The Irwin Guide to Stocks, Bonds, Futures, 1982), pp.391–410.
and Options,.New York: McGraw-Hill Co., 2000. Sharpe, W. F. Portfolio Theory and Capital Markets. New York:
Lintner, J. “The Valuation of Risk Assets and the Selection of Risky McGraw-Hill, 1970.
Investments in Stock Portfolio and Capital Budgets.” Review of Simkowitz, M. A. and W. L. Beedles. “Diversitifcation in a
Economics and Statistics, v. 47 (February 1965), pp. 13–27. Three-Moment World.” Journal of Finance and Quantitative
Macbeth, J. and L. Merville. “An Empirical Examination of the Black– Analysis, v. 13 (1978), pp. 927–941.
Scholes Call Option Pricing Model.” Journal of Finance, v. Smith, C. “Option Pricing: A Review.” Journal of Financial
34 (December 1979), pp. J173–J186. Economics, v. 3 (January 1976), pp. 3–51.
Maginn, J. L., D. L. Tuttle, J. E. Pinto and D. W. McLeavey. Managing Stoll, H. “The Relationships between Put and Call Option Prices.”
Investment Portfolios: A Dynamic Process, CFA Institute Invest- Journal of Finance, v. 24 (December 1969), pp. 801–824.
ment Series, 3rd ed. New York: John Wiley & Sons, 2007. Summa, J. F. and J. W. Lubow, Options on Futures. New York: John
Mao, J. C. F. Quantitative Analysis of Financial Decisions. New York: Wiley & Sons, 2001.
Macmillan, 1969. Trennepohl, G. “A Comparison of Listed Option Premium and Black–
Markowitz, H. M. “Markowitz Revisited.” Financial Analysts Journal, Scholes Model Prices: 1973–1979.” Journal of Financial Research,
v. 32 (September/October 1976), pp. 47–52. v. 4 (Spring 1981), pp. 11–20.
Markowitz, H. M. “Portfolio Selection.” Journal of Finance, v. Von Neumann, J. and O. Morgenstern. Theory of Games and Economic
1 (December 1952), pp. 77–91. Behavior, 2nd ed. Princeton, NJ: Princeton University Press, 1947.
Markowitz, H. M. Mean-Variance Analysis in Portfolio Choice and Wackerly, D., W. Mendenhall and R. L. Scheaffer. Mathematical
Capital Markets. New York: Blackwell, 1987. Statistics with Applications, 7th ed. California: Duxbury Press,
Markowitz, H. M. Portfolio Selection. Cowles Foundation Monograph 2007.
16. New York: John Wiley and Sons, Inc., 1959. Weinstein, M. “Bond Systematic Risk and the Options Pricing Model.”
Martin, A. D., Jr. “Mathematical Programming of Portfolio Selections.” Journal of Finance, v. 38 (December 1983), pp. 1415–1430.
Management Science, v. 1 (1955), pp. 152–166. Welch, W. Strategies for Put and Call Option Trading. Cambridge,
McDonald, R. L. Derivatives Markets, 2nd ed. Boston, MA: Addison MA: Winthrop, 1982.
Wesley, 2005. Whaley, R. “Valuation of American Call Options on Dividend Paying
Merton, R. “An Analytical Derivation of Efficient Portfolio Frontier.” Stocks: Empirical Tests.” Journal of Financial Economics, v.
Journal of Financial and Quantitative Analysis, v. 7 (September 10 (March 1982), pp. 29–58.
1972), pp. 1851–1872. Zhang, P. G., Exotic Options: A Guide to Second Generation Options,
2nd ed. Singapore: World Scientific, 1998.
Simulation and Its Application
10
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 227
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_10
228 10 Simulation and Its Application
pffiffiffiffi
ST ¼ S0 exp½ r q 0:5r2 T þ r T :
At the heart of the Monte Carlo simulation for option
valuation is the stochastic process that generates the share The share price process outlined above is the same as that
price. The stochastic equation for the underlying share price assumed for binomial tree valuation. RAND gives random
at time T when the option on the share expires was given as numbers uniformly distributed in the range [0, 1]. Regarding
follows: its outputs as cumulative probabilities, the NORMSINV
pffiffiffiffi function converts them into standard normal variate values,
ST ¼ S0 exp½ðl 0:5r2 ÞT þ r T : mostly between −3 and 3. The random normal samples
The associated European option payoff depends on the (value of e) are then used to generate share prices and the
expectation of ST in the risk-neutral world. Thus, the corresponding option payoff.
stochastic equation for ST for risk-neutral valuation takes the In European option pricing, we need to estimate the
following form: expected value of the discounted payoff of the option:
10.2 Monte Carlo Simulation 229
f ¼ erT EðfT Þ
¼ erT E½maxðST X; 0Þ
pffiffiffiffi
¼ erT E½maxðS0 exp½ðr q 0:5r2 ÞT þ r T e X; 0Þ:
Using Excel RAND() function, we can generate a uni- The discount value of the average of the 100 simulated
form random number, like cell E8. We simulate 100 random option payoffs is the estimate call value from the Monte
numbers in the sheet. Next, we use the NORMSINV func- Carlo simulation. Pressing the F9 in Excel, we can generate
tion to transfer a uniform random number to a standard a further 100 trials and another Monte Carlo simulation. The
normal random number in cell F8. Then the random normal formula for the call option estimated by Monte Carlo sim-
samples are used to generate stock prices from the formula, ulation in H3 is
like G8. The stock price formula in G8 is
¼ EXPð$B$5 $B$8Þ AVERAGEðH8 : H107Þ:
¼ $B$3 EXP $B$5 $B$6 0:5 $B$72
The value in H3 is 5.49. Compare with the true Black and
$B$8 þ $B$7 SQRTð$B$8Þ F8Þ: Scholes call value, there exist some differences. To improve
Finally, the corresponding call option payoff in cell H8 is the precision of the Monte Carlo estimate, the number of
simulation trials has to be increased.
¼ MAXðG8 $B$4; 0Þ: We can write a function for crude Monte Carlo
simulation.
230 10 Simulation and Its Application
The Monte Carlo simulation for the European call option In this case, we replicate 1000 times to get the call option
in K3 is value. The value in K3 is equal to 5.2581 which is a little
more near the value of Black–Scholes, 5.34.
¼ MCCallðB3; B4; B5; B6; B8; B7; 1000Þ:
10.3 Antithetic Variables 231
We can directly use this function in the worksheet to get numbers of replication, we can get the option prices in dif-
the estimate of the antithetic method. After changing the ferent numbers of replication.
10.4 Quasi-Monte Carlo Simulation 233
The formula for the call value of the antithetic variates The inverse transform is a general approach to transform
method in K4 is uniform variates into normal variates. Since no analytical
form for it is known, we cannot invert the normal distribu-
¼ MCCallAntiðB3; B4; B5; B6; B8; B7; 1000Þ: tion function efficiently. One old-fashioned possibility,
The value in K4 is closer to Black–Scholes, K5, than K3 which is still suggested in some textbooks is to exploit the
estimated by Monte Carlo in 100 times replication. central limit theorem to generate a normal random number
by summing a suitable number of uniform variates. Com-
putational efficiency would restrict the number of uniform
10.4 Quasi-Monte Carlo Simulation variates. An alternative method is the Box–Muller approach.
Consider two independent variables X,Y * N(0,1), and let
Quasi-Monte Carlo simulation is another way to improve the (R,h) be the polar coordinates of the point of Cartesian
efficiency of Monte Carlo. This simulation method is a coordinates (X,Y) in the planes, so that
method for solving some other problems using
d ¼ R2 ¼ X 2 þ Y 2
low-discrepancy sequences (also called quasi-random
sequences or sub-random sequences). This is in contrast to Y
the regular Monte Carlo simulation, which is based on h ¼ tan1
X
sequences of pseudorandom numbers. To generate U(0,1)
variables, the standard method is based on linear congru- The Box–Muller algorithm can be represented as follows:
ential generators (LCGs). LCG is a process that gives an
initial z0 and through a formula to generate the next number. 1. Generate two independent uniform random variates U1
The formula is and U2 * U(0,1).
2. Set R2 = −2*log(U1) and h = 2p*U2.
zi ¼ ða zi1 þ cÞðmodmÞ: 3. Set X = R*cosh and Y = R*sinh,
then X * N(0,1) and Y * N(0,1) are independent
For example, 15 mod 6 = 3 (remainder of integer division). standard normal variates.
Then the uniform random number is
zi Here is the VBA function to generate a Box–Muller
Ui ¼ : normal random numbers:
m
There is nothing random in this sequence. First, it must
start from an initial number z0 , seed. Secondly, the generator
is periodic.
n ¼ ð. . . d4 d3 d2 d1 d0 Þb
Xm
¼ d bk :
k¼0 k
Using this function in the worksheet, we can get a The formula for the Halton number in B4 is
sequence number generated by the Halton function. In
addition, we can change the prime number to get a Halton ¼ haltonðA4; 2Þ
number from a different base. which is the 16th number under the base is equal to 2. We
can change the base to 7 as shown in C4.
Two independent numbers generated by Halton or ran-
dom generator can construct a join distribution. The results
are shown in the below figures. We can see that the numbers
generated from Halton’s sequence are more discrepant than
the numbers generated from a random generator in Excel.
Halton Random
1 1
0.8 0.8
Base=2
0.6 0.6
rand2
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Base=7 rand1
236 10 Simulation and Its Application
The Halton sequence can have the desirable property. The Quasi-Monte Carlo estimates in different simulation num-
error in any estimate based on the samples is proportional to bers. In the table below, we represent different replication
pffiffiffiffiffi
1/M rather than 1= M , where M is the number of samples. numbers, 100, 200, … 2000, to price option. The following
We compare Monte Carlo estimates, Antithetic variates, and figure is the result.
10.5 Application 237
10.5 Application
2
pffiffiffiffi
a ¼ ðSb =S0 Þ1 þ 2r=r d6 ¼ d5 r T
pffiffiffiffi
d7 ¼ ln S0 X=S2b r q r2 =2 T =ðr T Þ
2
b ¼ ðSb =S0 Þ1 þ 2r=r
pffiffiffiffi pffiffiffiffi
d1 ¼ lnðS0 =X Þ þ r q þ r2 =2 T =ðr T Þ d8 ¼ d7 r T
pffiffiffiffi As an example, a down-and-out put option with strike
d2 ¼ d1 r T
price X, expiring in T time units, with a barrier set to Sb. S0,
pffiffiffiffi r, q, r have the usual meaning.
d3 ¼ lnðS0 =Sb Þ þ r q þ r2 =2 T =ðr T Þ
To accomplish this, we can use the below code to gen-
pffiffiffiffi erate a function:
d4 ¼ d3 r T
pffiffiffiffi
d5 ¼ lnðS0 =Sb Þ r q r2 =2 T =ðr T Þ
The formula for down-and-out put option in cell E5 is However, the monitored continuous barrier option is
theoretical. In practice, we can only consider a down-and-out
¼ DOPutð$B$3; $B$4; $B$5; $B$6; $B$8; E4; $E$2Þ put option periodically, under the assumption that the barrier
As volatility increases, the price of down-and-out put is checked at the end of each trading day. In order to price
option may decrease because the stock is easy to drop down the barrier option, we have to generate a stock price process
across the barrier. We can see this effect in the below figure. and not only the maturity price. Below are the functions to
As the volatility increases from 0.1 to 0.2, the barrier option generate two asset price processes under random number and
price increases. However, as the volatility increases from 0.2 Halton’s sequence:
to 0.3, the barrier option price decreases.
240 10 Simulation and Its Application
Where NSteps is the number of time intervals from now to price process. Below we replicate three stock price processes
option maturity and NRepl is how many replications to for each method. Each process with 20 time intervals.
simulate. After we input the parameters, we can get the stock
10.5 Application 241
Because the output of this function is a matrix, we should that you want to use, in this example, AssetPaths(B3,B5,B6,
follow the step below to generate the outcome. First, select B8,B7,20,3). Finally, press Ctrl + Shift + Enter.
the range of cells in which you want to enter the array for- Now, we can use a Monte Carlo simulation to compute
mula, in this example, D1:F21. Second, enter the formula the price of the down-and-out put option. Following the
function can help us accomplish this task:
Using the above function, we can get two outcomes in the We should mark the range H5:I5, then type the formula.
cells, H5:I5. H5 is down-and-out put option value and I5 is Finally, press the [ctrl] + [shift] + [enter]. Then we can get
the times that the price crosses the barrier. The formula for the result.
option price and crossed number in cells, H5:I5 is In order to see the different crossed numbers, we set two
barriers, Sb. In the first case, Sb is equal to 5. Because
¼ DOPutMC 2ðB3; B4; B5; B6; B8; B7; B17; B15; B16Þ: barrier Sb in this case is 5, much below exercise and stock
price, there is no price that crosses the barrier.
244 10 Simulation and Its Application
References Wilmott, Paul. Paul Wilmott on quantitative finance. John Wiley &
Sons, 2013.
Hull, John C. Options, Futures, and Other Derivatives. Prentice Hall,
Boyle, Phelim P. “Options: A monte carlo approach.” Journal of 2015
financial economics 4.3 (1977): 323-338
Boyle, Phelim, Mark Broadie, and Paul Glasserman. “Monte Carlo
methods for security pricing.” Journal of economic dynamics and
control 21.8 (1997): 1267-1321. On the Web
Joy, Corwin, Phelim P. Boyle, and Ken Seng Tan. “Quasi-Monte Carlo
methods in numerical finance.” Management Science 42.6 (1996): http://roth.cs.kuleuven.be/wiki/Main_Page
926–938.
Part III
Applications of Python, Machine Learning
for Financial Derivatives and Risk Management
Linear Models for Regression
11
The goal of regression is to predict the target value y as a Consider a training dataset of N examples with the inputs
function f(x) of the d-dimensional input variables x, where {xi|i = 1, …, N} RD, the target is the sum of the model
the underlying function f is unknown (Altman and Krzy- function f(xi) and the noise ei, i.e.,
winski 2015). Examples of regression include predicting the
GDP using the inflation x, to predict cancer or not (y = 0,1) yi ¼ f ðxi Þ þ ei ð11:1Þ
using a patient’s X-ray image x. The former example is the where 1 i N, e1, …, eN are i.i.d. Gaussian noises with
case of a regression problem with continuous target variable means zeros and variance c−1. In many practical applica-
y, while the second example is a classification problem. In tions, the d-dimensional x is preprocessed to result in the
either case, our objective will choose a specific function f features expressed in terms of a set of basis functions
(x) for each input x. A polynomial is a specific example of a /(x) = [/0(x), …, /M(x)]′, and the model output is
broad class of the functions to proxy the underlying function
XM 0
f. A more useful class of functions known as linear combi- f ðxi Þ ¼ /j ðxi Þwj ¼ /ðxi Þ w ð11:2Þ
nations of a set of basis functions, which are linear in the j¼0
parameters but nonlinear with respect to the input variables, where /(xi) = [/0(xi), …, /M(xi)]′ is a set of M basis
gives simple analytical properties for the estimation and functions {/j(xi)| j = 0, …, M}, and w = [w0, …, wM]′ are
prediction purpose. the corresponding weight parameters. Typically, /0(x) = 1,
To choose f(x) for the underlying function, we incur a so that w0 acts as a bias. Popular basis functions are given in
loss L[y, f(x)] and the optimal function f(x) is the one that Sect. 31–3. To find an estimator b y of the target variable y,
minimizes the loss function. However, the loss function one often considers the squared-error loss function
L depends on whether the problem is a regression with a
continuous target variable or classification (Altman and L½y; ^yðxÞ ¼ ðy ^yðxÞÞ2
Krzywinski 2015). In the following section, we start with the
former case. In the following, we will start from a regression Suppose the estimator b y is the one that minimizes the
problem with a continuous target variable y, in which the expected loss function given by
underlying function f is modeled as a linear combination of a ZZ
set of basis functions. EðLÞ ¼ L½y; b
y ðxÞpðx; yÞdxdy ð11:3Þ
This chapter is broken down into the following sections.
Section 11.2 discusses loss functions and least squares, where p(x, y) is the joint probability function of x and y. As
Sect. 11.3 discusses regularized least squares—Ridge and the noises e1, …, eN in (11.1) are i.i.d. Gaussian with means
Lasso regression, and Sect. 11.4 discusses a logistic zeros and variance c1 , it can be shown the estimator by ðxÞ
regression for classification: a discriminative model. Sec- that minimizes the expected squared-error loss function E
tion 11.5 talks about K-fold cross-validation, and Sect. 11.6 (L) in (11.3) is simply the conditional mean
discusses the types of basis functions. Section 11.7 looks at
the accuracy of measures in classification, and Sect. 11.8 is a ^yðxÞ ¼ EðyjxÞ ¼ f ðxÞ:
Python programming example. Finally, Sect. 11.9 summa-
Therefore, like all forms of regression analysis, the focus
rizes the chapter.
is on the conditional probability distribution p(y|x) rather
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 249
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_11
250 11 Linear Models for Regression
than on the joint probability distribution p(x, y). In the fol- (1) Ridge Regression: The modified sum-of-squares error
lowing section, if the model function f(x) is given in the form function is
of (11.2), we discuss the procedure to obtain the estimates of XN XM
the weight parameters w, and thus an estimate of the model Er1 ðwÞ ¼ ðy /ðxi ÞwÞ2 þ k
i¼1 i j¼0
w2j ð11:5aÞ
on f(x).
(2) Lasso Regression: The modified sum-of-squares error
function is
11.3 Regularized Least Squares—Ridge XN 2
XM
and Lasso Regression Er2 ðwÞ ¼ i¼1
ð y i /ð x i ÞwÞ þ k j¼0
w2j ð11:5bÞ
Thus, one has f(xi) = /ðxi Þw, 1 i N, as the odds ratio Cross-validation is a popular method because it is simple
to understand and it generally results in less biased than
pi other methods, such as a simple train/test split. The general
/ðxi Þw ¼ ln
1 pi procedure is as follows:
where pi = p(C1|xi), 1 i N. For this reason, (11.6) is
1. Shuffle the dataset randomly;
termed logistic regression. For a training dataset (x1, y1), …,
2. Split the dataset into K groups;
(xN, yN), the likelihood function is
3. For each group
YN (a) Take the group as a hold-out or test dataset, and the
l¼ pyi i ð1 pi ÞNyi
i¼1 remaining groups as a training dataset;
By taking the negative logarithm of the likelihood l, we (b) Fit a model on the training set and evaluate it on the
obtain the error function in the terms of the cross-entropy test set;
form (c) Retain the evaluation score and discard the model;
(d) Summarize the result of the model using the sample
X
N of model evaluation scores.
EðlÞ ¼ fyi lnðpi Þ þ ðN yi Þlnð1 pi Þg ð11:7Þ
i¼1
The K value must be chosen carefully for your data
There is no closed-form solution for the cross-entropy sample. A poorly chosen value for K may result in a mis-
error function in (11.7) due to the nonlinearity of the logistic representative idea of the model, such as a score with a high
sigmoid function r in (11.6). However, as the cross-entropy variance or a high bias. Three common tactics for choosing a
error function (11.7) is concave, thus a unique minimum value for K are as follows:
exists and an efficient iterative technique by taking the gra-
dient of the error function in (11.7) with respect to w based • Representative: The value for K is chosen such that each
on the Newton–Raphson iterative optimization scheme can train/test group of data samples is large enough to be
be applied. statistically representative of the broader dataset.
Extension of the two-class classifier for classification to • K = 10: The value for K is fixed to 10, a value that has
K > 2 classes, we can use either of the following algorithms: been found through experimentation to generally result in
a model estimate with low bias and a modest variance.
(1) One-versus-the-rest classifier: Using (K − 1) of • K = n: The value for K is fixed to the size of the dataset
two-class classifiers, each of the two-class classifiers n to give each test sample an opportunity to be used in the
solves a two-class classification problem of separating hold-out dataset. This approach is called leave-one-out
class Ck from other classes, 1 k K. cross-validation.
(2) One-versus-one classifier: Using K(K − 1)/2 of
two-class classifiers, one for every possible pair of The results of a K-fold cross-validation run are often
classes. summarized with the mean of the model skill scores. It is
also good practice to include a measure of the variance of the
scores, such as the standard deviation or standard error.
The kernel concept was introduced into the field of pattern A kernel function corresponds to a scalar product in some
recognition by Aizerman et al. (1964). It was re-introduced feature space. For models based on a fixed nonlinear feature
into machine learning in the context of large margin classi- space mapping /ðxÞ, the corresponding kernel function is
fiers by Boser et al. (1992). The kernel concept allows us to the inner product
build interesting extensions of many well-known algorithms.
These well-known algorithms require the raw data to be kðx; x0 Þ ¼ /ðxÞT /ðx0 Þ:
explicitly transformed into representations via a
It is obvious a kernel function is a symmetric of its
user-specified feature map. Instead, kernel methods, require
arguments, i.e., kðx; x0 Þ ¼ kðx0 ; xÞ. Some examples include.
only a user-specified similarity function over pairs of data
points in raw representation. This dual representation of raw
1. Liner Kernel—kðx; x0 Þ ¼ xT x0 .
data arises the kernel trick, which enables them to operate in
2. Polynomial Kernel—kðx; x0 Þ ¼ ðxT x0 þ 1Þ , d is the
d
a high-dimensional, implicit feature space without ever
computing the coordinates of the data in that space, but degree of the polynomial.
rather by simply computing the inner products between
the images of all pairs of data in the feature space. There are many other forms of kernel functions in com-
Any linear model can be turned into a nonlinear model by mon use. One type of kernel functions is known as stationary
applying the kernel trick to the model: replacing its features kernels, which satisfy kðx; x0 Þ ¼ /ðx x0 Þ. In other words,
(predictors) by a kernel function. stationary kernels are functions of the difference between the
Algorithms capable of operating with kernels include arguments only and thus are invariant to translations in input
the kernel regression, Gaussian process regression, support space. Another type involves dial basis functions, which
vector machines, principal components analysis (PCA), depend only on the magnitude of the distance (typically
spectral clustering, linear adaptive filters, and many others. Euclidean) between the arguments so that kðx; x0 Þ ¼
In the following, the ideas of kernel approach and its uðkx x0 kÞ. The most well-known example is the Gaussian
applications will be given. kernel:
The sections of this chapter are as follows. Section 12.2
discusses constructing kernels. Section 12.3 discusses the 3. Gaussian Kernel—kðx; x0 Þ ¼ exp ckx x0 k2 .
Nadaraya–Watson model of kernel regression, Sect. 12.4
talks about relevant vector machines, and Sect. 12.5 talks
about the Gaussian process for regression. Section 12.6
discusses support vector machines, and Sect. 12.7 talks 12.3 Kernel Regression (Nadaraya–Watson
about Python programming. Model)
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 261
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_12
262 12 Kernel Linear Model
N examples, {xi|i = 1, …, N} are the inputs; {yi|i = 1, …, N} produces sparse solutions using an improper hierarchical
are the corresponding target values. The goal is to find a prior and optimizing over hyper-parameters. In more speci-
smooth function f(x) that fits every target value as close as fic, given a training dataset of N examples with the inputs
possible, which can be achieved by expressing f(x) as a fxi ji ¼ 1zmimi N g RD ; the target is the sum of the model
linear combination of radial basis functions, one centered on output f(xi) and the noise ei, i.e.,
every data point
y i ¼ f ð xi Þ þ e i
X
N
f ðxÞ ¼ /ðx xi Þyj ð12:1Þ where 1 i N, the model output is
j¼1
X
N
where / is a radial basis function. f ð xi Þ ¼ /j ðxi Þwj ¼ /ðxi Þw ð12:4Þ
As the inputs {xi|i = 1, …, N} are noisy, the kernel j¼1
and gðxÞ ¼ R hðx; yÞdy. (12.2) is known as the Nadaraya– The posterior distribution p(w|y), which is proportional to
Watson model, or kernel regression (Nadaraya 1964; Wat- the product of the prior p(w|A) and the likelihood (12.6), is
son 1964). For a localized kernel function, it has the property given by
of giving more weight to the data points that are close to x.
pðwjyÞ N ðm; SN Þ ð12:8Þ
An example of the component density h(x, y) is the stan-
1
dard normal density. More general joint density p(x, y) in- where m ¼ bSN U0 y and SN ¼ ½A þ bU0 U are the poste-
volves a Gaussian mixture model, in which the number of rior mean and covariance of m, respectively.
components in the mixture model can be smaller than the In the process of estimating a1, …, aN and b, a proportion
number of training set points, resulting in a model that is faster of the hyper-parameters {ai} are driven to large values, and
to evaluate for test data points. so the weight parameters wi, 1 i N, corresponding to
the large ai has posterior distribution with mean and variance
both zero. Thus the parameter wi and the corresponding basis
12.4 Relevance Vector Machines functions /i ðxÞ; 1 i N, are removed from the model and
play no role in making predictions for new inputs, and are
The Relevance Vector Machine (RVM), a Bayesian sparse ultimately responsible for the sparsity property. On the other
kernel technique for regression and classification, is intro- hand, the example xi associated with the nonzero weight wi
duced by Tipping (2001). As a Bayesian approach, it
12.6 Support Vector Machines 263
are termed “relevance” vectors. In another word, RVM sat- Here k*′ is the row vector k 0 ¼ ðk½x1 ; x ; . . .; k½xN ; x Þ IN is
isfies the principle of automatic relevance determination the N N identity matrix. If the N N covariance matrix
(ARD) via the hyper-parameters ai, 1 i N (Tipping K is degenerate, i.e., K can be expanded by a set of finite
2001). basis functions, namely,
With the posterior distribution pðwjyÞ, the predictive
K ¼ URU0
distribution p y jx ; y of y at a new test input x , obtained
as the integration of the likelihood pðy jx Þ over the poste- where U is the N M matrix with the (i, j)th entry Uij ¼
rior distribution pðwjyÞ, can be formulated as
/j ðxi Þ; 1 i N; 1 j M; f/1 ðxÞ; /M ðxÞg is a set of
M basis functions; R is a M M diagonal matrix. It can be
p y jx ; y N m0 /ðx Þ; r2 ð12:9Þ
shown that the predictive variance (12.13) is lesser as k*′ is
where the variance of the predictive distribution in the direction of the eigenvectors corresponding to zero
eigenvalues of the covariance matrix K, that is, the predictive
1 variance (12.13) is lesser as U′k* = 0. If the basis functions
r2 ¼ þ /0 ðx ÞSN /ðx Þ: ð12:10Þ
b in U are localized basis functions, the same problem is met
as in the RVM that the model becomes very confident in its
Here SN is the posterior covariance given in (12.7).
If the N basis functions /ðxÞ ¼ ½/1 ðxÞ; . . .; /N ðxÞ are predictions when extrapolating outside the region occupied
by the basis functions. For the above reason, when adopting
localized with centers the inputs {xi|i = 1, …, N} of the
Gaussian process regression, covariance matrix K based on
training dataset of N examples, then as the test input x is
located in region away from the N centers, the contribution non-degenerate kernel function is considered.
Without the mechanism of automatic relevance deter-
from the second term in (12.10) will get smaller and leave
mination (ARD), however, the main limitation of Gaussian
only the noise contribution 1/b. In another word, the model
becomes very confident in its predictions when extrapolating process regression is memory requirements and computa-
tional demands grow as the square and cube, respectively, of
outside the region occupied by the N centers of the training
the number of training examples N. To overcome the com-
dataset, which is generally an undesirable behavior. For this
reason, we consider a more appropriate model, namely, the putational limitations, numerous authors have recently sug-
gested a wealth of sparse approximations (Csat´o and Opper
Gaussian process regression, that avoids this undesirable
2002; Seeger et al. 2003; Qui˜nonero-Candela and Ras-
behavior of RVM in the following section.
mussen 2005; Snelson and Ghahramani 2006).
f ðxi Þ ¼ /ðxi Þw þ b
mG ¼ k ½K þ bIN 1 y ð12:12Þ
where /(x) denotes a fixed feature-space transformation, b is
r2 ¼ k½x ; x k 0 ½K þ bIN 1 k ð12:13Þ the bias parameter. The N data points x1, …, xN are labeled
264 12 Kernel Linear Model
many hyperplanes that might classify the two classes of the In order to classify new data point x using the trained
N datapoints. The best is the one that represents the largest model, we evaluate the sign of w/ðxÞ þ b. As ^y½w/ðxÞ þ
separation, or margin, between the two classes of data b0, thus ^y 0 if ½w/ðxÞ þ b0, otherwise ^y 0.
points. If such a hyperplane exists, it is known as the max- Whereas the above we consider a linear hyperplane, it
imum-margin hyperplane and the linear classifier it defines often happens that the sets to discriminate are not linearly
is known as a maximum-margin classifier; or equivalently, separable in that space. In addition to linear classification,
the perceptron of optimal stability. Intuitively, a good sep- the formulation of the objective function (12.16) allows
aration is achieved by the hyperplane that has the largest SVMs to efficiently perform a nonlinear classification using
distance to the nearest training-data point of any class what is called the kernel trick, implicitly mapping their
(so-called functional margin), inputs into high-dimensional feature spaces. It was proposed
More formally, suppose the hyperplane that separate the that the original finite-dimensional space be mapped into a
two classes of data points is given by f(x) = 0, then the much higher-dimensional space, presumably making the
perpendicular distance of a data point x from the hyperplane separation easier in that space. To keep the computational
f(x) = 0 takes the form load reasonable, the mappings are designed so that the dot
products of pairs of input data points are defined by a kernel
jf ðxÞj=kwk ¼ y½/ðxÞw þ b=kwk ð12:14Þ
function to suit the problem.
where y is the label of the data point x. Now the margin is
defined as the perpendicular distance to the closest data point
from the data set, say, xn, 1 n N. The parameters 12.7 Python Programming
w and b are those that maximize the margin in (12.14). The
optimization problem is equivalent to minimize ||w||2, subject Consider the dataset of credit card holders’ payment data in
to the constraint that October 2005, from a bank (a cash and credit card issuer) in
Taiwan. Among the total 25,000 observations, 5529 obser-
yi /ðxi Þ0 w þ b 1 ð12:15Þ vations (22.12%) are the cardholders with default payment.
Thus the target variable y is the default payment (Yes = 1,
for all 1 i N. In the case the equality holds, the con- No = 0), and the explanatory variables are the following 23
straints are said to be active, whereas for the remainder they variables:
are said to be inactive. Any data point for which the equality
holds is called a support vector and the remaining data points • X1: Amount of the given credit (NT dollar): it includes
play no role in making predictions for new data points. By both the individual consumer credit and his/her family
definition, there will always be at least one active constraint, (supplementary) credit.
because there will always be a closest point, and once the • X2: Gender (1 = male; 2 = female).
margin has been maximized there will be at least two active • X3: Education (1 = graduate school; 2 = university;
constraints. The dual representation of the maximum margin 3 = high school; 4 = others).
problem in (12.15) is to maximize • X4: Marital status (1 = married; 2 = single; 3 = others).
• X5: Age (year).
X
N N X
X N
ai ai aj y i y j k xi ; xj : ð12:16Þ • X6-X11: History of past payment from September to April
i¼1 i¼1 j¼1 2005.
(The measurement scale for the repayment status is:
Subject to the constraint ai 0 for all 1 i N, and −1 = pay duly; 1 = payment delay for one month;
X
N 2 = payment delay for two months; ...; 8 = payment
ai y i ¼ 0 ð12:17Þ delay for eight months; 9 = payment delay for nine
i¼1 months and above).
• X12-X17: Amount of bill statement from September to
where the kernel function k xi ; xj ¼ /ðxi Þ/ xj . To solve
April 2005.
the maximization problem (12.16)–(12.17), quadratic pro- • X18-X23: Amount of previous payment (NT dollar) from
gramming technique is required. September to April 2005.
Once the maximization problem (12.16)–(12.17) is
solved, the weight parameters are
12.8 Kernel Linear Model and Support Vector Machines 265
Question 1
sns.set_context('talk')
sns.set_palette('dark') Question 2a. Get the “correlations” between X1–X11 and
sns.set_style('white') y; and plot the bar plot
fields = list(data.columns[0:11])
fields = list(data.columns[7:9])
y=data.Y
X=data[fields] correlations = data[fields].corrwith(y)
X['Default']=data["Y"] # Add the last column "Default" ax = correlations.plot(kind='bar')
sns.pairplot(X, hue='Default')
ax.set(ylim=[-1, 1], ylabel='pearson correlation');
266 12 Kernel Linear Model
correlations.sort_values(inplace=True)
correlationsAbs=correlations.map(abs).sort_values()
fields =correlationsAbs.iloc[-2:].index
X = data[fields]
scaler = MinMaxScaler()
X = scaler.fit_transform(X)
LSVC = LinearSVC()
LSVC.fit(X, y)
y_default = y.loc[X_default.index]
ax = plt.axes()
ax.scatter(
color=y_color, alpha=1)
# -------------------------------------------------------------------
xx_ravel = xx.ravel()
yy_ravel = yy.ravel()
y_grid_predictions = LSVC.predict(X_grid)
y_grid_predictions = y_grid_predictions.reshape(xx.shape)
# -----------------------------------------------------------------
ax.set(
xlabel=fields[0],
ylabel=fields[1],
268 12 Kernel Linear Model
xlim=[0, 1],
ylim=[0, 1],
Question 4. Fit a Gaussian kernel SVC and see how the def plot_decision_boundary(estimator, X, y):
decision boundary changes
estimator.fit(X, y)
• Consolidate the code snippets in Question 3 into one
function which takes in an estimator, X and y, and pro-
duces the final plot with decision boundary. The steps are
X_default = X.sample(900, random_state=45)
1. fit model
2. get sample 900 records from X and the corresponding y_default = y.loc[X_default.index]
y's
3. create grid, predict, plot using ax.contourf
4. add on the scatter plot
y_color = y_default.map(lambda r: 'red' if r == 1 else 'blue')
• After copying and pasting code make sure the finished x_axis, y_axis = np.arange(0, 1, .005), np.arange(0, 1, .005)
function uses your input estimator and not the Lin-
earSVC model you built. xx, yy = np.meshgrid(x_axis, y_axis)
• For the following values of gamma, create a Gaussian
Kernel SVC and plot the decision boundary. xx_ravel = xx.ravel()
• gammas = [10, 20, 100, 200]
yy_ravel = yy.ravel()
• Holding gamma constant, for various values of C, plot
the decision boundary. You may try X_grid = pd.DataFrame([xx_ravel, yy_ravel]).T
• Cs = [0.1, 1, 10, 50]
y_grid_predictions = estimator.predict(X_grid)
y_grid_predictions = y_grid_predictions.reshape(xx.shape)
12.8 Kernel Linear Model and Support Vector Machines 269
ax.set(
xlabel=fields[0],
ylabel=fields[1],
title=str(estimator))
plot_decision_boundary(SVC_Gaussian, X, y)
270 12 Kernel Linear Model
Question 5 Fit a Polynomial kernel SVC with degree 5 • For various values of C, plot the decision boundary. You
and see how the decision boundary changes may try Cs = [0.1, 1, 10, 50]
• Try to find out a C value that gives the best possible
• Use the plot decision boundary function from the previ- decision boundary
ous question and try the Polynomial Kernel SVC
for C in Cs:
plot_decision_boundary(SVC_Polynomial, X, y)
12.8 Kernel Linear Model and Support Vector Machines 271
272 12 Kernel Linear Model
• Get the mean and standard deviation on the set for the
various combination of gammas = [10, 20, 100, 200] and
Cs = [0.1, 1, 10, 100]
• print the best parameters in the training set
test_size=0.3, random_state=42)
coeff_labels_gamma = ['gamma=10','gamma=20','gamma=100','gamma=200']
y_pred = list()
lr=clf.fit(X_train,y_train)
y_pred.append(pd.Series(lr.predict(X_test), name=lab))
12.8 Kernel Linear Model and Support Vector Machines 273
metrics = list()
cm = dict()
metrics.append(pd.Series({'precision':precision, 'recall':recall,
'fscore':fscore, 'accuracy':accuracy},
name=lab))
metrics
274 12 Kernel Linear Model
axList = axList.flatten()
fig.set_size_inches(10, 10)
axList[-1].axis('on')
# axList[:] will list all the 4 confusion tables; axList[:-1] list the first three confusion tables
ax.set(title=lab);
12.8 Kernel Linear Model and Support Vector Machines 275
y_pred = list()
lr=clf.fit(X_train,y_train)
y_pred.append(pd.Series(lr.predict(X_test), name=lab))
metrics = list()
cm = dict()
metrics.append(pd.Series({'precision':precision, 'recall':recall,
metrics
276 12 Kernel Linear Model
axList = axList.flatten()
fig.set_size_inches(10, 10)
axList[-1].axis('on')
# axList[:] will list all the 4 confusion tables; axList[:-1] list the first three confusion tables
ax.set(title=lab);
References 277
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 279
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_13
280 13 Neural Networks and Deep Learning Algorithm
ðlÞ
X
H l1
ðlÞ ðl1Þ
HX
lþ1
0 ðlÞ ðl þ 1Þ ðl þ 1Þ
an;k ¼ wk;j zn;j ð13:6aÞ ¼ h an;k dn;j wj;k ð13:8Þ
j¼0 j¼0
ðlÞ ðlÞ
zn;k ¼ h an;k ð13:6bÞ Equation (13.8) indicates that the value of d for a par-
ticular hidden node can be obtained by propagating the d’s
Note the activation function of the Lth layer is the identity backwards from the nodes in the next layer in the network.
function, thus The backpropagation procedure can therefore be imple-
mented as follows:
ðLÞ
X
H L1
ðLÞ ðL1Þ
zn;k ¼ wk;j zn;j 1. The inputs and activations of all of the hidden and output
j¼0
nodes in the network by (13.6a) and (13.6b) are
Consider the sum of squared errors for the K outputs calculated,
yn = (yn,1,…, yn,K)′ of the nth example: 2. At the output layer, i.e., the Lth layer, evaluate the
derivative
1 XK ðLÞ 2
Error n ðwÞ ¼ dn;k for1 n N ðLÞ
2 k¼1 @Error n ðwÞ ðL1Þ
ðLÞ
¼ d zn;j for1 k HL; 0 j HL 1
ðLÞ ðLÞ @wk;j
where dn;k ¼ yn;k zn;k , 1 k K. n;k
ðlÞ ð13:9Þ
Of interest is the derivative of Error n ðwÞ w.r.t. wk;j ,
1 k Hl, 1 j Hl−1, and 1 l L. In order to 3. For the lth hidden layer with Hl hidden units, 1 l
ðlÞ
evaluate these derivatives, we need to calculate the value of L−1, the derivative of Error n ðwÞ W.R.T. wk;j , 1 k
d for each hidden and output node in the network, where d Hl, 1 j Hl−1, is
for the kth hidden node in the lth layer, 1 l L−1, is
! ðlÞ
!
defined as @Error n ðwÞ @Error n ðwÞ @an;k ðlÞ ðl1Þ
! ! ðlÞ
¼ ðlÞ ðlÞ
¼ dn;k zn;j :
ðl þ 1Þ @wk;j @an;k @wk;j
@Error n ðwÞ X @an;j
Hl þ 1
ðlÞ @Error n ðwÞ
dn;k ¼ ðlÞ
¼ ðl þ 1Þ ðlÞ
@an;k j¼0 @an;j @an;k
ð13:7Þ
ðl þ 1Þ
In (13.7), an;j is the input to the jth hidden node in the
(l + 1)th layer given by
282 13 Neural Networks and Deep Learning Algorithm
independent dataset, generally called a validation set, often learnable parameters resulting in more efficient training. The
shows a decrease at first, followed by an increase as the intuition behind a convolutional neural network is thus to
network starts to overfit. Training can therefore be stopped at learn in each layer a weight matrix that will be able to extract
the point of smallest error with respect to the validation data the necessary, translation-invariant features from the input.
set to obtain a network model with good generalization Consider the inputs x0, …, xN−1. In the first layer, the
performance. Early stopping is similar to weight decay by ð1Þ
input is convolved with a set of H1 filters (weights)fwh , 1
the quadratic regularizer in (13.11). h H 1 } and the output is
!
ð1Þ
X1
ð1Þ
zh ði Þ ¼ h wh ð jÞxij ð13:12Þ
13.6 Deep Feedforward Network Versus j¼1
Deep Convolutional Neural Networks
ð1Þ
where wh is k-dim, here k is the filter size that controls the
A neural network with very large number of hidden layers receptive field of each output node, and 1 i N−1. In a
and/or nodes with no feedback connections is called a deep convolutional neural network, the receptive field of node a is
feedforward network. Due to its high degree of freedoms in defined as the set of nodes from previous layer with the
the numbers of hidden layers and nodes, the deep feedfor- outputs acting as the inputs of node a.
ward neural network can be trained to learn Now the output feature map z(1) is (N−k + 1) H1,
high-dimensional and nonlinear mappings, which makes ð2Þ
which is convolved with a set of H2 filters (weights) fwh , 1
them candidates for complex tasks. However, there are still
problems with the deep feedforward neural network for h H 2 } and becomes the inputs of the 2nd layer. Similar
complex tasks such as image recognition, as images are to the first layer, a nonlinear transformation is applied to the
large, often with several hundred variables (pixels). A deep inputs to produce the output feature map. Repeat the same
feedforward network with, say one hundred hidden units in procedure, the output feature map of the lth layer, 2 l
the first layer, would already contain several tens of thou- L, is
sands of weights. Such a large number of parameters !
ðlÞ
X 1 X H l1
ðlÞ ðl1Þ
increases the capacity of the system and therefore requires a zh ðiÞ ¼ h wh ð j; mÞ zm ði jÞ ð13:13Þ
larger training dataset. In addition, images have a strong j¼1 m¼1
2D local structure: variables (or pixels) that are spatially or
ðlÞ
temporally nearby are highly correlated. Local correlations where wh is k Hl, and the output feature map z(l) is Nl
are the reasons for the well-known advantages of extracting Hl, Nl = Nl−1-k + 1. The local connectivity is achieved by
and combining local features before recognizing spatial or replacing the weighted sums from the neural network with
temporal objects, because configurations of neighboring convolutions to a local region of each node in CNN. The
variables can be classified into a small number of categories local connected region of a node is referred to as the
(e.g., edges, corners…). Another deficiency of a feedforward receptive field of the node.
network is the lack of built-in invariance with respect to For time series inputs x0, …, xN−1, to learn the long-term
translations, or local distortions of the inputs. dependencies within the time series, stacked layers of dilated
Convolutional neural networks (CNN) were developed convolutions are used:
with the idea of local connectivity and shared weights so the !
ðlÞ
X1 X H l1
ðlÞ
shift invariance is automatically obtained by forcing the zh ði Þ ¼ h ðl1Þ
wh ð j; mÞ zm ði d jÞ :
replication of weight configurations across space. In each j¼1 m¼1
layer of the convolutional neural network, the input is con-
ð13:14Þ
volved with the weight matrix (also called the filter) to create
a feature map. In other words, the weight matrix slides over In this way, the filter is applied to every dth element in the
the input and computes the dot product between the input input vector, allowing the model to learn connections
and the weight matrix. Note that as opposed to regular neural between far-apart data elements. In addition to dilated
networks, all the values in the output feature map share the convolutions, for time series inputs x0, …, xN−1, it is con-
same weights. This means that all the nodes in the output venient to pad the input with zeros around the border. The
detect exactly the same pattern. The local connectivity and size of this zero-padding depends on the size of the receptive
shared weights aspect of CNNs reduces the total number of field.
284 13 Neural Networks and Deep Learning Algorithm
Consider the dataset of credit card holders’ payment data in Coenraad, M; Myburgh, Johannes C.; Davel, Marelie H. (2020).
October 2005, from a bank (a cash and credit card issuer) in Gerber, Aurona (ed.). “Stride and Translation Invariance in
Taiwan. Among the total 25,000 observations, 5529 obser- CNNs”. Artificial Intelligence Research. Communications in Com-
puter and Information Science. Cham: Springer International
vations (22.12%) are the cardholders with default payment. Publishing. 1342: 267–281.
Thus the target variable y is the default payment (Yes = 1, Collobert, Ronan, Weston, Jason (2008–01–01). A Unified Architec-
No = 0), and the explanatory variables are the following 23 ture for Natural Language Processing: Deep Neural Networks with
variables: Multitask Learning. Proceedings of the 25th International Confer-
ence on Machine Learning. ICML’08. New York, NY, USA: ACM.
pp. 160–167.
• X1: Amount of the given credit (NT dollar): it includes Dupond, Samuel (2019). “A thorough review on the current advance of
both the individual consumer credit and his/her family neural network structures”. Annual Reviews in Control. 14: 200–230.
(supplementary) credit. Graves, Alex; Liwicki, Marcus; Fernandez, Santiago; Bertolami,
Roman; Bunke, Horst; Schmidhuber, Jürgen (2009). “A Novel
• X2: Gender (1 = male; 2 = female). Connectionist System for Improved Unconstrained Handwriting
• X3: Education (1 = graduate school; 2 = university; Recognition” (PDF). IEEE Transactions on Pattern Analysis and
3 = high school; 4 = others). Machine Intelligence. 31 (5): 855-868.
• X4: Marital status (1 = married; 2 = single; 3 = others). McCulloch, W. S. and W. Pitts (1943). A logical calculus of the ideas
immanent in nervous activity. Bulletin of Mathematical Biophysics
• X5: Age (year). 5, 115–133.
• X6–X11: History of past payment from September to April Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and
2005. the Theory of Brain Mechanisms. Spartan.
(The measurement scale for the repayment status is: Rumelhart, D. E., J. L. McClelland, and the PDP Research Group
(Eds.) (1986). Parallel Distributed Processing: Explorations in the
1 = pay duly; 1 = payment delay for one month; Microstructure of Cognition, Volume 1: Foundations. MIT Press.
2 = payment delay for two months; ... ; 8 = payment Tealab, Ahmed (2018–12–01). “Time series forecasting using artificial
delay for eight months; 9 = payment delay for nine neural networks methodologies: A systematic review”. Future
months and above). Computing and Informatics Journal. 3 (2): 334–340.
Valueva, M.V.; Nagornov, N.N.; Lyakhov, P.A.; Valuev, G.V.;
• X12–X17: Amount of bill statement from September to Chervyakov, N.I. (2020). “Application of the residue number
April 2005. system to reduce hardware costs of the convolutional neural
• X18–X23: Amount of previous payment (NT dollar) from network implementation”. Mathematics and Computers in Simula-
September to April 2005. tion. Elsevier BV. 177: 232–243.
Zhang, Wei (1990). “Parallel distributed processing model with local
space-invariant interconnections and its optical architec-
ture”. Applied Optics. 29 (32): 4790–7.
Alternative Machine Learning Methods
for Credit Card Default Forecasting* 14
By Huei-Wen Teng, National Yang Ming Chiao Tung University,
Taiwan
This chapter is a revised and extended version of the paper: Based upon the concept and methodology of machine
Huei-Wen Teng and Michael Lee. Estimation procedures of learning and deep learning, which has been discussed in
using five alternative machine learning methods for pre- Chaps. 12 and 13, this chapter shows how five alternative
dicting credit card default. Review of Pacific Basin Financial machine learning methods can be used to forecast credit card
Markets and Policies, 22(03):1950021, 2019. doi: https:// default. This chapter is organized as follows. Section 14.1 is
doi.org/10.1142/S0219091519500218 the introduction, and Sect. 14.2 reviews literature. Sec-
tion 14.3 introduces the credit card data set. Section 14.4
reviews five supervised learning methods. Section 14.5 gives
14.1 Introduction the study plan to find the optimal parameters and compares
the learning curves among five methods. A summary and
Following de Mello and Ponti (2018), Bzdok et al. (2018), concluding remarks are provided in Sect. 14.6. Python codes
and others, we can define machine learning as a method of are given in Appendix 14.1.
data analysis that automates analytical model building. It is a
branch of artificial intelligence based on the idea that sys-
tems can learn from data, identify patterns, and make deci- 14.2 Literature Review
sions with minimal human intervention. Machine learning is
one of the most important tools for financial technology. Machine learning is a subset of artificial intelligence that
Machine learning is particularly useful when the usual lin- often uses general and intuitive methodology to give com-
earity assumption does not hold for the data. Under equi- puters (machines) the ability to learn with data so that the
librium conditions and when the standard assumptions of performance on a specific task is improved, without
normality and linearity hold, machine learning and para- explicitly programmed (Samuel 1959). Because of its flexi-
metric methods, such as OLS, tend to generate similar bility and generality, machine learning has been successfully
results. Since machine learning methods are essentially applied in the fields, including email filtering, detection of
search algorithms, there is the usual problem of finding network intruders or malicious intruders working towards a
global minima that minimizes some function. data breach, optical character recognition, learning to rank,
Machine learning can generally be classified as (i) super- informatics, and computer vision (Mitchell 1997; Mohri
vised learning, (ii) unsupervised learning, and (iii) others et al. 2012; De Mello and Ponti 2018). In recent years,
(reinforcement learning, semi-supervised, and active learn- machine learning has fruitful applications in financial tech-
ing). Supervised learning includes (i) regression (lasso, nology, such as fraud prevention, risk management, portfolio
ridge, logistic, loess, KNN, and spline) and (ii) classification management, investment predictions, customer service,
(SVM, random forest, and deep learning). Unsupervised digital assistants, marketing, sentiment analysis, and network
learning includes (i) clustering (K-means, hierarchical tree security.
clustering) and (ii) factor analysis (principle component Machine learning is closely related to statistics (Bzdok
analysis, etc.). K nearest neighbors (KNN) is a simple et al. 2018). Indeed, statistics is a sub-field of mathematics,
algorithm that stores all available cases and classifies new whereas machine learning is a sub-field of computer science.
cases based on a similarity measure (e.g., distance func- To explore the data, statistics starts with a probability model,
tions). KNN has been used in statistical estimation and fits the model to the data, and verifies if this model is ade-
pattern recognition already in the data. quate using residuals analysis. If the model is not adequate,
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 285
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_14
286 14 Alternative Machine Learning Methods for Credit Card Default Forecasting*
residuals analysis can be used to refine the model. Once the outcomes (Hand and Henley 1997). There have been
model is shown to be adequate, statistical inference about the extensive studies examining the accuracy of alternative
parameters in the model can be furthermore used to deter- machine learning algorithms or classifiers. Recently, Less-
mine if a factor of interests is significant. The ability to mann et al. (2015) provide comprehensive classifier com-
explain if a factor really matters makes statistics widely used parisons to date and divide machine learning algorithms into
in almost all disciplines. three divisions: individual classifiers, homogeneous ensem-
In contrast, machine learning focuses more on prediction bles, and heterogeneous ensembles.
accuracy but not model interpretability. In fact, machine Individual classifiers are those using a single machine
learning uses general purposes algorithms and aims at learning algorithm, for example, the k-nearest neighbors,
finding patterns with minimal assumption about the decision trees, support vector machine, and neural network.
data-generating system. Classic statistics method together Butaru et al. (2016) test decision tree, regularized logistic
with machine learning techniques leads to a combined field, regression, and random forest models with a unique large
called statistical learning (James et al. 2013). data set from six larger banks. It is found that no single
Applications domain of machine learning can be roughly model applies to all banks, and suggests the need for a more
divided into unsupervised learning and supervised learning customized approach to the supervision and regulation of
(Hastie et al. 2008). Unsupervised learning refers to the financial institutions, in which parameters such as capital
situations one has just predictors, and attempts to extract ratios and loss reserves should be specified to each bank
features that represent the most distinct and striking features according to its credit risk model exposures and forecasts.
in the data. Supervised learning refers to the situations that Sun and Vasarhelyi (2018) demonstrate the effectiveness of
one has predictors (also known as input, explanatory, or a deep neural network based on clients’ personal character-
independent variables) and responses (also known as output, istics and spending behaviors over logistic regression, naïve
or dependent variables), and attempts to extract important Bayes, traditional neural networks, and decision trees in
features in the predictors that best predict responses. Using terms of better prediction performance with a data set of size
input–output pairs, supervised learning learns a function 711,397 collected in Brazil.
from the data set to map an input to an output using sample Novel machine learning method to incorporate complex
(Russell and Norvig 2010). features of the data are proposed as well. For example,
In financial technology (FinTech), machine learning has Fernandes and Artes (2016) incorporate spatial dependence
received extensive attention in recent years. For example, as inputs into the logistic regression, and Maldonado et al.
Heaton et al. (2017) apply deep learning for portfolio opti- (2017) propose support vector machines for simultaneous
mization. With the rapid development of high-frequency classification and feature selection that explicitly incorporate
trading, intra-day algorithmic trading becomes a popular attribute acquisition costs. Addo et al. (2018) provide binary
trading device and machine learning is a fundamental par- classifiers based on machine and deep learning models on
alytics for predicting returns of underlying asset: Putra and real data in predicting loan default probability. It is observed
Kosala (2011) use neural network and validate the validity that tree-based models are more stable than neural
of the associated trading strategies in the Indonesian stock network-based methods.
market; Borovykh et al. (2018) propose a convolutional On the other hand, the ensemble method contains two
neural network to predict time series of the S&P 500 index. steps: model developments and forecast combinations. It can
Lee (2020) and Lee and Lee (2020) have discussed the be divided into homogeneous ensemble classifiers and
relationship between machine learning and financial econo- heterogeneous ensemble classifiers. The former uses the
metrics, mathematics, and statistics. same classification algorithm, whereas the latter uses dif-
In addition to the above applications, machine learning is ferent classification algorithms. Finlay (2011) and Paleologo
also applied to other canonical problems in finance. For et al. (2010) have shown that homogeneous ensemble clas-
example, Solea et al. (2018) identify the next emerging coun- sifiers increase predictive accuracy. Two types of homoge-
tries using statistical learning techniques. To measure asset risk neous ensemble classifiers are bagging and boosting.
premia in empirical asset pricing, Gu et al. (2018) perform a Bagging derives independent base models from bootstrap
comparative analysis of methods using machine learning, samples of the original data (Breiman 1996), and boosting
including generalized linear models, dimension reduction, iteratively adds base models to avoid the errors of current
boosted regression trees, random forests, and neural networks. ensembles (Freund and Schapire 1996).
To predict the delinquency of a credit card holder, a credit Heterogeneous ensemble methods create these models
scoring model provides a model-based estimate of the using different classification algorithms, which have different
default probability of a credit card customer. The predictive views on the same data and may complement each other. In
models for the default probability have been developed addition to base models’ developments and forecast com-
using machine learning classification algorithms for binary binations, heterogeneous ensembles need a third step to
14.4 Alternative Machine Learning Methods 287
search the space of available base models. Static approaches are just integers without clear differentiation of categories
search the base model once, and dynamic approaches repeat and have much larger possible ranges of how much money
the selection step for every case (Ko et al. 2008; was paid. Especially, if there was not strong correlation
Woloszynski and Kurzynski 2011). For static approaches, between education, marital status, age, etc., and defaulting
the direct method maximizes predictive accuracy (Caruana on payments, it could be more difficult to algorithmically
et al. 2006) and the indirect method optimizes the diversity predict the outcome from past payment details, except for
among base models (Partalas et al. 2010). the extremes where someone never pays their bills or always
pays their bills. Figure 14.1 plots the heatmap to show
pairwise correlations between attributes. It is shown that
14.3 Description of the Data most correlations are about zeros, but high correlations exist
in features of past monthly payments ðX6 ; . . .; X11 Þ and past
We apply the machine learning techniques in the default of monthly bill statements ðX12 ; . . .; X17 Þ.
credit card clients’ data set. There are 29,999 instances in the
credit card data set. The default of credit card client’s data
set can be found at http://archive.ics.uci.edu/ml/datasets/ 14.4 Alternative Machine Learning Methods
default+of+credit+card+clients and was initially analyzed by
Yeh and Lien (2009). This data set is the payment data of
Let X ¼ X1 ; . . .; Xp denote the p-dimensional input vector,
credit card holders in October 2005, from a major cash and
and let Y ¼ ðY1 ; . . .; Yd Þ denote the d-dimensional output
credit card issuer in Taiwan. This data set contains 23 dif-
vector. In its simplest form, a learning machine is an input–
ferent attributes to determine whether or not a person would
output mapping, Y ¼ FðXÞ. In statistics, F () is usually a
default on their next credit card payment. It contains amount
simple function, such as a linear or polynomial function. In
of given credit, gender, education, marital status, age, and
contrast, the form of the F () in machine learning may not be
history of past payments, including how long it took
represented by simple functions.
someone to pay the bill, the amount of the bill, and how
In the following, we introduce the spirit of five machine
much they actually paid for the previous six months.
learning methods: k-nearest neighbors, decision tree, boost-
The response variable is
ing, support vector machine, and neural network, with
illustrative examples. Rigorous formulations for each
• Y: Default payment next month (1 = default; 0 = not
machine learning method will not be covered here because
default). We use the following 23 variables as explana-
they are out of the scope of this chapter.
tory variables:
• X1: Amount of the given credit (NT dollar),
• X2: Gender (1 = male, 2 = female),
14.4.1 k-Nearest Neighbors
• X3: Education (1 = graduate school; 2 = university;
3 = high school; 4 = others),
The k-Nearest Neighbors (KNN) method is intuitive and
• X4: Marital status (1 = married; 2 = single; 3 = others),
easy to implement. First, a distance metric (such as the
• X5: Age (year),
Euclidean distance) needs to be chosen to identify the KNNs
• X6–X11: History of past monthly payment traced back
for a sample of unknown category. Second, a weighting
from September 2005 to April 2005 (−1 = pay duly;
scheme (uniform weighting or distance weighting) to sum-
1 = payment delay for one month; 2 = payment delay for
marize the score of each category needs to be decided. The
two months; ...; 8 = payment delay for eight months;
uniform weighting scheme gives equal weight for all
9 = payment delay for nine months and above),
neighbors regardless of its distance to the sample of
• X12–X17: Amount of past monthly bill statement (NT
unknown category, whereas the distance weighting scheme
dollar) traced backfrom September 2005 to April 2005.
weights distant neighbors less. Third, the score for each
• X18–X23: Amount of past payment (NT dollar) traced
category is summed over these KNNs. Finally, the predicted
back from September 2005 to April 2005.
category of this sample is the category yielding the highest
score.
This data set is interesting because it contains two “sorts”
An example is illustrated in Fig. 14.2. Suppose there are
of attributes. The first sort is about categorical attributes like
two classes (category A and category B) for the output and
education, marital status, and age. These attributes have a
two features (x1 and x2). A sample of unknown category is
very small range of possible values, and if there was a high
plotted as a solid circle. KNN predicts the category of this
correlation between these categorical attributes then the
sample as follows. To start, we choose Euclidean distance
classification algorithms would be able to easily identify
and uniform distance weight. If K = 3, in the three nearest
them and produce high accuracies. The second sort of
neighbors to the unknown sample, there are one sample of
attribute is the past payment information. These attributes
288 14 Alternative Machine Learning Methods for Credit Card Default Forecasting*
category A and two samples of category B. Because there leaves represent class labels and branches represent con-
are more samples of category B, KNN predicts the unknown junctions of features that lead to those class labels. Decision
sample to be of category B. If K = 6, in the six nearest tree is usually constructed top-down, by choosing a mapping
neighbors to the sample of unknown category, there are four of feature variables at each step that best splits the set of
samples of class A and two samples of class B. Because items. Different algorithms choose different metrics for
class A occurs more frequently than class B, KNN predicts measuring the homogeneity of the target variables within the
the sample to be of category A. subsets. These metrics are applied to each candidate subset
In addition to the distance metric and weighting scheme, and the resulting values are combined to provide a quality of
the number of neighbors K is needed to be decided. Indeed, the the split. Common metrics include the Gini Index or Infor-
performance of the KNN is highly sensitive to the size of mation Gain based on the concept of entropy.
K. There is no strict rule in selecting l. In practice, the selection Figure 14.3 depicts the structure of a decision tree: the deci-
of K can be done by observing the predicted accuracies for sion tree starts with a root node and consists of internal decision
various K and select the one that reach the highest training nodes and leaf nodes. The decision nodes and leaf nodes are
scores and cross-validation scores. Detailed descriptions stemmed from the root node and are connected by branches.
about how to calculate these scores are given in Sect. 14.4. Each decision node represents a test function with discrete out-
comes labeling the branches. The decision tree grows along with
14.4.2 Decision Trees these branches into different depths of internal decision nodes. At
each step, the data is classified by a different test function of
A decision tree is also called a classification tree when the attributes leading the data either to a deeper depth of internal
target output variable is categorical. For a decision tree, decision node or it finally ends up at a leaf node.
14.4 Alternative Machine Learning Methods 289
Figure 14.3 illustrates a simple example. Suppose an Again, the outcome is yes or no. For data with output no, all
interviewer is classified as “decline offer” or “accept offer”. data decline the offer, so this branch ends up with a leaf node
The tree starts with a root node. The root node is a test indicating “decline”. Data with answer “yes” accepts the
function to check if the salary is at least $50,000. If data with offer, so this branch ends up with a leaf node indicating
answer “no” declines offer, the branch ends up with decline “accept”.
offer and hence is represented as a leaf node indicating To apply the decision tree algorithm, we use the training
“decline”. If the answer is yes, the data remain contains data set to build a decision tree. For a sample with unknown
samples of declining and accepting the offer. Therefore, this category, we simply employ the decision tree to figure
branch results in a second decision node to check if the out which leaf node the sample of unknown category will
interviewer needs commuting time more than 1 hour and the end up.
output could be “yes” and “no”. If data with answer “yes” Different algorithms choose different metrics for mea-
declines the offer, then this branch ends up with a leaf node suring the homogeneity of the target variables within the
indicating “decline”. Data with answers no contains both subsets. These metrics are applied to each candidate subset
declining and accepting the offer, so the branch ends up with and the resulting values are combined to provide a quality of
another decision node to check if parental leave is provided. the split. Common metrics include the Gini Index and
290 14 Alternative Machine Learning Methods for Credit Card Default Forecasting*
Information Gain. The major difference between the Infor- There are many boosting algorithms, such as AdaBoost
mation Gain and the Gini Index is that the former produces (Adpative Boosting), Gradient Tree Boosting, and XGBoost.
multiple nodes, whereas the latter only produces two nodes Here, we focus on AdaBoost.
(TRUE and FALSE, or binary classification). In an iterative process, boosting yields a sequence of
The representative decision tree using Gini Index (also weak learners which are generated by assuming different
known as the Gini Split and Gini Impurity) to generate the distributions for the sample. To choose the distribution,
next lower node is the classification and regression tree boosting proceeds as follows:
(CART), which indeed allows both classification and
regression. Because CART is not limited to the types of • Step 1: The base learner (or the first learning algorithm)
response and independent variables, it is of wide popularity. assigns equal weight to each observation.
Suppose we would like to build up a next lower node, and • Step 2: The weights of observations which are incorrectly
the possible classification label is i, for i = 1, ..., c. Let pi predicted are increased to modify the distribution of the
represent the proportion of the number of samples in the observation, so that a second learner is obtained.
lower node classified as i. The Gini Index is defined as • Step 3: Iterate Step 2 until the limit of base learning
algorithm is reached, or higher accuracy is reached.
X
c
GiniIndex ¼ 1 ðpi Þ2 ð14:1Þ
i¼1 With the above procedures, a sequence of weak learner is
obtained. The prediction of a new sample is based on the
The attribute used to build the next node is the one that average (or weighted average) of each weak learners or that
maximize the Gini Index. having the higher vote from all these weak learners.
The Information Gain is precisely the measure used by
the decision tree ID3 and C4.5 to select the best attribute or
feature when building the next lower node (Mitchell 1997). 14.4.4 Support Vector Machines
Let f denote a candidate feature, and D denote the data at
current node and Di denote the data classified as label i at the A support vector machine (SVM) is a recently developed
lower node, for i ¼ 1; . . .; c: N ¼ =D= is the number of the technique originally used for pattern classification. The idea
sample at current node, and Ni = |Di| is the number of of SVM is to find a maximal margin hyperplane to separate
sample classified as label i at the lower node. Then, the data points of different categories. Figure 14.4 shows how
Information Gain is defined as the SVM separates the data into two categories with
X
c hyperplanes.
Ni
IGðD; f Þ ¼ IðDÞ IðDÞ; ð14:2Þ
i¼1
N
14.4.3 Boosting
If the classification problem cannot be separated by a zeroth layer and (L + 1)th layer, respectively. The name of
linear hyperplane, the input features have to be mapped into hidden layers implies that they are originally invisible in the
a higher dimensional feature space by a mapping function, data and are built artificially. The number of layers L is
which is calculated through a prior chosen a kernel function. called the depth of the architecture. See Fig. 14.5 for an
Kernel functions include linear, polynomial, sigmoid, and illustration of a structure of a neural network.
the radial basis function (RBF). Yang (2007) and Kim and Each layer is composed of nodes (also called neurons)
Sohn (2010) apply SVM in credit scoring problem and show representing a nonlinear transformation of information from
that SVM outperforms other techniques in terms of higher previous layer. The nodes in the input layer receive input
accuracy. features X = (X1, …, Xp) of each training sample and transmit
the weighted outputs to the hidden layer. The d nodes in the
output layer represent the output features Y ¼ ðY1 ; . . .; Yd Þ.
14.4.5 Neural Networks Let l 2 f1; 2; . . .; Lg denote the index of the layers from 1
to L. NN trains a model on data to make predictions by
A neural network (NN), or an artificial neural network, has passing learned features of data through different layers via
the advantage of strong learning ability without any L nonlinear transformation applied to input features. We
assumptions about the relationships between input and out- explicitly describe a deep learning architecture as follows.
put variables. Recent studies using an NN or its variants in For a hidden layer, various activation functions, such as
credit risk analysis can be found in Desai et al. (1996), logistic, sigmoid, and radial basis function (RBF), can be
Malhotra and Malhotra (2002), and Abdou et al. (2008). applied. We summarize some activation functions and their
NN links the input–output paired variables with simple definitions in Table 14.1.
functions called activation functions. A simple standard Let f ð0Þ ; f ð1Þ ; . . .; f ðLÞ be given univariate activation
structure for an NN includes an input layer, a hidden layer, functions for these layers. For notational simplicity, let f be a
and an output layer. If an NN contains more than one hidden given activation. Suppose U = ðU1 ; . . .; Uk ÞX is a k-dimen-
layer, it is also called as deep neural network (or deep sional input. We abbreviate f ðUÞ by
learning neural network).
Suppose that there are unknown L layers in an NN. The f ðUÞ ¼ ðf ðU1 Þ; . . .; f ðUk ÞÞX
original input layer and the output layer are also called the
Once the architecture of the deep neural network (i.e., L, 14.5.2 Tuning Optimal Parameters
and Nl for i ¼ 1; . . .; LÞ and activation functions
f ðlÞ for l ¼ 1; . . .; L are decided, we need to solve the The optimal combination of parameters is decided based on
training problem to find the learning parameters W ¼ criteria such as testing scores and cross-validation scores. To
ð0Þ ð1Þ calculate the testing score, we split the data set randomly
W ; W ; . . .; W ðLÞ and b ¼ bð0Þ ; . . .; bðLÞ , so that the
into 70% training set and 30% testing set. When fitting the
solutions W ^ and ^b satisfy
algorithm, we only use the training set. Then, we use the
X
n remaining 30% testing set to calculate the percentage of
^ ^b ¼ arg min 1n
W; L Y ðiÞ ; F W;b X ðiÞ correct classification of the method, which is also the pre-
W;b i¼0 diction accuracy or testing score.
14.5 Study Plan 293
Fig. 14.6 Validation curves of the k-nearest neighbors against k with Fig. 14.8 Validation curves of boosting against a number of
uniform weight and distance weight using training data and estimators with tree maximum depths of one and two using training
cross-validation of the credit card dataset data and cross-validation of the credit card dataset
294 14 Alternative Machine Learning Methods for Credit Card Default Forecasting*
better performance and this data set does not benefit from
having more estimators.
Figure 14.9 compares the testing and cross-validation
scores with the SVM using both the polynomial and RBF
kernels with maximum iterations ranging from 1100, 1600,
2100, and 2600. Our experiments suggest using the RBF
kernel because it performs much better than the polynomial
kernel, and it also runs faster than the polynomial kernel. In
addition, we use a maximum iterations value of 2100, as no
more improvements on the testing scores can be found with
larger maximum iterations value.
We use the ReLU function as the activation function. For Fig. 14.10 Validation curves of neural network against number of
neural network, we decide to test the number of hidden layers hidden layers and number of neurons in each hidden layer, in the upper
and number of neurons in each hidden layer. Figure 14.10 and lower panels, respectively, using training data and cross-validation
of the credit card data set
compares the testing and cross-validation scores of neural
networks. The upper panel varies the number of hidden lay-
ers, and suggests us to select the number of hidden layers to be
three. With three hidden layers, the lower panel varies the
number of hidden neurons in each layer, which suggests us to
have 15 neurons as a suitable size in each layer.
14.6 Summary and Concluding Remarks This chapter only uses accuracy as a measure to compare
different machine learning methods. Indeed, in addition to
In this chapter, we introduce five machine learning methods: the standard measures, such as precision, recall, F1-score,
k-nearest neighbors, decision tree, boosting, support vector and AUC, it is interesting to consider cost-sensitive frame-
machine, and neural network, to predict the default of credit work or profit measures to compare different machine
card holders. For illustration, we conduct data analysis using learning algorithms as in Verbraken et al. (2014), Bahnsen
a data set of 29,999 instances with 23 features and provide et al. (2015), and Garrido et al. (2018).
Python scripts for implementation. It is shown in our study Along with the availability of voluminous data in recent
that the decision tree performs best in predicting the default days, Moeyersoms and Martens (2015) solve high-cardinality
of credit card holders in terms of learning curves. attributes in churn prediction in the energy sector. In addition,
As the risk management for personal debt is of consid- it is also interesting to predict for longer-horizon or the default
erable importance, it is worthy of studying the following time (using survival analysis). Last but not least, it is of con-
directions for future research. One limitation in this paper is siderable importance to develop a method for extremely rare
that we only use one data set. According to Butaru et al. event. All of the above-mentioned issues are worthy of future
(2016), multiple data sets should be used to illustrate the studies. In the next chapter, we will discuss how deep neural
robustness of a machine learning algorithm, and networks can be used to predict credit card delinquency.
pairwise-comparisons should be conducted to verify which
machine learning algorithm outperforms the others (Demšar
2006; García and Herrera 2008). Appendix 14.1: Python Codes
296 14 Alternative Machine Learning Methods for Credit Card Default Forecasting*
References 297
References Caruana, R., Munson, A., & Niculescu-Mizil, A. (2006). Getting the
most out of ensemble selection. Proceedings of the 6th international
conference on data mining (pp. 828–833). Hong Kong, China: IEEE
Abdou, H., Pointon, J. and Masry, A.E. (2008). Neural Nets Versus Computer Society.
Conventional Techniques in Credit Scoring in Egyptian Banking. De Mello, R.F. and Ponti, M.A. (2018). Machine Learning: A Practical
Expert Systems with Applications 35(2), 1275–1292. Approach on the Statistical Learning Theory. Springer.
Addo, P.M., Guegan, D. and Hassani, B. (2018). Credit Risk Analysis Demšar, J. (2006). Statistical Comparisons of Classifiers Over Multiple
Using Machine and Deep Learning Models. Risks 6(2), 38. Data Sets. Journal of Machine Learning Research 7, 1–30.
Bahnsen, A.C., Aouada, D. and Ottersten, B. (2015). A Novel Desai, V.S., Crook, J.N. and Overstreet, G.A. (1996). A Comparison of
Cost-sensitive Framework for Customer Churn Predictive Model- Neural Networks and Linear Scoring Models in the Credit Union
ing. Decision Analytics 2(5), 1–15. Environment. European Journal of Operational Research 95(1),
Borovykh, A., Bothe, S. and Oosterlee, C. (2018). Conditional Time 24–47.
Series Forecasting with Convolutional Neural Networks. https:// Fernandes, G.B. and Artes, R. (2016). Spatial Dependence in Credit
arxiv.org/abs/1703.04691v4 (retrieved June 15, 2018). Risk and its Improvement in Credit Scoring. European Journal of
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123– Operational Research 249, 517–524.
140. Finlay, S. (2011). Multiple classifier architectures and their application
Butaru, F., Chen, Q., Clark, B., Das, S., Lo, A.W. and Siddique, A. to credit risk assessment. European Journal of Operational
(2016). Risk and Risk Management in the Credit Card Industry. Research, 210, 368–378.
Journal of Banking and Finance 72, 218–239. Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting
Bzdok, D., Altman, N. and Krzywinski, M. (2018). Statistics Versus algorithm. In L. Saitta (Ed.), Proceedings of the 13th international
Machine Learning.Nature Methods 15(4), 233–234. conference on machine learning (pp. 148–156). Bari, Italy: Morgan
Kaufmann.
298 14 Alternative Machine Learning Methods for Credit Card Default Forecasting*
García, S. and Herrera, F. (2008). An Extension on “Statistical Malhotra, R. and Malhotra, D.K. (2002). Differentiating Between Good
Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Credits and Bad Credits Using Neuro-Fuzzy Systems. European
Comparisons. Journal of Machine Learning Research 9, 2677– Journal of Operational Research 136(1), 190–211.
2694. Mitchell, T. (1997). Machine Learning. McGraw-Hill.
Garrido, F., Verbeke, W. and Bravo, C. (2018). A Robust Profit Moeyersoms, J. and Martens, D. (2015). Including High-cardinality
Measure for Binary Classification Model Evaluation. Expert Attributes in Predictive Models: A Case Study in Churn Prediction
Systems with Applications 92, 154–160. in the Energy Sector. Decision Support Systems 72, 72–81.
Gu, S., Kelly, B. and Xiu, D. (2018). Empirical Asset Pricing via Mohri, M., Rostamizadeh, A. and Talwalkar, A. (2012). Foundations of
Machine Learning. Technical Report No. 18–04, Chicago Booth Machine Learning. MIT Press.
Research Paper. Paleologo, G., Elisseeff, A., & Antonini, G. (2010). Subagging for
Hand, D. J., & Henley, W. E. (1997). Statistical classification models in credit scoring models. European Journal of Operational Research,
consumer credit scoring: A review. Journal of the Royal Statistical 201, 490–499.
Society: Series A (General), 160, 523–541. Partalas, I., Tsoumakas, G., & Vlahavas, I. (2010). An ensemble
Hastie, T., Ribshirani, R. and Friedman, J. (2008). The Elements of uncertainty aware mea- sure for directed hill climbing ensemble
Statistical Learning: Data Mining, Inference, and Prediction. pruning. Machine Learning, 81, 257–282.
Springer, New York. Putra, E.F. and Kosala, R. (2011). Application of Artificial Neural
Heaton, J.B., Polson, N.G. and White, J.H. (2017). Deep Learning for Networks to Predict Intraday Trading Signals. In Proceedings of
Finance: Deep Portfolios. Applied Stochastic Models in Business 10th WSEAS international conference on e-activity, Jakatar, Island
and Industry 33(3), 3–12. of Java, pp. 174–179.
James, G., Witten, D. Hastie, T. and Tibshirani, R. (2013). An Raschka, S. (2015). Python Machine Learning. Packt, Birmingham,
Introduction to Statistical Learning: With Applications in R. UK.
Springer. Russell, S. and Norvig, P. (2010). Artificial Intelligence: a Modern
Kim, H.S. and Sohn, S.Y. (2010). Support Vector Machines for Default Approach, 3rd Edition. Prentice-Hall.
Prediction of SMEs Based on Technology Credit. European Samuel, A.L. (1959). Some Studies in Machine Learning Using the
Journal of Operational Research 201(3), 838–846. Game of Checkers. IBM Journal of Research of Development 3(3),
Ko, A. H. R., Sabourin, R., & Britto, J. A. S. (2008). From dynamic 210–229.
classifier selection to dynamic ensemble selection. Pattern Recog- Solea, E., Li, B. and Slavković, A. (2018). Statistical Learning on
nition, 41, 1735–1748. Emerging Economies. Journal of Applied Statistics 45(3), 487–507.
Kumar, P. R., & Ravi, V. (2007). Bankruptcy prediction in banks and Sun, T. and Vasarhelyi, M. A. (2018). Predicting Credit Card
firms via statistical and intelligent techniques—A review. European Delinquencies: An Application of Deep Neural Network. Intelligent
Journal of Operational Research, 180, 1–28. Systems in Accounting, Finance and Management 25, 174–189.
Lee, C.F. (2020). Financial Econometrics, Mathematics, Statistics, and Woloszynski, T., & Kurzynski, M. (2011). A probabilistic model of
Financial Technology: An Overall View. Review of Quantitative classifier competence for dynamic ensemble selection. Pattern
Finance and Accounting. Forthcoming. Recognition, 44, 2656–2668.
Lee, C.F. and Lee, J. (2020). Handbook of Financial Econometrics, Yang, Y.X. (2007). Adaptive Credit Scoring with Kernel Learning
Mathematics, Statistics, and Machine Learning. World Scientific, Methods. European Journal of Operational Research 183(3),
Singapore. Forthcoming. 1521–1536.
Lessmann, S., Baesens, B., Seow, H.-V. and Thomas, L.C. (2015). Verbraken, T., Bravo, C., Weber, R. and Baesens, B. (2014).
Benchmarking State-of-the-Art Classification Algorithms for Credit Development and Application of Consumer Credit Scoring Models
Scoring: An Update of Research. European Journal of Operational Using Profit-based Classification Measures. European Journal of
Research 247, 124–136. Operational Research 238(2), 505–513.
Maldonado, S., Pérez, J. and Bravo, C. (2017). Cost-Based Feature Yeh, I.-C. and Lien, C.-H. (2009). The Comparisons of Data Mining
Selection for Support Vector Machines: An Application in Credit Techniques for the Predictive Accuracy of Probability of Default of
Scoring. European Journal of Operational Research 261, 656–665. Credit Card Clients. Expert Systems with Applications 36, 2473–
2480.
Deep Learning and Its Application to Credit
Card Delinquency Forecasting 15
By Ting Sun, The College of New Jersey
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 299
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_15
300 15 Deep Learning and Its Application to Credit Card …
predictors, including the information about the credit card not achieved solid progress until early 2000s when deep
holder’s basic demographic data, historical payment record, learning was firstly introduced by Hinton et al. (2006) in a
the amount of bill statements, as well as the amount of paper named “A Fast Learning Algorithm for Deep Belief
previous payments. They compare the performance of deep Nets.” In their paper, Hinton and his colleagues develop a
learning models with various activation functions to deep neural network capable of classifying handwritten
ensemble-learning techniques, bagging, random forest, and digits with high accuracy. Since then, scholars have explored
boosting. The results show that boosting has the strongest this technique and demonstrated that deep learning is cap-
predictive power and the performance of deep learning able of achieving state-of-art achievements in various areas,
models relies on the choice of activation function (i.e., Tanh such as self-driving car, game of Go, and Natural Language
and ReLu), the number of hidden layers, and the regular- Processing (NLP).
ization method (i.e., Dropout). A DNN consists of a number of layers of artificial neu-
As a simple application of deep learning, Zhang et al. rons which are fully connected to one another. The central
(2017) also analyze a dataset from UCI machine learning idea of DNN is that layers of those neurons automatically
repository and develop a prediction model for credit card learn from massive amounts of observational data, recognize
default. Their data represents Taiwan’s credit card defaults in the underlying pattern, and classify the data into different
2005 and consists of 22 predictors, including age, education, categories. As shown in Fig. 15.1, a simple DNN consists of
marriage, and financial account characteristics. The result of interconnected layers of neurons (as represented by circles in
the developed deep learning model is compared to those of Fig. 15.1). It contains one input layer, two hidden layers,2
linear regression and support vector machine. It finds that deep and one output layer. The input layer receives the raw data,
learning outperforms other models in terms of processing identifies the most basic element of the data, and passes it to
ability, which is suitable for large, complex financial data. the hidden layers. The hidden layer further analyzes, extracts
Using a dataset of 29,999 observations with 23 predictors data representations, and sends the output to the next layer.
from a major bank in Taiwan obtained from UCI machine After receiving the data representations from its predecessor
learning repository, Teng and Lee (2019) examine the pre- layer, the output layer categorizes the data into predefined
dictive capabilities of five techniques, the nearest neighbors, classes (e.g., students’ grade A, B, and C). Within each
decision trees, boosting, support vector machine, and neural layer, complex nonlinear computations are executed by the
networks, for credit card default. Their work shows an neuron, and the output will be assigned with a weight. The
inconsistent result from prior ones: the decision tree per- weighted outputs are then combined through a transforma-
forms best among others in terms of validation curves. tion and transferred to the next layer. As the data is pro-
Albanesi and Domonkos (2019) claim that deep learning cessed and transmitted from one layer to another, a DNN
approach is “specifically designed for prediction in envi- extracts higher level data representations defined in terms of
ronments with high dimensional data and complicated non- other, lower-level representations (Bengio 2012a, b; Good-
linear patterns of interaction among factors affecting the fellow et al. 2016; Sun and Vasarhelyi 2017).
outcome of interest, for which standard regression approa-
ches perform poorly.” A deep learning-based prediction
model is proposed for consumer default using an anon- 15.3.2 Deep Learning Versus Conventional
ymized credit file data from the Experian credit bureau. The Machine Learning Approaches3
data comprises more than 200 variables for 1 million
households, describing information on credit cards, bank A DNN is a special case of a traditional artificial neural
cards, other revolving credit, auto loans, installment loans, network with deeper hierarchical layers of neurons. Today’s
business loans, etc. For the proposed model, they apply large quantity of available data and tremendous increase in
dropout to each layer and ReLu at all neurons. Their results computing power make it possible to train neural networks
show that the proposed model consistently outperforms with deep hierarchical layers. With the great depth of layers
conventional credit scoring models. and the massive number of neurons, a DNN has much
greater representational capability than a traditional one with
only one or two hidden layers. In a DNN, with each iteration
15.3 The Methodology of model training, the final classification result provided by
the output layer will be compared to the actual observation
15.3.1 Deep Learning in a Nutshell
Fig. 15.1 Architecture of a simplified deep neural network Adopted 15.3.3 The Structure of a DNN
from Marcus (2018) and the Hyper-Parameters
to compute the error, and the DNN gradually “learns” from (1) Layers and neurons
the data by updating the weight and other parameters in the As mentioned earlier, a DNN is composed of layers
next rounds of training. After numerous rounds of model containing neurons. To construct a DNN, it firstly needs
training, the algorithm iterates through the data until the to determine the number of layers and neurons. There are
error cannot be reduced any further (Sun and Vasarhelyi many types of DNN. For example, multi-layer percep-
2017). Then the validation data is used to examine the data tron (MLP), convolutional neural network (CNN),
overfitting, and the selected model is used to predict the recursive neural network (RNN), and recurrent neural
holdout data, which is the out-of-sample test. The paper will network (RNN). The architectural of a DNN is as below:
discuss the concepts of weights, iterations, overfitting, and a. The input layer
out-of-sample test in the next section. There is only one input layer as the goal of which is to
A key feature of deep learning is that it performs well in receive the data. The number of neurons comprising
terms of feature engineering. While traditional machine the layer is typically equal to the number of variables
learning usually relies on human experts’ knowledge to in the data (sometimes, one additional neuron is
identify critical data features to reduce the complexity of the included as a bias neuron).
data and eliminate the noise created by irrelevant attributes, b. The output layer
deep learning automatically learns highly abstract features Similar to the input layer, a DNN has exactly one
from the data itself without human intervention (Sun and output layer. The number of neurons in the output
Vasarhelyi 2017). For example, a convolutional neural net- layer is determined by the objective of the model. If
work (CNN) trained for face recognition can identify basic the model is a regressor, the output layer has a single
elements such as pixels and edges in the first and second neuron, while the number of the neuron for a clas-
layers, then parts of faces in successive layers, and finally a sifier is determined by the number of class labels for
high-level representation of a face as the output. This char- the dependent variable.
acteristic of DNNs is seen as “a major step ahead of tra- c. The hidden layers
ditional Machine Learning” (Shaikh 2017). Another There are no “rules of thumb” for choosing the number
important difference between deep learning and other of hidden layers and neurons on each layer. It depends
machine learning techniques is its performance as the scale on the complexity of the problem and the nature of the
of data increases. Deep learning algorithms learn from past data. For many problems, it starts with one single hid-
examples. As a result, they need a sufficiently large amount den layer and examines the prediction accuracy. It
of data to understand the complex pattern underlying. keeps adding more layers until the test error does not
A DNN may not perform better than traditional machine improve anymore (Bengio 2012a, b). Likewise, the
learning algorithms like decision trees when the dataset is choice of the number of neurons is based on “trial and
small or simple. But their performance will significantly error.” This paper starts with minimum neurons and
improve as the data scales increases (Shaikh 2017). increases the size until the model achieves its optimal
Therefore, deep learning performs excellently for performance. In other words, it stops adding neurons
unstructured data analysis and has produced remarkable when it starts to overfit the training set.
302 15 Deep Learning and Its Application to Credit Card …
(2) Other hyper-parameters nonlinear transformation performed over the input data,
a. Weight and bias and the transformed output will then be passed to the
From the prior discussion, we learned that, in a neural next layer as the input data (Radhakrishnan 2017).
network, inputs are received by the neurons in the input Activation functions help the neural network learn
layer and then are transmitted between layers of neu- complex data and provide accurate predictions. Without
rons which are fully connected to each other. The input the activation function, the weights of the neural net-
in a predecessor layer must be strong enough to be work would simply execute a linear transformation and
passed to the successor layer. To make the input data even a deep stack of layers is equivalent to a single
transmittable between layers, a weight along with a bias layer, which is too simple to learn complex data (Gupta
term is applied to the input data to control the strength 2017). In contrast, “a large enough DNN with nonlin-
of the connection between layers. That is, the weight ear activations can theoretically approximate any
affects the amount of influence the input will have on continuous function” (Géron 2019). Some frequently
the output. Initially, a neural network will be assigned used nonlinear activation functions include Sigmoid
with random weights and biases before training begins. (also called Logistic), TanH (Hyperbolic Tangent),
As training continues, the weights and biases are ReLU (Rectified Linear Unit), Leaky ReLU, Parametric
adjusted on the basis of “trial and error” until the model ReLU, Softmax, Swish, and more. Each of them has its
achieves its best predictive performance, that is the own advantages and disadvantages and the choice of
difference between desired value and model output (as the activation function relies on trial and error. A clas-
represented by the cost function which will be discussed sification MLP often uses ReLu in its hidden layers and
later) is minimized.4 Softmax or Sigmoid in the output layer (Géron 2019).
Bias is a constant term added to the product of inputs As shown in Fig. 15.2, a diagram describing the inner
and weights, with the objective of shifting the output working of a neural network. In a neural network, a
toward the positive or negative side to reduce its vari- neuron is a basic processing unit, performing two
ance. Assuming you want a DNN to return 2 when all functions: collecting inputs and producing the output.
the inputs are 0s. If the result of the activation function, Once received by a neuron, each input is multiplied by
which is the product of inputs and weights, is 0, you a weight, and the products are summed and added with
may add a bias value of 1 to ensure the output is 1. biases, then an activation function is applied to produce
What will happen if you do not include the bias? an output as shown in Fig. 15.2 (Mohamed 2019).
The DNN is simply performing a matrix multiplication d. Learning rate, batch, iteration, and epoch
on the inputs and weights. This could easily introduce Since machine learning projects typically use limited
an overfitting issue (Malik 2019). size of data, to optimize the learning, this study
b. Cost function employs an iterative process of continuously adjusting
A cost function is a measure of the performance of a the values of model weight or bias. This strategy is
neural network with respect to its given training sample called Gradient Descent (Rumelhart et al. 1986;
and the expected output. An example of a cost function Brownlee 2016b). Explicitly, updating the parameters
is Mean Squared Error (MSE), which is simply a once is not enough as it will lead to underfitting
squared difference between every output and true value (Sharma 2017). Hence the entire training data needs to
and takes the average. Other more complex examples be passed through (forward and backward) and learned
include cross-entropy cost, exponential cost, Hellinger
distance, Kullback–Leibler divergence, and so on.
c. Activation function
The activation function is a mathematical function
applied between the input that is received in the current
neuron and the output that is transmitting to the neuron
in the next layer.5 Specifically, the activation function is
used to introduce nonlinearity to the DNN. It is a
4
For more information about weights and biases, read https://deepai.org/
machine-learning-glossary-and-terms/weight-artificial-neural-network and
https://docs.paperspace.com/machine-learning/wiki/weights-and-biases.
5
For more information about activation functions, read https://
missinglink.ai/guides/neural-network-concepts/7-types-neural-network- Fig. 15.2 The inner working of a neural network Adopted from
activation-functions-right/. Mohamed (2019)
15.4 Data 303
by the algorithm multiple times until it reaches the neurons resulting in a different set of outputs. A pa-
global minimum of the cost function. Each time the rameter, the probability, is used to control the number
entire data is passed through the algorithm is called one of neurons that will be deleted (Jain 2018). Early stop
epoch. As the number of epochs increases, a greater technique is a cross-validation strategy where we par-
number of times the parameters are updated in the tition one part of the training set as the validation set.
neural network, the training accuracy as well as the We learn the data patterns with the training set to
validation accuracy will increase.6 Because it is construct a model and assess the performance of the
impossible to pass the entire dataset into the algorithm model on the validation set. Specifically, the study
at once, the dataset is divided into a number of parts monitors the model’s predictive errors on the validation
called batches. the number of batches needed to com- set. If the performance of the model on the validation
plete one epoch is called the number of iterations. The set is not improving while the training error is
learning rate is the extent to which the parameters are decreasing, it immediately stops training the model
updated during the learning process. A lower learning further. Two parameters need to be configured. One is
rate requires more epochs, as the smaller adjustment is the quantity that needs to be monitored (e.g., validation
made to the parameters of each update, and vice versa error); the other is the number of epochs with no further
(Ding et al. 2020). improvement after which the training will be stopped
e. Overfitting and regularization (Jain 2018).
A very complex model may cause an overfitting issue,
which means that the model performs excellently on the
training set, but has a low predictive accuracy on the
testing set. This is because a complex model such as 15.4 Data
DNN can detect idiosyncratic patterns in training set. If
the data contains lots of noises (or if it is too small), the The credit card data in the data analysis part is from a large
model actually detects patterns in the noise itself, bank in Brazil. The final dataset consists of three subsets,
instead of generalizing to the testing set (Geron 2019). including (1) a dataset describing the personal characteristics
To avoid overfitting, one can employ a regularization of the credit card holder (e.g., gender, age, annual income,
constraint to make the model simpler to reduce the residential location, occupation, account age, and credit
generalization error. One will tune regularization score); (2) a dataset providing the accumulated transactional
parameters to control the strength of regularization information at account level recorded by the bank in
applied during the learning process. September 2013 (e.g., the frequency that the account has
There are several regularization techniques such as L1 been billed, the count of payments, and the number of cash
and L2 regularization, dropout, and early stopping. L1 withdrawals in domestic); and (3) a dataset containing
or L2 regularization works by applying a penalty term account-level transactions in June 2013 (e.g., credit card
to the cost function to limit the capacity of models. The revolving payment made, the amount of authorized trans-
strength of regularization is controlled by the value of action exceeded the evolve limit of credit card payment, and
its parameters (e.g., lambda), By adding the regularized the number of days past due).
term, the values of weight matrices decrease, which in The original transaction set contains 6,516,045 records at
turn reduces the complexity of the model (Kumar the account level based on transactions made in June 2013,
2019). Dropout is one of the most frequently used among which 45,017 are made with delinquent credit card,
regularization techniques in DNN. At every iteration of and 6,471,028 are legitimate. For each credit card holder, the
learning, it randomly removes some neurons and all of original transaction set is matched with the personal char-
their incoming and outgoing connections. Dropout can acteristics set and the accumulated transactional set. The
be applied to both the input layer and hidden layers. objective of this work is to investigate the credit card
This approach can be considered an ensemble technique holder’s characteristics and the spending behaviors and use
as it allows each iteration to have a different set of them to develop an intelligent prediction model for credit
card delinquency. Some transactional data is aggregated at
the level of credit card holder. For example, all the trans-
actions made by the client are aggregated on all credit cards
owned and generate a new variable, TRANS_ALL. Another
6
However, when the number of epochs reaches a certain point, the derived variable, TRANS_OVERLMT, is the average
validation accuracy starts decreasing while the training accuracy is still
amount of authorized transactions that exceed the credit limit
increasing. This means the model is overfitting. Thus, the optimal
number of epochs is the point where the validation accuracy reaches its made by the client on all credit cards owned.
highest value.
304 15 Deep Learning and Its Application to Credit Card …
Table 15.1 The data structure Panel A: delinquent versus legitimate observations
Dataset Delinquent Obs. Legitimate Obs. Total
(percentage) (percentage) (Percentage)
Credit card data 6,537 704,860 711,397
(0.92%) (99.08%) (100%)
Panel B: data content
Data categories7 No. of data fields Time period
Client characteristics 15 As of September 2013
Accumulative transactional 6 As of September 2013
information
Transactional information 23 June 2013
Total 44
After summarization, standardization, eliminating obser- a graphical user interface, providing a point-and-click inter-
vations with missing variables, and discarding variables with face for every operation (e.g., selecting hyper-parameters).8
zero variations, there are 44 input data fields (among which, This feature enables users with limited programming skills
15 fields are related to credit card holders’ characteristics, 6 such as auditors to build their own machine learning models
variables provide accumulative information for all past much easier than they do with other tools.
transactions made by the credit card holder based on the
bank’s record as of September 2013, and 23 attributes
summarize the account-level records in June 2013), which 15.5.1 Splitting the Data
are linked to 711,397 credit card holders. In other words, for
each credit card holder, there are 15 variables describing his The objective of data splitting in machine learning is to
or her personal characteristics, 6 variables summarizing his evaluate how well a model will generalize to new data before
or her past spending behavior, and 23 variables reporting the putting the model into production. The entire data is divided
transactions the client made with all credit cards owned in into two sets: the training set and the test set. A data analyst
June 2013. The final data is imbalanced because only 6,537 typically trains the model using the training set and tests it
clients are delinquent. In this study, a credit card client is using the test set. By evaluating the error rate on the test set,
defined as delinquent when any of his or her credit card the data analyst can evaluate the error rate on new data in the
account was permanently blocked by the bank in September future. But how to choose the best model? More specifically,
2013 due to the credit card delinquency. Table 15.1 sum- how to determine what is the best set of hyper-parameters that
marized the input data. The input data fields are listed and make a model outperform others? A solution to this is to tune
explained in Appendix 15.1. those hyper-parameters by holding out part of the training set
as a validation set and monitoring the performance of all
candidate models on the validation set. With this approach,
15.5 Experimental Analysis multiple models with various hyper-parameters are trained on
the reduced training set, which is the full training set minus
The data analysis process is performed with an Intel (R) Xeon the validation set, and the model that performs best on the
(R) CPU (64 GB RAM, 64-bit OS). The software used in this validation set will be chosen. The current analysis uses
analysis is H2O, an open source machine learning and pre- cross-validation technique. Cross-validation9 is a popular
dictive analytics platform. H2O provides deep learning algo- method, especially when the data size is limited. It makes
rithms to help users train DNNs based on different problems fully use of all data instances in the training set and gener-
(Candel et al. 2020). This research uses H2O Flow, which is a ally results in a less biased estimate than other methods
notebook-style user interface for H2O. It is a browser-based (Brownlee 2018).
interactive environment allowing uses to import files, split
data, develop models, iteratively improve them, and make
predictions. H2O Flow blends command-line computing with
8
https://www.h2o.ai/h2o-old/h2o-flow/.
9
For more information about cross-validation, read https://
7
A description of the attributes in each data category is provided in towardsdatascience.com/5-reasons-why-you-should-use-cross-
Appendix 15.1. validation-in-your-data-science-project-8163311a1e79.
15.5 Experimental Analysis 305
First, 20% of the data is held as a test set,10 which will be of random combinations. At each iteration, it uses one single
used to give a confident estimate of the performance of the random value for each hyper-parameter. Assuming there are
final tuned model. The stratified sampling method is applied 500 iterations as controlled by the user, Randomized Search
to ensure that the test set has the same distribution of both uses 500 random values for each hyper-parameter.
classes (delinquent vs. legitimate class) as the overall data- In contrast, Grid Search tries all combinations of only
set. For the remaining 80% of the data (hereafter called several values as selected by the user for each
“remaining set”), fivefold cross-validation is applied. In hyper-parameter. This approach works well when we are
H2O, the fivefold cross-validation works as follows. Totally exploring relatively few combinations, but when the
six models are built. The first five models are called hyper-parameter search space is large, Randomized Search is
cross-validation models. The last model is called main more preferable as you have more control over the com-
model. In order to develop the five cross-validation models, puting cost for hyper-parameter search by controlling the
the remaining set is divided into five groups using stratified number of iterations.
sampling to ensure each group has the same class distribu- In this analysis, Grid Search is employed to select some
tion. To construct the first cross-validation model, group 2, key hyper-parameters and other settings in the DNN, such as
3, 4, and 5 are used as training data, and the constructed the number of hidden layers and neurons as well as the
model is used to make predictions on group 1; to construct activation function. The simplest form of DNN, MLP, is
the second cross-validation model, group 1, 3, 4, and 5 are employed as the basic structure of the neural network. No
used as training data, and the constructed model is used to regularization is applied because the model itself is very
make predictions on group 2, and so on. So now it has five simple. With Grid Search, one selects the combination of
holdout predictions. Next, the entire remaining set is trained hyper-parameters that produces the lowest validation error.
to build the main model, with training metrics and This leads to the choice of three hidden layers. In other
cross-validation metrics that will be reported later. The words, the DNN consists of five fully connected layers (one
cross-validation metrics are computed as follows. The five input layer, three hidden layers, and one output layer). The
holdout predictions are combined into one prediction for the input layer contains 322 neurons.11 The first hidden layer
full training dataset. This “holdout prediction” is then scored contains 175 neurons, the second hidden layer contains 350
against the true labels, and the overall cross-validation neurons, and the third hidden layer contains 150 neurons.
metrics are computed. This approach scores the holdout Finally, the output layer has 2 output neurons,12 which is the
predictions freshly rather than taking the average of the five classification result of this research (whether or not the credit
metrics of the cross-validation models (H2O.ai 2018). card holder is delinquent). The number of hidden layers and
the number of neurons determine the complexity of the
structure of the neural network. It is critical to build a neural
15.5.2 Tuning the Hyper-Parameters network with an appropriate structure that fits the complexity
of the data. While a small number of layers or neurons may
Hyper-parameters need to be configured before fitting the cause underfitting, an extremely complex DNN would lead
model (Tartakovsky et al. 2017). The choice of to overfitting (Radhakrishnan 2017).
hyper-parameters is critical as it determines the structure and It uses Uniform Distribution Initialization method to
the variables controlling how the network is trained (e.g., the initialize the network weights to a small random number
learning rate and weight) (Radhakrishnan 2017), which will between 0 and 0.05 generated from a uniform distribution,
in turn makes the difference between poor and superior then forward propagate the weight throughout the network.
predictive performance (Tartakovsky et al. 2017). To select At each neuron, the weights and the input data are multi-
the best value for hyper-parameters, two prevalent plied, aggregated, and transmitted through the activation
hyper-parameter optimization techniques are frequently function.
used: Grid Search and Randomized Search. The model uses the ReLu activation function on the three
The basic idea of Grid Search is that the user selects hidden layers to solve the problem of exploding/vanishing
several grid points for every hyper-parameter (e.g., 2, 3, and
4 for the number of hidden layers) and trains the model using
every combination of those values of hyper-parameters. The 11
The original inputs have 41 attributes. After creating dummies for all
combination that performs the finest will be selected. Unlike classes of categorical attributes, it finally has 322 attributes.
12
Grid Search, Randomized Search evaluates a given number For a binary classification problem, it just needs a single output
neuron using the logistic activation function: the output will be a
number between 0 and 1, which can be interpreted as the estimated
probability of the positive class. The estimated probability of the
10
An 80:20 ratio of data splitting is used as it is a common rule of negative class is equal to one minus that number (Géron 2019). Here, a
thumb (Guller 2015; Giacomelli 2013; Nisbet et al. 2009; Kloo 2015). number 2 is used to indicate there are two classes.
306 15 Deep Learning and Its Application to Credit Card …
Table 15.2 The structure of the Layer Number of neurons Type Initial weight distribution/activation function
DNN
1 322 Input Uniform
2 175 Hidden layer 1 ReLu
3 350 Hidden layer 2 ReLu
4 150 Hidden layer 3 ReLu
5 2 Output Sigmoid
gradient which is introduced by Bengio, Simard, and over-represented class (which is the legitimate class in our
Frasconi (1994) (Jin et al. 2016; Baydin et al. 2016). The case). It applies Grid Search again to try both approaches
Sigmoid activation function is applied to the output layer as and find over-sampling works better for our data. Table 15.3
it is a binary prediction. Table 15.2 depicts the neural net- summaries the distributions of classes in training, 5
work’s structure. cross-validation, and test set.13
The number of epochs in the DNN model is 10. The To compare the predictive performance of DNN to that of
learning rate defines how quickly a network updates its traditional neural network, logistic regression, Naïve Bayes,
parameters. Instead of using a constant learning rate to and decision tree, the same dataset, and data splitting and
update the parameters (e.g., network weights) for each preprocessing method are used to develop prediction models.
training epoch, it employs an adaptive learning rate, which The results of cross-validation are reported in the next section.
allows the specification of different learning rates per layer
(Brownlee 2016a; Lau 2017). Two parameters, Rho and
Epsilon, need to be specified to implement the adaptive 15.6 Results
learning rate algorithm. Rho is similar to momentum and
relates to the memory of prior weight updates. Typical val- 15.6.1 The Predictor Importance
ues are between 0.9 and 0.999. This study uses the value
0.99. Epsilon is similar to learning rate annealing during This analysis evaluates the independent contribution of each
initial training and momentum at later stages where it allows predictor in explaining the variance of the target variable.
forward progress. It prevents the learning process from being Figure 15.3 lists the top 10 important indicators and their
trapped in local optima. Typical values are between 1e–10 importance scores measured by the relative importance as
and 1e–4. The value of epsilon is 1e–8 in this study. Batch compared to that of the most important variable.
size is the total number of training observations present in a The most powerful predictor is TRANS_ALL, the total
single batch. The batch size used here is 32. amount of all authorized transactions on all credit cards held
by the client in June, which indicates that the more the client
spent, the riskier that the client will have severe delinquency
15.5.3 Techniques of Handling Data Imbalance issue later in September. The second important predictor is
LOCATION, suggesting that clients living in some regions
The entire dataset has imbalanced classes. The vast majority
of the credit card holders do not have delinquency. A total of
6,537 instances are labeled with class “delinquent,” while 13
When splitting frames, H2O does not give an exact split. It’s
the remaining 704,860 are labeled with class “legitimate.” designed to be efficient on big data using a probabilistic splitting
To avoid the data imbalance, over-sampling and method rather than an exact split. For example, when specifying a
0.75/0.25 split, H2O will produce a test/train split with an expected
under-sampling are two popular resampling techniques. value of 0.75/0.25 rather than exactly 0.75/0.25. On small datasets, the
While over-sampling adds copies of instances from the sizes of the resulting splits will deviate from the expected value more
under-represented class (which is the delinquency class in than on big data, where they will be very close to exact. http://h2o-
our case), under-sampling deletes instances from the release.s3.amazonaws.com/h2o/master/3552/docs-website/h2o-docs/
datamunge/splitdatasets.html.
15.6 Results 307
Relave importance
are more likely to default on credit card debt. Compared to uses those metrics to compare the prediction result of the
TRANS_ALL, whose relative importance is 1 as it is the DNN and other models.
most important indicator, LOCATION’s relative importance As shown in Table 15.4, the DNN has an overall accuracy
is 0.9622. It is followed by the limit of cash withdrawal of 99.54%, slightly lower than the traditional neural network
(CASH_LIM) and the number of days given to the client to and decision tree, but higher than the other two approaches.
pay off the new balance without paying finance charges Since there is a large class imbalance in the validation data,
(GRACE_PERIOD). This result suggests that the flexibility the classification accuracy alone cannot provide useful
the bank provides to the client facilitates the occurrence of information for model selection as it is possible that a model
delinquencies. Other important data fields include BAL- can predict the value of the majority class for all predictions
ANCE_CSH (the current balance of cash withdrawal), and achieve a high classification accuracy. Therefore, I
PROFESSION (the occupation of the client), BAL- consider a set of additional metrics.
ANCE_ROT (the current balance of credit card revolving Specificity (also called True Negative Rate (TNR))
payment), FREQUENCY (the number of times the client has measures the proportion of negatives that are correctly
been billed until September 2013), and TRANS_OVERLMT identified as such. In this case it is the percentage of legiti-
(the average amount of the authorized transactions exceeded mate holders who are correctly identified as non-delinquent.
the limit on all credit card accounts owned by the client). The TNR of DNN is 0.9990, which is the second highest
The last predictor is the average number of days the client’s score of all algorithms. This result shows that the DNN
payments (on all credit cards) in June 2013 have passed the classifier performs excellently in correctly identifying legit-
due dates. imate clients. Decision tree has a slightly higher specificity,
which is 0.9999. Traditional neural network and logistic
regression also have a high score of specificity. However,
15.6.2 The Predictive Result Naïve Bayes has a low TNR, which is 0.5913. This means
for Cross-Validation Sets that many legitimate observations are mistakenly identified
by the Naïve Bayes model as delinquent ones. False negative
A list of metrics is applied to evaluate the predictive per- rate (FNR) is the Type II error rate. It is the proportion of
formance of the constructed DNN for cross-validation. The positives that are incorrectly identified as negatives. A FNR
current analysis also uses a traditional neural network of 0.3958 of DNN indicates 39.58% of delinquent clients are
algorithm with a single hidden layer and a comparative undetected by the classifier. This is the second lowest score.
number of neurons to build a similar prediction model. The lowest one is 0.1226 generated by Naïve Bayes. So far,
Logistic regression, Naïve Bayes, and decision tree tech- it seems like that the Naïve Bayes model tends to consider
niques are also employed to conduct the same task. Next, it all observations as default ones because of the low level of
308 15 Deep Learning and Its Application to Credit Card …
Table 15.4 Predictive Metrics DNN Traditional Decision tree Naïve Logistic
performance14 NN (J48) Bayes regression
Overall accuracy 0.9954 0.9955 0.9956 0.5940 0.9938
recall 0.6042 0.5975 0.5268 0.8774 0.4773
precision 0.8502 0.8739 0.9922 0.0196 0.7633
Specificity 0.9990 0.9980 0.9999 0.5913 0.9986
F1 0.7064 0.6585 0.6882 0.0383 0.5874
F2 0.6413 0.6204 0.5813 0.0898 0.5166
F 0:5 0.7862 0.7016 0.8432 0.0243 0.6816
FNR 0.3958 0.4027 0.4732 0.1226 0.5227
FPR 0.0010 0.0020 0.0001 0.4087 0.0014
AUC 0.9547 0.9485 0.881 0.7394 0.8889
Model building 8 h 3 min 13 min 56 s 0.88 s 9s 34 s
time 13 s
TNR and FNR. False positive rate (FPR) is called Type I considering both precision and recall. Three F scores, F 1 ,
error rate. It is the proportion of negatives that are incorrectly F 2 , and F 0:5 , , are frequently used by existing data mining
classified as positives. The table shows that the Type I error research to conduct this job (Powers 2011). The F 1 score17 is
rate of decision tree is 0.01%, higher than that of DNN, the harmonic mean of precision and recall, treating precision
which is 0.1%. This result suggests that it is unlikely that a and recall equally. While F 2 18 treats recall with more
normal client will be identified by Decision Tree and DNN importance than precision by weighting recall higher than
as a problematic one. precision, F0.519 weighs recall lower than precision. The F 1 ,
Precision and recall are two important measures for the F 2 , and F 0:5 score of the DNN is 0.7064, 0.6413, and
ability of the classifier for delinquency detection, where 0.7862, respectively. The result shows that, with the
precision15 measures the percentage of actual delinquencies exception of F 0:5 , DNN exhibit the highest overall perfor-
in all perceived ones. The precision score, 0.8502, of DNN is mance than other models.
lower than that of decision tree and traditional neural net- The overall capability of the classifier can also be mea-
work, which is 0.9922 and 0.8739, respectively, but higher sured by the Area Under the Receiver Operating Charac-
than that of the other two algorithms. Specifically, Naïve teristic (ROC) curve, AUC. The ROC curve (see Fig. 15.4)
Bayes model receives an extremely low score, 0.0196. This plots the recall versus the false positive rate as the dis-
number shows that approximately all perceived delinquen- criminative threshold is varied between 0 and 1. Again, the
cies are actually legitimate observations. Recall,16 on the DNN provides the highest AUC of 0.9547 compared to other
other hand, indicates that, for all actual delinquencies, how models, showing its strong ability to discern between the two
many of them are successfully identified by the classifier. It classes. Finally, the model building time shows that it is a
is also called Sensitivity or the True Positive Rate (TPR), time-consuming procedure (more than 8 h) to develop a
which can be thought of as a measure of a classifier's DNN due to the complexity of computing.
completeness. The Recall score of DNN is 0.6042, the
highest score of all models except Naïve Bayes. This number
also means 39.58% of delinquent observations are not 15.6.3 Prediction on Test Set
identified by our model, which is consistent with the result
of FNR. The results of cross-validation show the performance of the
While the decision tree and traditional neural network model with optimal hyper-parameters. The actual predictive
models perform better than the DNN in terms of precision, capability of the model is measured by the out-of-sample test
the DNN outperforms them in terms of recall. Thus, it is on the test set. Table 15.5 is the confusion matrix for the test
necessary to evaluate the performance of models by set. 85 legitimate credit card holders are classified as
14
We choose the threshold that gives us the highest F1 score, and the
reported value of the metric is based on the selected threshold.
17
F 1 = 2 (precision recall)/(precision + recall).
15
Precision = true positive/(true positive + false positive).
18
F 2 = 5 (precision recall)/(4 precision + recall).
16
Recall = true positive/(true positive + false negative).
19
F 0:5 = 54 (precision recall)/(14precision + recall).
15.7 Conclusion 309
Table 15.6 The result of Metrics DNN Traditional NN Naïve Bayes Logistic Decision tree (J48)
out-of-sample test
Overall accuracy 0.9959 0.9941 0.6428 0.9949 0.9944
Recall 0.6053 0.5521 0.8677 0.5770 0.4527
Precision 0.9009 0.7291 0.0217 0.8047 0.9080
Specificity 0.9994 0.9981 0.6407 0.9987 0.9996
F1 0.7241 0.6283 0.0424 0.6721 0.6042
F2 0.6478 0.5802 0.0987 0.6116 0.5032
F 0:5 0.8208 0.6851 0.0270 0.7459 0.7559
False negative Rate 0.3947 0.4479 0.1323 0.4230 0.5473
False positive Rate 0.0006 0.0019 0.3593 0.0013 0.0004
AUC 0.9246 0.9202 0.7581 0.8850 0.8630
310 15 Deep Learning and Its Application to Credit Card …
holder
0.6 ACCOUNT_AGE The oldest age of the credit card accounts
0.5 owned by the client (in months)
Target variable Description20 CREDIT_LMT_PRVS The maximum credit limit in the last
period
INDICATOR It indicates if any of the client’s credit card
is permanently blocked in September 3. Transactions in June 2013
2013 due to credit card delinquency CREDIT_LMT_CRT The maximum credit limit
Input variables Description LATEDAYS The average number of days that the
1. Personal characteristics client’s credit card payments have passed
the due date
SEX The gender of the credit card holder
UNPAID_DAYS the average number of days that previous
Individual The code indicating if the holder is an transactions have remained unpaid
individual or a corporation
BALANCE_ROT The current balance of credit card
AGE The age of the credit card holder revolving payment
INCOME_CL The annual income claimed by the holder BALANCE_CSH The current balance of cash withdrawal
INCOME_CF The annual income of the holder GRACE_PERIOD The remaining number of days that the
confirmed by the bank bank gives the credit card holder to pay off
ADD_ASSET The number of additional assets owned by the new balance without paying finance
the holder charges. The time window starts from the
end of June 2013 to the next payment due
(continued)
date
20
The unit of the amount is Brazilian Real.
References 311
Hamet, P., & Tremblay, J. (2017). Artificial intelligence in medicine. Ohlsson, C. (2017). Exploring the potential of machine learning: How
Metabolism, 1–5 machine learning can support financial risk management. Master’s
Hamori, S., Kawai, M., Kume, T., Murakami, Y., & Watanabe, C. Thesis. Uppsala University
(2018). Ensemble learning or deep learning? Application to default Powers, D.M. (2011). Evaluation: from precision, recall and F-measure
risk analysis. Journal of Risk and Financial Management, 11(1), 12. to ROC, informedness, markedness and correlation
Heaton, J.B., Polson, N.G. & Witte, J.H. (2016). Deep learning in Radhakrishnan, P. (2017). What are Hyperparameters and How to tune
finance. arXiv preprint arXiv:1602.06561 the Hyperparameters in a Deep Neural Network? Towards Data
Issa, E. (2019). Nerdwallet’s 2019 American Household Credit Card Science. https://towardsdatascience.com/what-are-hyperparameters-
Debt Study. https://www.nerdwallet.com/blog/average-credit-card- and-how-to-tune-the-hyperparameters-in-a-deep-neural-network-
debt-household/ d0604917584a
Jain, S. (2018). An Overview of Regularization Techniques in Deep Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning
Learning (with Python code). https://www.analyticsvidhya.com/ representations by back-propagating errors. nature, 323(6088),
blog/2018/04/fundamentals-deep-learning-regularization- 533-536
techniques/ Silver, D., et al. (2016). Mastering the game of Go with deep neural
Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J. & Yan, S. (2016). Deep networks and tree search. Nature, 529, 484-489
Learning with S-Shaped Rectified Linear Activation Units. In AAAI, Sun, T. & Vasarheyi, M.A. (2017). Deep learning and the future of
2, 1737-1743. auditing: how an evolving technology could transform analysis and
Kloo, I. (2015). Textmining: Clustering, Topic Modeling, and Classi- improve judgment. The CPA Journal. 6, 24-29
fication. http://data-analytics.net/cep/Schedule_files/Textmining% Sun, T., & Vasarhelyi, M. A. (2018). Predicting credit card delinquen-
20%20Clustering,%20Topic%20Modeling,%20and% cies: An application of deep neural networks. Intelligent Systems in
20Classification.htm Accounting, Finance and Management, 25(4), 174-189
Koh, H. C., & Chan, K. L. G. (2002). Data mining and customer Shaikh, F. (2017). Deep learning vs. machine learning-the essential
relationship marketing in the banking industry. Singapore Man- differences you need to know. Analytics Vidhya. https://www.
agement Review, 24, 1–27. analyticsvidhya.com/blog/2017/04/comparison-between-deep-
Kumar, N. (2019). Deep Learning Best Practices: Regularization learning-machine-learning/
Techniques for Better Neural Network Performance. https:// Sharma, S. (2017). Epoch vs Batch Size vs Iterations. Towards Data
heartbeat.fritz.ai/deep-learning-best-practices-regularization- Science. https://towardsdatascience.com/epoch-vs-iterations-vs-
techniques-for-better-performance-of-neural-network- batch-size-4dfb9c7ce9c9
94f978a4e518 Szegedy, C. (2014). Building a deeper understanding of images.
Lau, S. (2017). Learning Rate Schedules and Adaptive Learning Rate Google Research Blog (September 5, 2014). https://research.
Methods for Deep Learning. Towards Data Science. https:// googleblog.com/2014/09/building-deeper-understanding-of-images.
towardsdatascience.com/learning-rate-schedules-and-adaptive- html
learning-rate-methods-for-deep-learning-2c8f433990d1 Tartakovsky, S., Clark, S., & McCourt, M (2017) Deep Learning
Levy, S. (Aug 24, 2016). An exclusive inside look at how artificial Hyperparameter Optimization with Competing Objectives. NVIDIA
intelligence and machine learning work at Apple. Backchannel. Developer Blog. https://devblogs.nvidia.com/parallelforall/sigopt-
https://backchannel.com/an-exclusive-look-at-how-ai-and-machine- deep-learning-hyperparameter-optimization/
learning-work-at-apple-8dbfb131932b Teng, H. W., & Lee, M. (2019). Estimation procedures of using five
Malik, F. (2019). Neural networks bias and weights. https://medium. alternative machine learning methods for predicting credit card
com/fintechexplained/neural-networks-bias-and-weights- default. Review of Pacific Basin Financial Markets and Policies, 22
10b53e6285da (03), 1950021
Marcus, G. (2018). Deep learning: a critical appraisal. https://arxiv.org/ Thomas, L. C. (2000). A survey of credit and behavioral scoring:
abs/1801.00631 Forecasting financial risk of lending to consumers. International
Marqués, A.I., García, V. & Sánchez, J.S. (2012). Exploring the Journal of Forecasting, 16, 149–172
behavior of base classifiers in credit scoring ensembles. Expert Zhang, B. Y., Li, S. W., & Yin, C. T. (2017). A Classification
Systems with Applications, 39, 10244-10250. Approach of Neural Networks for Credit Card Default Detection.
Mohamed, Z. (2019). Using the artificial neural networks for prediction DEStech Transactions on Computer Science and Engineering,
and validating solar radiation. Journal of the Egyptian Mathematical (AMEIT 2017). DOI https://doi.org/10.12783/dtcse/ameit2017/
Society. 27(47). https://doi.org/10.1186/s42787-019-0043-8 12303
Nisbet, R., Elder, J. & Miner, G. (2009). Handbook of statistical
analysis and data mining applications. Academic Press.
Binomial/Trinomial Tree Option Pricing
Using Python 16
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 313
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_16
314 16 Binomial/Trinomial Tree Option Pricing Using Python
110 10 0
100 ?? ??
90 0 10
Let’s first consider the issue of pricing a call option. Therefore, from the above simple algebraic exercise, we
Using a one period Binomial Tree, we can illustrate the price should at period 0 buy .5 shares of IBM stock and borrow
of a stock if it goes up and the price of a stock if it goes 42.05607 at 7% to replicate the payoff of the call option.
down. Since we know the possible endings values of a stock, This means the value of a call option should be .5 100 −
we can derive the possible ending values of a call option. If 42.05607 = 7.94393. If this were not the case, there would
the stock price increases to $110, the price of the call option then be arbitrage profits. For example, if the call option were
will then be $10 ($110 − $100). If the stock price decreases sold for $8 there would be a profit of .056607. This would
to $90, the value of the call option will be worth $0 because result in an increase in the selling of the call option. The
it would be below the exercise price of $100. We have just increase in the supply of call options would push the price
discussed the possible ending value of a call option in period down for the call options. If the call options were sold for $7,
1. But, what we are really interested is the value now of the there would be a saving of .94393. This saving would result
call option knowing the two resulting value of a call option. in an increased demand for the call option. The equilibrium
To help determine the value of a one period call option, point would be 7.94393.
it’s useful to know that it is possible to replicate the resulting Using the above mentioned concept and procedure,
two states of the value of the call option by buying a com- Benninga (2000) has derived a one period call option model
bination of stocks and bonds. Below is the formula to as
replicate the situation where the price increases to $110. We
will assume that the interest rate for the bond is 7%. C ¼ qu Max½Sð1 þ uÞ X; 0 þ qd Max½Sð1 þ dÞ
X; 0
110S þ 1:07B ¼ 10 ð16:1Þ
90S þ 1:07B ¼ 0
where
We can use simple algebra to solve for both S and B. The
first thing that we need to do is to rearrange the second id
qu ¼
equation as follows: ð1 þ iÞðu dÞ
Below calculates the value of the above one period call ui
qd ¼
option where the strike price, X, is $100 and the risk-free ð1 þ iÞðu dÞ
interest rate is 7%. We will assume that the price of a stock
for any given period will either increase or decrease by 10%. u ¼ increase factor
Therefore, from the above calculations, the value of the Pd ¼ Max½X Sð1 þ dÞ; 0
call option is $7.94. From the above calculations, the call
then we have
option pricing binomial tree should look like the following:
P ¼ ½pPu þ ð1 pÞPd =R ð16:4Þ
Call Option Price
Period 0 Period 1 As an example, suppose the strike price, X, is $100 and
the risk-free interest rate is 7%. Then
the stock price minus the exercise price, $81 − $100, or − As the pricing of a call option for one period, the price of
$19. A negative value has no value to an investor so the a call option when the stock price increases from period 0
value of a call option would be $0. In period two, the value will be $16.68. The resulting Binomial Tree is shown below.
of a put option when a stock price is $81 is the exercise price
minus the stock price, $100 − $81, or $19. We can derive Call Option
the call and put option value for the other possible value of Period 0 Period 1 Period 2
the stock in period 2 in the same fashion. The following
shows the possible call and put option values for period 2. 21.00
16.68
Call Option
0
Period 0 Period 1 Period 2
21.00 0
0 0
In the same fashion, we can price the value of a call
0 option when a stock price decreases. The price of a call
option when a stock price decreases from period 0 is $0. The
0 resulting Decision Tree is shown below.
0.00 21.00
16.68
1.00 0
1.00 0
0
19.00
0
We cannot calculate the value of the call and put option in
period 1 the same way as we did in period 2, because it’s not
In the same fashion, we can price the value of a call
the ending value of the stock. In period 1, there are two
option in period 0. The resulting Binomial Tree is shown
possible call values. One value is when the stock price
below.
increased and one value is when the stock price decreased.
The call option Decision Tree shown above shows two Call Option
possible values for a call option in period 1. If we just focus Period 0 Period 1 Period 2
on the value of a call option when the stock price increases
from period one, we will notice that it is like the Decision 21.00
Tree for a call option for one period. This is shown below. 16.68
Call Option 0
Period 0 Period 1 Period 2 13.25
0
21.00 0
0
0
We can calculate the value of a put option in the same
0 manner as we did in calculating the value of a call option.
The Binomial Tree for a put option is shown below.
0
16.2 European Option Pricing Using Binomial Tree Model 317
Xn
n i ni
Put Option C¼ qu qd max½Sð1 þ uÞi ð1 þ dÞni X; 0
i¼0
i
Period 0 Period 1 Period 2
ð16:5Þ
0.00 n
X
0.14 P¼
n i ni
i d max½X Sð1 þ uÞ ð1 þ dÞ
qiu qni ; 0
1.00 i¼0
0.60 ð16:6Þ
1.00
3.46 Chapter 5 has shown how Excel VBA can be used to
19.00 estimate the binomial option pricing model. Appendix 16.1
has shown how the Python program can be used to estimate
the binomial option pricing model. By using the python
program in Appendix 16.1, Figs. 16.1, 16.2 and 16.3 illus-
trate the simulation results of binomial tree option pricing
16.2.2 European Option Pricing—N Periods using initial stock price S0 = 100, strike price X = 100,
n = 4 periods, interest rate r = 0.07, the up factor u = 1.175,
Benninga (2000, p 260) has derived the price of a call and a and down factor d = 0.85. Figure 16.1 illustrates the simu-
put option, respectively, by a Binomial Option Pricing lated stock prices, and Figs. 16.2 and 16.3 illustrate the
model with n periods as corresponding European call and put prices, respectively. As
can be seen, for example, as the stock price at the 4th period options. The binomial option pricing model presents two
S = 190.61, the European call and put prices are 90.61 and advantages for option sellers over the Black–Scholes model.
0, respectively. As the stock price at the 4th period S = 52.2, The first is its simplicity, which allows for fewer errors in
the European call and put prices are 0 and 47.8, respectively. commercial application. The second is its iterative operation,
which adjusts prices in a timely manner so as to reduce the
opportunity for buyers to execute arbitrage strategies. For
16.3 American Option Pricing Using example, since it provides a stream of valuations for a
Binomial Tree Model derivative for each node in a span of time, it is useful for
valuing derivatives such as American options—which can
An American option is an option the holder may exercise at be executed anytime between the purchase date and expi-
any time between the start date and the maturity date. ration date. It is also much simpler than other pricing models
Therefore, the holder of an American option faces the such as the Black–Scholes model.
dilemma of deciding when to exercise. Binomial tree valu- The first step of pricing an American option is the same
ation can be adapted to include the possibility of exercise at as a European option. For an American option, the second
intermediate dates and not just the maturity date. This feature step relates to the difference between the strike price of the
needs to be incorporated into the pricing of American option and the price of the stock. A simplified example is
16.3 American Option Pricing Using Binomial Tree Model 319
given as follows. Assume there is a stock that is priced at stock and writes or sells one call option. The total investment
S = $100 per share. In one month, the price of this stock will today is the price of half a share less the price of the option,
go up by $10 or go down by $10, creating this situation and the possible payoffs at the end of the month are
Suppose there is a call option available on this stock that The portfolio payoff is equal no matter how the stock
expires in one month and has a strike price of $100. In the up price moves. Given this outcome, assuming no arbitrage
state, this call option is worth $10, and in the down state, it is opportunities, an investor should earn the risk-free rate over
worth $0. Assume an investor purchases one-half share of the course of the month. The cost today must be equal to the
320 16 Binomial/Trinomial Tree Option Pricing Using Python
payoff discounted at the risk-free rate for one month. The 16.4.1 Cox, Ross, and Rubinstein Model
equation to solve is thus
Cox et al. (1979) (hereafter CRR) propose an alternative
Option price ¼ $50 $45 erT ; choice of parameters that also creates a risk-neutral valuation
where e is the mathematical constant 2:7183 environment. The price multipliers, u and d, depend only on
volatility r and on dt, not on drift
Assuming the risk-free rate is 3% per year, and T equals pffiffiffi
0.0833 (one divided by 12), then the price of the call option u ¼ er dt
today is $5.11.
1
d¼
u
16.4 Alternative Tree Models To offset the absence of a drift component in u and d, the
probability of an up move in the CRR tree is usually greater
In this section, we will introduce three binomial tree meth- than 0.5 to ensure the expected value of the price increases
ods and one trinomial tree method to price option values. by a factor of exp[(r − q)dt] on each step. The formula for
Three binomial tree methods include Cox et al. (1979), p is
Jarrow and Rudd (1983), and Leisen and Reimer (1996).
These methods will generate different kinds of underlying eðrqÞdt d
p¼
asset trees to represent different trends of asset movement. ud
Kamrad and Ritchken (1991) extend the binomial tree
Let fi,j denotes the option value in node (i, j), where
method to multinomial approximation models. The trinomial
i denotes the ith node in period j (j = 0,1,2,…, n). Note in a
tree method is one of the multinomial models.
16.5 Summary 321
binomial tree model, i = 0, …, j. Thus, the underlying asset Expressed algebraically, the trinomial tree parameters are
price in a node (i, j) is Sujdi−j. At the expiration we have pffiffiffi
u ¼ ekr dt
fi;N ¼ max Sui d ni X; 0 i ¼ 0; 1; . . .; n
1
Going backward in time (decreasing j), we get d¼
u
f i;j ¼ erdt ½pf i þ 1;j þ 1 þ ð1 pÞf i;j þ 1 The formula for probability p
pffiffiffiffi
Lee et al. (2000, p 237) has derived the pricing of a call 1 ðr r2 =2Þ dt
pu ¼ 2 þ
and a put option, respectively, a Binomial Option Pricing 2k 2kr
model with N period as
1
1 X
n pm ¼ 1
C¼ n
n!
pk ð1 pÞnk max½0; ð1 þ uÞk ð1 þ dÞnk S X k2
R k¼0 k!ðn k!Þ
p d ¼ 1 pu pm
ð16:7Þ
If parameter k is equal to 1, then the trinomial tree model
1 X
n
n! reduces to a binomial tree model. Below is the underlying
P¼ n pk ð1 pÞnk max½0; X
R k¼0 k!ðn k!Þ asset price pattern base on the trinomial tree model.
ð1 þ uÞk ð1 þ dÞnk S ð16:8Þ Appendix 16.2 has shown how the Python program can
be used to estimate the trinomial option pricing model.
Figures 16.4, 16.5 and 16.6 illustrate the simulation results
of trinomial tree option pricing using initial stock price
16.4.2 Trinomial Tree S0 = 50, strike price X = 50, n = 6 periods, interest rate
r = 0.04, and k = 1.5. Figure 16.4 illustrates the simulated
Because binomial tree methods are computationally expen- stock prices, and Figs. 16.5 and 16.6 illustrate the corre-
sive, Kamrad and Ritchken (1991) propose multinomial sponding European call and put prices, respectively. As can
models. New multinomial models include as special cases be seen, for example, as the stock price at the 6th period
existing models. The more general models are shown to be S = 84.07, the European call and put prices are 34.07 and 0,
computationally more efficient. respectively. As the stock price at the 6th period S = 29.74,
the European call and put prices are 0 and 20.25,
respectively.
16.5 Summary
closely match those computed from other commonly used pricing models can be developed according to a trader’s
models like Black–Scholes, which indicates the utility and preferences and can work as an alternative to Black–
accuracy of binomial models for option pricing. Binomial Scholes.
Appendix 16.1: Python Programming Code for Binomial Tree Option Pricing 323
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
define a balanced binary tree
class Binode(object):
def __init__(self,element=None,down=None,up=None):
self.element = element
self.up = up
self.down = down
def dict_form(self):
dict_data = {'up':self.up,'down':self.down,'element':self.element}
return dict_data
class Tree(object):
def __init__(self,root=None):
self.root = root
if not nx.is_tree(G):
raise TypeError('Need to define a tree')
if root is None:
if isinstance(G, nx.DiGraph):
root = next(iter(nx.topological_sort(G)))
else:
root = random.choice(list(G.nodes))
Appendix 16.1: Python Programming Code for Binomial Tree Option Pricing 325
def _hierarchy_pos(G, root, leftmost, width, leafdx = 0.2, vert_gap = 0.2, vert_loc = 0,
xcenter = 0.5, rootpos = None,
leafpos = None, parent = None):
if rootpos is None:
rootpos = {root:(xcenter,vert_loc)}
else:
rootpos[root] = (xcenter, vert_loc)
if leafpos is None:
leafpos = {}
children = list(G.neighbors(root))
leaf_count = 0
if not isinstance(G, nx.DiGraph) and parent is not None:
children.remove(parent)
if len(children)!=0:
rootdx = width/len(children)
nextx = xcenter - width/2 - rootdx/2
for child in children:
nextx += rootdx
rootpos, leafpos, newleaves = _hierarchy_pos(G,child, leftmost+leaf_count*leafdx,
width=rootdx, leafdx=leafdx,
vert_gap = vert_gap, vert_loc = vert_loc-vert_gap,
xcenter=nextx, rootpos=rootpos, leafpos=leafpos, parent = root)
leaf_count += newleaves
leftmostchild = min((x for x,y in [leafpos[child] for child in children]))
rightmostchild = max((x for x,y in [leafpos[child] for child in children]))
leafpos[root] = ((leftmostchild+rightmostchild)/2, vert_loc)
else:
leaf_count = 1
leafpos[root] = (leftmost, vert_loc)
# pos[root] = (leftmost + (leaf_count-1)*dx/2., vert_loc)
# print(leaf_count)
return rootpos, leafpos, leaf_count
xcenter = width/2.
if isinstance(G, nx.DiGraph):
leafcount = len([node for node in nx.descendants(G, root) if G.out_degree(node)==0])
elif isinstance(G, nx.Graph):
leafcount = len([node for node in nx.node_connected_component(G, root) if
G.degree(node)==1 and node != root])
rootpos, leafpos, leaf_count = _hierarchy_pos(G, root, 0, width,
leafdx=width*1./leafcount,
vert_gap=vert_gap,
vert_loc = vert_loc,
xcenter = xcenter)
pos = {}
for node in rootpos:
pos[node] = (leaf_vs_root_factor*leafpos[node][0] + (1-
leaf_vs_root_factor)*rootpos[node][0], leafpos[node][1])
# pos = {node:(leaf_vs_root_factor*x1+(1-leaf_vs_root_factor)*x2, y1) for ((x1,y1), (x2,y2)) in
(leafpos[node], rootpos[node]) for node in rootpos}
xmax = max(x for x,y in pos.values())
for node in pos:
pos[node]= (pos[node][0]*width/xmax, pos[node][1])
return pos
Final stage
return list_node
#store cur-1 layer
#for each ele in cur-1 layer, update value in cur layer
def construct_Ecallput_node(list_node,K,N,u,d,r,call_put):
p_tel = (1+r-d)/(u-d)
q_tel = 1-p_tel
#store the last layer of the list node to a new dict
last_layer = list_node['layer'+str(N)]
#use max(x-k,0) to recalculate the value of that layer
if call_put=='call':
last_layer = np.subtract(last_layer,K)
else:
last_layer = np.subtract(K,last_layer)
last_layer = [max(ele,0) for ele in last_layer]
#construct a new dict to store next layer's value
call_node = {'layer'+str(N):last_layer}
#construct for loop from layer end-1 to 0
for layer in reversed(range(N)):
cur_layer = list()
propagate_layer = call_node['layer'+str(layer+1)]
#instide the for loop.construct another for loop from the first element to end-1
for ele in range(len(propagate_layer)-1):
#calculate the value for the next layer and add to it
val = (propagate_layer[ele]*q_tel+propagate_layer[ele+1]*p_tel)/(1+r)
cur_layer.append(round(val,10))
dict_data = {'layer'+str(layer):cur_layer}
call_node.update(dict_data)
return call_node
def construct_Acallput_node(list_node,K,N,u,d,r,call_put):
p_tel = (1+r-d)/(u-d)
q_tel = 1-p_tel
#store the last layer of the list node to a new dict
last_layer = list_node['layer'+str(N)]
#use max(x-k,0) to recalculate the value of that layer
if call_put=='call':
last_layer = np.subtract(last_layer,K)
else:
last_layer = np.subtract(K,last_layer)
last_layer = [max(ele,0) for ele in last_layer]
#construct a new dict to store next layer's value
Appendix 16.1: Python Programming Code for Binomial Tree Option Pricing 327
call_node = {'layer'+str(N):last_layer}
#construct for loop from layer end-1 to 0
for layer in reversed(range(N)):
cur_layer = list()
propagate_layer = call_node['layer'+str(layer+1)]
#instide the for loop.construct another for loop from the first element to end-1
for ele in range(len(propagate_layer)-1):
#calculate the value for the next layer and add to it
val = (propagate_layer[ele]*q_tel+propagate_layer[ele+1]*p_tel)/(1+r)
## the main difference between european and american option is the following##
##need to calculate all the pre-exericise value
if call_put=='call':
pre_exercise = max(list_node['layer'+str(layer)][ele]-K,0)# the difference between call and
put
else:
pre_exercise = max(K-list_node['layer'+str(layer)][ele],0)
return call_node
if val<pre_exercise:
color_map.append('red')
else:
color_map.append('skyblue')
#dict.append(counter:list_node['layer][])
#counter++
return color_map
def construct_nodelabel(list_node,N):
#construct a dictionary to store labels
nodelabel = {}
#define a for loop from 0 to N
for layer in range(N+1):
#define a for loop from 0 to len(list_node['layer])
for ele in range(len(list_node['layer'+str(layer)])):
dict_data = {str(layer)+str(ele):round(list_node['layer'+str(layer)][ele],2)}
nodelabel.update(dict_data)
#dict.append(counter:list_node['layer][])
#counter++
return nodelabel
328 16 Binomial/Trinomial Tree Option Pricing Using Python
def construct_node(node_list,N):
#set a for loop from 0 to n-1
G = nx.Graph()
for layer in range(N):
#store layer current and layer next
cur_layer = node_list['layer'+str(layer)]
#for each ele in current layer, add_edge to ele on next layer and next ele on next layer
for ele in range(len(cur_layer)):
G.add_edge(str(layer)+str(ele),str(layer+1)+str(ele))
G.add_edge(str(layer)+str(ele),str(layer+1)+str(ele+1))
return G
def construct_nodepos(node_list):
position = {}
for layer in range(len(node_list)):
cur_layer = node_list['layer'+str(layer)]
return position
Input the parameters required for a Binomial Tree:
' S... stock price
' K... strike price
' N... time steps of the binomial tree
. r... Interest Rate
. sigma... Volatility
. deltaT ... time duration of a step
def usr_input():
plt.figure(figsize=(20,10))
vals = construct_labels(initial_price,N,u,d)
labels = construct_nodelabel(vals,N)
nodepos = construct_nodepos(vals)
G = construct_node(vals,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',alpha=1
,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('Stock price simulation')
plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
if A_E =='European':
plt.figure(figsize=(20,10))
call_vals = construct_Ecallput_node(vals,K,N,u,d,r,'call')
labels = construct_nodelabel(call_vals,N)
nodepos = construct_nodepos(call_vals)
G = construct_node(call_vals,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',alpha=1
,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('European call option')
plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
plt.figure(figsize=(20,10))
put_vals = construct_Ecallput_node(vals,K,N,u,d,r,'put')
labels = construct_nodelabel(put_vals,N)
nodepos = construct_nodepos(put_vals)
G = construct_node(put_vals,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',alpha=1
,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('European put option')
plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
else:
plt.figure(figsize=(20,10))
call_vals_A= construct_Acallput_node(vals,K,N,u,d,r,'call')
labels = construct_nodelabel(call_vals_A,N)
nodepos = construct_nodepos(call_vals_A)
G = construct_node(call_vals_A,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',alpha=1
,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('American call option')
plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
330 16 Binomial/Trinomial Tree Option Pricing Using Python
plt.figure(figsize=(20,10))
put_vals = construct_Ecallput_node(vals,K,N,u,d,r,'put')
put_vals_A = construct_Acallput_node(vals,K,N,u,d,r,'put')
Color_map = color_map(vals,put_vals,N,K)#should use put_vals instead of put_vals_A
labels = construct_nodelabel(put_vals_A,N)
nodepos = construct_nodepos(put_vals_A)
G = construct_node(put_vals_A,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color=Color_map,node_size=size_of_nodes,node_shape='o',alpha
=1,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('American put option')
plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
return list_node
#store cur-1 layer
#for each ele in cur-1 layer, update value in cur layer
def construct_Ecallput_node(list_node,K,N,r,T,lambdA,sigma,call_put):
dt = T/N
erdt = np.exp(r*dt)
pu = 1/(2*lambdA**2)+(r-sigma**2/2)*np.sqrt(dt)/(2*lambdA*sigma)
pm = 1-1/lambdA**2
pd = 1-pu-pm
#store the last layer of the list node to a new dict
last_layer = list_node['layer'+str(N)]
#use max(x-k,0) to recalculate the value of that layer
if call_put=='call':
last_layer = np.subtract(last_layer,K)
else:
last_layer = np.subtract(K,last_layer)
last_layer = [max(ele,0) for ele in last_layer]
#construct a new dict to store next layer's value
call_node = {'layer'+str(N):last_layer}
#construct for loop from layer end-1 to 0
for layer in reversed(range(N)):
cur_layer = list()
propagate_layer = call_node['layer'+str(layer+1)]
#instide the for loop.construct another for loop from the first element to end-2
for ele in range(len(propagate_layer)-2):
332 16 Binomial/Trinomial Tree Option Pricing Using Python
return call_node
#need to reconstruct plot, can't use netwrokx
def construct_nodelabel(list_node,N):
#construct a dictionary to store labels
nodelabel = {}
#define a for loop from 0 to N
for layer in range(N+1):
#define a for loop from 0 to len(list_node['layer])
for ele in range(len(list_node['layer'+str(layer)])):
dict_data = {str(layer)+str(ele):round(list_node['layer'+str(layer)][ele],2)}
nodelabel.update(dict_data)
#dict.append(counter:list_node['layer][])
#counter++
return nodelabel
def construct_node(node_list,N):
#set a for loop from 0 to n-1
G = nx.Graph()
for layer in range(N):
#store layer current and layer next
cur_layer = node_list['layer'+str(layer)]
#for each ele in current layer, add_edge to ele on next layer and next ele on next layer
for ele in range(len(cur_layer)):
G.add_edge(str(layer)+str(ele),str(layer+1)+str(ele))
G.add_edge(str(layer)+str(ele),str(layer+1)+str(ele+1))
G.add_edge(str(layer)+str(ele),str(layer+1)+str(ele+2))
return G
def construct_nodepos(node_list):
position = {}
for layer in range(len(node_list)):
cur_layer = node_list['layer'+str(layer)]
return position
def usr_input():
initial_price,K,sigma,T,N,r,lambdA = usr_input()
number_of_calculation = 0
for i in range(N+2):
number_of_calculation = number_of_calculation+i
Stock Price - S (Defualt : 50) -->
Strike price - K (Default 50) -->
Volatility - sigma (Default 0.2) -->
Time to mature - T (Default 0.5) -->
Periods (Default 6) -->
Interest Rate - r (Default 0.04) -->
Lambda (Default 1.5)-->
size_of_nodes = 1500
size_of_font = 12
plt.figure(figsize=(20,10))
vals = construct_labels(initial_price,N,T,sigma,lambdA)
labels = construct_nodelabel(vals,N)
nodepos = construct_nodepos(vals)
G = construct_node(vals,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',
alpha=1,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('Stock price simulation')
#plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
plt.figure(figsize=(20,10))
call_vals = construct_Ecallput_node(vals,K,N,r,T,lambdA,sigma,'call')
labels = construct_nodelabel(call_vals,N)
nodepos = construct_nodepos(call_vals)
G = construct_node(call_vals,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',
alpha=1,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('European call option')
#plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
plt.figure(figsize=(20,10))
call_vals = construct_Ecallput_node(vals,K,N,r,T,lambdA,sigma,'put')
labels = construct_nodelabel(call_vals,N)
nodepos = construct_nodepos(call_vals)
G = construct_node(call_vals,N)
nx.set_node_attributes(G, labels, 'label')
nx.draw(G,pos=nodepos,node_color='skyblue',node_size=size_of_nodes,node_shape='o',
alpha=1,font_weight="bold",font_color='darkblue',font_size=size_of_font)
plt.title('European put option')
#plt.suptitle('Price = {}, Exercise ={}, U = {}, D = {}, N = {}, Rate = {},Number of
calculation={}'.format(initial_price,K,u,d,N,r,number_of_calculation))
nx.draw_networkx_labels(G, nodepos, labels)
plt.show()
334 16 Binomial/Trinomial Tree Option Pricing Using Python
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 337
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_17
338 17 Financial Ratio Analysis and Its Applications
2014, December 31, 2015, December 31, 2016, December statement is used primarily for internal purposes, such as the
31, 2017, December 31, 2018, and December 31, 2019. The estimation of sales and profit targets, judgment of controls
balance sheet, however, is static and therefore should be on expenses, and monitoring progress toward longer-term
analyzed with caution in financial analysis and planning. targets. The statement of earnings is more dynamic than the
balance sheet, because it reflects changes for the period. It
provides an analyst with an overview of a firm’s operations
17.2.2 Statement of Earnings and profitability on a gross, operating, and net income basis.
JNJ’s income includes sales, interest income, and other
JNJ’s statement of earnings is presented in Table 17.2 and income/expenses. Costs and expenses for JNJ include the
describes the results of operations for a 12-month period cost of goods sold, selling, marketing, and administrative
ending December 31. The usual income-statement periods expenses, depreciation, depletion, and amortization. The
are annual, quarterly, and monthly. Johnson has chosen the difference between income and cost and expenses results in
annual approach. Both the annual and quarterly reports are the company’s Net Earnings. A comparative statement of
used for external as well as internal reporting. The monthly earnings is very useful in financial analysis and planning
Table 17.2 Consolidated (Dollars in 2012 2013 2014 2015 2016 2017 2018 2019
statements of earnings of JNJ millions except
corporation and subsidiaries per share
figures)
Sales to 67,224 71,312 74,331 70,074 71,890 76,450 81,581 82,059
customers ($)
Cost of products 21,658 22,342 22,746 21,536 21,685 25,354 27,091 27,556
sold
Gross profit 45,566 48,970 51,585 48,538 50,101 51,011 54,490 54,503
Selling, 20,869 21,830 21,954 21,203 20,067 21,520 22,540 22,178
marketing, and
administrative
expenses
Research 7,665 8,183 8,494 9,046 9,143 10,594 10,775 11,355
expense
Purchased 1,163 580 178 224 29 408 1,126 890
in-process
research and
development
Interest income (64) (74) (67) (128) (368) (385) (611) (357)
Interest 532 482 533 552 726 934 1,005 318
expense, net of
portion
capitalized
Other (income) 1,626 2,498 (70) (2,064) 210 (42) 1,405 2,525
expense, net
Restructuring – – – 509 491 509 251 266
Earnings before 13,775 15,471 20,563 19,196 19,803 17,673 17,999 17,328
provision for
taxes on income
Provision for 3,261 1,640 4,240 3,787 3,263 16,373 2,702 2,209
taxes on income
Net earnings 10,514 13,831 16,323 15,409 16,540 1,300 15,297 15,119
Basic net 3.50 3.76 3.67 4.62 6.04 0.48 5.70 5.72
earnings per
share ($)
Diluted net 3.46 3.73 3.63 4.57 5.93 0.47 5.61 5.63
earnings per
share ($)
340 17 Financial Ratio Analysis and Its Applications
because it allows insight into the firm’s operations, prof- summary of the firm’s dividend policy and shows how net
itability, and financing decisions over time. For this reason, income is allocated to dividends and reinvestment. JNJ’s
JNJ presents the statement of earnings for six consecutive equity is one source of funds for investment, and this internal
years: 2012, 2013, 2014, 2015, 2016, 2017, 2018, and 2019. source of funds is very important to the firm. The balance
Armed with this information, evaluating the firm’s future is sheet, the statement of earnings, and the statement of equity
easier. allow us to analyze important firm decisions on the capital
structure, cost of capital, capital budgeting, and dividend
policy of that firm.
17.2.3 Statement of Equity
JNJ’s statements of equity are shown in Table 17.3. These 17.2.4 Statement of Cash Flows
are the earnings that a firm retains for reinvestment rather
than paying them out to shareholders in the form of divi- Another extremely important part of the annual and quarterly
dends. The statement of equity is easily understood if it is report is the statement of cash flows. This statement is very
viewed as a bridge between the balance sheet and the helpful in evaluating a firm’s use of its funds and in deter-
statement of earnings. The statement of equity presents a mining how these funds were raised. Statements of cash flow
summary of those categories that have an impact on the level for JNJ are shown in Table 17.4. These statements of cash
of retained earnings: the net earnings and the dividends flow are composed of three sections: cash flows from
declared for preferred and common stock. It also represents a operating activities, cash flows from investing activities, and
Table 17.3 Consolidated statements of equity of JNJ corporation and subsidiaries (2012–2019) (dollars in millions)
Consolidated statements of equity— Total Retained Accumulated other Common stock Treasury stock
USD ($) in millions earnings comprehensive income issued amount amount
Balance at Dec. 30, 2012 $ 64,826 85,992 (5,810) 3,120 (18,476)
Net earnings 13,831 13,831 – – –
Cash dividends paid (7,286) (7,286) – – –
Employee compensation and stock 3,285 (82) – – 3,367
option plans
Repurchase of common stock (3,538) (2,947) – – (591)
Payments for repurchase of common 3,538 – – – –
stock
Other (15) (15) – – –
Other comprehensive income (loss), 2,950 – 2,950 – –
net of tax
Balance at Dec. 29, 2013 $ 74,053 89,493 (2,860) 3,120 (15,700)
Net earnings 16,323 16,323 – – –
Cash dividends paid (7,768) (7,768) – – –
Employee compensation and stock 2,164 (769) – – 2,933
option plans
Repurchase of common stock (7,124) – – – (7,124)
Other (34) (34) – – –
Other comprehensive income (loss), (7,862) – (7,862) – –
net of tax
Balance at Dec. 28, 2014 $ 69,752 97,245 (10,722) 3,120 (19,891)
Net earnings 15,409 15,409 – – –
Cash dividends paid (8,173) (8,173) – – –
Employee compensation and stock 1,920 (577) – – 2,497
option plans
Repurchase of common stock (5,290) – – – (5,290)
(continued)
17.2 Financial Statements: A Brief Review 341
cash flows from financing activities. The statement of cash The statement of cash flows can be used to help resolve
flows can be compiled by either the direct or indirect differences between finance and accounting theories. There
method. Most companies, such as Johnson & Johnson, is value for the analyst in viewing the statement of cash flow
compile their cash flow statements using the indirect over time, especially in detecting trends that could lead to
method. For JNJ, the sources of cash are essentially provided technical or legal bankruptcy in the future. Collectively, the
by operations. Application of these funds includes dividends balance sheet, the statement of retained earnings, the state-
paid to stockholders and expenditures for property, plant, ment of equity, and the statement of cash flow present a
equipment, etc. Therefore, this statement reveals some fairly clear picture of the firm’s historical and current
important aspects of the firm’s investment, financing, and position.
dividend policies; making it an important tool for financial
planning and analysis. 17.2.5 Interrelationship Among Four Financial
The cash flow statement shows how the net increase or Statements
decrease in cash has been reflected in the changing com-
position of current assets and current liabilities. It highlights It should be noted that the balance sheet, statement of
changes in short-term financial policies. It should be noted earnings, statement of equity, and statement of cash flow are
that the balance of cash flow statement should be equal to the interrelated. These relationships are briefly described as
first item of the balance sheet (i.e., cash and cash equiva- follows:
lents). Furthermore, it is well known that investment policy,
financial, dividend, and production policies are four (1) Retained earnings calculated from the statement of
important policies in the financial management and decision- equity for the current period should be used to replace
making process. Most of the information of these four the retained earnings item in the balance sheet of the
policies can be obtained from the cash flow statement. For previous period. Therefore, the statement of equity is
example, cash flow associated with operation activity gives regarded as a bridge between the balance sheet and the
information about operation and production policy. Cash statement of earnings.
flow associated with investment activity gives information (2) We need the information from the balance sheet, the
about investments policy. Finally, cash flow associated with statement of earnings, and the statement of equity to
financial activity gives information about dividend and compile the statement of cash flow.
financing policy.
344 17 Financial Ratio Analysis and Its Applications
(3) Cash and cash equivalents item can be found in the across firms of different sizes. However, if one creates a net
statement of cash flow. In other words, the statement of profitability ratio (NI/Sales), comparisons are easier to make.
cash flow describes how the cash and cash equivalent Analysis of a series of ratios will give us a clear picture of a
changed during the period. It is known that the first item firm’s financial condition and performance.
of the balance sheet is cash and cash equivalent. Analysis of ratios can take one of two forms. First, the
analyst can compare the ratios of one firm with those of
similar firms or with industry averages at a specific point in
time. This is a type of cross-sectional analysis technique that
17.2.6 Annual Versus Quarterly Financial Data may indicate the relative financial condition and perfor-
mance of a firm. One must be careful, however, to analyze
Both annual and quarterly financial data are important to the ratios while keeping in mind the inherent differences
financial analysts; which one is the most important depends between a firm’s production functions and its operations.
on the time horizon of the analysis. Depending upon pattern Also, the analyst should avoid using “rules of thumb” across
changes in the historical data, either annual or quarterly data industries because the composition of industries and indi-
could prove to be more useful. It is well-known that vidual firms varies considerably. Furthermore, inconsistency
understanding the implications of using quarterly data versus in a firm’s accounting procedures can cause accounting data
annual data is important for proper financial analysis and to show substantial differences between firms, which can
planning. hinder ratio comparability. This variation in accounting
Quarterly data has three components: trend-cycle, sea- procedures can also lead to problems in determining the
sonal, and irregular or random components. It contains “target ratio” (to be discussed later).
important information about seasonal fluctuations that The second method of ratio comparison involves the
“reflects an intra-year pattern of variation which is repeated comparison of a firm’s present ratio with its past and
constantly or in evolving fashion form year to year.” Quar- expected ratios. This form of time-series analysis will indi-
terly data has the disadvantage of having a large irregular, or cate whether the firm’s financial condition has improved or
random, component that introduces noise into the analysis. deteriorated. Both types of ratio analysis can take one of the
Annual data has both the trend-cycle component and the two following forms: static determination and its analysis, or
irregular component, but it does not have the seasonal dynamic adjustment and its analysis. In this section, we only
component. The irregular component is much smaller in discussed static determination of financial ratios. The
annual data than in quarterly data. While it may seem that dynamic adjustment and its analysis can be found in Lee and
annual data would be more useful for long-term financial Lee (2017).
planning and analysis, seasonal data reveals important per-
manent patterns that underlie the short-term series in finan-
cial analysis and planning. In other words, quarterly data can 17.3.1 Static Determination of Financial Ratios
be used for intermediate-term financial planning to improve
financial management. The static determination of financial ratios involves the
Use of either quarterly or annual data has a consistent calculation and analysis of ratios over a number of periods
impact on the mean-square error of regression forecasting, for one company, or the analysis of differences in ratios
which is composed of variance and bias. Changing from among individual firms in one industry. An analyst must be
quarterly to annual data will generally reduce variance while careful of extreme values in either direction, because of the
increasing bias. Any difference in regression results, due to the interrelationships between ratios. For instance, a very high
use of different data, must be analyzed in light of the historical liquidity ratio is costly to maintain, causing profitability
patterns of fluctuation in the original time-series data. ratios to be lower than they need to be. Furthermore, ratios
must be interpreted in relation to the raw data from which
they are calculated, particularly for ratios that sum accounts
17.3 Static Ratio Analysis in order to arrive at the necessary data for the calculation.
Even though this analysis must be performed with extreme
In order to make use of financial statements, an analyst needs caution, it can yield important conclusions in the analysis for
some form of measurement for analysis. Frequently, ratios a particular company. Table 17.5 presents six alternative
are used to relate one piece of financial data to another. The types of ratios for Johnson & Johnson. These six ratios are
ratio puts the two pieces of data on an equivalent base, short-term solvency, long-term solvency, asset management,
which increases the usefulness of the data. For example, net profitability ratios, market value ratios, and policy ratios. We
income as an absolute number is meaningless to compare now discuss these six ratios in detail.
17.3 Static Ratio Analysis 345
Table 17.5 Alternative financial ratios for Johnson & Johnson (2016–2019)
Ratio classification Formula JNJ
2019 2018 2017 2016
I. Short-term solvency, or liquidity ratios (times)
(1) Current ratio (Current asset)/(current liabilities) 1.26 1.47 1.41 2.47
(2) Quick ratio (Cash + MS + receivables)/(current liabilities) 0.94 1.08 1.04 2.04
(3) Cash ratio (Cash + MS)/(current liabilities) 0.54 0.63 0.60 1.59
(4) Net working capital to total asset (Net working capital)/(total asset) 0.06 0.10 0.08 0.27
II. Long-term solvency, or financial leverage ratios (times)
(5) Debt to asset (Total debt)/(total asset) 0.62 0.61 0.62 0.50
(6) Debt to equity (Total debt)/(total equity) 1.65 1.56 1.61 1.01
(7) Equity multiplier (Total asset)/(total equity) 2.65 2.56 2.61 2.01
(8) Times interest paid (EBIT)/(interest expenses) 54.49 17.91 18.92 28.28
(9) Long-term debt ratio (Long-term debt)/(long-term debt + total 0.31 0.32 0.34 0.24
equity)
(10) Cash coverage ratio (EBIT + depreciation)/(interest expenses) 76.53 24.80 24.96 33.45
III. Asset management, or turnover (activity) ratios (times)
(11) Day’s sales in receivables (average (Account receivable) /(sales/365) 64.41 63.08 64.41 59.40
collection period)
(12) Receivable Turnover (Sales)/(account receivable) 5.67 5.79 5.67 6.14
(13) Day’s sales in inventory (Inventory)/(cost of goods cold/365) 119.48 115.86 126.18 137.08
(14) Inventory turnover (Cost of goods sold) (inventory) 3.05 3.15 2.89 2.66
(15) Fixed asset turnover (Sales)/(fixed assets) 4.65 4.78 4.50 4.52
(16) Total asset turnover (Sales)/(total assets) 0.52 0.53 0.49 0.51
(17) Net working capital turnover (Sales)/(net working capital) 8.81 5.51 6.09 1.86
IV. Profitability ratios (percentage)
(18) Profit margin (Net income)/(sales) 18.42 18.75 1.70 23.01
(19) Return on assets (ROA) (Net income)/total assets) 9.59 10.00 0.83 11.71
(20) Return on equity (ROE) (Net income)/(total equity) 25.42 25.60 2.16 23.49
V. Market value ratios (times)
(21) Price-earnings ratio (Mkt price per share)/(earnings per share) 30.08 25.96 289.33 18.70
(22) Market-to-book ratio (Mkt price per share)/(book value per share) 2.88 2.60 2.39 2.19
(23) Earnings yield (Earnings per share)/(mkt price per share) 0.03 0.04 0.00 0.05
(24) Dividend yield (Dividend per share) /(mkt price per share) 0.02 0.02 0.02 0.03
(25) PEG ratio (Price-earnings ratio)/(earnings growth rate) 343.85 267.28 –2277.37 166.27
(26) Enterprise value-EBITDA ratio (Enterprise value)/(EBITDA) 18.97 17.68 18.81 14.46
(27) Dividend payout ratio (Dividend payout)/(net income) 0.66 0.62 6.88 0.52
VI. Policy ratios (percentage)
(5) Debt to asset (Total debt)/(total asset) 62.30 60.93 61.76 50.13
(27) Dividend payout ratio (Dividend payout)/(net income) 65.59 62.06 687.92 52.12
(28) Sustainable growth rate [(1 − payout ratio) * ROE]/[1 − (1 − payout 9.59 10.76 –11.27 12.67
ratio) * ROE]
346 17 Financial Ratio Analysis and Its Applications
Short-Term Solvency, or Liquidity Ratios The income-statement leverage ratio measures the firm’s
ability to meet fixed obligations of one form or another. The
Liquidity ratios are calculated from information on the bal- time interest paid, which is earnings before interest and taxes
ance sheet; they measure the relative strength of a firm’s over interest expense, measures the firm’s ability to service
financial position. Crudely interpreted, these are coverage the interest expense on its outstanding debt. A more broadly
ratios that indicate the firm’s ability to meet short-term defined ratio of this type is the fixed-charge coverage ratio,
obligations. The current ratio (ratio 1 in Table 17.5) is the which includes not only the interest expense but also all
most popular of the liquidity ratios because it is easy to other expenses that the firm is obligated by contract to pay
calculate, and it has intuitive appeal. It is also the most (This ratio is not included in Table 17.5 because there is not
broadly defined liquidity ratio, as it does not take into enough information on fixed charges for these firms to cal-
account the differences in relative liquidity among the indi- culate this ratio).
vidual components of current assets. A more specifically
defined liquidity ratio is the quick or acid-test ratio (ratio 2), Asset Management, or Turnover (Activity) Ratios
which excludes the least liquid portion of current assets and
inventories. In other words, the numerator of this ratio This group of ratios measures how efficiently the firm is
includes cash, marketable securities (MS), and receivables. utilizing its assets. With activity ratios, one must be partic-
Cash ratio (ratio 3) is the ratio of the company’s total cash ularly careful about the interpretation of extreme results in
and cash equivalents (marketable securities, MS) to its cur- either direction; very high values may indicate possible
rent liabilities. It is most often used as a measure of company problems in the long term, and very low values may indicate
liquidity. A strong cash ratio is useful to creditors when a current problem of low sales or not taking a loss for
deciding how much debt they are willing to extend to the obsolete assets. The reason that high activity may not be
asking party (Investopedia.com). good in the long term is that the firm may not be able to
The net working capital to total asset ratio (ratio 4) is the adjust to an even higher level of activity and therefore may
NWC divided by the total assets of the company. A rela- miss out on a market opportunity. Better analysis and
tively low value might indicate relatively low levels of planning can help a firm get around this problem.
liquidity. The days-in-accounts-receivable or average collection
period ratio (11) indicates the firm’s effectiveness in col-
Long-Term Solvency, or Financial Leverage Ratios lecting its credit sales. The other activity ratios measure the
firm’s efficiency in generating sales with its current level of
If an analyst wishes to measure the extent of a firm’s debt assets, appropriately termed turnover ratios. While there are
financing, a leverage ratio is the appropriate tool to use. This many turnover ratios that can be calculated, there are three
group of ratios reflects the financial risk posture of the firm. basic ones: inventory turnover (14), fixed assets turnover
The two sources of data from which these ratios can be (15), and total assets turnover (16). Each of these ratios
calculated are the balance sheet and the statement of measures a different aspect of the firm’s efficiency in
earnings. managing its assets.
The balance sheet leverage ratio measures the proportion Receivables turnover (12) is computed as credit sales
of debt incorporated into the capital structure. The debt– divided by accounts receivable. In general, a higher accounts
equity ratio measures the proportion of debt that is matched receivable turnover suggests more frequent payment of
by equity; thus this ratio reflects the composition of the receivables by customers.
capital structure. The debt–asset ratio (ratio 5), on the other In general, analysts look for higher receivables turnover
hand, measures the proportion of debt-financed assets cur- and shorter collection periods, but this combination may
rently being used by the firm. Other commonly used lever- imply that the firm’s credit policy is too strict, allowing only
age ratios include the equity multiplier ratio (7) and the time the lowest risk customers to buy on credit. Although this
interest paid ratio (8). strategy could minimize credit losses, it may hurt overall
Debt-to-equity (6) is a variation in the total debt ratio. Its sales, profits, and shareholder wealth.
total debt is divided by total equity. Day’s sales in inventory ratio (13) estimate how many days,
Long-term debt ratio (9) is long-term debt divided by the on average, a product sits in the inventory before it is sold.
sum of long-term debt and total equity. Net working capital turnover (17) measures how much
Cash coverage ratio (10) is defined as the sum of EBIT per dollar of net working capital can generate dollar of sales.
and depreciation divided by interest. The numerator is often For example, if this ratio is 3, this means the per dollar of net
abbreviated as EBITDA. working capital can generate $3 of sales.
17.3 Static Ratio Analysis 347
assets from the balance sheet identity. Total market value of industry of diversified firms. The analyst can then use 3- or
equity = market price per share times basic number of shares 4-digit codes and compute their own weighted industry
outstanding. average.
Enterprise value is often used to calculate the Enterprise Often an industry average is used as a proxy for the target
value-EBITDA ratio (26): ratio. This can lead to another problem, the inappropriate
calculation of an industry average, even though the industry
EBITDA ratio ¼ Enterprise value=EBITDA and companies are fairly well defined. The issue here is the
where EBITDA is defined as earnings before interest, taxes, appropriate weighting scheme for combining the individual
depreciation, and amortization. company ratios in order to arrive at one industry average.
This ratio is similar to the PE ratio, but it relates the value Individual ratios can be weighted according to equal
of all the operating assets to a measure of the operating cash weights, asset weights, or sales weights. The analyst must
flow generated by those assets. determine the extent to which firm size, as measured by asset
base or market share, affects the relative level of a firm’s
Policy Ratios ratios and the tendency for other firms in the industry to
adjust toward the target level of this ratio. One way this can
Policy ratios include debt-to-asset ratio, dividend payout be done is to calculate the coefficients of variation for a
ratio, and sustainable growth rate. Debt-to-asset ratio has number of ratios under each of the weighting the schemes
been discussed in Group 2 of Table 17.5. Dividend payout and to compare them to see which scheme consistently has
ratio is defined as (dividend payout)/(net income). The div- the lowest coefficient variation. This would appear to be the
idend payout ratio is the ratio of the total amount of divi- most appropriate weighting scheme. Of course, one could
dends paid out to shareholders relative to the net income of also use a different weighting scheme for each ratio, but this
the company. It is the percentage of earnings paid to would be very tedious if many ratios were to be analyzed.
shareholders in dividends. The amount that is not paid to Note, that the median rather than the average or mean can be
shareholders is retained by the company to pay off debt or to used to avoid needless complications with respect to extreme
reinvest in core operations. It is sometimes simply referred to values that might distort the computation of averages.
as the “payout ratio.” Dynamic financial ratio analysis is to compare individual
Sustainable growth rate is defined as [(1 − payout ratio) company ratios with industry averages over time. In general,
*ROE]/[1 − (1 − payout ratio)*ROE]. Appendix 2B will this kind of analysis needs to rely upon regression analysis.
discuss sustainable growth rate in further detail. Lee and Lee (2017, Chap. 2) have discussed this kind of
Table 17.5 summarizes all 28 ratios for Johnson & analysis in detail.
Johnson during 2016, 2017, 2018, and 2019. Appendix 2A
shows how to use Excel to calculate the first 26 ratios with
the data of 2018 and 2019 from JNJ Financial Statement. 17.4 Two Possible Methods to Estimate
the Sustainable Growth Rate
Estimation of the Target of a Ratio
Sustainable growth rate (SGR) can be either estimated by
An issue that must be addressed at this point is the deter- (i) using both external and internal source of fund or
mination of an appropriate proxy for the target of a ratio. For (ii) using only internal source of fund.
an analyst, this can be an insurmountable problem if the firm We present these two methods in detail as follows:
is extremely diversified, and if it does not have one or two
major product lines in industries where industry averages are Method 1: The sustainable growth rate with both
available. One possible solution is to determine the relative external and internal source of fund can be defined as
industry share of each division or major product line, then (Lee 2017)
apply these percentages to the related industry averages. Retention Rate*ROE
Lastly, derive one target ratio for the firm as a whole with SGR ¼
1 ðRetention Rate*ROEÞ
which its ratio can be compared. One must be very careful in ð1 Dividend Payout RatioÞ*ROE
any such analysis, because the proxy may be extremely over- ¼ ð17:1Þ
1 ½ð1 Dividend Payout RatioÞ ROE
or underestimated. The analyst can also use Standard
Industrial Classification (SIC) codes to properly define the Dividend Payout Ratio ¼ Dividends=Net Income
17.5 DFL, DOL, and DCL 349
Method 2: The sustainable growth rate: considering 17.5 DFL, DOL, and DCL
internal source of fund
It is well known that financial leverage can lead to higher
ROE ¼ Net Income=Total Equity expected earnings for a corporation’s stockholders. The use
ROE ¼ ðNet Income=AssetsÞ ðAssets=EquityÞ of borrowed funds to generate higher earnings is known as
ROE ¼ ðNet Income=SalesÞ ðSales=AssetsÞ ð17:2Þ financial leverage. But this is not the only form of leverage
ðAssets=EquityÞ available to increase corporate earnings. Another form is
operating leverage, which pertains to the proportion of the
SGR ¼ ROE ð1 Dividend Payout RatioÞ
firm’s fixed operating costs. In this section, we discuss
degree of financial leverage (DFL), degree of operating
leverage (DOL), and degree of combined leverage (DCL).
Example
With the data from JNJ financial statement of 2019 fiscal 17.5.1 Degree of Financial Leverage
year, we estimate obtain
Suppose that a levered corporation improves its performance
ROE ¼ Net Income=Total Equity ¼ 15; 119=59; 471 of the previous year by increasing its operating income by 1
¼ 0:2542 percent. What is the effect on earnings per share? If you
answered “a 1 percent increase,” you have ignored the
Dividend Payout Ratio ¼ Dividends=Net Income influence of leverage. To illustrate, consider the corporation
¼ 9; 917=15; 119 ¼ 0:6559 of Table 17.6. In the current year, as we saw earlier, this firm
produces earnings per share of $2.49.
According to the method 1; SGR The firm’s operating performance improves next year, to
¼ ð10:6559Þ 0:2542=1½ð10:6559Þ 0:2542 the extent that earnings before interest and taxes increase by 1
¼ 0:0959 percent, from $270 million to $272.7 million. Other relevant
factors are unchanged. Interest payments are $104 million,
According to the method 2; SGR ¼ 0:2542 ð10:6559Þ
and with a corporate tax rate of 40 percent, 60 percent of
¼ 0:0875:
earnings after interest are available for distribution to stock-
holders. Thus, earnings available to stockholders = 0.60
The difference between method 1 and method 2 (272.7 − 104) = $101.22 million. Therefore, with 40 million
shares outstanding, earnings per share next year will be
Technically, as ROE ð1 DÞ is the numerator of
$101:22
ROEð1DÞ EPS ¼ ¼ $2:5305
1ROEð1DÞ and 1 [ ½1 ROE ð1 DÞ 0; it is easy to 40
ROEð1DÞ
prove 1ROE ð1DÞ ROE ð1 DÞ: Hence, the percentage increase in earnings per share is
ROEð1DÞ
In addition, we can transform 1ROE ð1DÞ into 2:5305 2:49
Retained Earnings %change in EPS ¼ 100 ¼ 1:6265%
EquityRetained Earnings and transform ROE ð1 DÞ 2:49
Retained Earnings We see that a 1 percent increase in EBIT leads to a greater
into Equity : It is obvious to see
percentage increase in EPS. The reason is that none of the
Retained Earnings Retained Earnings
EquityRetained Earnings Equity since Equity increased earnings need be paid to debtholders. All of this
Retained Earnings Equity: If we use equity value at the increase goes to equity holders, who therefore benefit dis-
end of this year, then ðEquity Retained EarningsÞ can be proportionately. The argument is symmetrical. If EBIT were
interpreted as the equity value at the beginning of this year to fall by 1 percent, then EPS would fall by 1.6265%.
under the condition of no external finance. The extent to which a given percentage increase in
Consequently, the SGR from method 1 is usually greater operating income produces a greater percentage increase in
than that from method 2. The numerical result earnings per share provides a measure of the effect of
0.0959 > 0.0875 confirms this. In Appendix 17.2, we use leverage on stockholders’ earnings. This is known as the
Excel to show how to calculate SGR with two methods. degree of financial leverage (DFL) and is defined as
350 17 Financial Ratio Analysis and Its Applications
%change in EPS
DFL ¼
%change in EBIT
We now develop an expression for the degree of financial
leverage. Suppose that a firm has earnings before interest
and tax of EBIT, and debt of B, on which are interest pay-
ments at rate i. If the corporate tax rate is sc , then
Comparing Eqs. (17.3) and (17.4), the increase in earn- 17.5.2 Operating Leverage and the Combined
ings available to stockholders is Effect
ð1 sc Þð1:01EBIT iBÞ ð1 sc ÞðEBIT iBÞ
Net earnings are the difference between total sales value and
¼ :01ð1 sc ÞEBIT
total operating costs. We now look in detail at operating
It follows that the percentage change in stockholders’ costs, which we break down into two components: fixed
earnings, and hence in earnings per share, is costs and variable costs. Fixed costs are costs that the firm
must incur, whatever its level of production. Such costs
ð:01Þð1 sc ÞEBIT include rent and equipment depreciation. Variable costs are
%change in EPS ¼ 100
ð1 sc ÞðEBIT iBÞ costs that increase with production, such as wages. The mix
ð:01ÞEBIT of fixed and variable costs in a firm’s total operating cost
¼ 100
EBIT iB structure provides operating leverage. Let us consider a firm
Since the increase in EBIT is 1 percent, it follows from with a single product, under the following conditions:
our definition that the degree of financial leverage is
• The firm incurs fixed costs F, which must be paid
ð:01ÞEBIT EBIT whatever the level of output.
DFL ¼ ¼ ¼ 1:6265 ð17:5Þ
ðEBIT iBÞ:01Þ EBIT iB • Each unit of output costs an additional amount V.
• Each unit of output can be sold at price P.
Thus, the degree of financial leverage can be found as the • A total of Q units of output are produced and sold.
ratio of net operating income to income remaining after
interest payments on debt. This is illustrated in Fig. 17.1, XYZ Corporation produces parts for the automobile
which plots the degree of financial leverage against interest industry. Information for this corporation can be found in
payments for a given level of net operating income. If there Table 17.6. Its current net operating income is derived from
are no interest payments, so that the firm is unlevered, DFL the sale of 10 million units, priced at $150 each. Operating
is 1. That is, each 1 percent increase in earnings before cost consist of $310 million of fixed costs and variable costs
interest and tax leads to a 1 percent increase in earnings per of $92 per unit.
share. As interest payments increase, so does the degree of Suppose this corporation increases its sales volume by 1
financial leverage, to the point where, if interest payments percent to 10.1 million units next year, with other factors
equal net operating income, DFL is infinite. This is not unchanged. Would you guess that earnings before interest
surprising, for in this case there would be no earnings and tax also increase by 1 percent? In fact, net operating
available to stockholders. Hence, any increase in net oper- income will rise by more than 1 percent. The reason is that
ating income would, proportionately, yield an infinitely large while the value of sales and variable operating costs
improvement. The relationship between DFL and interest increases proportionately, fixed operating costs remain
payments is presented in Fig. 17.1. uncharged. These costs, then, constitute a source of
17.5 DFL, DOL, and DCL 351
operating leverage. The greater the share of total cost attri- So that, by comparison with (11.14), the increase in EBIT
butable to fixed costs, the greater this leverage. is .01Q(P − V). It follows that
The extent to which a given percentage increase in sales
volume produces a greater percentage increase in earnings :01QðP V Þ
%change in EBIT ¼ 100
before interest and taxes are used to measure the degree of QðP V Þ F
QðP V Þ
operating leverage. ¼
The degree of operating leverage (DOL) is given by QðP V Þ F
1. The mean of the EPS distribution for the firm with the Also covered are financial ratios, cost-volume-profit
higher degree of financial leverage exceeds the mean of (CVP) analysis, break-even analysis, and degree of leverage
the other firm. This reflects the potential for higher (DOL) analysis. Financial ratios are an important tool by which
expected EPS resulting from financial leverage. managers and investors evaluate a firm’s market value as well
2. The variance of the EPS distribution is higher for the firm as understand the reasons for the fluctuations of the firm’s
with the greater degree of financial leverage. This reflects market value. Factors that affect the industry in general and the
the increase in financial risk resulting from financial firm in particular should be investigated. The best way to
leverage. understand the common factors is to study economic infor-
mation associated with the fluctuations or to look at the leading
Thus, the overall risk faced by corporate stockholders is a indicators. Accounting information, market information, and
combination of business risk and financial risk. We might economic information are the three basic sources of data used in
think of the possibility of a trade-off between these two types the financial decision-making process. In addition to analyzing
of risk. Suppose that a firm operates in a risky business the various types of information at one point in time and over
environment. Perhaps it trades in volatile markets and is time, the financial analyst is also interested in how the infor-
highly capital-intensive, so that a large proportion of its costs mation changes over time. This area of study is known as dy-
are fixed. This riskiness will be exacerbated if the firm also namic analysis and a detailed discussion can be found in Lee
has substantial debt, so that the firm has considerable and Lee (2017).
financial risk. On the other hand, a low degree of financial
leverage, and hence of financial risk, can mitigate the impact
of high business risk on the overall riskiness of stockholders’ Appendix 17.1: Calculate 26 Financial Ratios
equity. Management of a corporation subject to low business with Excel
risk might feel more sanguine about taking on additional
debt and thereby increasing financial risk. In this appendix, we use the data of 2018 and 2019 fisical
year of Johnson & Johnson annual report as the example and
show how to calculate the 26 basic financial ratios across
17.6 Summary five groups. The following figure lists 21 basic input vari-
ables from the Financial statements of fisical year 2019 and
This chapter reviews economic, financial, market, and 2018. The colunm A is the name of the input variable. The
accounting information to provide some environmental back- column B shows the value of each variable in 2019 and
grounds to understand and apply sound financial management. column C shows that in 2018.
Appendix 17.1: Calculate 26 Financial Ratios with Excel 355
Liquidity Ratio
First, we focus on the Liquidity ratio, which measures the relative strength of a firm’s financial position. It usually includes
current ratio, quick ratio, cash ratio, and networking capital to total asset ratio. The formula for each ratio is defined as
follows:
Current asset
Current ratio ðCRÞ ¼
Current liability
Cash þ MS þ receivables
Quick ratio ¼
Current liability
Cash þ MS
Cash ratio ¼
Current liability
Similarly, we can compute the Quick ratio and Cash equivalent and Marketable securities [= (B5 + B6)] as the
Ratio as the following two figures instruct. Compared with numerator in order to calculate the Cash ratio or use the sum
calculating the current ratio, the only difference for com- of Cash and cash equivalent, Marketable securities and
puting the Quick ratio or the Cash ratio is that different Accounting receivables[= (B5 + B6 + B7)] as the numera-
numerator is used. We have to use the sum of Cash and cash tor in order to calculate the Quick ratio.
356 17 Financial Ratio Analysis and Its Applications
For the net working capital to total asset ratio, we firstly need to calculate “Net working capital” and then divide it by
current asset. As net working capital is defined as “Current asset minus current liability,” we compute this ratio by inputting
“= (B3 − B4)/B8,” which gives us 0.06 in the figure below.
Appendix 17.1: Calculate 26 Financial Ratios with Excel 357
Total liability For the first four ratios, their calculations are quite simple.
Debt to Asset ¼ We input “= B9/B8” to get 0.6230 for the Debt to Asset
Total asset
ratio, “= B9/B10” to get 1.6522 for the Debt to Equity ratio,
Debt to Equity ¼
Total liability “= B8/B10” to get 2.6522 for the Equity Multiplier,
Total equity “= B11/B13” to get 54.4906 for the Times interested paid.
The following figure shows how to calculate the
Total asset long-term debt ratio. We input “= B14/(B14 + B10)” in an
Equity Multiplier ¼
Total equity empty cell, where (B14 + B10) equals the sum of long-term
EBIT debt and total equity. Excel gives us 0.3082.
Times interest paid ¼
Interest expense
Similarly, the Cash coverage ratio can be computed based on the formula by inputting “= (B11 + B15)/B13.” Then we
obtain 76.5314 as the value of this ratio.
358 17 Financial Ratio Analysis and Its Applications
In order to calculate the Net Working capital Turnover, we input “= B16/(B3 − B4)” since “B3 − B4” equals to the
working capital of JNJ in 2019. Excel shows the final value of 8.81.
Profitability Ratios
Net Income
Profit Margin ¼
Sales
Net Income
Return on Equity ¼
Total equity
Net Income
Return on Asset ¼
Total asset
Similar to the skills used before, we only need to divide one variable (X1) by another one (X2) with inputting “= X1/X2”
to obtain the ratios. The figure below gives an example of how to calculate the Profit Margin (0.18). ROA and ROE can be
obtained in a similar way.
360 17 Financial Ratio Analysis and Its Applications
PE
PEG ratio ¼
Earnings growth rate
Enterprise value
Enterprise EBDTA ratio ¼
EBITDA
The following two figures show how to compute PE ratio and MB ratio. Since the price per share is input into cell B23,
we only need to find EPS or book value per share. According to the definition of EPS, it is computed via dividing net income
by total shares (= B20/B22). Similarly, book value per share can be obtained by inputting “= B8/B22.” In order to calculate
PE ratio or MB ratio in one-step, we directly input “= B23/(B20/B22)” or “= B23/(B8/B22),” respectively. The values are
30.0774 and 2.8831, respectively.
Appendix 17.1: Calculate 26 Financial Ratios with Excel 361
Additionally, the Earnings yield is simply the reciprocal of PE so that we get it (1/30.0774 = 0.03325), and Dividend
yield can be computed via inputting “= (B21/B22)/B23” and equals 0.0218. The following figure shows the result.
362 17 Financial Ratio Analysis and Its Applications
For enterprise-EBITDA ratio, we firstly calculate enterprise value on the numerator according to the definition “Total
market value of equity + Book value of Total Liability-Cash” and then input “= B22*B23 + B9 − B5” into an empty cell.
Next, we divide enterprise value by EBITDA. So the one-step formula is “= (B22*B23 + B9 − B5)/B12.” Excel gives us
the value of 18.9793.
The last ratio is PEG ratio, which equals to PE ratio divided by sustainable growth rate. Since we already have PE ratio,
we only need to find the value of sustainable growth rate. Based on the formula: sustainable growth rate
¼ ROE*ð1 dividend payout ratioÞ, we input sustainable growth rate = H28*(1 − B21/B20) in the cell B35 to get the
value of sustainable growth rate (0.0875). The figure below shows the result.
Appendix 17.2: Using Excel to Calculate Sustainable Growth Rate 363
Therefore, we get PEG ratio by inputting “= H31/B35,” which equals to 343.8547. The result is as follows.
Example:
Appendix 17.2: Using Excel to Calculate
Sustainable Growth Rate With the data from JNJ financial statement for the 2019
fiscal year, we estimate obtain
Sustainable growth rate (SGR) can be either estimated by
ROE ¼ Net Income=Total Equity ¼ 15; 119=59; 471 ¼ 0:2542
(i) using both external and internal source of fund or Dividend Payout Ratio ¼ Dividends=Net Income ¼ 9; 917=15; 119 ¼ 0:6559
(ii) using only internal source of fund.
We present these two methods in detail as follows: According to the method 1; SGR ¼ ð1 0:6559Þ 0:2542=1
½ð1 0:6559Þ 0:2542 ¼ 0:0959
Method 1: The sustainable growth rate with both According to the method 2; SGR ¼ 0:2542 ð10:6559Þ ¼ 0:0875
external and internal source of fund can be defined as
(Lee 2017):
The difference between method 1 and method 2
Retention Rate*ROE
SGR ¼
1 ðRetention Rate*ROEÞ Technically, as ROE ð1 DÞ is the numerator of
ð1 Dividend Payout RatioÞ*ROE ROEð1DÞ
¼ 1ROEð1DÞ and 1 [ ½1 ROE ð1 DÞ 0, it is easy to
1 ½ð1 Dividend Payout RatioÞ ROE ROEð1DÞ
ð17A:1Þ
prove 1ROE ð1DÞ ROE ð1 DÞ.
In addition, we can transform
Dividend Payout Ratio ¼ Dividends=Net Income ROEð1DÞ Retained Earnings
into EquityRetained Earnings and transform
1ROEð1DÞ
Retained Earnings
Method 2: The sustainable growth rate: considering the ROE ð1 DÞ into Equity . It is obvious to see
internal source of fund Retained Earnings Retained Earnings
EquityRetained Earnings Equity since
ROE ¼ Net Income=Total Equity Equity Retained Earnings Equity. If we use equity
ROE ¼ ðNet Income=AssetsÞ ðAssets=EquityÞ value at the end of this year, then
ROE ¼ ðNet Income=SalesÞ ðSales=AssetsÞ ðAssets=EquityÞ ðEquity Retained EarningsÞ can be interpreted as the
SGR ¼ ðNet Income=SalesÞ ðRetention RateÞ ðSales=AssetsÞ equity value at the beginning of this year under the condition
ðAssets=EquityÞ of no external finance.
¼ ROE ð1 Dividend Payout RatioÞ Consequently, the SGR from method 1 is usually greater
than that from method 2. The numerical result
ð17A:2Þ
0.0959 > 0.0875 confirms this.
364 17 Financial Ratio Analysis and Its Applications
First, we calculate the dividend payout ratio by inputting “= B21/B20.” We compute the SGR with method 1 by input-
ting = ((1 – B26)*H28)/(1 – ((1 – B26)*H28))” and then obtain 0.0958558 and with method 2 by inputting = H28*
(1 – B26)” and then obtain 0.087471204. The following figures show the calculation.
2. Briefly discuss the definition of liquidity, asset man- Variable cost = $300,000
agement, capital structure, profitability, and market Fixed cost = $50,000
value ratio. What can we learn from examining the a. Calculate the DOL at the above quantity of output.
financial ratio information of GM in 1984 and 1985 as b. Find the break-even quantity and sales levels.
listed in Table 17.6? 9. On the basis of the following firm and industry norm
3. Discuss the major difference between the linear and ratios, identify the problem that exists for the firm:
nonlinear break-even analysis.
4. ABC Company’s financial records are as follows: Ratio Firm Industry
Quantity of goods sold = 10,000 Total asset utilization 2.0 3.5
Price per unit sold = $20 Average collection period 45 days 46 days
Variable cost per unit sold = $10 Inventory turnover 6 times 6 times
Total amount of fixed cost = $50,000
Fixed asset utilization 4.5 7.0
Corporate tax rate = 50%
a. Calculate EAIT.
b. What is the break-even quantity?
c. What is the DOL? 10. The financial ratios for Wallace, Inc., a manufacturer of
d. Should the ABC Company produce more for greater consumer household products, are given below along
profits? with the industry norm:
5. ABC Company’s predictions for next year are as
Ratio Firm Industry
follows:
1986 1987 1988
Probability Quantity Price Variable Corporate Current ratio 1.44 1.31 1.47 1.2
cost/unit tax rate Quick ratio .66 .62 .65 .63
State 1 0.3 1,000 $10 $5 .5 Average collection 33 days 37 days 32 days 34 days
State 2 0.4 2,000 $20 $10 .5 period
State 3 0.3 3,000 $30 $15 .5 Inventory turnover 7.86 7.62 7.72 7.6
Fixed asset turnover 2.60 2.44 2.56 2.8
In addition, we also know that the fixed cost is $15,000. Total asset 1.24 1.18 1.40 1.20
What is the next year’s expected EAIT? utilization
Debt to total equity 1.24 1.14 .84 1.00
6. Use an example to discuss four alternative depreciation
methods. Debt to total assets .56 .54 .46 .50
7. XYX, Inc. currently produces one product that sells for Times interest 2.75 5.57 7.08 5.00
earned
$330 per unit. The company’s fixed costs are $80,000
per year; variable costs are $210 per unit. A salesman Return on total .02 .06 .07 .06
assets
has offered to sell the company a new piece of equip-
Return on equity .06 .12 .12 .13
ment which will increase fixed costs to $100,000. The
salesman claims that the company’s break-even number Net profit margin .02 .05 .05 .05
of units sold will not be altered if the company pur-
chases the equipment and raises its price (assuming Analyze Wallace’s ratios over the three-year period for each
variable costs remain the same). of the following categories:
a. Find the company’s current break-even level of a. Liquidity
units sold. b. Asset utilization
b. Find the company’s new price if the equipment is c. Financial leverage
purchased and prove that the break-even level has d. Profitability
not changed.
8. Consider the following financial data of a corporation: 11. Below are the Balance Sheet and the Income Statement
Sales = $500,000 for Nelson Manufacturing:
Quantity = 25,000
Appendix 17.3: How to Compute DOL, DFL, and DCL with Excel 367
Stockholder’s Equity
Preferred stock 5,000
Common stock (at par) 175,000
Retained earnings 341,000
Total Stockholder’s Equity $ 521,000
References
Johnson & Johnson 2016, 2017, 2018, and 2019 Annual Reports.
Lee, C. F. & John Lee Financial Analysis and Planning: Theory and
Application (Singapore: World Scientific, 2017).
Time Value of Money Determinations
and Their Applications 18
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 369
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_18
370 18 Time Value of Money Determinations and Their Applications
Borrowers view interest as a payment or rental fee for the discussion, it follows that the present value of a dollar to be
privilege of being able to have the immediate use of cash received in our year is
that they otherwise would have to save over time.
1
We answered the question posed at the beginning of this present value per dollar ¼
section by noting that if the risk-free annual interest is 8%, 1þr
then $1,000 today will be worth $1,080 a year from now. Therefore, the present value of C1 dollars to be received
This is calculated as follows: in the future is
ð1 þ :08Þ1;000 ¼ $1;080 C1
PV ¼
1þr
We can turn this statement around to determine the value
today of $1,000 received a year from now; that is, the pre- In assessing our proposed investment, the present value
sent value of this future receipt. To do this, it is necessary to of the return must be compared with the amount invested.
determine how much we would have to invest today, at 8% The difference between these two quantities is called the net
annual interest, to obtain $1,000 in a year’s time. This is present value of the investment. For the convenience of
done as follows: notation, we will write
1;000 C0 ¼ cost
¼ $925:93
1:08
So that C0, which is negative, represents the “cost” today.
Therefore, to the nearest cent, given an interest rate of The net present value, then, is the sum of today’s “cost” and
8%, the present value of $1,000 a year from now is $925.93. the present value of the future return; that is
The concept of present value is crucial in corporate
finance. Investors commit resources now in the expectation NPV ¼ C0 þ
C1
of receiving future earnings flows. To properly evaluate the 1þr
returns from an investment, it is necessary to consider that Provided this quantity is positive, the investment is worth
returns are derived in the future. These future monetary making.
amounts must be expressed in present value terms to assess As another example, suppose you are offered the oppor-
the worth of the investment when compared to its cost or tunity today to invest $1,000, with an assured return of
current market value. Additionally, the cast receipts received $1,100 one year from now and a risk-free interest rate of 8%.
at different points in time are not directly comparable In our notation, then, C0 = −1,000; C1 = 1,100; and r = .08.
without employing the present value (PV) method. The present value of the $1,000 return is
C1 1;100
¼ ¼ $1;018:52
18.3 Foundation of Net Present Value Rules 1þr 1:08
We begin our study of procedures for determining present where again we have rounded to the nearest cent. Thus, it
values with a simple example. Suppose that an individual, or requires $1,018.52 invested at 8% to yield $1,000 in one
a firm, has the opportunity to invest C0 dollars today in a year. Therefore, the net present value of our investment
project that will yield a return of C1 dollars in one year. opportunity is
Assume further that the risk-free annual interest rate, C1
expressed as a percentage, is r. To evaluate this investment, NPV ¼ C0 þ
1þr
we need to know the present value of the future return of C1
¼ 1;000 þ 1;018:52 ¼ $18:52
dollars. In general, for each dollar invested today at interest
rate r, we would receive in one year’s time, an amount Offering you this investment opportunity then is equiva-
lent to an increase of $18.52 in your current wealth.
future value per dollar ¼ ð1 þ r Þ This example is quite restrictive in that it assumes that all of
The term (1 + r) is an important enough variable in the investment’s returns will be realized precisely in one year.
finance to warrant its own name. It is called a wealth relative In the next section, we see how this situation can be gener-
and is part of all present value formulations. Returning to our alized to allow for the possibility of returns spread over time.
18.4 Compounding and Discounting Processes 371
Suppose that $1 is invested today for a period of t years at a 1;000ð1 þ :08Þ5 ¼ 1;000ð1:08Þ5 ¼ 1;469:33
risk-free annual interest rate rt, with interest to be com-
pounded annually. How much money will be returned at the The total interest of $469.33 consists of $400 yearly
end of t years? We find the answer by proceeding in annual interest ($80 per year 5 years) and $69.33 of com-
steps. At the end of the first year, an amount of interest r1 is pounding. If t = 64 years, the future value is $137,759.11,
added, giving a total of $(1 + r1). Since interest is com- which consists of $5,120 yearly interest (80 per year 64)
pounded, second-year interest is paid on this whole amount, and $131,639.11 of compounding.
so that the interest paid at the end of the second year is $r2(1
+ r1). Hence, the total amount at the end of the second 18.4.2 Continuous Compounding
year is
There is no difficulty in adapting Eq. (4.1) to a situation
future value per dollar after 2 years ¼ ð1 þ r1 Þ þ r2 ð1 þ r1 Þ
where interest is compounded at an interval of less than one
¼ 1 þ r1 þ r2 þ r1 r2 year. Simply replace the word year with the word period
compounding (the interval) in the above discussion. For
In words, the future value in two years comprises four example, suppose the interest is compounded semiannually,
quantities: the value you started with, $1; the interest with an annual rate of 8%. This implies that 4% is to be
accrued on the principal during the first year, r1; the interest added to the balance at the end of each half year. Suppose,
earned on the principal during the second year, r1r2. If the again, that $1,000 is to be invested for a term of five years.
interest rate is constant—that is, r1 = r2 = rt—then the Since this is the same as a term of ten half years, the total
compound term r1r2 can be written rt2. This assumes that the amount (in dollars), to be received is
term structure of interest rates is flat. Continuing in this way,
interest paid at the end of the third year is $r3(1 + rt)2, so that 1;000ð1 þ :04Þ10 ¼ 1;000ð1:04Þ10 ¼ 1;480:24
future values per dollar after 3 years ¼ ð1 þ rt Þ2 þ r3 ð1 þ rt Þ2 The additional $10.91 ($480.24–$469.33) arises because
¼ 1 þ r1 þ r2 þ r 3 þ r1 r2 the compounding effect is greater when it occurs ten times
þ r1 r3 þ r2 r3 þ r1 r2 r3 than when it occurs five times.
The extreme case is when interest is compounded con-
¼ ð1 þ rt Þ3
tinuously. (This is discussed in Appendix 3C in greater
In words, the future value in three years comprises eight detail.) The total amount per dollar to be received after
terms: the principal you started with; three terms for the t years, if interest is compounded continuously at an annual
interest on the principal each year, r1, r2, r3; three terms for rate rt, is
the interest on the interest, r1r2, r1r3, r2r3; and a term for the
future value per dollar ¼ etrt ð18:2Þ
interest during year 3 on the compound interest from years 1
and 2, r1r2r3. Again, if r1 = r2 = r3 = rt, this can all be reduced where e = 2.71828 … is a constant. If $1,000 is invested for
to (1 + rt)3. It is interesting to note that as t increases, the rt five years at an annual interest rate of 8%, with interest
terms increase linearly, whereas the compound terms increase compounded continuously, the end-of-period return would be
geometrically. That is, for each year, there is only one yearly
interest payment, but for the compounding terms, the number 1;000e5ð:08Þ ¼ 1;000e:4 ¼ $1;491:80
372 18 Time Value of Money Determinations and Their Applications
Many investment opportunities offer daily compounding. from these projects will be spread over a four-year period.
The formula we present for continuous compounding pro- The following table shows the dollar amounts involved.
vides a close approximation to daily compounding.
Year 0 Year 1 Year 2 Year 3 Year 4
Project A Costs 80,000 20,000 0 0 0
18.4.3 Single Payment Case—Present Values Returns 0 20,000 30,000 50,000 50,000
Project B Costs 50,000 50,000 0 0 0
Since many investments generate returns during several Returns 0 40,000 60,000 30,000 10,000
different years in the future, it is important to assess the
present value of future payments. Suppose that a payment is At first glance, this data might suggest that, for project A,
to be received in t years’ time and that the risk-free annual total returns exceed total costs by $50,000, while the same
interest rate for a period of t years is rt. In Eq. (18.1), we saw figure that project B is only $40,000, indicating a preference
that future value at the end of t years is ð1 þ rt Þt per dollar. for project A. However, this neglects the timing of the
Conversely, it follows that the present value of a dollar returns. Assuming an annual interest rate of 8% over the
received at the end of t years is four-year period, we can calculate the present values of the
1 net receipts for each project as follows:
present value per dollar ¼ ð18:3Þ
ð1 þ rt Þt
Year 0 Year 1 Year 2 Year 3 Year 4
For example, suppose that $1,000 is to be received in four Project A Net −80,000 0 30,000 50,000 50,000
returns
years. At an annual interest rate of 8%, the present value of
this future receipt is Present −80,000 0 25,720 39,692 36,751
values
1 1;000 Project B Net −50,000 −10,000 60,000 30,000 10,000
1;000 4
¼ 4
¼ $735:03 returns
ð1 þ :08Þ ð1:08Þ
Present −50,000 −9,259 51,440 23,815 7,350
More generally, we can consider a stream of annual values
receipts, which may be positive or negative. Suppose that, in
dollars, we are to receive C0 now, C1 in one year, C2 in two It is the sums of the present values that must be compared
years, and so on, and finally in year N we receive CN. Again, in evaluating the projects. For project A, substituting r = .08
let rt denote the annual rate of interest for a period of t years. into Eq. 18.5
To find the net present value of this stream of receipts, we
simply add the individual present values, obtaining 0 30;000 50;000 50; 000
NPV ¼ 80;000 þ 1
þ 2
þ 3
þ
ð1:08Þ ð1:08Þ ð1:08Þ ð1:08Þ4
C1 C2 CN
NPV ¼ C0 þ þ þ...þ ¼ 80;000 þ 0 þ 25;720 þ 39;692 þ 36;751
ð1 þ r1 Þ1 ð1 þ r2 Þ2 ð1 þ r2 ÞN
¼ $22;163
ð18:4Þ
X
N Similarly, for project B
Ct
NPV ¼
ð1 þ rt Þt 10; 000 60;000 30;000 10;000
t¼0
NPV ¼ 50;000 1
þ 2
þ 3
þ
ð1:08Þ ð1:08Þ ð1:08Þ ð1:08Þ4
Typically, the rate of interest, rt, depends on the period
t. When a constant rate, r, is assumed for each period, the net ¼ 50;000 9;259 þ 51;440 þ 23;815 þ 7;350
present value formula (Eq. 18.4) simplifies to ¼ $23;346
X
N
Ct It emerges that, if future returns are discounted at an
NPV ¼ ð18:5Þ
t¼0 ð1 þ r Þt annual rate of 8%, the net present value is higher for project
B than for project A. Hence, project B is preferred, because
Example 18.1 A corporation must choose between two it provides larger cash flows in the early years, which gives
projects. Each project requires an immediate investment, and the firm more opportunity to reinvest the funds, thereby
further costs will be incurred in the next year. The returns adding greater value to the firm.
18.4 Compounding and Discounting Processes 373
18.4.4 Annuity Case—Present Values bondholder is made every year. To find the present value of
a perpetuity, we need only let the term—N, in the annuity
An annuity is a special form of income stream in which case—grow infinitely large. Consequently, the second
regularly spaced equal payments are received over a period expression in brackets on the right-hand side of Eq. 18.6
of time. Common examples of annuities are payments on becomes zero, so that the present value of perpetuity pay-
home mortgages and installment credit loans. ments of C dollars per period, when the per period interest
Suppose that an amount C dollars is to be received at the rate is r, is
end of each of the next n time periods (which could, for
C
example, be months, quarters, or years). Assume further that, PV ¼
irrespective of the term, the interest rate period is fixed at r
r. Then the present value of the payment to be received at the For example, given an 8% annual interest rate, the present
end of the first period is C=ð1 þ r Þ, the present value of the value of $1,000 per annum in perpetuity is
next payment is C=ð1 þ r Þ2 , and so on. Hence, the present
value of the N period annuity is $1;000
¼ $12;500
:08
C C C XN
Ct
PV ¼ þ þ...þ n ¼
Notice that this sets an upper limit on the possible value
ð1 þ r Þ1 ð1 þ r Þ2 ð1 þ r Þ t¼1 ð1 þ r Þ
t
of an annuity. Thus, if the interest rate is 8% per annum,
annuity payments of $1,000 per year must have a present
In fact, it can be shown1 that this expression simplifies to value of less than $12,500, whatever the term.
" #
1 1
PV ¼ C ð18:6Þ
r r ð1 þ r ÞN 18.4.5 Annuity Case—Future Values
Suppose that an annuity of $1,000 per year is to be With an annuity of C dollars per year, we can also calculate a
received for each of the next ten years. The total dollar future value (FV) by using Eq. 18.7
amount is $10,000, but because receipts stretch far into the
future, we would expect the present value to be much less. FV ¼ C ð1 þ r ÞN þ C ð1 þ r ÞN1 þ . . . þ C ð1 þ r Þ1 ð18:7Þ
Assuming an annual interest rate of 8%, we can find the
This is very similar to the single value case discussed
present value of this annuity by using Eq. 4.6
earlier; each of the terms on the right-hand side of Eq. 18.7
" #
1 1 is identical to the values shown by Eq. 18.1.
$1;000 ¼ $6;710
;08 :08ð1:08Þ10
This annuity, then, has the same value as an immediate 18.4.6 Annual Percentage Rate
cash payment of $6,710.
The annual percentage rate (APR) is the actual or effective
Perpetuity interest rate that the borrower is paying. Quite often, the
stated or nominal rate of a loan is different from the actual
An extreme type of annuity is a perpetuity, in which pay- amount of interest or cost the lender is paying. This results
ments are to be received forever. Certain British government from the differences created by using different compounding
bonds, known as “consol”, are perpetuities. The principal periods. The main benefit of calculating the APR is that it
need not be repaid, but a fixed interest payment to the allows us to compare interest rates on loans or investments
that have different compounding periods.
The Consumer Credit Protection Act (Truth-in-Lending
Act), enacted in 1968, provides for disclosure of credit terms
so that the borrower can make a meaningful comparison of
1
Let x = 1/(1 + r). Then
alternative sources of credit. This act was the cornerstone for
Regulation Z of the Federal Reserve. The finance charge and
1 xN the annual percentage rate must be given explicitly to the
PV ¼ C ð xÞ 1 þ x þ . . . þ xN1 ¼ C ð xÞ
" # 1x borrower. The finance charge is the actual dollar amount that
1 1þr 1 the borrower must pay if given the loan. The APR also must
¼C 1
1þr r ð1 þ r ÞN be explained to individual borrowers and the actual figure
From which Eq. (18.6) follows. must be given.
374 18 Time Value of Money Determinations and Their Applications
Exhibit 18.1 shows the amount of interest paid and the 18.5 Present and Future Value Tables
APR for a $1,000 loan at 10% interest for 1 year, to be
repaid in 12 equal monthly installments. In the previous section, we presented formulae for various
present and future value calculations. However, the arith-
Exhibit 18.1: Interest Paid and APR metic involved can be rather tedious and time-consuming.
Amount borrowed = $1,000. Because the present and future values are frequently needed,
Nominal interest rate = 10% per year or 0.83% per tables have been prepared to make the computational task
month. easier. When using present value tables, keep in mind the
following: (1) they cannot be used for r < 0, (2) the interest
amount borrowed
Annuity or monthly payment ¼ PN 1
or discount rate must be constant over time for use of
ð 12Þ
t¼1 1 þ r t annuity tables, and (3) the tables are constructed by
1;000 assuming that all cash flows are reinvested at the discount
¼ ¼ $87:92 rate or interest rate.
11:3745
Month Payment Interest Principal Remaining
paid off principal unpaid
18.5.1 Future Value of a Dollar at the End of t
0 – – – $1,000
Periods
1 $87.92 $8.33 $79.58 $920.42
2 87.92 7.67 80.25 840.17 Suppose that a dollar is invested now at an interest rate of
3 87.92 7.00 80.91 759.26 r per period, with interest compounded at the end of each
4 87.92 6.33 81.59 677.67 period. Equation 18.1 gives the future value of a dollar at the
5 87.92 5.65 82.27 595.40 end of t periods. Values of this expression for various
6 87.92 4.96 82.95 512.45
interest rates, r, and the number of periods, t, are tabulated in
Table 1, which presents the future value of annuity.
7 87.92 4.27 83.65 428.80
Table 18.3 of Appendix 18C presents the Excel approach to
8 87.92 3.57 84.34 344.46
calculate this future value.
9 87.92 2.87 85.05 259.41 To illustrate, suppose that a dollar is invested now for
10 87.92 2.16 85.75 173.66 20 years at an annual rate of interest of 10% compounded
11 87.92 1.45 86.47 87.19 annually. Table 1 shows that the future value—the amount
12 87.92 0.73 87.19 0.00 to be received at the end of this period—is $6,728. (It fol-
Total $1,054.99 $54.99 $1,000.00 lows, of course, that the future value of an investment of
$1,000 is $6,728.)
beginning balance ending balance Example 18.2 Suppose you deposit $1,000 at an annual
Average loan balance ¼
2 interest rate of 12% for two years. How much extra interest
1;000 0 would you receive at the end of the term if interest was
¼ ¼ 500
2 compounded monthly instead of annually?
Fig. 18.1 Future value over time of $1 invested at different interest rates
Using the information in Table 18.20 of the appendix, we for compound interest are listed in Table 1 of the appendix.
can construct graphs showing the effect over time of com- Under simple interest, ten cents is accumulated each year, so
pound interest. Figure 18.1 shows the future values over that the future value after t years is $(1 + .10t). Notice that,
time of a dollar invested at interest rates of 0, 4, 8, and 12%. while the future values grow exponentially under com-
At 0%, the future value is always $1. The other three curves pounding, they do so only linearly with simple interest, so
were constructed from the future values taken from the 4, 8, that the two curves diverge over time.
and 12% interest columns in Table 18.20. Notice that these
curves exhibit exponential growth; that is, as a result of
compounding, annual changes in future values increase 18.5.2 Future Value of a Dollar Continuously
nonlinearly. Of course, the higher the rate of interest, the Compounded
greater the growth rate; or the longer the time, the greater the
compounding effect. Table 18.21 in the appendix of this book shows the future
In Fig. 18.2, we compare future values of a dollar over value of a dollar invested for t periods at an interest rate of
time under simple and annually compounded interest, both at r per period, continuously compounded. The entries in this
a 10% annual interest rate. By simple interest, we usually table are computed from Eq. 18.2, which states that the
mean the interest calculated for a given period by multi- future value is ert. Table 18.21 shows the corresponding
plying the interest rate times the principal. The future values future values for specific values of rt.
Fig. 18.2 Future value over time of $1 invested at 10% per annum simple and compound interest
376 18 Time Value of Money Determinations and Their Applications
Fig. 18.3 Future value time of $1 invested at 10% per annum, compounded annually and continuously
To illustrate, suppose a dollar is invested now for Using the information in Table 3, we can construct graphs
20 years at an annual interest rate of 10%, with continuous showing the effect over time of the discounting process
compounding. The future value at the end of the term can be involved in present value calculations. Figure 18.4 shows the
read from Table 2, using r = 0.10, t = 20, rt = 2. From the present values of a dollar received at various points in the
table, we find, corresponding to an rt value of 2, the future future, discounted at interest rates of 0, 4, 8, and 12%.
value is $7.389. Notice that the present values decrease the further into the
Figure 18.3 compares, over time, the future value of a future the payment is to be received; the higher the interest
dollar invested at 10% per annum under both annual and rate, the sharper the decrease. A comparison of Figs. 18.1,
continuous compounding. The two curves were constructed 18.2, 18.3 and 18.4 reveals the connection between com-
from the information in Tables 1 and 2 of the appendices. pound interest and present values. This is also clear from
Notice that, over time, the curves diverge, reflecting the Eqs. 18.1 and 18.4. If the future value after t years of a dollar
faster growth rate of future values as the interval for com- invested today, at annual interest rate r, is K, then, using the
pounding decreases. same interest rate, the present value of K to be received in
t years’ time is $1.
18.5.3 Present Value of a Dollar Received t Example 18.3 A corporation is considering a project for
Periods in the Future which both costs and returns extend into the future, as set out
in the following table (in dollars).
Suppose that a dollar is to be received t periods in the future
Year 0 1 2 3 4 5
and that the rate of interest is r, with compounding at the end
of each period. The present value of this future receipt can be Costs 130,000 70,000 50,000 0 0 0
computed from Eq. 18.3. The results of various combina- Returns 0 20,000 25,000 50,000 60,000 75,000
tions of values of r and t are tabulated in Table 3 of the Year 6 7 8 9 10
appendix at the back of this volume. For example, the table Costs 0 0 0 0 0
shows that the present value of a dollar to be received in Returns 75,000 60,000 50,000 25,000 20,000
20 years’ time, at an annual interest rate of 10% com-
pounded annually, is $0.149. (It follows that the present Assuming that future returns are discounted at an annual
value of $1,000 under these conditions is $149.) rate of 8%, find the net present value of this project.
18.6 Why Present Values Are Basic Tools … 377
Fig. 18.4 Present value, at different discount rates, of $1 to be received in the future
return. For example, the original project may be seen as to follow, we need to assume perfect competition in the
equivalent to one in which a return of $9,000 is certain. We capital markets; that is:
can then value the project by discounting the certainty
equivalent return at the risk-free rate. 1. Access to the market is open and free, with securities
readily traded.
2. No individual, or group of individuals acting in collu-
18.6.1 Managing in the Stockholders’ Interest sion, has sufficient market power for the actions of the
individual or group to significantly influence market
Consider the dilemma of a corporate manager who makes prices.
investment decisions on behalf of the corporation’s stock- 3. All relevant information about the price and risk of
holders. Because stockholders do not constitute a homoge- securities is readily available, at no cost, to all.
neous entity, the manager is faced with the problem of
accommodating an array of tastes and preferences. In Certainly, these assumptions are an idealization of reality.
particular: Nevertheless, they are sufficiently close to reality for our
analysis to be appropriate.
• Stockholders are not uniform in their time preferences for Now, in considering the consumption patterns available
consumption. Some prefer relatively high levels of cur- to our individual investor, we will assume borrowing or
rent consumption, while others prefer less current con- lending at the risk-free rate, which, for purposes of illus-
sumption in order to obtain higher consumption levels in tration, is 8%. The investor may, instead, prefer to assume
the future. some level of risk, which trading in the capital market allows
• Stockholders have different attitudes toward the for, and for such an investor this example can be carried
risk-return trade-off. Some are happier than others to through in terms of certainty equivalent amounts.
accept an element of risk in anticipation of higher Let us begin by computing the present value and future
potential returns. value of this investor’s cash flow stream. At an interest rate
of 8%, the present value is
Even if the manager is able to elicit accurate information
64;800
about the various tastes and preferences of individual PV ¼ 50;000 þ
stockholders, the problem of making decisions for the ben- 1:08
efit of all seems formidable. Fortunately, Irving Fisher, in ¼ 50;000 þ 60;000
1930, developed a simple resolution. Essentially, Fisher ¼ $110;000
demonstrated that, given certain assumptions, whatever the
array of stockholder tastes and preferences, the optimal This investor could consume $110,000 this year and
management strategy is to maximize the firm’s net present nothing next year by borrowing $60,000 at 8%interest. All
value. of next year’s income will then be needed to repay this loan.
To illustrate, suppose that a particular stockholder has a The future value, next year, of the cash flow stream is
current cash flow of $50,000 and a future cash flow, next FV ¼ ð50;000Þð1:08Þ þ 64;800
year, of $64,800.2 This stockholder could plan to consume
¼ 54;000 þ 64;800
$50,000 this year and $64,800 next year. However, this is
not the only consumption pattern that can be achieved with ¼ $118;800
these resources.
It follows that another option available to our investor is
At the heart of our analysis is the assumption that there is
to consume nothing this year and $118,800 next year. This
access to the capital markets, in which cash on hand can be
can be achieved by investing all of this year’s cash flow at
lent, or that an investor can borrow against future cash
8% interest.
receipts. This allows our stockholders to consume either
Our results are depicted in Fig. 18.5, which represents
more or less than $50,000 this year, which affects next year’s
possible two-period consumption levels. These levels are
consumption level. Moreover, the investor is not restricted to
found by plotting current consumption on the horizontal axis
risk-free market instruments, but is free to opt for riskier
and future consumption on the vertical axis; a point on the
securities with higher expected returns. For our conclusions
curve represents a specific combination of current and future
consumption levels. Thus, our two extreme cases are (0;
118,800) and (110,000; 0).
2
2 The restriction of our analysis to two periods is convenient for Between these extremes, many combinations are possi-
graphical exposition. However, the same conclusions follow when this ble. If the investor wants to consume only $30,000 of the
restriction is dropped.
18.6 Why Present Values Are Basic Tools … 379
current year’s cash flow, the remaining $20,000 can be adverse future events (precautionary balances), and 93) for
invested at 8% to yield $21,600 next year. Adding this to speculative reasons (for example, if interest rates are
next year’s cash flow produces a future consumption total of expected to rise in the future, it may be best to stay liquid
$86,400. today to take advantage of the future higher rates). Each
Conversely, $70,000 can be consumed this year by bor- rationale for holding cash makes individuals more partial to
rowing $20,000 at 8% interest. This requires repayment of maintaining liquidity. An incentive must be offered in the
$21,600 next year, leaving $43,200 available for consump- form of a positive interest rate to induce these individuals to
tion at that time (Table 18.1). give up some of their liquidity.
The consumption possibilities discussed so far are listed For a corporation, the management of cash and working
in Table 18.1 and plotted in Fig. 18.5. But these are not the capital is an important treasury function that takes these
only possibilities. Notice that the five points all lie on the factors into consideration.
same straight line. The reason is that, at 8% annual interest,
each $1 of current consumption can be traded for $1.08 of
consumption next year, and vice versa; therefore, any pair of 18.6.2 Productive Investments
consumption levels on the line in Fig. 18.5 is possible. The
slope of the consumption trade-off line in Fig. 18.5 is So far, we have assumed that the only opportunities for our
ð1 þ r Þ, i.e., −1.08. investor are in the capital market. Suppose that there are
In addition to the time preference discussed in this sec- productive investment opportunities, which may yield, in
tion, positive interest rates also indicate a liquidity prefer- certainty equivalent terms, rates of return in excess of 8% per
ence on the part of some investors. Keynes (1936) gives annum. Each dollar invested now that produces a return in
three reasons why individuals require cash: (1) to pay bills excess of $1.08 in a year’s time will increase the net present
(transaction balances), (2) to protect against uncertain value for the investor.
To illustrate, suppose the investor finds $80,000 worth of it is possible to consume more both now and in the future.
such opportunities that will yield $97,200 next year. (Notice Hence, we find that, whatever the time preference for con-
that the amount invested can exceed the current year’s cash sumption, the investor is better off as a result of a productive
flow, because any excess can be borrowed in the capital investment that raises net present value. Neither is it necessary
market.) The net present value of these investment oppor- to worry about the investor’s attitude toward risk, as this too
tunities is can be accommodated through capital market investments.
We have now established Irving Fisher’s concept.
97;200
NPV ¼ 80;000 þ ¼ $10;000 Viewing this individual stockholder’s cash flows as shares of
1:08 those of the corporation, it follows that, to act in the
These productive investments would raise the present stockholders’ interest, management’s objective should be to
value of our investor’s cash flow stream from $110,000 to seek those productive investments that increase the net
$120,000. Similarly, future value is raised by (1.08) (10,000) present value of the corporation as much as possible.
= $10,800, from $118,800 to $129,600. It follows from this discussion that the concept of net
Taking advantage of such productive opportunities does present value does considerably more than provide a con-
not affect the investor’s access to the capital market. venient and sensible way of interpreting future receipts. As
Therefore, our investor could consume $120,000 now and we have just seen, the net present value provides a basis on
nothing next year, or nothing now and $129,600 next year. It which financial managers can judge whether a proposed
is also possible to have intermediate consumption level productive investment is in the best interest of corporate
combinations by trading $1 of current consumption for stockholders. The manager’s task is to ascertain whether or
$1.08 of future consumption. not the project raises the firm’s net present value by more
This position is illustrated in Fig. 18.6, which shows the than would competing projects, without having to pay
shift in the consumption possibilities line resulting from the attention to the personal tastes and preferences of
productive investments. As compared with the earlier position, stockholders.
XN
C Ft
18.7 Net Present Value and Internal Rate NPV ¼ I ð18:8Þ
of Return t¼1 ð1 þ k Þt
where
Both Net present value (NPV) method and internal rate of
k = the appropriate discount rate.
return (IRR) method can be used to do the capital budgeting
CFt = Net Cash flow (positive or negative) in period t,
decision. For example, for project A and project B, the initial
I = Initial outlay,
outlays and net cash inflow for year 0 to year 4 are presented
N = Life of the project.
in Table 18.2. In Table 18.2, we know that the initial outlay
Using the excel function, we can calculate NPV for both
at year 0 for Project A and B are $80,000 and $50,000,
projects A and B. We can also calculate the example above
respectively. In year 1, additional investments for projects A
using the Excel NPV function. NPV is a function to calcu-
and B are $20,000 and $50,000, respectively. The net cash
late the net present value of an investment by using a dis-
inflow of project A for the next four years are $20,000,
count rate and a series of future payments (negative values)
$30,000, $50,000, and $50,000, respectively. The net cash
and income (positive values). The NPV function in Cell H10
inflow of project B for the next four years are $40,000,
is equal to
$60,000, $30,000, and $10,000, respectively.
The net present value of a project is computed by dis- ¼ NPVðC2; D10 : G10Þ þ C10
counting the project’s cash flows to the present by the
appropriate cost of capital. The formula used to calculate Based upon the NPV function in Fig. 18.7, the NPV
NPV can be defined as follow: results are shown in Fig. 18.8.
Appendix 18A
18.8 Summary
Three Hypotheses about Inflation and the Firm’s Value
In this chapter, we have introduced the concept of the pre-
sent value of a future receipt. For each dollar to be received
We began this chapter by asking whether you would prefer
in t years at an annual interest rate over t years of rt, the
to receive $1,000 today or $1,000 a year from now. One
present value is
reason for selecting the first option is that, as a result of
Appendix 18A 383
inflation, $1,000 will buy less in a year than it does today. In high indeed. For example, Feldstein and Summers (1979)
this appendix, we explore the possible effects of inflation on estimated that the use of depreciation and inventory
a firm’s value. According to Van Horne and Glassmire accounting on a historical cost basis raised corporate tax
(1972), unanticipated inflation affects the firm in three ways, liabilities by $26 billion in 1977.
characterized by the following hypotheses: In principle, the effects of general inflation should only be
felt when parties are forced to comply with nominal con-
1. Debtor-creditor hypothesis. tracts, the terms of which fail to anticipate inflation. Hence,
2. Tax-effects hypothesis. in theory, wealth transfers caused by general inflation should
3. Operating income hypothesis. be due primarily to the debtor-creditor or tax-effects
hypothesis discussed above. Apart from these considera-
The debtor-creditor hypothesis postulates that the impact tions, if all prices move in unison, real profits should not be
of unanticipated inflation depends on a firm’s net borrowing affected. Nevertheless, there is strong empirical evidence of
position. In periods of high inflation, fixed money amounts a negative association between corporate profitability and
borrowed today will be repaid in the future in a currency the general inflation rate. One possible explanation, called
with lower purchasing power. Thus, while the rate of interest the operating income hypothesis, is that high inflation rates
on the loan reflects expected inflation rates over the term of lead to restrictive government fiscal and monetary policies,
the loan, a higher than anticipated rate of inflation should which, in turn, depress the level of business activity, and
result in a transfer of wealth from creditors to debtors. hence profits. Further, operating income may be adversely
Conversely, if the inflation rate turns out to be lower than affected if prices of inputs, such as labor and materials, react
expected, wealth is transferred from debtors to creditors. more quickly to inflationary trends than prices of outputs.
Hence, according to the debtor-creditor hypothesis, a higher Viewed in this light, we might expect firms to react differ-
than anticipated rate of inflation should, all other things ently to inflation, depending on the reaction speed in the
being equal, raise the value of firms with heavy borrowings. markets in which the firms operate.
The tax-effects hypothesis concerns the influence of Van Horne and Glassmire suggest that, of these three
inflation on those firms with depreciation and inventory tax effects of unanticipated inflation on the value of the firm, the
shields. Since these shields are based on historical costs, operating income effect is likely to dominate. Some support
their real values decline with inflation. Hence, unanticipated for this contention is provided by French et al. (1983), who
inflation should lower the value of the firms with such find that debtor-creditor effects and tax effects are rather
shields. The magnitude of these tax effects could be very small.
384 18 Time Value of Money Determinations and Their Applications
Appendix 18B addition, we also give some examples to show how these
two processes to the real world.
Book Value, Replacement Cost, and Tobin’s q
Continuous Compounding
An objective of financial management should be to raise the
firm’s net present value. We have not, however, discussed
In the general calculation of interest, the amount of interest
what constitutes a firm’s value.
earned plus the principal is
An accounting measure of value is the total value of all a
firm’s assets, including plant and equipment, plus inventory. r T
principal þ interest ¼ principal 1 þ ð18:10Þ
Generally, in a firm’s accounts, the book values of the assets are m
reported. However, this is an inappropriate measure for two
reasons. First, it takes no account of the growth rate of capital where r = annual interest rate, m = number of compounding
goods prices since the assets were acquired, and second, it does periods per year, and T = number of compounding periods
not account for the economic depreciation of those assets. (m) times the number of years N.
Therefore, in considering a firm’s value, it is preferable to There are three variables: the initial amount of principal
consider current accounting measures that incorporate inflation invested, the periodic interest rate, and the time period of the
and depreciation. The relevant measure of accounting value, investment. If we assume that you invest $100 for 1 year at
then, is replacement cost, which is the cost today of purchasing 10% interest, you will receive the following:
assets of the same vintage as those currently held by the firm.
:10 1
However, this accounting concept of value is not the one principal þ interest ¼ $100 1 þ ¼ $110
1
used in financial management, as it does not incorporate the
potential for future earnings through the exploitation of For a given interest rate, the greater frequency with which
productive investment opportunities. If this broader defini- interest is compounded affects the interest and the time
tion is considered, the value of a firm will depend not only variables of the above equation; the interest per period
on the accounting value of its assets, but also on the ability decreases, but the number of compounding periods increa-
of management to make productive use of those assets. In ses. The greater the frequency with which interest is com-
finance theory, the relevant concept of values of common pounded, the larger the amount of interest earned. For
stock, preferred stock, and debt, all of which are determined interest compounded annually, semiannually, quarterly,
by the financial markets.3 monthly, weekly, daily, hourly, or continuously, we can see
The ratio of a firm’s market value to the replacement cost the increase in the amount of interest earned as follows:
of its assets is known as Tobin’s q, as shown in Tobin and
r T
Brainard (1977). One reason for looking at this relationship principal þ interest ¼ P0 1 þ
is that if the acquisition of new capital adds more to the m
firm’s value than the cost of acquiring that capital—that is, it
has a positive NPV—then shareholders immediately benefit
Annual $110 ¼ 100 1 þ :101 1
from the acquisition. On the other hand, if the addition of
new capital adds less than its cost to market value, share- Semiannual 110.25 ¼ 100 1 þ :102 2
holders would be better off if the money were distributed to Quarterly 110.38 ¼ 100 1 þ :104 4
them as dividends. Therefore, the relationship between
Monthly 110.47 ¼ 100 1 þ :10
12 12
market value and replacement cost is crucial in financial
Weekly 110.51 ¼ 100 1 þ :10
52 52
management decision-making.
Daily 110.52 ¼ :10
100 1 þ 365 365
Hourly 110.52 ¼ :10
100 1 þ 8760 8760
Appendix 18C
Continuously 110.52 ¼ 100 e:1ð1Þ ¼ 100ð2:7183Þ:1
Appendix 18D: Applications of Excel Pmt: The payment in each period; If “pmt” is omitted, we
for Calculating Time Value of Money should include the “pv” argument below.
Pv: The present value. If “pv” is omitted, it is assumed to
In this appendix, we will show how to use Excel to calculate: be 0. Then we should include the “pmt” argument above.
(i) the future value of a single amount, (ii) the present value Type: The number 0 or 1 shows when payments are due.
of a single amount, (iii) the future value of an ordinary If payments are due at the end of the period, Excel sets it as
annuity, and (iv) the present value of an ordinary annuity. 0; If payments are due at the beginning of the period, Excel
sets it as 1.
The FV function gives us the same amount as what we
Future Value of a Single Amount calculate according to the formula except the sign is nega-
tive. Actually, the FV function in Excel is to compute the
Suppose the principal is $1000 today and the interest rate is Future value of the principal that one party should pay back
5% per year. to another party. Therefore, Excel adds a negative sign to
The future value of the principal can be calculated as indicate the amount needed to pay back, as presented in
FV ¼ PV ð1 þ r Þn , where n is the number of years. Table 18.4.
Case 1. Suppose there is only one period, i.e. n = 1. The Case 2. Now suppose there are 4 periods. The future
future value in one year will be 1000ð1 þ 5%Þ1 ¼ 1050. value of $1,000 at the end of the 4th year will be
We can use Excel to directly compute it by inputting 1000ð1 þ 5%Þ4 ¼ 1215:51.
“=B1*(1+B2),” as presented in Table 18.3 We use two methods to compute the future value and
Or we can also use the function in Excel to compute the obtain the same result.
future value by inputting “=FV(B2,1, ,B1,0)” First, we calculate it directly according to the formula, as
There are five options in this function. presented in Table 18.5.
Rate: The interest rate per period. Second, we use the FV function in Excel to calculate it, as
Nper: The number of payment periods. presented in Table 18.6.
The FV function gives us the same amount as what we Or, we can use the FV function which is quite similar to
calculate according to the formula except the sign is nega- the FV function we used before. The result is presented in
tive. Actually, the FV function in Excel is to compute the Table 18.8.
Future value of the principal that one party should pay back Case 2. Suppose a project will end in four years and it
to another party. Therefore, Excel adds a negative sign to would pay $1000 only at the end of the last year. The interest
indicate the amount needed to pay back. rate is 5% for one year.
The present value will be 1000=ð1 þ 5%Þ4 ¼ 952:38.
We can use Excel to directly compute it by inputting
Present Value of a Single Amount “=B1/(1+B2)^4,” as presented in Table 18.9.
Or we use the PV formula in Excel by inputting “=PV
The present value of the future sum of money can be cal- (B2,4,,B1,0),” as presented in Table 18.10.
culated as PV ¼ FV=ð1 þ r Þn , where n is the number of
years.
Case 1. Suppose a project will end in one year and it pays Future Value of an Ordinary Annuity
$1000 at the end of that year. The interest rate is 5% for one
year. Annuity is a series of cash flow of a fixed amount for n
The present value will be 1000=ð1 þ 5%Þ1 ¼ 952:38. periods of equal length. It can be divided into Ordinary
We can use Excel to directly compute it by inputting Annuity (the first payment occurs at the end of period) and
“=B1/(1+B2),” as presented in Table 18.7. Annuity Due (the first payment is at the beginning of the
period)
388 18 Time Value of Money Determinations and Their Applications
Case 1. Future Value of an ordinary annuity. Case 2. Future Value of an Annuity Due.
P
n Pn
The formula is FV ¼ PMT ð1 þ r Þk1 ; where PMT is The formula is FV ¼ PMT ð1 þ r Þk ; where PMT is
k¼1 k¼1
the payment in each period: the payment in each period:
Suppose a project will pay you $1,000 at the end of each Suppose a project will pay you $1,000 at the beginning of
year for 4 years at 5% annual interest, and the following each year for 4 years at 5% annual interest, and the following
graph shows the process: graph shows the process:
nP
1
Present Value of an Ordinary Annuity The formula is PV ¼ PMT=ð1 þ r Þk ; where PMT is
k¼0
the payment in each period:
Case 1. Present Value of an ordinary annuity. Suppose a project will pay you $1500 at the end of each
Pn
The formula is FV ¼ PMT=ð1 þ r Þk ; where PMT is year for 4 years at 5% annual interest.
k¼1 According to this formula, we directly input “=B1/(1+B5)
the payment in each period: ^3+B2/(1+B5)^2+B3/(1+B5)^1+B4/(1+B5)^0” to get the
Suppose a project will pay you $1500 at the end of each present value of 5584.87, as presented in Table 18.17.
year for 4 years at 5% annual interest. Similarly, the PV function gives us the same result as
According to this formula, we directly input “=B1/(1+B5) presented in Table 18.18.
^4+B2/(1+B5)^3+B3/(1+B5)^2+B4/(1+B5)^1” to get the Case 3. An annuity that pays forever (Perpetuity).
present value of 5318.93, as presented in Table 18.15.
In addition, we can use the PV function in Excel directly PMT
PV ¼
and obtain the same amount as above, as presented in r
Table 18.16. In Excel, we directly input “=B1/B2” to get PV = 30,000,
Case 2. Present Value of an annuity due. as presented in Table 18.19.
390 18 Time Value of Money Determinations and Their Applications
b. Find NPV when CF0 = -$1,000, CF1 = $600, CF2 = 7. Suppose that C dollars is to be received at the end of
$700, and r = 10%. each of the next N years, and that the annual interest
c. If the investment is risk-free, what rate is used as a rate is r over the N years.
proxy for r? a. What is the formula for the present value of the
6. ABC Company is considering two projects for a new payments?
investment, as shown in table below (in dollars). Which b. Calculate the present value of the payments when C
is better if ABC uses the NPV rule to select between the = $1,000, r = 10%, and N = 50.
projects? Suppose that the interest rate is 12%. c. Would you pay $10,000 now (t = 0) for the annuity
of $1,000 to be received every year for the next
Year 0 Year 1 Year 2 Year 3 Year 4 50 years?
Project A Costs 10,000 0 0 0 0 d. If $1,000 per year is to be received forever, what is
Returns 0 0 0 1,000 20,000 the present value of those cash flow streams?
Project B Costs 5,000 5,000 0 0 0
8. Mr. Smith is 50 years old and his salary will be $40,000
next year. He thinks his salary will increase at an annual
Returns 0 10,000 5,000 3,000 2,000
rate of 10% until his retirement at age 60.
392 18 Time Value of Money Determinations and Their Applications
a. If the appropriate interest rate is 8%, what is the 11. Which of the following would you choose if the current
present value of these future payments? interest rate is 10%?
b. If Mr. Smith saves 50% of his salary each year and a. $100 now.
invests these savings at the annual interest rate of b. $12 at the end of each year for the next ten years.
12%, how much will he save by age 60? c. $10 at the end of each year forever.
9. Suppose someone pays you $10 at the beginning of d. $200 at the end of the seventh year.
each year for 10 years, expecting that you will pay back e. $50 now and yearly payments decreasing by 50% a
a fixed amount of money each year forever commenc- year forever.
ing at the beginning of Year 11. For a fair deal when f. $5 now and yearly payments increasing by 5% a
annual interest rate is 10% how much should the annual year forever.
fixed amount of money be? 12. You are given an opportunity to purchase an investment
10. ZZZ Bank agrees to lend ABC Company $10,000 today which pays no cash in years 0 through 5, but will pay
in return for the company’s promise to pay back $150 per year beginning in year 6 and continuing forever.
$25,000 five years from today. What annual rate of Your required rate of return for this investment is 10%.
interest is the bank charging the company? Assume all cash flows occur at the end of each year.
Appendix 18E: Tables of Time Value of Money 393
Table 18.10 Present value for multiple periods in terms of excel formula
a. Show how much you should be willing to pay for 10 years and interest is compounded continuously at an
the investment at the end of year 5. annual quoted rate of 5%, how much will you have in
b. How much should you be willing to pay for the your account at the end of 10 years?
investment now? 16. Your mother is about to retire. Her firm has given her
13. If you deposit $100 at the end of each year for the next the option of retiring with a lump sum of $50,000 now
five years, how much will you have in your account at or an annuity of $5,200 per year for 20 years. Which is
the end of five years if the bank pays 5% interest worth more if your mother can earn an annual rate of
compounded annually? 6% on similar investments elsewhere?
14. If you deposit $100 at the beginning of each year for the 17. You borrow $6145 now and agree to pay the loan off
next five years, how much will you have in your over the next ten years in ten equal annual payments,
account at the end of five years if the bank pays 5% which include principal and 10% annually compounded
interest compounded annually? interest on the unpaid balance. What will your annual
15. If you deposit $200 at the end of each year for the next payment be?
394 18 Time Value of Money Determinations and Their Applications
Table 18.22 Present value table—present value of a dollar received t periods in the future
t/ r 2% 4% 6% 8% 10% 12% 14% 16% 18% 20%
1 0.9804 0.9615 0.9434 0.9259 0.9091 0.8929 0.8772 0.8621 0.8475 0.8333
2 0.9612 0.9246 0.8900 0.8573 0.8264 0.7972 0.7695 0.7432 0.7182 0.6944
3 0.9423 0.8890 0.8396 0.7938 0.7513 0.7118 0.6750 0.6407 0.6086 0.5787
4 0.9238 0.8548 0.7921 0.7350 0.6830 0.6355 0.5921 0.5523 0.5158 0.4823
5 0.9057 0.8219 0.7473 0.6806 0.6209 0.5674 0.5194 0.4761 0.4371 0.4019
6 0.8880 0.7903 0.7050 0.6302 0.5645 0.5066 0.4556 0.4104 0.3704 0.3349
7 0.8706 0.7599 0.6651 0.5835 0.5132 0.4523 0.3996 0.3538 0.3139 0.2791
8 0.8535 0.7307 0.6274 0.5403 0.4665 0.4039 0.3506 0.3050 0.2660 0.2326
9 0.8368 0.7026 0.5919 0.5002 0.4241 0.3606 0.3075 0.2630 0.2255 0.1938
10 0.8203 0.6756 0.5584 0.4632 0.3855 0.3220 0.2697 0.2267 0.1911 0.1615
11 0.8043 0.6496 0.5268 0.4289 0.3505 0.2875 0.2366 0.1954 0.1619 0.1346
12 0.7885 0.6246 0.4970 0.3971 0.3186 0.2567 0.2076 0.1685 0.1372 0.1122
13 0.7730 0.6006 0.4688 0.3677 0.2897 0.2292 0.1821 0.1452 0.1163 0.0935
14 0.7579 0.5775 0.4423 0.3405 0.2633 0.2046 0.1597 0.1252 0.0985 0.0779
15 0.7430 0.5553 0.4173 0.3152 0.2394 0.1827 0.1401 0.1079 0.0835 0.0649
16 0.7284 0.5339 0.3936 0.2919 0.2176 0.1631 0.1229 0.0930 0.0708 0.0541
17 0.7142 0.5134 0.3714 0.2703 0.1978 0.1456 0.1078 0.0802 0.0600 0.0451
18 0.7002 0.4936 0.3503 0.2502 0.1799 0.1300 0.0946 0.0691 0.0508 0.0376
19 0.6864 0.4746 0.3305 0.2317 0.1635 0.1161 0.0829 0.0596 0.0431 0.0313
20 0.6730 0.4564 0.3118 0.2145 0.1486 0.1037 0.0728 0.0514 0.0365 0.0261
Suppose that k dollar(s) is to be received t periods in the future and that the rate of interest is r, with compounding at the end of each period
This table gives the present value of k dollar(s) collected at the end of t periods for various interest rates, r, and the number of periods, t
Assume the amount of money in dollar(s) is $1
Table 18.23 Present value table—present value of an annuity of a dollar per period
t/ r 2% 4% 6% 8% 10% 12% 14% 16% 18% 20%
1 0.9804 0.9615 0.9434 0.9259 0.9091 0.8929 0.8772 0.8621 0.8475 0.8333
2 1.9416 1.8861 1.8334 1.7833 1.7355 1.6901 1.6467 1.6052 1.5656 1.5278
3 2.8839 2.7751 2.6730 2.5771 2.4869 2.4018 2.3216 2.2459 2.1743 2.1065
4 3.8077 3.6299 3.4651 3.3121 3.1699 3.0373 2.9137 2.7982 2.6901 2.5887
5 4.7135 4.4518 4.2124 3.9927 3.7908 3.6048 3.4331 3.2743 3.1272 2.9906
6 5.6014 5.2421 4.9173 4.6229 4.3553 4.1114 3.8887 3.6847 3.4976 3.3255
7 6.4720 6.0021 5.5824 5.2064 4.8684 4.5638 4.2883 4.0386 3.8115 3.6046
8 7.3255 6.7327 6.2098 5.7466 5.3349 4.9676 4.6389 4.3436 4.0776 3.8372
9 8.1622 7.4353 6.8017 6.2469 5.7590 5.3282 4.9464 4.6065 4.3030 4.0310
10 8.9826 8.1109 7.3601 6.7101 6.1446 5.6502 5.2161 4.8332 4.4941 4.1925
11 9.7868 8.7605 7.8869 7.1390 6.4951 5.9377 5.4527 5.0286 4.6560 4.3271
12 10.5753 9.3851 8.3838 7.5361 6.8137 6.1944 5.6603 5.1971 4.7932 4.4392
13 11.3484 9.9856 8.8527 7.9038 7.1034 6.4235 5.8424 5.3423 4.9095 4.5327
14 12.1062 10.5631 9.2950 8.2442 7.3667 6.6282 6.0021 5.4675 5.0081 4.6106
15 12.8493 11.1184 9.7122 8.5595 7.6061 6.8109 6.1422 5.5755 5.0916 4.6755
16 13.5777 11.6523 10.1059 8.8514 7.8237 6.9740 6.2651 5.6685 5.1624 4.7296
17 14.2919 12.1657 10.4773 9.1216 8.0216 7.1196 6.3729 5.7487 5.2223 4.7746
18 14.9920 12.6593 10.8276 9.3719 8.2014 7.2497 6.4674 5.8178 5.2732 4.8122
19 15.6785 13.1339 11.1581 9.6036 8.3649 7.3658 6.5504 5.8775 5.3162 4.8435
20 16.3514 13.5903 11.4699 9.8181 8.5136 7.4694 6.6231 5.9288 5.3527 4.8696
Suppose that k dollar(s) is collected, and the interest is compounded at the end of each period
This table gives the value of k dollar(s) collected at the end of t periods for various interest rates, r, and the number of periods, t
Assume the amount of money in dollar(s) is $1
References 401
References Tobin, J. and W. C. Brainard. “Asset Markets and the Cost of Capital,”
Economic Progress, Private Values, and Public Policy: Essays in
Honor of William Fellner, B. Balassa and R. Nelson, eds.
Feldstein, M. and L. Summers. “Inflation and the Taxation of Capital (Amsterdam: North-Holland, 1977).
Income in the Corporate Sector,” National Tax Journal (December Van Horne, J. and W. Glassmire. “The Impact of Unanticipated
1979, pp. 445–47). Changes in Inflation on the Value of Common Stocks,” Journal of
French, K., R. Ruback, and W. Schwert. “Effects of Nominal Finance (December 1972, pp. 1083–92).
Contracting on Stock Returns,” Journal of Political Economy 91
(February 1983, pp. 70–96).
Capital Budgeting Method Under Certainty
and Uncertainty 19
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 403
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_19
404 19 Capital Budgeting Method Under Certainty and Uncertainty
19.2.1 Identification Phase business units. This approach, called the Business Strategy
Matrix, has been developed and used quite successfully by
The identification of potential capital expenditures is directly the Boston Consulting Group. It emphasizes market share
linked to the firm’s overall strategic objective; the firm’s and market growth rate in terms of stars, cash cows, question
position within the various markets it serves; government marks, and dogs, as shown in Exhibit 19.1.
fiscal, monetary, and tax policies; and the leadership of the
firm’s management. A widely used approach to strategic Exhibit 19.1: Boston Consulting Group, Business
planning is based on the concept of viewing the firm as a Strategy Matrix
collection, or portfolio, of assets grouped into strategic
Given an organization that follows some sort of strategic This approach highlights the risk-and-return nature of both
planning relative to the Business Strategy Matrix, the most capital budgeting and business strategy. As presented, the
common questions are, How does capital budgeting fit into inclusion of risk in the analysis focuses on the identification
this framework? and, Are the underlying factors of capital of projects such as A, which will add sufficient value (return)
budgeting decisions consistent with the firm’s objectives of to the organization to justify the risk that the firm must take.
managing market share? Because of its high risk and low return, project F will not
There are various ways to relate the Business Strategy normally be sought after, nor will the extensive effort be
Matrix to capital budgeting. One of the more appealing is made to evaluate its usefulness. Marginal projects such as B,
presented in Exhibit 19.3. C, D, and E require careful scrutiny. In the case of projects
such as B, with low risk but also low return, there may be
Exhibit 19.2: Capital Budgeting and the Business justification for acceptance based on capital budgeting con-
Strategy Matrix siderations, but such projects may not fit into the firm’s
19.2 The Capital Budgeting Process 405
strategic plans. On the other hand, projects such as E, which change—play an important role in developing the alterna-
make strategic sense to the organization, may not offer tives. Most of this initial screening data is nonfinancial. But
sufficient return to justify the higher risk and so may be even such nonfinancial considerations as the quality and
rejected by the capital budgeting decision-maker. quantity of the workforce, political activity, competitive
To properly identify appropriate projects for management reaction, regulation, and environmental concerns must be
consideration, both the firm’s long-run strategic objectives integrated into the process of selecting alternatives.
and its financial objectives must be considered. One of the Depending on the nature of the firm’s business, there are
major problems facing the financial decision-maker today is two other considerations. First, different levels of the firm’s
the integration of long-run strategic goals with financial management require different types of information. Second,
decision-making techniques that produce short-run gains. as Ackoff (1970) notes, “most managers using a manage-
Perhaps the best way to handle this problem is in the project ment information system suffer more from an overabun-
identification step by considering whether the investment dance of irrelevant information than they do from a lack of
makes sense in light of long-run corporate objectives. If the relevant information.”
answer is no, look for more compatible projects. If the In a world in which all information and analysis were free,
answer is yes, proceed to the next step, the development we could conceive of management analyzing every possible
phase. investment idea. However, given the cost, in both dollars and
time, of gathering and analyzing information, management is
forced to eliminate many alternatives based on strategic
19.2.2 Development Phase considerations. This paring down of the number of feasible
alternatives is crucial to the success of the overall capital
The development, or information generation, step of the budgeting program. Throughout this process, the manager
capital budgeting process is probably the most difficult and faces critical questions, such as are excellent proposals being
most costly. The entire development phase rests largely on eliminated from consideration because of lack of informa-
the type and availability of information about the investment tion? and, Are excessive amounts of time and money being
under consideration. With limited data and an information spent to generate information on projects that are only mar-
system that cannot provide accurate, timely, and pertinent ginally acceptable? These questions must be addressed on a
data, the usefulness of the capital budgeting process will be firm-by-firm basis. When considered in the global context of
limited. If the firm does not have a functioning management the firm’s success, these questions are the most important
information system (MIS) that provides the type of infor- considerations in the capital budgeting process.
mation needed to perform capital budgeting analysis, then After the appropriate alternatives have been determined
there is little need to perform such analysis. The reason is the during the development phase, we are ready to perform the
GIGO (garbage-in, garbage-out) problem; garbage (bad detailed economic analysis, which occurs during the selec-
data) used in the analysis will result in garbage (bad or tion phase.
useless information) coming out of the analysis. Hence, the
establishment and use of an effective MIS are crucial to the
capital budgeting process. This may be an expensive 19.2.3 Selection Phase
undertaking, both in dollars and in human resources, but the
improvement in the efficiency of the decision-making pro- Because managers want to maximize the firm’s value for the
cess usually justifies the cost. shareholders, they need some guidance as to the potential
There are four types of information needed in capital value of the investment projects. The selection phase
budgeting analysis: (1) the firm’s internal data, (2) external involves measuring the value, or the return, of the project as
economic data, (3) financial data, and (4) nonfinancial data. well as estimating the risk and weighing the costs and
The actual analysis of the project will eventually rely on benefits of each alternative to be able to select the project or
firm-specific financial data because of the emphasis on cash projects that will increase the firm’s value given a risk target.
flow. However, in the development phase, different types of In most cases, the costs and benefits of an investment
information are needed, especially when various options are occur over an extended period, usually with costs being
being formulated and considered. Thus, economic data incurred in the early years of the project’s life and benefits
external to the firm such as general economic conditions, being realized over the project’s entire life. In our selection
product market conditions, government regulation or procedures, we take this into consideration by incorporating
deregulation, inflation, labor supply, and technological the time value of money. The basic valuation framework, or
406 19 Capital Budgeting Method Under Certainty and Uncertainty
normative model, that we will use in the capital budgeting The firm’s evaluation and control system are important
selection process is based on present value, as presented in not only to the postaudit procedure but also to the entire
Eq. 19.1: capital budgeting process. It is important to understand that
the investment decision is based on cash flow and relevant
XN
CFt costs, while the postaudit is based on accrued accounting
PV ¼ t; ð19:1Þ
t¼1 ð1 þ kÞ
and assigned overhead. Also, firms typically evaluate per-
formance based on accounting net income for profit centers
where PV = the present value or current price of the within the firm, which may be inaccurate because of the
investment; CFt = the future value or cash flow that occurs in misspecification of depreciation and tax effects. The result is
time t; N = the number of years that benefits accrue to the that, while managers make decisions based on cash flow,
investor; and k = the time value of money or the firm’s cost they are evaluated by an accounting-based system.
of capital. In addition to data and measurement problems, the con-
By using this framework for the selection process, we are trol phase is even more complicated in practice because there
looking explicitly at the firm’s value over time. We are not is a growing concern that the evaluation, reward, and
emphasizing short-run or long-run profits or benefits, but are executive incentive system emphasizes a short-run,
recognizing that benefits are desirable whenever they occur. accounting-based return instead of the maximization of
However, benefits in the near future are more highly valued long-run value of cash flow. Thus, quarterly earnings per
than benefits far down the road. share, or revenue growth, are rewarded at the expense of
The basic normative model (Eq. 19.1) will be expanded longer-run profitability. This emphasis on short-run results
to fit various situations that managers encounter as they may encourage management to forego investments in capital
evaluate investment proposals and determine which pro- stock or research and development that have long-run ben-
posals are best. efits in exchange for short-run projects that improve earnings
per share.
A brief discussion of the differences between
19.2.4 Control Phase accounting-based information and cash flow is appropriate at
this point. The first major difference between the financial
The control phase is the final step of the capital budgeting decision-maker who uses cash flow and the accountant who
process. This phase involves placing an approved project on uses accounting information is one of time perspective.
the appropriation budget and controlling the magnitude and Exhibit 6.4 shows the differences in time perspective
timing of expenditures while the project is progressing. between financial decision-makers and accountants.
A major portion of this phase is the postaudit of the project,
through which past decisions are evaluated for the benefit of Exhibit 19.3: Relevant Time Perspective
future capital expenditures.
19.3 Cash-Flow Evaluation … 407
As seen in Exhibit 19.3, the financial decision-maker is give greater consideration to abandonment questions in their
concerned with future cash flows and value, while the capital budgeting decision-making. An ideal time to reassess
accountant is concerned with historical costs and revenue. the value of an ongoing investment is at regular intervals
The financial decision-maker faces the question, What will I during the postaudit.
do? while the accountant asks, How did I do?
The second problem is one of definition. The financial
decision-maker is concerned with economic income, or a 19.3 Cash-Flow Evaluation of Alternative
change in wealth. For example, if you purchase a share of Investment Projects
stock for $10 and later sell the stock for $30, from a financial
viewpoint you have gained $20 of value. It is easy to mea- Investment should be undertaken by a firm only if it will
sure economic income in this case. However, when we look increase the value of shareholders’ wealth. Theoretically,
at a firm’s actual operations, the measurement of economic Fama and Miller (1972) and Copeland et al. (2004) show
income becomes quite complicated. that the investment decisions of the firm can be separated
The accountant is concerned with accounting income, from the individual investor’s consumption–investment
which is measured by the application of generally accepted decision in a perfect capital market. This is known as
accounting principles. Accounting income is the result of Fisher’s (1930) separation theorem. With perfect capital
essential but arbitrary judgments concerning the matching of markets, the manager will increase shareholder wealth if he
revenues and expenses during a particular period. For or she chooses projects with a rate-of-return greater than the
example, revenue may be recognized when goods are sold, market-determined rate-of-return (cost of funds), regardless
shipped, or invoiced, or on receipt of the customer’s check. of the shape of individual shareholders’ indifference curves.
A financial analyst and an accountant would likely differ on The ability to borrow or lend in perfect capital markets leads
when revenue is recognized. to a higher wealth level for investors than they would be able
to achieve without capital markets. This ability also leads to
Clearly, over long periods economic value and accounting optimal production decisions that do not depend on indi-
income converge and are equal because the problems of vidual investors’ resources and preferences. Thus, the
allocation to particular time periods disappear. However, investment decision of the firm is separated from the indi-
over short periods, there can be significant differences vidual’s decision concerning current consumption and
between these two measures. The financial decision-maker investment. Investment decision will therefore depend only
should be concerned with the value added over the life of the on equating the rate-of-return of production possibilities
project, even though the postaudit report of results is an with the market rate-of-return.
accounting report based on only one quarter or one year of This separation principle implies that the maximization of
the project’s life. To incorporate a long-run view of value the shareholders’ wealth is identical to maximizing the
creation, the firm must establish a relationship between its present value of their lifetime consumption. Under these
evaluation system, its reward or management incentive sys- circumstances, different shareholders of the same firm will
tem, and the normative goals of the capital budgeting system. be unanimous in their preference. This is known as the
Another area of importance in the control or postaudit unanimity principle. It implies that the managers of a firm, in
phase is the decision to terminate or abandon a project once their capacity as agents for shareholders, need not worry
it has been accepted. Too often we consider capital bud- about making decisions that reconcile differences of opinion
geting as only the acquisition of investments for their entire among shareholders: All shareholders will have identical
economic life. The possibility of abandoning an investment interests. In fact, the price system by which profit is mea-
prior to the end of its estimated useful or economic life has sured conveys the shareholders’ unanimously preferred
important implications for the capital budgeting decision. production decisions to the firm.
The possibility of abandonment expands the options avail- Looked at in another way, the use of investment decision
able to management and reduces the risk associated with rules, or capital budgeting, is really an example of a firm
decisions based on holding an asset to the end of its eco- attempting to realize the economic principle of operating at
nomic life. This form of contingency planning gives the the point where marginal cost equals marginal revenue to
financial decision-maker and management a second chance maximize shareholder wealth. In terms of investment deci-
to deal with the economic and political uncertainties of the sions, the “marginal revenue” is the rate-of-return on
future. investment projects, which must be equated with the mar-
At any point, to justify the continuation of a project, the ginal cost, or the market-determined cost of capital.
project’s value from future operations must be greater than Investment decision rules, or capital budgeting, involve
its current abandonment value. Given the recent increase in the evaluation of the possible capital investments of a firm
the number and frequency of divestitures, many firms now according to procedures that will ensure the proper
408 19 Capital Budgeting Method Under Certainty and Uncertainty
comparison of the cost of the project, that is, the initial and Equation (19.2) is the basic equation to be used to
continuing outlays for the project, with the benefits, the determine the cash flow for capital-budgeting determination.
expected cash flows accruing from the investment over time. Second, the definition of cash flow relevant to financial
To compare the two cash flows, future cash amounts must be decision-making involves finance rather than accounting
discounted to the present by the firm’s cost of capital. Only income. Accounting regulations attempt to adjust cash flows
in this way will the cost of funds to the firm be equated with over several periods (e.g., the expense of an asset is depre-
the benefits from the investment project. ciated over several time periods); finance cash flows are
The firm generally receives funds from creditors and calculated as they occur to the firm. Thus, the cash outlay (It)
shareholders. Both fund suppliers expect to receive a to purchase a machine is considered a cash outflow in the
rate-of-return that will compensate them for the level of risk finance sense when it occurs at acquisition.
they take. Hence, the discount rate used to discount the cash To illustrate the actual calculations involved in defining
flow should be the weighted-average cost of debt and equity. the cash flows accruing to a firm from an investment project,
In Chap. 10, we will discuss the weighted cost of capital we consider the following situation. A firm is faced with a
with tax effect in detail. decision to replace an old machine with a new and more
The weighted-average cost of capital is the same with the efficient model. If the replacement is made, the firm will
market-determined opportunity cost of funds provided to the increase production sufficiently each year to generate
firm. It is important to understand that projects undertaken $10,000 in additional cash flows to the company over the life
by firms must earn enough cash for the creditors and of the machine. Thus, the before-tax cash flow accruing to
shareholders to compensate their expected risk-adjusted the firm is $10,000.
rate-of-return. If the present value of annuity on the cash The cash flow must be adjusted for the net increase in
flow obtained from the weighted-average cost of capital is income taxes that the firm must now pay due to the increased
larger than the initial investment, then there are some gains net depreciation of the new machine. The annual straight line
in shareholders’ wealth using this kind of concept. Copeland depreciation for the new machine over its 5-year life will be
et al. (2004) demonstrated that maximizing the discount cash $2,000, and we assume no terminal salvage value. The old
flows provided by the investment project. machine has a current book value of $5,000 and a remaining
Before any capital-budgeting techniques can be surveyed, depreciable life of 5 years with no terminal salvage value.
a rigorous definition of cash flows to a firm from a project Thus, the incremental annual depreciation will be the annual
must be undertaken. First, the decision-maker must consider depreciation charges of the new, $2,000, less the annual
only those future cash flows that are incremental to the depreciation of the old, or $1,000. The additional income to
project; that is, only those cash flows accruing to the firm the firm from the new machine is then the $10,000 cash flow
that are specifically caused by the project in question. In less the incremental depreciation, $1,000. The increased tax
addition, any decrease in cash flows to the company by the outlay from the acquisition will then be (assuming a 50%
project in question (i.e., the tax-depreciation benefit from a corporate income tax rate) 0.50 $9,000, or $4,500.
machine replaced by a new one) must be considered as well. Adjusting the gross annual cash flow of $10,000 by the
The main advantage of using the cash-flow procedure in incremental tax expense of $4,500 gives $5,500 as the net
capital-budgeting decisions is that it avoids the difficult cash flow accruing to the firm from the new machine. It
problem underlying the measurement of corporate income should be noted that corporate taxes are real outflow and
associated with the accrual method of accounting, for must be taken into account when evaluating a project’s
example, the selection of depreciation methods and desirability. However, the depreciation allowance (dep) is
inventory-valuation methods. not a cash outflow and therefore should not be subtracted
It is well known that the equality between sources and from the annual cash flow.
uses of funds for an all-equity firm in period t can be defined The calculations of post-tax cash flow mentioned above
as can be summarized in Eq. (19.3):
Rt þ Nt Pt ¼ Nt dt þ WSMSt þ It ; ð19:2Þ Annual After Tax Cash Flow ¼ ICFBT ðICFBT D depÞs
¼ ICFBTð1 sÞ þ ðdepÞs
where
Rt = Revenue in period t, ð19:3Þ
NtPt = New equity in period t, where
Ntdt = Total dividend payment in period t, ICFBT = Annual incremental operating cash flows,
WSMSt = Wages, salaries, materials, and service pay- s = Corporate tax rate, and
ment in period t, and
It = Investment in period t.
19.4 Alternative Capital-Budgeting Methods 409
Ddep = Incremental annual depreciation charge, or the Table 19.1 Initial cost and net cash inflow for four projects
annual depreciation charges on the new machine less the Year A B C D
annual depreciation on the old. 0 −100 −100 −100 −100
Following Eq. (19.3), ICFBT can be defined in
1 20 0 30 25
Eq. (19.4) as
2 80 20 50 40
ICFBT ¼ DRt DWSMSt : ð19:4Þ 3 10 60 60 50
4 −20 160 80 115
Note that ICFBT is an amount before interest and
depreciation are deducted and D indicates the change of
related variables. The reason is that when discounted at the
Since they are mutually exclusive investment projects, only
weighted cost of capital, we are implicitly assuming that the
one project can be accepted, according to the following
project will return the expected interest payments to creditors
capital-budgeting methods.
and the expected dividends to shareholders.
Alternative depreciation methods will change the time
pattern but not the total amount of the depreciation allow-
19.4.1 Accounting Rate-of-Return
ance. Hence, it is important to choose the optimal depreci-
ation method. To do this, the net present value (NPV) of tax
In this method, a rate-of-return for the project is computed
benefits due to the tax deductibility of the depreciation
by using average net income and average investment outlay.
allowance can be defined as
This method does not incorporate the time value of money
XN
dept and cash flow. The ARR takes the ratio of the investment’s
NPVðtax benefitÞ ¼ s ; average annual net income after taxes to either total outlay or
t¼1 ð1 þ k Þt
average outlay. The accounting rate-of-return method aver-
where dept = depreciation allowance in period t and N = life ages the after-tax profit from an investment for every period
of project; it will depend upon whether the straight-line, over the initial outlay:
double declining balance, or sum-of-years’-digits method is PN A Pt
used. ARR ¼ t¼0 N
; ð19:6Þ
I
The net cash inflow in period t (Ct ) used for capital
budgeting decision can be defined as where
APt = After-tax profit in period t,
Ct ¼ CFt sc ðCFt dept It Þ; ð19:5Þ
I = Initial investment, and
where CFt ¼ ½Qt ðPt Vt Þ; Qt = quantity produced and N = Life of the project.
sold; Pt = price per unit; Vt = variable costs per unit; dept = By assuming that the data in Table 19.1 are accounting
depreciation; sc = tax rate; and It = interest expense. profits and the depreciation is $25, the accounting
rates-of-return for the four projects are
19.4.2 Internal Rate-of-Return Method Although there are several problems in using the payback
method as a capital-budgeting method, the reciprocal or
The internal rate-of-return (IRR, r) is the discount rate which payback period is related to the internal rate-of-return of the
equates the discounted cash flows from a project to its project when the life of the project is very long. For exam-
investment. Thus, one must solve iteratively for the r in ple, assume an investment project that has an initial outlay of
Eq. (19.7): I and an annual cash flow of R. The payback period is I/R
and its reciprocal is R/I. On the other hand, the internal
X
N
CFt rate-of-return (r) of a project can be written as follows:
¼ I; ð19:7Þ
t¼1 ð1 þ r Þt
R R 1
r¼ ð Þ½ ; ð19:8Þ
where I I ð1 þ r ÞN
CFt = Cash flow (positive or negative) in period t,
I = Initial investment, and where r is the internal rate-of-return and N is the life of the
N = Life of the project. project in years. Clearly, when N approaches infinity, the
The IRR for the four projects in Table 19.1 are reciprocal of payback period R/I will approximate the
annuity rate-of-return. The payback method provides a liq-
Project A: IRR does not exist (since the cash flows are less uidity measure, i.e., sooner is better than later.
than the initial investment), Equation (19.8) is the special case of the internal
Project B: 28.158%, rate-of-return formula defined in Eq. (19.7). By assuming
Project C: 33.991%, and equal annual net receipts and zero semi-annual value,
Project D: 32.722%. Eq. (19.7) can be rewritten as
R 1 1 1
Since the four projects are mutually exclusive and Pro- I¼ ½1 þ þ þ ::: þ ;
1þr ð1 þ rÞ ð1 þ r Þ2 ð1 þ r ÞN1
ject C has the highest IRR, we will choose Project C.
The IRR is then compared to the cost of capital of the ð19:70 Þ
firm to determine whether the project will return benefits where R ¼ CF1 ¼ CF2 ¼ ¼ CFn : Summing the geo-
greater than its cost. A consideration of advantages and metric series within the square brackets and reorganizing
disadvantages of the IRR method will be undertaken when it terms, we obtain Eq. (19.8).
is compared to the net present value method.
Project A: 2.0 years, where k = the appropriate discount rate, and all other terms
Project B: 3.125 years, are defined as above.
Project C: 2.33 years, and The NPV method can be applied to the cash flows of the
Project D: 2.70 years. four projects in Table 19.1. By assuming a 12% discount
rate, the NPV for the four projects are as follows:
If we use the payback method, we will choose Project A.
Several problems can arise if a decision-maker uses the Project A: −23.95991,
payback method. First, any cash flows accruing to the firm Project B: 60.33358,
after the payback period are ignored. Second, and most Project C: 60.19367, and
importantly, the method disregards the time value of money. Project D: 62.88278.
That is, the cash flow returned in the later years of the
project’s life is weighted equally with more recent cash Since Project D has the highest NPV, we will select
flows accruing to the firm. Project D as the best one.
19.5 Capital-Rationing Decision 411
Clearly, the NPV method explicitly considers both time should be accepted first. Obviously, PI considers the time
value of money and economic cash flows. It should be noted value of money and the correct finance cash flows, as does
that this conclusion is based upon the discount rate which is the NPV method. Further, the PI and NPV methods will lead
12%. However, if the discount rate is either higher or lower to identical decisions unless ranking mutually exclusive
than 12%, this conclusion may not be entirely true. This projects and/or under capital rationing. When considering
issue can be resolved by crossover rate analysis, which can mutually exclusive projects, the PI can lead to a decision
be found in Appendix 19.2. In Appendix 19.2, we analyzed different from that derived by the NPV method.
projects A and B for different cash flows and different dis- For example:
count rates. The main conclusion for Appendix 19.2 can be
summarized as follows. Project Initial Present value of cash NPV PI
NPV(B) is higher with low discount rates and NPV(A) is outlay inflows
higher with high discount rates. This is because the cash A 100 200 100 2
flows of project A occur early and those of project B occur B 1000 1300 300 1.3
later. If we assume a high discount rate, we would favor
project A; if a low discount rate is expected, project B will Project A and B are mutually exclusive projects. Pro-
be chosen. In order to make the right choice, we can cal- ject A has a lower NPV and higher PI compared to Pro-
culate the crossover rate. If the discount rate is higher than ject B. This will lead to a decision to select Project A by
the crossover rate, we should choose project A; if otherwise, using the PI method and select Project B by using the NPV
we should go for project B. method. In the case shown here, the NPV and PI rank-
Based upon the concept of break-even analysis discussed ings differ because of the differing scale of investment:
in Eq. (2.6) of Chap. 2, we can determine the units of pro- The NPV subtracts the initial outlay while the PI method
duct that must be produced in order for NPV to be zero. If divides by the original cost. Thus, differing initial invest-
CF1 = CF2 = … = CFN = CF and NPV = 0, then Eq. (19.9) ments can cause a difference in ranking between the two
can be rewritten as methods.
The firm that desires to maximize its absolute present
X
N
1
CF½ ¼ I: ð19:90 Þ value rather than percentage return will prefer Project B,
t¼1 ð1 þ k Þt because the NPV of Project B ($300) is greater than the NPV
of Project A ($100). Thus, the PI method should not be used
By substituting the definition of CF given in Eq. (19.5) as a measure of investment worth for projects of differing
into Eq. (19.9′), we can obtain the break-even point (Q*) for sizes where mutually exclusive choices have to be made. In
capital budgeting as other words, if there exist no other investment opportunities,
½I ðdepÞs=ð1 sÞ 1 then the NPV will be the superior method in this case
Q ¼ f PN t gðp vÞ: ð19:10Þ because, under the NPV, the highest ranking investment
t¼1 1=½ð1 þ k Þ
project (the one with the largest NPV) will add the most
A real-world example of an application of the NPV value to shareholders’ wealth. Since this is the objective of
method to breakeven analysis can be found in Reinhardt the firm’s owners, the NPV will lead to a more accurate
(1973) and Chap. 13 of Lee and Lee (2017). decision.
The manager’s views on alternative capital budgeting
methods and related practical issues will be presented in
19.4.5 Profitability Index Appendix 19.1.
19.5.1 Basic Concepts of Linear Programming greater-than-or-equal-to. Second, the solution values of the
decision variables are divisible, that is, a solution would
Linear programming is a mathematical technique used to permit x(j) = 1/2, 1/4, etc. If such fractional values are not
find optimal solutions to problems of a firm involving the possible, the related technique of integer programming,
allocation of scarce resources among competing activities. yielding only whole numbers as solutions, can be applied.
Mathematically, the type of problem that linear program- Third, the constant coefficients are assumed known and
ming can solve is one in which both the objective of the firm deterministic (fixed). If the coefficients have probabilistic
to be maximized (or minimized) and the constraints limiting distributions, one of the various methods of stochastic pro-
the firm’s actions are linear functions of the decision vari- gramming must be used. Examples will be given below of
ables involved. Thus, the first step in using linear pro- the application of linear programming to the areas of capital
gramming as a tool for financial decisions is to model the rationing and capital budgeting.
problem facing the firm in a linear programming form. To
construct the linear programming model, one must take the
following steps. 19.5.2 Capital Rationing
First, identify the controllable decision variables involved
in the firm’s problem. Second, define the objective or cri- The XYZ Company produces products A, B, and C within
terion to be maximized or minimized and represent it as a the same product line, with sales totaling $37 million last
linear function of the controllable decision variables. In year. Top management has adopted the goal of maximizing
finance, the objective generally is to maximize the profit shareholder wealth, which to them is represented by gain in
contribution or the market value of the firm or to minimize shareholder price. Wickwire plans to finance all future pro-
the cost of production. jects with internal or external equity; funds available from
Third, define the constraints and express them as linear the equity market depend on share price in the stock market
equations or inequalities of the decision variables. This will for the period.
usually involve (a) a determination of the capacities of the Three new projects were proposed to the Finance Com-
scarce resources involved in the constraints and (b) a mittee, for which the following net after-tax annual funds
derivation of a linear relationship between these capacities flows are forecast:
and the decision variables.
Symbolically, then, if X1, X2, …, Xn represent the Project Year
quantities of output, the linear programming model takes the 0 1 2 3 4 5
general form: X −100 30 30 60 60 60
Y −200 70 70 70 70 70
Maximize (or minimize) Z ¼ c1 X1 þ c2 X2 þ þ cn Xn ;
Z −100 −240 −200 400 300 300
ð19:12Þ
Subject to: a11 X1 þ a12 X2 þ þ a1n Xn b1 All three projects involve financing cost-saving equip-
a21 X1 þ a22 X2 þ þ a2n Xn b2 ment for well-established product lines; adoption of any one
am1 X1 þ am2 X2 þ þ amn Xn bm project does not preclude adoption of any other. The fol-
.. .. lowing NPV formulations have been prepared by using a
. .
Xj 0; ðj ¼ 1; 2; . . .; nÞ: discount rate of 12%.
30X 70Y þ 240Z C þ D þ 0E ¼ 70; Capital budgeting frequently incorporates the concept of
probability theory. To illustrate, consider two projects—
30X 70Y þ 200Z þ 0C D þ E ¼ 50:
project x and project y—and three states of the economy—
Here, −D and −E are included in the second and third prosperity, normal, and recession—for any given time. For
constraints, ensuring that idle funds unused from one period each of these states, we may calculate a probability of
are carried over to the succeeding period. In addition, to occurrence and estimate their respective returns, as indicated
prevent the program from repeatedly selecting only one in Table 19.2.
project (the “best”) until funds are exhausted, three addi- The expected returns for projects x and y can be calcu-
tional constraints are needed: lated by Eq. 19.13:
X
X 1; Y 1; Z 1: k¼ k i pi ð19:13Þ
The solution to the model if V = $208.424, is. kx ¼ 6:25% þ 7:50% þ 1:25% ¼ 15:00%
The process of solving this linear program with Excel is
illustrated in Appendix 19A. ky ¼ 10% þ 7:50% 2:50% ¼ 15:00%
To give an indication of the value of relaxing the fund
and the standard deviation for these returns can be found
constraint in any period (the most the firm would be willing
through Eq. 19.14
to pay for additional financing), the shadow price of the fund
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
constraints is given below: X n
r¼ ðki kÞ2 pi ð19:14Þ
Funds constraint Shadow price i¼1
Data in Table 19.2 can be used to draw histograms of Ct ¼ CFt sc ðCFt dept It Þ
projects x and y, as depicted in Fig. 19.1. If we assume that
rates of return (k) are distributed continuously and normally, where CFt ¼ ½Qt ðPt Vt Þ; Ct = net cash flow in period t;
then Fig. 19.1a can be drawn as Fig. 19.1b. Qt = quantity produced and sold; Pt = price; Vt = variable
The concept of statistical probability distribution can costs; dept = depreciation; sc = tax rate; and It = interest
be combined with capital budgeting to derive the statisti- expense. For this equation, net cash flow is a random
cal distribution method for selecting risky investment projects. number because Q, P, and V are not known with certainty.
The expected return for both projects is 15%, but because We can assume that net Ct has a normal distribution.
project y has a flatter distribution with a wider range of If two projects have the same expected cash flow, or
values, it is the riskier project. Project x has a normal dis- return, as determined by the expected value (Eq. 19.9), we
tribution with a larger collection of values nearer the 15% may be indifferent between either project if we were to base
expected rate-of-return and therefore is more stable. our choice solely on return. However, if we also take risk
into account, we get a more accurate picture of the type of
19.6.1 Statistical Distribution of Cash Flow distribution to expect, as shown in Fig. 19.1.
With the introduction of risk, a firm is not necessarily
From Eq. (19.2) of this chapter, the equation for net cash indifferent between two investment proposals having equal
inflow can be explicitly defined as NPV. Both NPV and its standard deviation (rNPV ) should be
19.6 The Statistical Distribution Method 415
estimated in performing capital-budgeting analysis under However, because the standard deviations of project A’s
uncertainty, NPV under uncertainty is defined as cash flows are greater than project B’s, project A is riskier
than project B. This difference can only be explicitly eval-
X
N ~t
C S uated by using the statistical distribution method. To
NPV ¼ t þ Io ð19:15Þ
t¼1 ð1 þ k Þ ð1 þ kÞN examine the riskiness between the two projects, we can
calculate the standard deviation of their NPVs. If cash flows
where C ~ t = uncertain net cash flow in period t; k = are perfectly positively correlated over time, then the stan-
risk-adjusted discounted rate; St = salvage value; and Io = dard deviation of NPV (rNPV ) can be simplified as1
initial outlay.
The mean of the NPV distribution and its standard X
N
rt
rNPV ¼ ð19:17aÞ
deviation is defined as t¼1 ð1 þ k Þt
X
N
Ct S rNPV ð AÞ ¼ ð$4Þ PVIF10%;1 þ ð$4Þ PVIF10%;2 þ . . . þ ð$4Þ PVIF10%;5
NPV ¼ t þ Io ð19:16Þ
t¼1 ð1 þ k Þ ð1 þ kÞN ¼ ð4Þð:9091Þ þ ð4Þð:8264Þ þ ð4Þð:7513Þ þ ð4Þð:6830Þ þ ð4Þð:6209Þ
¼ 15:16 or $15;160
" #12
X
N
r2t
rNPV ¼ ð19:17Þ rNPV ðBÞ ¼ ð$2Þ PVIF10%;1 þ ð$2Þ PVIF10%;2 þ . . . þ ð$2Þ PVIF10%;5
t¼1 ð1 þ kÞ2t ¼ ð2Þð:9091Þ þ ð2Þð:8264Þ þ ð2Þð:7513Þ þ ð2Þð:6830Þ þ ð2Þð:6209Þ
¼ 7:58 or $7;580
for cash flows that are mutually independent (q = 0) cash
flows. The generalized case for both Eqs. 19.16 and 19.17 is
With the same NPV, project B’s cash flows would fluc-
explored in Appendix 19.2.
tuate by $7,580 per year, while project A’s would fluctuate
by $15,160. Therefore, project B would be preferred, given
Example 19.1 A firm is considering two new product lines,
the same returns, because it is less risky.
projects A and B, with the same life, mean returns, and
Lee and Wang (2010) provide the fuzzy real option val-
salvage flow, as indicated in Table 19.3. Under the certainty
uation approach to solve the capital budgeting decision
methods (this chapter), both projects would have the same
under an uncertainty environment. In Wang and Lee’s model
NPV:
framework, the concept of probability is employed in
X
5 describing fuzzy events under the estimated cash flow based
Ct
NPVA ¼ NPVB ¼ on fuzzy numbers, which can better reflect the uncertainty in
ð1 þ k Þt
t¼1 the project. By using a fuzzy real option valuation, the
managers can select fuzzy projects and determine the opti-
NPV ¼ 20 PVIF10%;1 þ 20 PVIF10%;2 þ 20 PVIF10%;3
mal time to abandon the project under the assumption of
þ 20 PVIF10%;4 þ 20 PVIF10%;5 60 þ 5 PVIF10%;5
¼ 20ð:9091Þ þ 20ð:8264Þ þ 20ð:6830Þ þ 20ð:6209Þ 60 þ 5ð:6209Þ
limited capital budget. Lee and Lee (2017) has discussed this
¼ 19:90
in detail in Chap. 14.
1
Equation 19.17a is a special case of Eq. 19.19.
416 19 Capital Budgeting Method Under Certainty and Uncertainty
Table 19.6 Uniformly 06,433 80,674 24,520 18,222 10,610 05,794 37,515 48,619 02,866
distributed random numbers 39,208 47,829 72,648 37,414 75,755 01,717 29,899 78,817 03,500
89,884 59,051 67,533 08,123 17,730 95,862 08,034 19,473 03,071
61,512 32,155 51,906 61,662 64,130 16,688 37,275 51,262 11,569
99,653 47,635 12,506 88,535 36,553 23,757 34,209 55,803 96,275
95,913 11,045 13,772 76,638 48,423 25,018 99,041 77,529 81,360
55,804 44,004 13,112 44,115 01,691 50,541 00,147 77,685 58,788
35,334 82,410 91,601 40,617 72,876 33,967 73,830 15,405 96,554
59,729 88,646 76,487 11,622 96,297 24,160 09,903 14,041 22,917
57,383 89,317 63,677 70,119 94,739 25,875 38,829 68,377 43,918
30,574 06,039 07,967 32,422 76,791 39,725 53,711 93,385 13,421
81,307 13,314 83,580 79,974 45,929 85,113 72,208 09,858 52,104
02,410 96,385 79,007 54,039 21,410 86,980 91,772 93,307 34,116
18,969 87,444 52,233 62,319 08,598 09,066 95,288 04,794 01,534
87,803 80,514 66,800 62,297 80,198 19,347 73,234 86,265 49,096
68,397 10,538 15,438 62,311 72,844 60,203 46,412 05,943 79,232
28,520 54,247 58,729 10,854 99,058 18,260 38,765 90,038 94,200
44,285 09,452 15,867 70,418 57,012 72,122 36,634 97,283 95,943
80,299 22,510 33,517 23,309 57,040 29,285 07,870 21,913 72,958
84,842 05,748 90,894 61,658 15,001 94,055 36,308 41,161 37,341
1. Draw a random number from Table 19.6. It doesn’t average number of Table 19.5. The first random number, 06,
matter exactly where on the table numbers are picked, as is in the first random interval of Table 19.5; therefore, the
long as the pattern for drawing numbers is consistent and demand is 350. The second random number, 80, is in the
unvaried; for example, the first two numbers of row 1, fourth random interval of Table 19.5; therefore, the demand
then row 2, then row 3, and so forth. is 650. Similarly, we can obtain other random numbers in
2. In Table 19.5, find the random number interval associ- column c. Column d represents the quantity order for alter-
ated with the random number chosen from Table 19.6. native A, which represents the number demand of the pre-
3. Find the weekly demand (Dn) in Table 19.5 that corre- vious week. Column e and column h represent the amount
sponds to the random number (RN). sold in week n for alternatives A and B, respectively. This
4. Calculate the amount sold (Sn). If Dn [ Qn , then sale number is determined in accordance with Procedure 4,
Sn ¼ Qn ; if Dn \Qn , Sn ¼ Dn . which was mentioned above. Column g represents the
5. Calculate weekly profit ½Pn ¼ ðSn PÞ ðQn C Þ. weekly amount order for alternative B, which is the average
6. Repeat steps 1 to 5 until 20 days have been simulated. number (550) of Table 19.5. Column f and column i rep-
resent the weekly profit for alternatives A and B, respec-
The results of the above procedures are summarized in tively, which was calculated using the formula in Eq. 19.14.
Table 19.7. There are nine columns in Table 19.7. Column a Through simulation, we can see that because there would
represents the week, column b represents the random num- be fewer machine parts in the inventory, the firm would earn,
ber, column c represents the weekly demand, column d on average, an additional $667 per week using alternative B
represents the amount ordered for the nth week for alternative rather than alternative A. This is because an average of about
A, column e represents the sales for alternative A, column f 29 more machines are sold per week. Through the simulation
represents the profit of nth week for alternative A, column g of these two types of order techniques, we have found that
represents the amount ordered for the nth week for alternative alternative B is the better of the two, but not necessarily the
B, column h represents the sales for alternative B, and col- optimal choice. We may run simulations for other types of
umn i represents the profit of nth week for alternative B. decision alternatives and may choose among these.
We will now explain how the random numbers in column A simulation model is a representation of a real system,
b were obtained. The first nine random numbers were taken wherein the system’s elements are depicted by arithmetic or
from the first two digits of the random numbers in row 1 of logical processes. These processes are then executed either
Table 19.6. The second nine numbers were obtained from manually, as illustrated in Example 19.2, or by using a
the first two digits of the random numbers in row 6 of computer, for more complicated models, to examine the
Table 19.6. The last three random numbers are from the first dynamic properties of the system. Simulation of the actual
two digits of the first three random numbers in row 11. operation of a system tests the performance of the specific
Column c is a number demand for the nth week. The first system. For this reason, simulation models must be
number for number demand for Week 0 is 550, which is the custom-made for each situation.
418 19 Capital Budgeting Method Under Certainty and Uncertainty
Example 19.2 is a specific production management problem complexity and the insights gained do not justify the effort.
and serves as a learning tool on manual simulation. Simulation Also, for ease of modeling, we use a uniform distribution to
models have been developed for capital budgeting decisions, describe the probability of any particular outcome in a
and by way of Example 19.2, we can see how such models can specified range. By using a set range for each of the nine
be utilized at the financial analysis and planning level. random variables, we are not actually allowing the proba-
bilities of each possible outcome to vary, but the spirit of
varying probabilities is imbedded in the simulation
19.7.1 Simulation Analysis and Capital approach. One further qualification of our model is that the
Budgeting life of the facilities is restricted to an integer value with the
range as specified at the bottom of Table 19.8.
The following example shows how the simulation model The uniform distribution density function2 can be written
developed by Hertz (1964, 1979) can be used in capital as
budgeting. Here we consider a firm that intends to introduce
a new product; the 11 input variables thought to determine
project value are shown in Table 19.7. Of these inputs,
variables 1–9 are specified as random variables (that is, there
2
is no predetermined sequence or order for their occurrence) For a more detailed discussion of the properties of the uniform density
with ranges as listed in the table. We could add a random function, see Hamburg (1983, pp. 100–101). Other more realistic
distributions, such as log-normal and normal distributions, can be used
element to variables 10 and 11, but the computational to improve the empirical results of this kind of simulation.
19.7 Simulation Methods 419
Table 19.8 Variables for simulation The operating cost for the first simulation can be obtained
Variables Range as follows:
1. Market size (units) 2,500,000–3,000,000 98
2. Selling price ($/unit) 40–60 30 þ ð45 30Þ ¼ 44:7
100
3. Market growth 0–5%
Similar computations can be used to calculate the values
4. Market share 10–15%
of all variables except the useful life of the facilities.
5. Total investment required ($) 8,000,000–10,000,000 Because useful life of facilities is restricted to integer values,
6. Useful life of facilities (years) 5–9 we use the following correspondence between random
7. Reside value of investment ($) 1,000,000–2,000,000 numbers and useful life of facilities:
8. Operating cost ($/unit) 30–045
9. Fixed costs ($) 400,000–500,000 Random 01– 20– 40– 60– 90– 00
number 19 39 59 79 99
10. Tax rate 40%
Useful life 5 6 7 8 9 10
11. Discount rate 12%
Source Reprinted from Lee (1985, p. 359)
Notes (a) Random numbers from Wonnacott and Wonacott (1977) are
Since the random number for useful life is 02, it is within
used to determine the value of a variable for simulation the range of 01–19; therefore, the useful life is 5 years.
For each simulation, a series of cash flows and its net
present value can be calculated by using the following
1 formula:
fx ¼ ð19:18Þ
ba
ðsales volumeÞt ¼ ðmarket sizeÞ ð1 þ market growth rateÞt
where b is the upper bound on the variable value and a is the ðmarket shareÞ
lower bound. Over the range a\x\b, the function EBIT ¼ ðsales volumeÞt ðselling price operating costÞ
fx ¼ 1=b a; over the range b\x\a, fx ¼ 0. With this in
ðfixed costÞ
mind, note the way the values are assigned. For each suc-
cessive input variable, a random-number generator selects a
value from 01 to 00 (where 00 is the proxy for 100 using a ðcash flowÞt ¼ EBITt ð1 tax rateÞ
2-digit random-number generator) and then translates that X
N
ðcash flowÞt
value into a variable value by taking account of the specified NPV ¼ I0
t¼1 ð1 þ discount rateÞt
range and distribution of the variable in question.
For each simulation, nine random numbers are selected. where t represents the tth year and N represents the useful
From these random numbers, a set of values for the nine key life.
factors is created. For example, the first set of random The results in terms of cash flow for each simulation are
numbers, as shown in Table 19.9, is 39, 73, 72, 75, 37, 02, listed in Table 19.10, with each period’s cash flows shown
87, 98, and 10. The procedure of selecting these numbers is separately. Now, we will discuss how the cash flow for the
similar to Example 9.2; however, these random numbers are first simulation is calculated. For example, the cash flow for
not based upon the uniform distribution random number as the first three periods are 2,034,382.33, 2,116,529.56, and
presented in Table 19.6. If we use the random numbers from 2,201,525.23. 2,034,382.335, can be calculated as follows:
Table 19.6, we can use the first two digits of the first row of
this random table, then the random numbers are 06, 80, 24, ðsales volumeÞ1 ¼ ðmarket sizeÞ ð1 þ market growth rateÞt
18, 10, 05, 37, 48, and 02. ðmarket shareÞ
The value of the market size factor for the first simulation ¼ ½ð2;695;000Þ ð1 þ 0:036Þ ð13:75%Þ
can be obtained as follows: ¼ 383;902:75
39
2;500;000 þ ð3;000;000 2;500;000Þ ¼ 2;695;000
100 EBIT1 ¼ ð383;902:75Þ ð54:6 44:7Þ ð410;000Þ
¼ 3;390;637:22
The value of sale price factor for the first simulation can
be obtained as follows: ðcash flowÞ1 ¼ 3;390;637:22 ð1 40%Þ ¼ 2;034;382:33
73
40 þ ð60 40Þ ¼ 54:6
100
420 19 Capital Budgeting Method Under Certainty and Uncertainty
ðsales volumeÞ2 ¼ ðmarket sizeÞ ð1 þ market growth rateÞt EBIT3 ¼ ð412;041:285Þ ð54:6 44:7Þ ð410;000Þ
ðmarket shareÞ ¼ 3;669;208:72
h i
¼ ð2;695;000Þ ð1 þ 0:036Þ2 ð13:75%Þ ðcash flowÞ3 ¼ 3;669;208:72 ð1 40%Þ ¼ 2;201;525:23
¼ 397;732:249
In Table 19.10 for the first, sixth, and tenth simulations,
EBIT2 ¼ ð397;732:249Þ ð54:6 44:7Þ ð410;000Þ we calculate cash flow for five periods. For the second and
¼ 3;527;549:27 fifth simulations, we calculate cash flow for eight periods.
For the third, seventh, and eighth simulations, we calculate
ðcash flowÞ2 ¼ 3;527;549:27 ð1 40%Þ ¼ 2;116;529:56 cash flow for seven periods. For the fourth simulation, we
calculate cash flow for six periods. Finally, for the ninth
simulation, we calculate cash flow for nine periods.
ðsales volumeÞ3 ¼ ðmarket sizeÞ ð1 þ market growth rateÞt
The NPVs for each simulation are given under the input
ðmarket shareÞ
h i values listed in Table 19.9. From these NPV figures, we can
¼ ð2;695;000Þ ð1 þ 0:036Þ3 ð13:75%Þ calculate a mean NPV figure and standard deviation, from
¼ 412;041:285 which we can analyze the project’s risk and return profile.
As we can see, this project’s NPV can range from −$6
19.8 Summary 421
million to +$15 million, depending on the combinations information obtained from simulation analysis is valuable in
of random events that could take place. The mean allowing the decision-maker to more accurately evaluate
NPV is $4,194,647.409 with a standard deviation of risky capital investments.
$6,618,476.469. This indicates that there is a 70% chance
that the NPV will be greater than 0. In addition, we can use
this average NPV and its standard deviation to calculate 19.8 Summary
interval estimate for NPV. In other words, by using simu-
lation we can have interval estimate of NPV, which was used Important concepts and methods related to capital-budgeting
in both statistical distribution method and decision tree decisions under certainty were explored in Sects. 19.3, 19.4,
method. and 19.5. Cash-flow estimation methods were discussed
Furthermore, if we change the range or distribution of the before alternative capital-budgeting methods were explored.
random variables, we can then perform sensitivity analysis Capital-rationing decisions in terms of linear programming
to investigate the impact of a change of an input factor on the were also discussed in this chapter.
risk and return of the investment project. In this chapter, we have also discussed uncertainty and
Also, by using sensitivity analysis, we essentially break how capital-budgeting decisions are made under conditions
down the uncertainty involved in the undertaking of any of uncertainty. Presented were two methods of handling
project, thereby highlighting exactly what the decision- uncertainty: statistical distribution method and simulation
maker should be primarily concerned with in forecasting in method. Each method is based on the NPV approach, so that,
terms of those variables critical to the analysis. The in theory, using any of the methods should yield similar
422 19 Capital Budgeting Method Under Certainty and Uncertainty
results. However, in practice, the method used will depend The second step is to express the objective function.
on the availability of information and the reliability of that As our objective is to maximize V ¼ 65:585X þ
information. 52:334Y þ 171:871Z þ 0C þ 0D þ 0E, V is our objective
function. I then input the expression of the objective function
in B5: “¼ 65:585 B15 þ 52:334 D15 þ 171:871 F15 þ
Appendix 19.1: Solving the Linear Program 0 H15 þ 0 J15 þ 0 L15”.
Model for Capital Rationing The third step is to input the expression of the constraint.
Our first constraint is 100X þ 200Y þ 100Z þ
The first step is to choose the cells which represent the C þ 0D þ 0E ¼ 300, so I input the left side of this equation
unknowns: X, Y, Z, C, D, and E. “¼ 100 B15 þ 200 D15 þ 100 F15 þ 1 H15 þ 0 J15
I use B15 to represent X, D15 represent Y, F15 represent þ 0 L15” in E6.
Z, H15 represent C, J15 represent D, L15 represent E.
Indeed, you can choose any cells to proxy for the unknowns
based on your preference.
Appendix 19.1: Solving the Linear Program Model … 423
Our second constraint is 30X 70Y þ 240Z “¼ 30 B15 þ ð70Þ D15 þ 240 F15 þ ð1Þ H15 þ 1
C þ D þ 0E ¼ 70, so I input the left side of this equation J15 þ 0 L15” in E7.
424 19 Capital Budgeting Method Under Certainty and Uncertainty
Our third constraint is 30X 70Y þ 200Z þ 0C D þ E ¼ 50., so I input the left side of this equation
“¼ 30 B15 þ ð70Þ D15 þ 200 F15 þ 0 H15 þ ð1Þ J15 þ 1 L15” in E8.
Additionally, we have constraints on X, Y, and Z: X 1, Y 1, Z 1 and non-negative. We will deal with them later.
The fourth step is to click “data” and then open “Solver”.
Appendix 19.1: Solving the Linear Program Model … 425
As our objective function is expressed in B5, we select Next, we select B15, D15, F15, H15, J15, and L15 in the
“B5” in the place “set objective”. Then we choose “Max” place “By changing variable cells” since we use these cells
since we want to maximize the function. to represent our unknowns X, Y, Z, C, D, and E.
426 19 Capital Budgeting Method Under Certainty and Uncertainty
For additional constraints, X 1, Y 1, Z 1 and After adding all the constraints, we should select “Make
non-negative, we continue clicking “add” and set them as Unconstrained variables Non-negative” because our X, Y, Z,
follows: C, D, and E are non-negative. The final display of setting the
model is as follows:
428 19 Capital Budgeting Method Under Certainty and Uncertainty
Appendix 19.2: Decision Tree Method … 429
Now, we can click “solve” to get our final result. The Excel will give us the optimal weights X, Y, Z, C, D, and E in B15,
D15, F15, H15, J15, and L15, respectively, and the maximum value of V in B5. The results are consistent with the solution
shown in the example.
In Fig. 19.2. the expected monetary values are shown in decision tree method for capital budgeting decision can be
the event nodes. The financial planner decides which actions found in Chap. 14 of Lee and Lee (2017).
to take by selecting the highest EMV, which in this case is
$76.5, as indicated in the decision node at the beginning of
the tree. The parallel lines drawn across the nonoptimal Appendix 19.3: Hillier’s Statistical
decision branches indicate the elimination of these alterna- Distribution Method for Capital Budgeting
tives from consideration. Under Uncertainty
In Example 19.3, we have simplified the number of
possible alternatives and events to provide a simpler view of In this chapter, we discussed the calculation of the standard
the decision tree process. However, as we introduce more deviation of NPV (1) where cash flows are independent of
possibilities to this problem, and as it becomes more com- each other as presented in Eq. 19.17 and (2) where cash
plex, the decision tree becomes more valuable in organizing flows are perfectly positively correlated as presented in
the information necessary to make the decision. This is Eq. 19.17a. In either case, the covariance term drops out of
especially true when making a sequence of decisions rather the equation for the variance of the NPV. Now we develop a
than a single decision. A more detailed discussion of the general formula for the standard deviation of NPV that can
be used for all cash flow relationships.
References 431
The general equation for the standard deviation of NPV Equation 19.19 is the general equation for rNPV . Thus,
(rNPV ) with a mean of Eq. 19.17 for rNPV under perfectly correlated cash flows or
independent cash flows is a special case derived from the
XN
Ct St general Eq. 19.19.
NPV ¼ t þ N I0
t¼1 ð1 þ k Þ ð1 þ k Þ Hillier (1963) combined the assumption of mutual inde-
pendence and perfect correlation to develop a mode of rNPV
is to deal with mixed situations. This model is presented in
" #12 Eq. 19.20, which analyzes investment proposals in which
XN
r2t XN X N
expected cash flows are a combination of correlated and
rNPV ¼ 2t
þ Wt Ws COV ðCs Ct Þ ðs 6¼ tÞ independent flows.
t¼1 ð1 þ kÞ t¼1 s¼1
2 !2 3
ð19:19Þ X X X
N r2yt m N
r h
r¼4 þ zt
t
5 ð19:20Þ
where r2t = variance of cash flows in the tth period; Wt and h¼1 t¼0 ð1 þ kÞ
2t
t¼1 ð1 þ kÞ
Ws = discount factors for the tth and sth period (that is,
Wt ¼ 1ð1 þ KÞt and Ws ¼ 1=ð1 þ KÞs ; and COVðCt ; Cs Þ = where r2yt = variance for an independent net cash flow in
covariability between cash flows in t and s (that is, period t and rh = standard deviation for stream h of a per-
zt
COVðCt ; Cs Þ ¼ qts rs rt , where qts = correlation coefficient fectly correlated cash flow stream in t. If h = 1, then
between cash flow in tth and sth period). Eq. 19.20 is a combination of Eqs. 19.17 and 19.17a.
Cash flows between periods t and s are generally related.
Therefore, COVðCt ; Cs Þ is an important factor in the esti-
mation of rNPV . The magnitude, sign, and degree of the References
relationships of these cash flows depend on the economic
operating conditions and the nature of the product or service Ackoff, Russell. “A concept of corporate planning.” Long Range
produced. Planning 3.1 (1970): 2–8.
Using portfolio theory to calculate the standard deviation Copeland, Thomas E, J. Fred Weston, Kuldeep Shastri, Financial
of a set of securities, we have derived Eq. 19.19, which can Theory and Corporate Policy (4th Edition) Pearson, 2004
Fama, E.F. and Miller, M.H. (1972) The Theory of Finance. Holt,
be explained by an example. Suppose we have cash flows for Rinehart and Winston, New York.
a three-year period, C1, C2, C3, with discount factors of W1, Fisher, I., The Theory of Interest, MacMillan, New York, 1930.
W2, W3. Table 19.11 shows the calculation of rNPV . Hamburg, Morris. “Statistical Analysis for Decision Making. NY:
The summation of the diagonal (W21r21 , W22r2 , W23r23 ) Harccurt Brace Jovanovich.” (1983).
Hertz, D. B. “Risk Analysis in Capital Investments,” Harvard Business
results in the first part of Eq. 19.19, or Review, 42 (1964, pp. 95–106).
Hertz, D. B. “Risk Analysis in Capital Investments,” Harvard Business
XN X N
Wt Ws COV ðCs ; Ct Þ Review, 57 (1979, pp. 169–81).
t 6¼ s Hillier, F. S. “The Derivation of Probabilistic Information for the
t¼1 s¼1 Evaluation of Risky Investments,” Management Science, 9 (1963,
pp. 443–57).
This calculation is similar to the calculation of portfolio Lee. C. F. and J. Lee Financial Analysis, Planning & Forecasting:
variance, as discussed in Chap. 19. However, in portfolio Theory Application (Singapore: World Scientific, 2017).
analysis, Wt represents the percent of money invested in the Lee, C. F. and S. Y. Wang “A Fuzzy Real Option Valuation Approach
ith security, and the summation of Wt equals 1. In the cal- to Capital Budgeting Under Uncertainty Environment,” Interna-
tional Journal of Information Technology & Decision Making,
culation of rNPV , Wt represents a discount factor. Therefore, Volume: 9, Issue: 5, pp. 695–713, 2010.
the summation of Wt will not necessarily equal 1. Pinches, G. E. “Myopic, Capital Budgeting and Decision Making,”
Financial Management, 11 (Autumn 1982, pp. 6–19).
Reinhardt, Uwe E. “BREAK‐EVEN ANALYSIS FOR LOCKHEED'S
Table 19.11 Variance covariance matrix TRI STAR: AN APPLICATION OF FINANCIAL THEORY.” The
Journal of Finance 28.4 (1973): 821–838.
W1C1 W2C2 W3C3 Weingartner, H. Martin. “The excess present value index-A theoretical
W1 W2 COV ðC1 ; C2 Þ W1 W3 COV ðC1 ; C3 Þ basis and critique.” Journal of Accounting Research (1963): 213–
1 r1
2 2
W1C1 W
224.
W2C2 W1 W2 COV ðC2 ; C1 Þ W22 r22 W2 W3 COV ðC2 ; C3 Þ Weingartner, H. Martin. “Capital rationing: n authors in search of a
W3C3 W1 W3 COV ðC3 ; C1 Þ W2 W3 COV ðC2 ; C3 Þ W23 r23 plot.” The Journal of Finance 32.5 (1977): 1403–1431.
Financial Analysis, Planning, and Forecasting
20
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 433
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_20
434 20 Financial Analysis, Planning, and Forecasting
Accounting Information
Balance sheet data
Income sheet data
Retained earnings data
Fund flow data
model is to efficiently and effectively handle the analysis of 1. The model results and assumptions should be credible.
information and its interactions with the forecasting of future 2. The model should be flexible so that it can be adapted
consequences within the planning process. and expanded to meet a variety of circumstances.
Hence, the financial planning model efficiently improves 3. The model should improve on current practice in a
the depth and breadth of the information the financial technical or performance sense.
manager uses in the decision-making process. Moreover, 4. The model inputs and outputs should be comprehensible
before the finalized plan is implemented, an evaluation of to the user without extensive additional knowledge or
how well subsequent performance stands up to the financial training.
plan provides additional input for future planning actions. 5. The model should take into account the interrelated
A key to the value of any financial planning model is how investment, financing, dividend, and production deci-
it is formulated and constructed. That is, the credibility of the sions and their effect on the firm’s market value.
model’s output depends on the underlying assumptions and 6. The model should be fairly simple for the user to operate
particular financial theory the model is based on, as well as without the extensive intervention of nonfinancial per-
its ease of use for the financial planner. Because of its sonnel and tedious formulation of the input.
potentially great impact on the financial planning process
and, consequently, on the firm’s future, the particular On the basis of these guidelines, we now present and
financial planning model to be used must be chosen care- discuss the simultaneous equations, linear programming, and
fully. Specifically, we can state that a useful financial plan- econometric financial planning models, which can be used
ning model should have the following characteristics: for financial planning and analysis.
20.3 The Algebraic Simultaneous Equations Approach to Financial Planning and Analysis 435
20.3 The Algebraic Simultaneous Equations flowchart describing the interrelationships of the equations is
Approach to Financial Planning shown in Fig. 20.2.
and Analysis The key concepts of the interaction of investment,
financing, and dividends, as explained in Chap. 13, are the
In this section, we present the financial planning approach of basis of the FINPLAN model, which we now consider in
Warren and Shelton (1971), which is based on a simulta- some detail. First, we discuss the inputs to the model; sec-
neous equations concept. The model, called FINPLAN, ond, we delve into the interaction of the equations in the
deals with overall corporate financial planning as opposed to model; and third, we look at the output of the FINPLAN
just some are of planning, such as capital budgeting. The model.
objective of the FINPLAN model is not to optimize any- The inputs to the model are shown in Table 20.2B. The
thing, but rather, to serve as a tool to provide relevant driving force of the WS model is the sales growth estimates
information to the decision-maker. One of the strengths of (GSALSt). Equation (20.1) in Table 20.1 shows that sales for
this planning model, in addition to its construction, is that it period t is the product of sales in the prior period multiplied
allows the user to simulate the financial impacts of changing by the growth rate in sales for period t. EBIT is then derived,
assumptions regarding such variables as sales, operating by expressing EBIT as a percentage of the sales ratio, as in
ratios, price-to-earnings ratios, retention rates, and debt-to- Eq. (2) of Table 20.1. Current and fixed assets are then
equity ratios. derived in Eqs. 3 and 4 of the table through the use of the
The advantage of utilizing a simultaneous equation CA/SALES and FA/SALES ratios. The sum of CA and FA is
structure to represent a firm’s investment, financing, pro- the total assets for the period.
duction, and dividend policies is the enhanced ability for the Financing of the desired level of assets is undertaken in
interaction of these decision-making areas. The Warren and Sect. 3 of the table. In Eq. 6, current liabilities in period t are
Shelton (WS) model is a system of 20 equations which are derived from the ratio of CL/SALES multiplied by SALES.
listed in Table 20.1. These equations are segmented into Equation 20.7 represents the funds required (NFt). FIN-
distinct subgroups corresponding to sales, investment, PLAN assumes that the amount of preferred stock is constant
financing, and per share (return to investors) data. The over the planning horizon. In determining what funds are
Table 20.1 WS model Section 1—Generation of sales and earnings before interest and taxes for period t.
(1) SALES t ¼ SALES tl ð1 þ GSALS t Þ
(2) EBITt ¼ REBITt SALESt
Section 2—Generation of total assets required for period t
(3) CAt ¼ RCAt SALESt
(4) FAt ¼ RFAt SALESt
(5) At ¼ CAt þ FAt
Section 3—Financing the desired level of assets
(6) CLt ¼ RCLt SALESt
(7) NFt ¼ ðAt CLt PFDSKt Þ ðLt1 LRt Þ St1 Rt1 br
fð1 Tt Þ½EBITt it1 ðLt1 LRt Þ PFDIVt g
(8) NFt þ bt ð1 Tt Þ let NLt þ Utl NLt ¼ NLt þ NSt
(9) Lt ¼ Lt1 LRt þ NLt
(10) St ¼ St1 þ NSt
(11) Rt ¼ Rt1 þ bt ð1 Tt Þ EBITt it Lt Utl NLt PFDIVt
(12) it ¼ it1 Lt1LLR
t
t
þ ie NL
Lt
t
Lt
(13) St þR t
¼ Kt
Section 4—Generation of per share data for period
t
(14) EAFCDt ¼ ð1 Tt Þ EBITt it Lt Utl NLt PFDIVt
(15) CMDIVt ¼ ð1 bt ÞEAFCDt
(16) NUMCSt ¼ NUMCSt1 þ NEWCSt
(17) NEWCSt ¼ ð1U NSt
t ÞPt
s
(18) Pt ¼ mt EPSt
(19) EPSt ¼ NUMCS
EAFCDt
t
needed and where they are to come from, FINPLAN uses a parenthesis, (Lt – 1 – LRt), takes into account the remaining
source-and-use-of-funds accounting identity. For instance, old debt outstanding, after retirements, in period t. Then the
Eq. 20.7 shows that the assets for period t are the basis for funds provided by existing stock and retained earnings are
the firm’s financing needs. Current liabilities, as determined subtracted. The last quantity is the funds provided by
in the prior equation, are one source of funds and therefore operations during period t.
are subtracted from asset levels. As mentioned above, pre- Once the funds needed for operations are defined, Eq. 8
ferred stock is a constant and therefore must be subtracted specifies that new funds, after taking into account under-
also. After the first term in Eq. 20.7, (At – CLt – PFDSKt), writing costs and additional interest costs from new debt, are
we have the financing that must come from internal sources to come from long-term debt and new stock issues. Equa-
(retained earnings and operations) and long-term external tions 20.9 and 20.10 simply update the debt and equity
sources (debt and stock issues). The term in the second accounts for the new issues. Equation 20.11 updates the
20.3 The Algebraic Simultaneous Equations Approach to Financial Planning and Analysis 437
retained-earnings account for the portion of earnings avail- parameters, the financial manager can better understand how
able to common stockholders from operations during period his or her decisions interact and, consequently, how they will
t. Specifically, bt is the retention rate in period t, and (1 – T t) affect the company’s future. (Sensitivity analysis is dis-
is the after-tax percentage, which is multiplied by the earn- cussed in greater detail later in this chapter.)
ings from the period after netting out interest costs on both We have shown how we can use Excel to solve 20
new and old debt. Since preferred stockholders must be paid simultaneous equation systems as presented in Table 20.1,
before common stockholders, preferred dividends must be and the results are presented in Table 20.4 and Table 20.5.
subtracted from funds available for common stockholders. Now, we will discuss how we can use the data from
Equation 20.12 calculates the new weighted-average interest Table 20.3 to calculate the unknown variables for Sect. 1,
rate for the firm’s debt. Equation 20.13 is the new debt-to- Sect. 2, Sect. 3, and Sect. 4 in 2017.
equity ratio for period t.
Section 4 of Table 20.1 applies to the common stock- Section 1: Generation of Sales and Earnings before Interest
holder; in particular, dividends and market value. Equa- and Taxes for Period t
tion 14 represents the earnings available for common
dividends and is simply the firm’s after-tax earnings. Cor- 1: Sales t ¼ Salest1 ð1 þ GSALSt Þ
respondingly, Eq. 15 computes the earnings to be paid to ¼ 71; 890 1:1267
common stockholders. Equation 16 updates the number of
¼ 80; 998:46
common shares for new issues.
As Eq. 17 shows, the number of new common shares is 2: EBITt ¼ REBITt1 Salest
determined by the total new stock issue divided by the stock ¼ 0:2872 80998:463
price after discounting for issuance costs. Equation 18 deter- ¼ 23; 262:76
mines the price of the stock through the use of a price-earnings
ratio (mt) of the stock purchase. Equation 19 determines EPS, Section 2: Generation of Total Assets Required for Period t
as usual, by dividing earnings available to common stock-
holders by the number of common shares outstanding. Equa- 3: CAt ¼ RCAt1 Salest
tion 20 determines dividends in a similar manner.
¼ 0:9046 80998:463
Tables 20.3, 20.4, and 20.5 illustrate the setup of the
necessary input variables and the resulting output of the pro ¼ 73; 271:21
forma balance sheet and income statement for the Exxon 4: FAt ¼ RFAt1 Salest
Company. As mentioned, the WS equation system requires ¼ 1:0596 80998:463
values for parameter inputs, which for this example are listed ¼ 85; 825:97
in Table 20.3. The first column represents the value of the 5: At ¼ CAt þ FAt
input, while the second column corresponds to the variable
¼ 73271:21 þ 85825:97
number. The third and fourth columns pertain to the begin-
ning and ending periods for the desired planning horizon. ¼ 159; 097:18
From Tables 20.4 and 20.5 you can see the type of
information the FINPLAN model generates. With 2016 as a Section 3: Financing the Desired Level of Assets
base year, planning information is forecasted for the firm
over the period 2017–2020. Based on the model’s con- 6: CLt ¼ RCLt1 Salest
struction, its underlying assumptions, and the input data, the ¼ 0:3656 80998:463
WS model reveals the following: ¼ 29; 613:00
1. The amount of investment to be carried out 7: NFt ¼ ðAt CLt PFDSKt Þ ðLt1 LRt Þ St1
2. How this investment is to be financed Rt1 bt fð1 Tt Þ½EBITt it 1ðLt 1 LRt Þ
3. The amount of dividends to be paid PFDIVt g ¼ ð159097:181 29; 613:00 0Þ ð22;
4. How alternative policies can affect the firm’s market 442 2; 223Þ 3; 120:0 110; 551 0:4788 fð1
value 0:18Þ ½23262:76 0:0332 ð22; 442 2; 223Þ
0g ¼ 13; 275:64
Even more important, as we will explore later in this
12: it Lt ¼ it1 ðLt1 LRt Þ þ iet1
t
NLt
chapter, this model’s greatest value (particularly for FIN-
PLAN) arises from the sensitivity analysis that can be per- ¼ 0:0332 ð22; 442 2; 223Þ þ 0:0368 NLt
formed. That is, by varying one or several of the input ¼ 671:2708 þ 0:0368 NLt
20.3 The Algebraic Simultaneous Equations Approach to Financial Planning and Analysis 439
Table 20.4 Pro forma balance 2016 2017 2018 2019 2020
sheet (2016–2020)
Assets
Current assets 0.00 73,271.6 82,555.56 93,015.84 104,801.5
Fixed assets 0.00 85,826.43 96,701.16 108,953.8 122,758.9
Total assets 0.00 159,098 179,256.7 201,969.6 227,560.4
Liabilities and net worth
Current liabilities 0.00 29,613.2 33,365.37 37,592.96 42,356.21
Long term debt 22,442.00 31,293.56 35,258.64 39,726.12 44,759.66
Preferred stock 0.00 0 0 0 0
Common stock 3120.00 −21,298.1 −18,972.8 −16,350.1 −13,392.6
Retained earnings 110,551.00 119,489.3 129,605.5 141,000.6 153,837.1
Total liabilities and net worth 0.00 159,098 179,256.7 201,969.6 227,560.4
Computed DBT/EQ 0.0000 0.3187 0.3187 0.3187 0.3187
Int. rate on total debt 0.0332 0.034474 0.034882 0.035205 0.035464
Per share data
Earnings 0.0000 7.292306 8.205176 9.188508 10.29033
Dividends 0.0000 3.80075 4.276538 4.78905 5.363322
Price 0.0000 139.1007 156.5137 175.2708 196.2881
440 20 Financial Analysis, Planning, and Forecasting
Table 20.5 Pro forma income 2016 2017 2018 2019 2020
statement (2016–2020)
Sales 71,890.00 80,998.90 91,261.94 102,825.38 115,853.98
Operating income 0.00 23,262.88 26,210.43 29,531.45 33,273.26
Interest expense 0.00 1078.81 1229.90 1398.57 1587.35
Underwriting commission— 0.00 221.49 123.76 133.81 145.13
debt
Income before taxes 0.00 21,962.58 24,856.77 27,999.07 31,540.79
Taxes 0.00 3953.26 4474.22 5039.83 5677.34
Net income 0.00 18,009.31 20,382.55 22,959.24 25,863.44
Preferred dividends 0.00 0.00 0.00 0.00 0.00
Available for common 0.00 18,009.31 20,382.55 22,959.24 25,863.44
dividends
Common dividends 0.00 9386.45 10,623.39 11,966.36 13,480.03
Debt repayments 0.00 2223.00 2223.00 2223.00 2223.00
Actl funds needed for 0.00 −13,028.02 8870.34 9715.43 10,667.10
investment
8. NFt þ bt ð1 TÞ iw1 NLt þ UtLt NLt ¼ ðbÞ ðeÞ ¼ ðfÞ
NLt þ NSt 13275:64 þ 0:4788 ð1 0:18Þ 20; 219 ¼ 0:3187St þ 0:3187Rt NLt
ð0:0332NLt þ 0:02 NLt Þ ¼ NLt þ NSt
13275:64 þ 0:02089 NLt ¼ NLt þ NSt ðfÞ 0:3187ðcÞ ¼ ðgÞ
(a) NSt þ 0:97911NLt ¼ 24; 337:4104 19; 224:656 ¼ 0:3187NSt NLt þ 0:3187Rt
9: Lt ¼ Lt1 LRt þ NLt
Lt ¼ 22; 442 2223 þ NLt ðgÞ 0:3187ðdÞ
(b) NLt þ 0:3187NSt 0:0071NLt ¼ 18834:74646
Lt NLt ¼ 20; 219
18834:74646 ¼ 0:3187NSt 1:0071NLt
10. St ¼ St1 þ NSt
(c) NSt þ St ¼ 3; 120:0 ðhÞ 0:3187ðaÞ ¼ ðiÞ
1:0071Nt 0:3120NLt ¼ 14603:81
11. Rt ¼ Rt1 þ bt ð1 Tt Þ EBITt it Lt UtL NLt
NLt ¼ 14603:81=1:31915 ¼ 11070:62
PFDIVt g ¼ 110; 551 þ 0:4778 fð1 0:18Þ
½23; 262:76 it Lt 0:02 NLt 0g Substitute NLt in (a)
NSt = −24114.98745
Substitute (12) into (11) Substitute NLt in (b)
Lt = 31289.62094
Substitute NSt in (c)
Rt ¼ 110; 551 þ 0:4778
St = −20994.98745
f0:82½23; 262:76 ð671:2708 þ 0:0368 NLt Þ 0:02 NLt g
Substitute NLt in (d)
¼ 119; 420:7796 0:0223NLt Rt = 119173.9047
Substitute NLtLt in (12)…
(d) 119; 420:7796 ¼ Rt þ 0:0223NLt it = 0.03447
13: Lt ¼ ðSt þ Rt ÞKt Section 4: Generation of Per Share Data for Period t
Lt ¼ 0:3187St þ 0:3187Rt
14. EAFCDt ¼ ð1 Tt Þ EBITt it Lt U L tNLt
(e) Lt 0:3187St 0:3187Rt ¼ 0 PFDIVt ¼ ð1 0:18Þ ½23; 262:75857 0:03447
31289:62 0:02 ð11070:62Þ 0 ¼ 18009:49019
20.4 The Linear Programming Approach to Financial Planning and Analysis 441
of a product), then a related technique called integer pro- Table 20.6 Production information for XYZ toys
gramming can be used.2 Toy Machine time (h) Assembly time (h)
In this section, we apply linear programming to profit KK 5 5
maximization, capital rationing, and financial planning and
PP 4 3
forecasting.
RC 5 4
Total hours available 150 100
20.4.1 Profit Maximization
Table 20.7 Financial information for XYZ toys
XYZ, a toy manufacturer, produces three types of toys: King
Toy Selling price ($/ Variable cost Profit contribution
Kobra (KK), Pistol Pete (PP), and Rock Coolies (RC). To unit) ($/unit) ($/unit)
produce each toy, the plastic parts must be molded by
KK 11 10 1
machine and then assembled. The machine and assembly
times for each type of toy are shown in Table 20.6. PP 8 4 4
Variable costs, selling prices, and profit contributions for RC 8 5 3
each type of toy are presented in Table 20.7.
XYZ finances its operations through bank loans. The
covenants of the loans require that XYZ maintain a current Table 20.8 Balance sheet of XYZ toys
ratio of 1 or more; otherwise, the full amount of the loan Assets Liabilities and equity
must be immediately repaid. The balance sheet of XYZ is
Cash $100 Bank loan $130
presented in Table 20.8.
Marketable securities 100 Long-term debt 300
For this case, the objective function is to maximize the
profit contribution for each product. From Table 20.7, we see Accounts receivable 50 Equity 70
that the profit contribution for each product is KK = $1, Plant and equipment 250 $500
$500
PP = $4, and RC = $3. We can multiply this contribution
per unit times the number of units sold to identify the firm’s
total operating income. Thus, the objective function is
X2 þ X3 10 ðmarketing constraintÞ ð20:4Þ
MAXP ¼ X1 þ 4X2 þ 3X3 ð20:1Þ
Finally, the bank covenant requiring a current ratio
where X1, X2, X3 are the number of units of KK, PP, and RC. greater than 1 must be met. Thus,
We can now identify the constraints of the linear pro-
gramming problem. The firm’s capacities for producing KK, cash þ marketable securities þ AR cost of production
1
PP, and RC depend on the number of hours of available bankloan
machine time and assembly time. Using the information from 100 þ 100 þ 50 10X1 4X2 5X2
1
Table 20.6, we can identify the following capacity constraints: 130
10X1 þ 4X2 þ 5X3 120ðcurrent ratio constraint Þ
5X1 þ 4X2 þ 5X3 150 hoursðmachine time constraintÞ
ð20:5Þ
ð20:2Þ
Since the production of each toy must, at minimum, be 0,
5X1 þ 3X2 þ 4X3 100 hoursðassembly time constraintsÞ three nonnegative constraints complete the formulation of
ð20:3Þ the problem:
There is also a constraint on the number of Pistol Petes X1 ; X2 ; X3 0 ðnonnegative constraintÞ ð20:6Þ
(PP) and Rock Coolies (RC) that can be produced. The
Combining the objective functions and constraints yields
firm’s marketing department has determined that 10 units of
PPs and RCs are the maximum amount that can be sold; MAXX1 þ 4X2 þ 3X3 ð20:7Þ
hence
subject to 5Xt + 4X2 + 5X3 150; 5X1 + 3X2 + 4X3
100; X2 + X3 10; 10X1 + 4X2 + 5X3 120; and X1
0, X2 0, X3 0.
2
Both linear programming and integer programming are generally Using the simplex method to solve this linear program-
taught in the MBA or undergraduate operation-analysis course. See
ming problem, we derive the three simplex method tableaus
Hillier and Lieberman, Introduction to Operation Research, for
discussion of these methods. in Table 20.9. Tableau 1 presents the information of
20.4 The Linear Programming Approach to Financial Planning and Analysis 443
Table 20.9 Simplex method tableaus for solving Eq. 20.7 In tableau 3, the solution values for variables X1 and X2
Tableau 1 are found in the right-hand column. Thus, X1 = 8 units and
Real variables Slack variables X2 = 10 units. Since X3 doesn’t appear in the final solution,
it has a value of 0. The slack variables indicate the amount of
X1 X2 X3 S1 S2 S3 S4
XYZ’s unused resources. For example, S1 = 70 indicates
S1 5 4 5 1 0 0 0 150
that the firm has 70 h of unused machine time. To produce 8
S2 5 3 4 0 1 0 0 100 units of X1 requires 40 h, and to produce 10 units of X2
S3 0 1 1 0 0 1 0 10 requires 40 h, so our total usage of machine time is 80 h.
S4 10 4 5 0 0 0 1 120 This is 70 h less than the total hours of machine time the
Objective function coefficients firm has available. S2 = 30 indicates that there are additional
Profit 1 4 3 0 0 0 0 0 assembly hours available. S3 = 0 (it is not in the solution)
Total profit: 0 implies that the constraint to make 10 units of X2 + X3 is
satisfied. S4 = 0 implies that the current ratio constraint is
Tableau 2
also satisfied and that financing, or, more precisely, the lack
Real variables Slack variables
of financing, is limiting the amount of production. If the firm
X1 X2 X3 S1 S2 S3 S4 can change the bank loan covenant or increase the amount of
S1 5 0 1 1 0 −4 0 110 available funds, it will be able to produce more. The maxi-
S2 5 0 1 0 1 −3 0 70 mum total profit contribution is $48 given the current pro-
X2 0 1 1 0 0 1 0 10 duction level.
S4 10 0 1 0 0 −4 1 80
Objective function coefficients
20.4.2 Linear Programming and Capital
Profit 1 0 −1 0 0 −4 0 −40
Rationing
Total profit: 40
Tableau 3 Linear programming is a mathematical technique that can be
Real variables Slack variables used to find the optimal solution to problems involving the
X1 X2 X3 S1 S2 S3 S4 allocation of scarce resources among competing activities.
S1 0 0 .5 1 0 −2 .5 70 Mathematically, linear programming can best solve prob-
S2 0 0 .5 0 1 −1 .5 30 lems in which both the firm’s objective is to be maximized
X2 0 1 1 0 0 1 0 10 and the constraints limiting the firm’s actions are linear
functions of the decision variables involved. Thus, the first
X1 1 0 .1 0 0 −0.4 .1 8
step in using linear programming as a tool for financial
Objective function coefficients
decision-making is to model the problem facing the firm into
Profit 0 0 −1.1 0 0 −3.6 −.1 −48 a linear-programming form. To construct the programming
Total profit: 48 model involves the following steps.
First, identify the controllable decision variables. Second,
objective function and constraints as derived in Eq. 20.7. define the objective to be maximized or minimized and
Since there are constraints for four resources, there are four formulate that objective into a linear function with control-
slack variables: S1, S2, S3, and S4. The initial tableau implies lable decision variables. In finance, the objective generally is
that we produce neither KK, PP, or RC. Therefore, the total to maximize profit and market value or to minimize pro-
profit is 0, a result that is not optimal because all objective duction costs. Third, the constraints must be defined and
coefficients are positive. In the second tableau, the firm expressed as linear equations (equalities or inequalities) of
produces ten units of PP and generates a $40 profit. But this the decision variables. This usually involves determining the
result also is not optimal because one of the objective capacities of the scarce resources involved in the constraints
function coefficients is positive. Tableau 3 presents the and then deriving a linear relationship between these
optimal situation because none of the objective function capacities and the decision variables.
coefficients is positive. (Appendix 20.1 presents the method For example, suppose that X1, X2, …, XN represents
and procedure for specifying tableau 1 and solving tableaus output quantities. Then the linear programming model takes
2, and 3 in terms of a capital rationing example.) the general form:
444 20 Financial Analysis, Planning, and Forecasting
implemented. If the outputs are not satisfactory, both the Table 20.12 Endogenous and exogenous variables
inputs and the model should be reconsidered and modified. Endogenous variables
Output from the LP model consists of the firm’s major (a) X1,t = D/Vt = cash dividends paid in period t
financial planning decisions (dividends, working capital, (b) X2,t = ISTt = net investment in short-term assets during period t
(c) X3,t = ILTt = gross investment in long-term assets during period t
financing). The use of linear programming techniques allows (d) X4,t = −DFt = minus the net proceeds from new debt issued
these decisions to be determined simultaneously. during period t
Carleton and CDD emphasize the importance of the (e) X5,t = −EQFt = minus the net proceeds from new equity issued
degree of detail included in their model’s forecasted balance during period t
Exogenous variables
sheets and income and funds-flow statements. That is, these P P
Y t ¼ 5i¼1 X i;t ¼ 5i¼1 X i;t
statements are broken down into the minimum number of where Y = net profits + depreciation allowance (a reformulation of
accounts consistent with making meaningful financial deci- the sources = uses identity)
sions: capital investment, working capital, capital structure, (b) RCB = corporate bond rate
and dividends. Complicating the interpretations of the results (c) RDPt = average dividend-price ratio (or dividend yield)
(d) DELt = debt-equity ratio
with myriad details can diminish the effectiveness of any (e) Rt = the rates of return the corporation could expect to earn on its
financial planning model. future long-term investment (or internal rate of return)
In comparing the LP and simultaneous equations (f) CUt = rates of capacity utilization (used by Francis and Rowell
approaches to financial planning, the main difference (1978) to lag capital requirements behind changes in percent sales;
used here to define the Rt expected)
between the two is that the linear programming method
Source Adapted from Spies (1974)
optimizes the plan based on classical finance theory while
the simultaneous equations approach does not. However, in
terms of ease of use, particularly for doing sensitivity anal- endogenous variables) depends not only on the component’s
ysis, the simultaneous equations model has the upper hand. distance from its target but also on the simultaneous
adjustment of the other four decision variables.5
components are uses of funds, while the latter two compo- The last two exogenous variables, R and CUt, describe the
nents are sources of funds. The dividends component rate of return the corporation could expect to earn on its
includes all cash payments to stockholders and must be non- future long-term investment. The ratio of the change in
negative. Net short-term investment is the net change in the earnings to invest in the previous quarter should provide a
corporation’s holdings of short-term financial assets, such as rough measure of the rate of return on that investment. Spies
cash, government securities, and accounts receivable. This used a four-quarter average of that ratio, Rt, to smooth out
component of the capital budget can be either positive or the normal fluctuations in earnings. The rate of capacity
negative. Gross long-term investment is the change in gross utilization, CUt, was also included to improve this measure
long-term assets during the period. For example, the of the expected rate of return. Finally, a constant and three
replacement of old equipment is considered a positive long- seasonal dummy variables were included. The exogenous
term investment. Long-term investment can be negative, but variables are summarized in Table 20.12.
only if the sale of long-term assets exceeds replacement plus
new investment.
As for sources of funds, the debt-financing component is 20.5.2 Simplified Spies Model
simply the net change in the corporation’s liabilities, such as
corporate bonds, bank loans, taxes owed, and other accounts The simplified Spies model8 for dividend payments (X1, t),
payable. Since a corporation can either increase its liabilities net short-term investments (X2, t), gross long-term invest-
or retire existing liabilities, this variable can be either posi- ments (X3, t), new debt issues (X4, t) and new equity issues
tive or negative. Finally, new equity financing is the change (X5, t) is defined as
in stockholder equity minus the amount due to retained
earnings. This should represent the capital raised by the sale Xi;t ¼ a0i þ a1t Yt þ a2i RCBt þ a3i RDPt þ a4i DELt þ a5i Rt
of new shares of common stock. Although corporations þ a6i CUt þ a7i Xi;t1
frequently repurchase stock already sold, this variable is ð20:12Þ
almost always positive when aggregated.
The first step is to develop a theoretical model that where i = 1, 2, 3, 4, 5, etc. Equation 20.12 implies that
describes the optimal capital budget as a set of predeter- dividend payments, net short-term investments, gross long-
mined economic and financial variables. The first of these term investments, new debt issues, and new equity issues all
variables is a measure of cash flow: net profits plus depre- can be affected by new cash inflow (Yt), the corporate bond
ciation allowances. This variable, denoted by Y, is exoge- rate (RCBt), average dividend yield (RDPt), debt-equity ratio
nous as long as the policies determining production, pricing, (DELt), rates of return on long-term investment (Rt), rates of
advertising, taxes, and the like cannot be changed quickly capacity utilization (CUt), and Xi, t-1 (the last period’s divi-
enough to affect the current period’s earnings. Since quar- dend payment, net short-term investment, etc.). These
terly data are used in this work, this seems a reasonable empirical models simultaneously take into account theory,
assumption. It should also be noted that the “uses equals information, and methodologies, and they can be used to
sources” identity ensures the following: forecast cash payments, net short-term investment, gross
long-term investment, new debt issues, and new equity
X
5 X
5 issues.
Xi;t ¼ Xi;t ¼ Yt ð20:11Þ
i¼1 i¼1
where X1,t, X2,t, X3,t, X4,t, X5,t, X*1,t, and Yt are defined in 20.6 Sensitivity Analysis
Table 20.12.7
The second exogenous variable in the model is the cor- So far, we have covered three types of financial planning
porate bond rate, RCDt, which was used as a measure of the models and discussed their strengths, weaknesses, and
corporations’ borrowing rate. In addition, the debt-equity functional procedures. The efficiency of these models will
ratio at the start of the period, DELt, was included to allow depend solely on how they are employed. This section looks
for the increase in the cost of financing due to leverage. The at alternative uses of financial planning models to improve
average dividend-price ratio for all stocks, RDPt, was used their information dissemination. One of the most
as a measure of the rate of return demanded by investors in a
no-growth, unlevered corporation for the average-risk class.
8
The original Spies model and its application can be found in Lee and
Lee (2017). In addition, Tagart (1977) has proposed an alternative
econometric model for financial planning and analysis. Readers who
7
Expanding Eq. 21.11, we obtain. X1,t + X2,t + X3,t + X4,t + X5, are interested in this model, please see Lee and Lee (2017) Chapter 26
t = X*1,t + X*2,t + X*3,t + X*4,t + X*5,t = Yt. for further detail.
448 20 Financial Analysis, Planning, and Forecasting
advantageous ways to use these financial planning models is Table 20.14 Summary results of sensitivity analysis for EPS, DPS,
to perform sensitivity analysis. The purpose of sensitivity and PPS (2017–2020)
analysis is to hold all but one or perhaps a couple of vari- Original analysis 2017 2018 2019 2020
ables constant and then analyze the impact of their change EPS 6.73 7.18 7.63 8.10
on the predicted outcome. DPS 3.51 3.74 3.97 4.22
As mentioned earlier, financial planning models are PPS 128.29 136.96 145.45 154.48
merely forecasting tools to help the financial manager ana-
Sensitivity analysis #1
lyze the interactions of important company decisions with
EPS 7.35 8.66 10.15 11.91
uncertain economic elements. Since we can never be pre-
cisely sure what the future holds, sensitivity analysis stands DPS 3.83 4.51 5.29 6.21
out as a desirable manner of examining the impact of the PPS 140.23 165.17 193.68 227.12
unexpected as well as of the expected. Sensitivity analysis #2
Of the three types of financial planning models presented EPS 5.89 5.40 4.94 4.52
in this chapter, the simultaneous equations approach, as DPS 3.07 2.81 2.58 2.36
embodied in Warren and Shelton’s FINPLAN, offers the PPS 112.38 103.00 94.25 86.23
best method for performing sensitivity analysis. By changing
Sensitivity analysis #3
the parameter values, we can compare new outputs of the
EPS 6.90 7.71 8.58 9.55
financial statements with those such as in Tables 20.4 and
20.5. The difference between the new statement and the DPS 3.60 4.02 4.47 4.98
statements in Tables 20.4 and 20.5 reflects the impact of PPS 131.58 147.03 163.62 182.10
potential changes in such areas as economic conditions (re- Sensitivity analysis #4
flected in the interest rate, tax rate, and sales growth esti- EPS 7.07 8.05 9.03 10.14
mates) and company policy decisions (reflected in the DPS 3.68 4.20 4.71 5.29
maximum and minimum limits specified for the maturity and PPS 134.82 153.56 172.34 193.42
amount of debt and in the dividend policy as reflected in the
Sensitivity analysis #5
specified payout ratio).
EPS 4.88 5.41 5.96 6.56
To perform sensitivity analysis, we change growth in
sales (variable 3), operating income as a percentage of sales DPS 2.54 2.82 3.10 3.42
(variable 17), the P/E ratio (variable 22), the expected PPS 93.01 103.17 113.62 125.14
interest rate on new debt (variable 16), and long-term debt- Sensitivity analysis #6
to-equity ratio (variable 20). The new parameters are listed EPS 12.34 14.04 15.93 18.08
in Table 20.13. Summary results of the alternative sensitivity DPS 6.43 7.32 8.30 9.42
analyses for EPS, DPS, and price per share (PPS) are listed PPS 235.39 267.81 303.88 344.82
in Table 20.14. The results indicate that changes in key
Sensitivity analysis #7
financial decision variables will generally affect EPS, DPS,
EPS 8.36 9.22 10.11 8.36
and PPS.
DPS 4.36 4.80 5.27 4.36
PPS 41.82 46.08 50.53 41.82
Sensitivity analysis #8
Table 20.13 Sensitivity analysis parameters EPS 6.88 7.75 8.69 9.74
Model Parameter Alternative Sensitivity DPS 3.58 4.04 4.53 5.08
variable values analysis PPS 206.26 232.42 260.65 292.32
number number
Sensitivity analysis #9
3 Growth in sales .20 1
−.15 2 EPS 7.15 8.09 9.09 10.21
20 Long-term debt-to- .10 3 DPS 3.73 4.22 4.74 5.32
equity ratio .5 4 PPS 136.48 154.27 173.38 194.74
17 Operating income as .20 5 Sensitivity analysis #10
a percentage of sales .50 6
EPS 7.00 7.85 8.76 9.79
22 Price-to-earnings 5 7
ratio 30 8 DPS 3.65 4.09 4.57 5.10
20.7 Summary coefficients. Note that only S1 and S2 are listed in the first
column of tableau 1. This indicates that S1 and S2 are basic
This chapter has examined three types of financial planning variables in tableau 1 and that remaining variables X1, X2,
models available to the financial manager for use in ana- X3, and X4 have been arbitrarily set equal to 0.
lyzing the interactions of company decisions: the algebraic With X1, X2, X3, and X4 all equal to 0, the remaining
simultaneous equations model, the linear programming variables assume the values in the last column of the tableau;
model, and the econometric model. We also have discussed that is, S1 = 15 and S2 = 20. The numbers in the last column
the benefits of sensitivity analysis for determining the impact represent the values of basic variables in a particular basic-
on the company from changes (expected and unexpected) in feasible solution.
economic conditions. Step 3: Obtain a new feasible solution. The basic-feasible
The student should understand the basic functioning of all solution of tableau 1 indicates zero profits for the firm.
three models, along with the underlying financial theory. Clearly, this basic-feasible solution can be bettered because
Moreover, it is essential to understand that a financial it shows no profit, and profit should be expected from the
planning model is an aid or tool to be used in the decision- adoption of any project.
making process and is not an end in and of itself. The fact that X4 has the largest incremental NPV indicates
The computer-based financial modeling discussed in this that the value of X4 should be increased from its present level
chapter can be performed on either a mainframe computer or of 0. If we divide the column of figures under X4 into the
a PC. An additional dimension is the development of elec- corresponding figures in the last column, we obtain quotients
tronic spreadsheets. These programs simulate the matrix or 1 and 1/3. Since the smallest positive quotient is associated
spreadsheet format used in accounting and financial state- with S2, then S2 should be replaced by X4 in tableau 2.
ments. Their growing acceptance and popularity are due to The figures in tableau 2 are computed by setting the value
the ease with which users can make changes in the spread- of S1 to 0, S2 to 1, and NPV to 0. The steps in the derivation
sheet. This flexibility greatly facilitates the use of these are as follows: To eliminate the nonzero terms, we first
programs for sensitivity analysis. divide the second row in tableau 1 by 60 and thus obtain the
coefficients indicated in the second row of tableau 2. We
then multiply this row by -60.88 and combine this result
Appendix 20.1: The Simplex Algorithm with the third row, as follows:
for Capital Rationing
½34:72 þ ð60:88Þ ð:75ÞX1
The procedure of using the simplex method in capital þ ½41:34 þ ð60:88Þ ð:125ÞX2
rationing to solve Eq. 20.8 is as follows: þ ½27:81 þ ð60:88Þ ð:125ÞX3
þ ½60:88 ð60:88Þ 1X4 þ ½0 þ ð60:88Þð0ÞS1
Step 1: Convert equality constraints into a system of
þ ½0 þ ð60:88Þð:017ÞS2
equalities through the introduction of slack variables S1 and
1
S2, as follows: ¼ ð60:88Þ ð20A 2Þ
3
15X1 þ 7:5X2 þ 7:5X3 þ S1 ¼ 15
ð20:13Þ The objective function coefficients of Eq. 20A-2 are listed
45X1 7:5X2 7:5X3 þ 60X4 þ S2 ¼ 20 in the third row of tableau 2. Tableau 2 implies that the com-
pany will undertake 1/3 units of project 4 (X4) and that the total
where X1 = XA; X2 = XB; X3 = XC; and X4 = XD (each of
NPV of X4 is $20.2933. All coefficients associated with the
these is a separate investment project)
objective function are positive, which implies that the NPV
Step 2: Construct a tableau or tableaus for representing
can be improved by replacing S1 with either X1, X2, X3, X4.
the objective function and equality constraints. This has been
Using the same procedure mentioned above, we can now
done for four tableaus in Table 20.A1. In tableau 1, the
obtain tableau 3. In tableau 3, the only positive objective
figures in columns 2 through 6 are the coefficients of X1, X2,
function coefficient is X2. Therefore, X2 can replace either X1
X3, X4, S1, and S2, as specified in the two equalities in
or X4 to increase the NPV.
Eq. 20.13. Below these figures are the objective function
450 20 Financial Analysis, Planning, and Forecasting
Once again, using the procedure discussed above, we now maximum NPV in order to understand and appreciate the
obtain tableau 4. In tableau 4, none of the coefficients asso- basic technique of derivation.
ciated with the objective function are positive. Therefore, the
solution in this tableau is optimal. Tableau 4 implies that the
company will undertake 2 units of project 2 (X2) and .583 Appendix 20.2: Description of Parameter
units of project 4 (X4) to maximize its total NPV. Inputs Used to Forecast Johnson & Johnson’s
From tableau 4, we obtain the best feasible solution: Financial Statements and Share Price
X1 ¼ 0; X2 ¼ 2; X3 ¼ 0; and X4 ¼ 0:583 In our financial planning plan program, there are 20 equa-
Total NPV is now equal to (2)(41.34) + (60.88) tions and 20 unknowns. To use this program, we need to
(.583) = $118.193. input 21 parameters. These 20 unknowns and 21 parameters
Although there are computer packages that can be used can be found in Table 20.2.
for linear programming, we can use the simplex method to We use 2016 as the initial reference year and input the 21
hand-calculate the optimal number of projects and the parameters, the bulk of which can be obtained or derived
Appendix 20.3: Procedure of Using Excel to Implement the FinPlan Program 451
from the historical financial statements of JNJ. The first input P/E ratio(mt-1 = 19.075) which is calculated as JNJ’s clos-
is SALE t-1 ($71,890), defined as fiscal 2016 net sales and ing share price on the last trading day of 2016 divided by
can be obtained from the income statement of JNJ. The fiscal 2016 net income.
second input is GCALSt-1. This parameter can be calculated
t1 Salest2
by either the percentage change method: SalesSales t2
¼
2:59% or sustainability growth rate: 1ROEROEt1 bt1
¼ 12:7% Appendix 20.3: Procedure of Using Excel
t1 bt 1
to Implement the FinPlan Program
The third input is RCAt-1 (90.46%), defined as current
assets divided by total sales, and the fourth input is RLA t-1
This appendix describes the detailed procedure of using
(1.0596), defined as total asset minus current asset divided by
Excel to implement the FinPlan program. There are four
net sales. The next parameter is RCLt-1 (36.57%), defined as
steps to use the FinPlan program.
current liabilities as a percentage of net sales. The sixth
parameter is preferred stock issued (PKV), with a value of 0, as Step 1. Open the Excel file of FinPlan Example.
JNJ does not currently have any preferred stock outstanding.
The inputs for the aforementioned three parameters are all
obtained from JNJ’s fiscal 2016 balance sheet. The seventh
input is JNJ’s preferred stock dividends, and since there is no
preferred stock outstanding, it is correspondingly 0. The
eighth input is LR t-1 ($22,442), defined as long-term debt,
coming from the balance sheet of JNJ for the fiscal year 2016,
and the ninth input is LR t-1 ($-2,223), defined as long-term
debt retirement, from the 2016 statement of cash flows.
The tenth input is St-1 ($3,120), which represents com-
mon stock issued, and the eleventh input is retained earnings
(Rt-1 = $110,551). Both of these two variables can be found
in the balance sheet for JNJ’s fiscal year 2016. The twelfth
input is the retention rate (bt-1 = 47.88%), defined as
1 Dividendpayout
Netincomet1 . The thirteenth input, the average tax rate
t1
Step 3. Choose “Macros” and then click “Run”. Forecast Actual Error
Income before taxes 15,857.29 17,999.00 11.90%
Taxes 2854.31 2702.00 5.64%
Net income 13,002.98 15,297.00 15.00%
Preferred dividends 0 0 0.00%
Common dividends −89,450.47 −9494.00 −842.18%
Debt repayments 6754.00 −3949.00 −271.03%
Assets
Current assets 45,821.08 46,033.00 0.46%
Fixed assets 121,459.68 106,921.00 13.60%
Total assets 167,280.77 152,954.00 9.37%
Liabilities and net worth
Current liabilities 7773.68 31,230.00 75.11%
Long term debt 58,220.99 27,684.00 110.31%
Preferred stock 0 0 0.00%
Common stock −111,718.34 3120 3680.72%
Retained earnings 213,004.44 106,216.00 100.54%
Total liabilities and 167,280.77 152,954.00 9.37%
net worth
Step 4. Excel will show the solutions of the simultaneous Computed DBT/EQ 0.57 0.51 11.76%
equations. Int. rate on total debt 0.04 0.03 33.33%
Per share data
Earnings 4.96 6.92 28.32%
Dividends −34.13 3.54 1064.12%
Price 1354.44 125.51 979.15%
8. How can investment, financing, and dividend policies 12:a Please interpret the results which you have obtained
be integrated in terms of either linear programming or from 12a.
econometric financial planning and forecasting??
9. Using information in Tables 21.3, 21.12, use the FIN- Solutions for 12a:
PLAN program enclosed in the instructor’s manual to
solve empirical results as listed in Tables 21.4, 21.5, 1. SALESt ¼ 47348ð1 þ 0:0687Þ ¼ 50600:81
and Table 21.14 2. EBITt ¼ 50600:81ð0:2754Þ ¼ 13935:46
10. a. Identify the input variables in the Warren and Shelton 3. CAt ¼ 0:577ð50600:81Þ ¼ 29196:67
model which require forecasted values and those which 4. CAt ¼ 0:2204ð50600:81Þ ¼ 11152:42
are obtained directly from current financial statements. 5. At ¼ 29196:67 þ 11152:42 ¼ 40349:08
6. CLt ¼ 0:2941ð50600:81Þ ¼ 14881:7
b. Discuss how the analyst can obtain values for the NFt ¼ ð40349:08 14881:7Þ ð2565 395Þ 3120 35223
forecasted values. 7. 0:6179f0:6628½13935:46 0:0729ð2565 395Þg
c. Why is sensitivity analysis so important and beneficial ¼ 20688:02
in this model? 8. 20688:02 þ 0:6179f0:6628½0:0729ðNL t Þ þ 0:05ðNLt Þg ¼ NLt þ NSt (a)
b. Identify which of the components are sources of funds 9. Lt ¼ 2565 395 þ NLt ¼ 2170 þ NLt (b)
and which are uses.
c. Identify the exogenous variables in this model. 10. St ¼ 3120 þ NSt (c)
12:a Please use the 21 inputs indicated in Table 20.16 to 11. Rt ¼ 35223 þ 0:6179f0:6628½13935:46 it Lt 0:05NLt g
solve Warren and Shelton model presented in this
chapter. 12. it Lt ¼ 0:0729ð2565 395Þ þ 0:0729NLt ¼ 158:193 þ 0:0729NLt
Substituting (12) into (11) yields From (18) and (19) we know that
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 459
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_21
460 21 Hedge Ratio Estimation Methods and Their Applications
Most of the studies mentioned above (except Lence 1995, The chapter is divided into six sections. In Sect. 21.2
1996) ignore transaction costs as well as investments in other alternative theories for deriving the optimal hedge ratios are
securities. Lence (1995, 1996) derives the optimal hedge discussed. Various estimation methods are presented in
ratio where transaction costs and investments in other Sect. 21.3. Section 21.4 presents applications of OLS,
securities are incorporated in the model. Using a CARA GARCH, CECM models to estimate the optimal hedge ratio.
utility function, Lence finds that under certain circumstances Section 21.5 presents a discussion on the relationship among
the optimal hedge ratio is zero; i.e., the optimal hedging lengths of hedging horizon, maturity of futures contract, data
strategy is not to hedge at all. frequency, and hedging effectiveness. Finally, in Sect. 21.6
In addition to the use of different objective functions in we provide the summary and conclusion.
the derivation of the optimal hedge ratio, previous studies
also differ in terms of the dynamic nature of the hedge ratio.
For example, some studies assume that the hedge ratio is 21.2 Alternative Theories for Deriving
constant over time. Consequently, these static hedge ratios the Optimal Hedge Ratio
are estimated using unconditional probability distributions
(e.g., see Ederington 1979; Howard and D’Antonio 1984; The basic concept of hedging is to combine investments in
Benet 1992; Kolb and Okunev 1992, 1993; Ghosh 1993). the spot market and futures market to form a portfolio that
On the other hand, several studies allow the hedge ratio to will eliminate (or reduce) fluctuations in its value. Specifi-
change over time. In some cases, these dynamic hedge ratios cally, consider a portfolio consisting of Cs units of a long
are estimated using conditional distributions associated with spot position and Cf units of a short futures position.1 Let St
models such as ARCH (Autoregressive conditional and Ft denote the spot and futures prices at time t, respec-
heteroscedasticity) and GARCH (Generalized Autoregres- tively. Since the futures contracts are used to reduce the
sive conditional heteroscedasticity) (e.g., see Cecchetti et al. fluctuations in spot positions, the resulting portfolio is
1988; Baillie and Myers 1991; Kroner and Sultan 1993; known as the hedged portfolio. The return on the hedged
Sephton 1993a). The GARCH-based method has recently portfolio, Rh , is given by:
been extended by Lee and Yoder (2007) where
Cs St Rs Cf Ft Rf
regime-switching model is used. Alternatively, the hedge Rh ¼ ¼ Rs hRf ; ð21:1aÞ
ratios can be made dynamic by considering a multi-period C s St
model where the hedge ratios are allowed to vary for dif- Cf F t St þ 1 St
where h ¼ Cs St is the so-called hedge ratio, and Rs ¼ St
ferent periods. This is the method used by Lien and Luo
Ft þ 1 Ft
(1993b). and Rf ¼ Ft are so-called one-period returns on the spot
When it comes to estimating the hedge ratios, many and futures positions, respectively. Sometimes, the hedge
different techniques are currently being employed, ranging ratio is discussed in terms of price changes (profits) instead
from simple to complex ones. For example, some of them of returns. In this case the profit on the hedged portfolio,
use such a simple method as the ordinary least squares DVH , and the hedge ratio, H, are respectively given by:
(OLS) technique (e.g., see Ederington 1979; Malliaris and
Cf
Urrutia 1991; and Benet 1992). However, others use more DVH ¼ Cs DSt Cf DFt and H¼ ; ð21:1bÞ
complex methods such as the conditional heteroscedastic Cs
(ARCH or GARCH) method (e.g., see Cecchetti et al. 1988; where DSt ¼ St þ 1 St and DFt ¼ Ft þ 1 Ft .
Baillie and Myers 1991; Sephton 1993a), the random coef- The main objective of hedging is to choose the optimal
ficient method (e.g., see Grammatikos and Saunders 1983), hedge ratio (either h or H). As mentioned above, the optimal
the cointegration method (e.g., see Ghosh 1993; Lien and hedge ratio will depend on a particular objective function to
Luo 1993b; and Chou et al. 1996), or the cointegration- be optimized. Furthermore, the hedge ratio can be static or
heteroscedastic method (e.g., see Kroner and Sultan 1993). dynamic. In subsections A and B, we will discuss the static
Recently, Lien and Shrestha (2007) has suggested the use of hedge ratio and then the dynamic hedge ratio.
wavelet analysis to match the data frequency with the It is important to note that in the above setup, the cash
hedging horizon. Finally, Lien and Shrestha (2010) also position is assumed to be fixed and we only look for the
suggest the use of multivariate skew-normal distribution in optimum futures position. Most of the hedging literature
estimating the minimum variance hedge ratio. assumes that the cash position is fixed, a setup that is suit-
It is quite clear that there are several different ways of able for financial futures. However, when we are dealing
deriving and estimating hedge ratios. In the chapter, we
review these different techniques and approaches and
examine their relations. 1
Without loss of generality, we assume that the size of the future
contract is 1.
21.2 Alternative Theories for Deriving the Optimal Hedge Ratio 461
with commodity futures, the initial cash position becomes an Alternatively, if we use definition (21.1a) and use
important decision variable that is tied to the production Var ðRh Þ to represent the portfolio risk, then the MV hedge
decision. One such setup considered by Lence (1995, 1996) ratio is obtained by minimizing Var ðRh Þ which is given by:
will be discussed in subsection C.
Var ðRh Þ ¼ Var ðRs Þ þ h2 Var Rf 2hCov Rs ; Rf :
where A represents the risk aversion parameter. It is clear From the optimal futures position, we can obtain the
that this utility function incorporates both risk and return. following optimal hedge ratio:
Therefore, the hedge ratio based on this utility function E R
ð fÞ
rf EðRs ÞRF q
would be consistent with the mean–variance framework. The rs rs
rf
optimal number of futures contract and the optimal hedge h3 ¼ : ð21:7Þ
ratio are respectively given by: EðRf Þq
1 rf EðRs ÞRF
rs
" #
Cf F E Rf rs
h2 ¼ ¼ q : ð21:4Þ Again, if E Rf ¼ 0, then h3 reduces to:
Cs S Ar2f rf
rs
One problem associated with this type of hedge ratio is h3 ¼ q; ð21:8Þ
rf
that in order to derive the optimum hedge ratio, we need to
know the individual’s risk aversion parameter. Furthermore, which is the same as the MV hedge ratio hJ .
different individuals will choose different optimal hedge As pointed out by Chen et al. (2001), the Sharpe ratio is a
ratios, depending on the values of their risk aversion highly non-linear function of the hedge ratio. Therefore, it is
parameter. possible that Eq. (21.7), which is derived by equating the
Since the MV hedge ratio is easy to understand and simple first derivative to zero, may lead to the hedge ratio that
to compute, it will be interesting and useful to know under would minimize, instead of maximizing, the Sharpe ratio.
what condition the above hedge ratio would be the same as This would be true if the second derivative of the Sharpe
the MV hedge ratio. It can be seen from Eqs. (21.2b) and
ratio with respect to the hedge ratio is positive instead of
(21.4) that if A ! 1 or E Rf ¼ 0, then h2 would be equal negative. Furthermore, it is possible that the optimal hedge
to the MV hedge ratio hJ . The first condition is simply a ratio may be undefined as in the case encountered by Chen
restatement of the infinitely risk-averse individuals. How- et al. (2001), where the Sharpe ratio monotonically increases
ever, the second condition does not impose any condition on with the hedge ratio.
the risk-averseness, and this is important. It implies that even
if the individuals are not infinitely risk averse, then the MV 21.2.1.4 Maximum Expected Utility Hedge Ratio
hedge ratio would be the same as the optimal mean–variance So far we have discussed the hedge ratios that incorporate
hedge ratio if the expected return on the futures contract is only risk as well as the ones that incorporate both risk and
zero (i.e. futures prices follow a simple martingale process). return. The methods, which incorporate both the expected
Therefore, if futures prices follow a simple martingale pro- return and risk in the derivation of the optimal hedge ratio,
cess, then we do not need to know the risk aversion parameter are consistent with the mean–variance framework. However,
of the investor to find the optimal hedge ratio. these methods may not be consistent with the expected
utility maximization principle unless either the utility func-
21.2.1.3 Sharpe Hedge Ratio tion is quadratic or the returns are jointly normally dis-
Another way of incorporating the portfolio return in the tributed. Therefore, in order to make the hedge ratio
hedging strategy is to use the risk-return tradeoff (Sharpe consistent with the expected utility maximization principle,
measure) criteria. Howard and D’Antonio (1984) consider we need to derive the hedge ratio that maximizes the
the optimal level of futures contracts by maximizing the ratio expected utility. However, in order to maximize the expected
of the portfolio’s excess return to its volatility: utility we need to assume a specific utility function. For
example, Cecchetti et al. (1988) derive the hedge ratio that
EðRh Þ RF
Max h ¼ ; ð21:5Þ maximizes the expected utility where the utility function is
Cf rh assumed to be the logarithm of terminal wealth. Specifically,
they derive the optimal hedge ratio that maximizes the fol-
where r2h ¼ Var ðRh Þ and RF represent the risk-free interest
lowing expected utility function:
rate. In this case, the optimal number of futures positions,
Z Z
Cf , is given by:
log 1 þ Rs hRf f Rs ; Rf dRs dRf ;
S r EðRf Þ
Rs Rf
F
s
rf
rs
rf EðRs ÞRF q
Cf ¼ Cs : ð21:6Þ where the density function f Rs ; Rf is assumed to be
E ðR f Þq bivariate normal. A third-order linear bivariate ARCH model
1 rrfs EðRs ÞRF
is used to get the conditional variance and covariance matrix,
and a numerical procedure is used to maximize the objective
function with respect to the hedge ratio.2
21.2 Alternative Theories for Deriving the Optimal Hedge Ratio 463
21.2.1.5 Minimum Mean Extended-Gini that the investors consider only the returns below the target
Coefficient Hedge Ratio return (d) to be risky. It can be shown (see Fishburn 1977)
This approach of deriving the optimal hedge ratio is con- that a\1 represents a risk-seeking investor and a [ 1 rep-
sistent with the concept of stochastic dominance and resents a risk-averse investor.
involves the use of the mean extended-Gini (MEG) coeffi- The GSV, due to its emphasis on the returns below the
cient. Cheung et al. (1990), Kolb and Okunev (1992), Lien target return, is consistent with the risk perceived by man-
and Luo (1993a), Shalit (1995), and Lien and Shaffer (1999) agers (see Crum et al. 1981; Lien and Tse 2000). Further-
all consider this approach. It minimizes the MEG coefficient more, as shown by Fishburn (1977) and Bawa (1978), the
Cm ðRh Þ defined as follows: GSV is consistent with the concept of stochastic dominance.
Lien and Tse (1998) show that the GSV hedge ratio, which
Cm ðRh Þ ¼ mCov Rh ; ð1 GðRh ÞÞm1 ; ð21:9Þ is obtained by minimizing the GSV, would be the same as
the MV hedge ratio if the futures and spot returns are jointly
where G is the cumulative probability distribution and m is normally distributed and if the futures price follows a pure
the risk aversion parameter. Note that 0 m\1 implies risk martingale process.
seekers, m ¼ 1 implies risk-neutral investors, and m [ 1
implies risk-averse investors. Shalit (1995) has shown that if 21.2.1.8 Optimum Mean-Generalized
the futures and spot returns are jointly normally distributed, Semivariance Hedge Ratio
then the minimum-MEG hedge ratio would be the same as Chen et al. (2001) extend the GSV hedge ratio to a
the MV hedge ratio. Mean-GSV (M-GSV) hedge ratio by incorporating the mean
return in the derivation of the optimal hedge ratio. The
21.2.1.6 Optimum Mean-MEG Hedge Ratio M-GSV hedge ratio is obtained by maximizing the following
Instead of minimizing the MEG coefficient, Kolb and mean-risk utility function, which is similar to the conven-
Okunev (1993) alternatively consider maximizing the utility tional mean–variance based utility function (see Eq. (21.3)):
function defined as follows:
U ðRh Þ ¼ E½Rh Vd;a ðRh Þ: ð21:12Þ
U ðRh Þ ¼ EðRh Þ Cv ðRh Þ: ð21:10Þ
This approach to the hedge ratio does not use the risk
The hedge ratio based on the utility function defined by aversion parameter to multiply the GSV as done in con-
Eq. (21.10) is denoted as the M-MEG hedge ratio. The ventional mean-risk models (see Hsin et al. 1994, and
difference between the MEG and M-MEG hedge ratios is Eq. (21.3)). This is because the risk aversion parameter is
that the MEG hedge ratio ignores the expected return on the already included in the definition of the GSV, Vd;a ðRh Þ. As
hedged portfolio. Again, if the futures price follows a before, the M-GSV hedge ratio would be the same as the
martingale process (i.e., E Rf ¼ 0), then the MEG hedge GSV hedge ratio if the futures price follows a pure martin-
ratio would be the same as the M-MEG hedge ratio. gale process.
21.2.2 Dynamic Case term on the right-hand side of Eq. (21.16). However, it is
interesting to note that the multi-period hedge ratio would be
We have up to now examined the situations in which the different from the single-period one if the changes in current
hedge ratio is fixed at the optimum level and is not revised futures prices are correlated with the changes in future
during the hedging period. However, it could be beneficial to futures prices or with the changes in future spot prices.
change the hedge ratio over time. One way to allow the
hedge ratio to change is by recalculating the hedge ratio
based on the current (or conditional) information on the 21.2.3 Case with Production and Alternative
covariance rsf and variance r2f . This involves calcu- Investment Opportunities
lating the hedge ratio based on conditional information (i.e., All the models considered in subsections A and B assume
rsf jXt1 and r2f jXt1 ) instead of unconditional information. that the spot position is fixed or predetermined, and thus
In this case, the MV hedge ratio is given by: production is ignored. As mentioned earlier, such an
assumption may be appropriate for financial futures. How-
rsf jXt1
h1 jXt1 ¼ : ever, when we consider commodity futures, production
r2f jXt1 should be considered in which case the spot position
The adjustment to the hedge ratio based on new infor- becomes one of the decision variables. In an important
mation can be implemented using such conditional models chapter, Lence (1995) extends the model with a fixed or
as ARCH and GARCH (to be discussed later) or using the predetermined spot position to a model where production is
moving window estimation method. included. In his model, Lence (1995) also incorporates the
Another way of making the hedge ratio dynamic is by using possibility of investing in a risk-free asset and other risky
the regime switching GARCH model (to be discussed later) as assets, borrowing, as well as transaction costs. We will
suggested by Lee and Yoder (2007). This model assumes two briefly discuss the model considered by Lence (1995) below.
different regimes where each regime is associated with dif- Lence (1995) considers a decision maker whose utility is
ferent set of parameters and the probabilities of regime a function of terminal wealth U ðW1 Þ, such that U 0 [ 0 and
switching must also be estimated when implementing such U 00 \0. At the decision date ðt ¼ 0Þ, the decision maker will
methods. Alternatively, we can allow the hedge ratio to engage in the production of Q commodity units for sale at
change during the hedging period by considering multi-period terminal date ðt ¼ 1Þ at the random cash price P1 . At the
models, which is the approach used by Lien and Luo (1993b). decision date, the decision maker can lend L dollars at the
Lien and Luo (1993b) consider hedging with T periods’ risk-free lending rate ðRL 1Þ and borrow B dollars at the
planning horizon and minimize the variance of the wealth at borrowing rate ðRB 1Þ, invest I dollars in a different
the end of the planning horizon, WT . Consider the situation activity that yields a random rate of return ðRI 1Þ and sell
where Cs;t is the spot position at the beginning of period X futures at futures price F0 . The transaction cost for the
t and the corresponding futures position is given by futures trade is f dollars per unit of the commodity traded to
Cf ;t ¼ bt Cs;t . The wealth at the end of the planning hori- be paid at the terminal date. The terminal wealth ðW1 Þ is,
therefore, given by:
zon, WT , is then given by:
X
T 1 W1 ¼ W0 R
WT ¼ W0 þ Cs;t ½St þ 1 St bt ðFt þ 1 Ft Þ ¼ P1 Q þ ðF0 F1 ÞX f j X j RB B þ RL L þ RI I;
t¼0
ð21:15Þ ð21:17Þ
X
T 1
¼ W0 þ Cs;t ½DSt þ 1 bt DFt þ 1 : where R is the return on the diversified portfolio. The
t¼0
decision maker will maximize the expected utility subject to
The optimal bt ’s are given by the following recursive the following restrictions:
formula:
W0 þ B vðQÞQ þ L þ I; 0 B kB vðQÞQ; kB 0;
T 1
X L kL F0 j X j; kL 0; I 0;
CovðDSt þ 1 ; DFt þ 1 Þ Cs;i CovðDFt þ 1 ; DSi þ 1 þ bi DFt þ i Þ
bt ¼ þ :
Var ðDFt þ 1 Þ Cs;t Var ðDFt þ 1 Þ
i¼t þ 1
where vðQÞ is the average cost function, kB is the maximum
ð21:16Þ amount (expressed as a proportion of his initial wealth) that
It is clear from Eq. (21.16) that the optimal hedge ratio bt the agent can borrow, and kL is the safety margin for the
will change over time. The multi-period hedge ratio will futures contract.
differ from the single-period hedge ratio due to the second Using this framework, Lence (1995) introduces two
opportunity costs: opportunity cost of alternative
21.3 Alternative Methods for Estimating the Optimal Hedge Ratio 465
(sub-optimal) investment ðcalt Þ and opportunity cost of esti- changes in futures price using the OLS technique (e.g., see
mation risk ðeBayes Þ.3 Let Ropt be the return of the Junkus and Lee 1985). Specifically, the regression equation
expected-utility maximizing strategy and let Ralt be the return can be written as:
on a particular alternative (sub-optimal) investment strategy.
DSt ¼ a0 þ a1 DFt þ et ; ð21:20Þ
The opportunity cost of alternative investment strategy calt is
then given by: where the estimate of the MV hedge ratio, Hj , is given by a1 .
The OLS technique is quite robust and simple to use.
E U W0 Ropt ¼ E½U ðW0 Ralt þ calt Þ: ð21:18Þ
However, for the OLS technique to be valid and efficient,
In other words, calt is the minimum certain net return assumptions associated with the OLS regression must be
required by the agent to invest in the alternative (sub-optimal satisfied. One case where the assumptions are not completely
hedging) strategy rather than in the optimum strategy. Using satisfied is that the error term in the regression is
the CARA utility function and some simulation results, heteroscedastic. This situation will be discussed later.
Lence (1995) finds that the expected-utility maximizing Another problem with the OLS method, as pointed out by
hedge ratios are substantially different from the Myers and Thompson (1989), is the fact that it uses
minimum-variance hedge ratios. He also shows that under unconditional sample moments instead of conditional sam-
certain conditions, the optimal hedge ratio is zero; i.e., the ple moments, which use currently available information.
optimal strategy is not to hedge at all. They suggest the use of the conditional covariance and
Similarly, the opportunity cost of the estimation risk conditional variance in Eq. (21.2a). In this case, the condi-
ðeBayes Þ is defined as follows: tional version of the optimal hedge ratio (Eq. (21.2a)) will
h n h ioi take the following form:
Eq E U W0 Ropt ðqÞ eBayes
h q i Cf CovðDS; DF ÞjXt1
HJ ¼ ¼ : ð21:2aÞ
¼ Eq E U W0 RoptBayes
; ð21:19Þ Cs Var ðDF ÞjXt1
where Ropt ðqÞ is the expected-utility maximizing return Suppose that the current information ðXt1 Þ includes a
where the agent knows with certainty the value of the cor- vector of variables ðXt1 Þ and the spot and futures price
changes are generated by the following equilibrium model:
relation between the futures and spot prices ðqÞ, RBayes
opt is the
expected-utility maximizing return where the agent only DSt ¼ Xt1 a þ ut ;
knows the distribution of the correlation q, and Eq ½: is the
expectation with respect to q. Using simulation results, DFt ¼ Xt1 b þ vt :
Lence (1995) finds that the opportunity cost of the estima-
In this case the maximum likelihood estimator of the MV
tion risk is negligible and thus the value of the use of
hedge ratio is given by (see Myers and Thompson 1989):
sophisticated estimation methods is negligible.
^ ^uv
r
hjXt1 ¼ 2 ; ð21:21Þ
^v
r
21.3 Alternative Methods for Estimating
the Optimal Hedge Ratio where r ^uv is the sample covariance between the residuals ut
and vt , and r^2v is the sample variance of the residual vt . In
In Sect. 21.2, we discussed different approaches to deriving general, the OLS estimator obtained from Eq. (21.20) would
the optimum hedge ratios. However, in order to apply these be different from the one given by Eq. (21.21). For the two
optimum hedge ratios in practice, we need to estimate these estimators to be the same, the spot and futures prices must be
hedge ratios. There are various ways of estimating them. In generated by the following model:
this section we briefly discuss these estimation methods.
DSt ¼ a0 þ ut ; DFt ¼ b0 þ vt :
In other words, if the spot and futures prices follow a
21.3.1 Estimation of the Minimum-Variance random walk, then with or without drift, the two estimators
(MV) Hedge Ratio will be the same. Otherwise, the hedge ratio estimated from
the OLS regression (21.18) will not be optimal. Now we
21.3.1.1 OLS Method show how SAS can be used to estimate the hedge ratio in
The conventional approach to estimating the MV hedge ratio terms of OLS method.
involves the regression of the changes in spot prices on the
466 21 Hedge Ratio Estimation Methods and Their Applications
21.3.1.2 Multivariate Skew-Normal Distribution ðS1;t Þ, spot canola ðS2t Þ, wheat futures ðF1t Þ, and canola
Method futures ðF2t Þ. We then have the following multi-variate
An alternative way of estimating the MV hedge ratio GARCH model:
involves the assumption that the spot price and futures price 2 3 2 3 2 3
follow a multivariate skew-normal distribution as suggested DS1t l1 e1t
6 DS2t 7 6 l2 7 6 e2t 7
6 7 6 7 6 7
4 DF1t 5 ¼ 4 l3 5 þ 4 e3t 5 , DYt ¼ l þ et ;
by Lien and Shrestha (2010). The estimate of covariance
matrix under skew-normal distribution can be different from
the estimate of covariance matrix under the usual normal DF2t l4 e4t
distribution resulting in different estimates of the MV hedge et jXt1 N ð0; Ht Þ:
ratio. Let Y be a k-dimensional random vector. Then Y is said The MV hedge ratio can be estimated using a similar
to have skew-normal distribution if its probability density technique as described above. For example, the conditional
function is given as follows: MV hedge ratio is given by the conditional covariance
between the spot and futures price changes divided by the
fY ðyÞ ¼ 2/k ðy; XY ÞUðat yÞ
conditional variance of the futures price change. Now we
where a is a k-dimensional column vector, /k ðy; XY Þ is the show how SAS can be used to estimate ratio in terms of
probability density function of a k-dimensional standard ARCH and GARCH models.
normal random variable with zero mean and correlation
matrix XY and Uðat yÞ is the probability distribution function 21.3.1.4 Regime-Switching GARCH Model
of a one-dimensional standard normal random variable The GARCH model discussed above can be further extended
evaluated at at y. by allowing regime switching as suggested by Lee and
Yoder (2007). Under this model, the data generating process
21.3.1.3 ARCH and GARCH Methods can be in one of the two states or regime denoted by the state
Ever since the development of ARCH and GARCH models, variable st ¼ f1; 2g, which is assumed to follow a first-order
the OLS method of estimating the hedge ratio has been Markov process. The state transition probabilities are
generalized to take into account the heteroscedastic nature of assumed to follow a logistic distribution where the transition
the error term in Eq. (21.20). In this case, rather than using probabilities are given by
the unconditional sample variance and covariance, the con-
ep0
ditional variance and covariance from the GARCH model Prðst ¼ 1jst1 ¼ 1Þ ¼ &
are used in the estimation of the hedge ratio. As mentioned 1 þ epq0
e0
above, such a technique allows an update of the hedge ratio Prðst ¼ 2jst1 ¼ 2Þ ¼ :
1 þ eq0
over the hedging period.
Consider the following bivariate GARCH model (see The conditional covariance matrix is given by
Cecchetti et al. 1988; Baillie and Myers 1991):
h1;t;st 0 1 qt;st h1;t;st 0
Ht;st ¼
DSt l1 e 0 h2;t;st qt;st 1 0 h2;t;st
¼ þ 1t , DYt ¼ l þ et ;
DFt l2 e2t
where
H11;t H12;t
et jXt1 N ð0; Ht Þ; Ht ¼ ; h21;t;st ¼ c1;st þ a1;st e21:t1 þ b1;st h21;t1
H12;t H22;t
h22;t;st ¼ c2;st þ a2;st e22:t1 þ b2;st h22;t1
vecðHt Þ ¼ C þ A vec et1 e0t1 þ B vecðHt1 Þ: ð21:22Þ qt;st ¼ 1 h1;st h2;st q þ h1;st qt1 þ h2;st /t1
P2
The conditional MV hedge ratio at time t is given by j¼1 e1;tj e2;tj
/t1 ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P P ffi ;
ht1 ¼ H12;t =H22;t . This model allows the hedge ratio to 2 2
2 2
change over time, resulting in a series of hedge ratios instead j¼1 e1;tj j¼1 e2;tj
of a single hedge ratio for the entire hedging horizon. ei;t
Equation (21.22) represents a GARCH model. This GARCH ei;t ¼ ; h1 ; h2 0 & h1 þ h2 1
hit
model will reduce to ARCH if B is equal to zero.
The model can be extended to include more than one type Once the conditional covariance matrix is estimated, the
of cash and futures contracts (see Sephton 1993a). For time varying conditional MV hedge ratio is given by the
example, consider a portfolio that consists of spot wheat ratio of the covariance between the spot and futures returns
to the variance of the futures return.
21.3 Alternative Methods for Estimating the Optimal Hedge Ratio 467
21.3.4 Estimation of Mean Extended-Gini Lien and Luo (1993a) suggest an alternative method of
(MEG) Coefficient Based Hedge Ratios estimating the MEG hedge ratio. This method involves the
estimation of the cumulative distribution function using a
The MEG hedge ratio involves the minimization of the non-parametric kernel function instead of using a rank
following MEG coefficient: function as suggested above.
Regarding the estimation of the M-MEG hedge ratio, one
Cv ðRh Þ ¼ vCov Rh ; ð1 GðRh ÞÞv1 : can follow either the empirical distribution method or the
non-parametric kernel method to estimate the MEG coeffi-
In order to estimate the MEG coefficient, we need to cient. A numerical method can then be used to estimate the
estimate the cumulative probability density function GðRh Þ. hedge ratio that maximizes the objective function given by
The cumulative probability density function is usually esti- Eq. (21.10).
mated by ranking the observed return on the hedged port-
folio. A detailed description of the process can be found in
Kolb and Okunev (1992), and we briefly describe the pro- 21.3.5 Estimation of Generalized Semivariance
cess here. (GSV) Based Hedge Ratios
The cumulative probability distribution is estimated by
using the rank as follows: The GSV can be estimated from the sample by using the
following sample counterpart:
Rank Rh;i
G Rh;i ¼ ;
N sample 1X N a
Vd;a ð Rh Þ ¼ d Rh;i U d Rh;i ; ð21:30Þ
where N is the sample size. Once we have the series for the N i¼1
probability distribution function, the MEG is estimated by
where
replacing the theoretical covariance by the sample covari-
ance as follows: 1 for d Rh;i
U d Rh;i ¼ :
0 for d\Rh;i
v X
N v1
Csample
v ðRh Þ ¼ Rh;i Rh 1 G Rh;i H ; ð21:29Þ
N i¼1 Similar to the MEG technique, the optimal GSV hedge
ratio can be estimated by choosing the hedge ratio that
where sample
minimizes the sample GSV, Vd;a ðRh Þ. Numerical methods
1X N
1X N v1 can be used to search for the optimum hedge ratio. Similarly,
Rh ¼ Rh;i and H¼ 1 G Rh;i : the M-GSV hedge ratio can be obtained by minimizing the
N i¼1 N i¼1
mean-risk function given by Eq. (21.12), where the expected
The optimal hedge ratio is now given by the hedge ratio return on the hedged portfolio is replaced by the sample
that minimizes the estimated MEG. Since there is no ana- average return and the GSV is replaced by the sample GSV.
lytical solution, the numerical method needs to be applied in One can instead use the kernel density estimation method
order to get the optimal hedge ratio. This method is some- suggested by Lien and Tse (2000) to estimate the GSV, and
times referred to as the empirical distribution method. numerical techniques can be used to find the optimum GSV
Alternatively, the instrumental variable (IV) method hedge ratio. Instead of using the kernel method, one can also
suggested by Shalit (1995) can be used to find the MEG employ the conditional heteroscedastic model to estimate the
hedge ratio. Shalit’s method provides the following analyt- density function. This is the method used by Lien and Tse (1998).
ical solution for the MEG hedge ratio:
Cov St þ 1 ; ½1 GðFt þ 1 Þt1 21.4 Applications of OLS, GARCH, and CECM
hIV ¼ : Models to Estimate Optimal Hedge
Cov Ft þ 1 ; ½1 GðFt þ 1 Þt1 Ratio2
It is important to note that for the IV method to be valid,
In this section, we apply OLS, GARCH, and CECM models
the cumulative distribution function of the terminal wealth
to estimate optimal hedge ratios through R language.
ðWt þ 1 Þ should be similar to the cumulative distribution of
Monthly data for S&P 500 index and its futures were
the futures price ðFt þ 1 Þ; i.e., GðWt þ 1 Þ ¼ GðFt þ 1 Þ. Lien and
Shaffer (1999) find that the IV-based hedge ratio ðhIV Þ is
2
significantly different from the minimum MEG hedge ratio. R programs that are used to estimate the empirical results in this
section can be found in Appendix 21.4.
21.4 Applications of OLS, GARCH, and CECM Models to Estimate Optimal Hedge Ratio 469
Table 21.2 Hedge ratio Variable Estimate Std. error t-ratio p-value
coefficient using the conventional
regression model Intercept 0.1984 0.2729 0.73 0.4680
DFt 0.9851 0.0034 292.53 <0.0001
collected from Datastream database, the sample consisted of (ECM) will be presented. Here, we apply the augmented
188 observations from January 31, 2005, to August 31, Dickey- Fuller (ADF) regression to test for the presence of
2020. First, we use OLS method by regressing the changes unit roots. The ADF test statistics, as shown in Panel A of
in spot prices on the changes in futures prices to estimate the Table 21.4, indicate that the null hypothesis of a unit root
optimal hedge ratio. The estimate of hedge ratio obtained cannot be rejected for the levels of the variables. Using
from the OLS technique are reported in Table 21.2. As differenced data, the computed ADF test statistics shown in
shown in Table 21.2, we can see that the hedge ratio of S&P Panel B of Table 21.4 suggested that the null hypothesis is
500 index is significantly different from zero, at a 1% sig- rejected, at the 1% significance level. As differencing one
nificance level. Moreover, the estimated hedge ratio, denoted produces stationarity, we may conclude that each series is
by the coefficient of DFt , is generally less than unity. integrated of order one, I(1), process which is necessary for
Secondly, we apply a conventional regression model with testing the existence of cointegration. We then apply Phillips
heteroscedastic error terms to estimate the hedge ratio. Here, and Ouliaris (1990) residual cointegration test to examine
an AR(2)-GARCH(1, 1) model for the changes in spot prices the presence of cointegration. The result of Phillips–Ouliaris
regressed on the changes in futures prices is specified as cointegration test shown is reported in Panel C of
follows, Table 21.4. The null hypothesis of the Phillips–Ouliaris
cointegration test is that there is no cointegration present.
DSt ¼ a0 þ a1 DFt þ et ; et ¼ et u1 et1 u2 et2 The result of Phillips–Ouliaris cointegration test indicates
pffiffiffiffi the null hypothesis of no cointegration is rejected, at 1%
et ¼ ht t ; ht ¼ x þ a1 e2t1 þ b1 ht1 significance level. This suggests that the spot S&P 500 index
where t N ð0; 1Þ: The estimated result of AR(2)-GARCH is cointegrated with the S&P 500 index futures.
(1, 1) model is shown in Table 21.3. The coefficient estimates Finally, we apply the ECM model in terms of Eq. (21.17)
of the AR(2)-GARCH(1, 1) model, as shown in Table 21.3, to estimate the optimal hedge ratio. Table 21.5 shows that
are all significantly different from zero, at a 1% significance the coefficient on the error-correction term, ^ ut1 , is signifi-
level. This finding suggests that the importance of capturing cantly different from zero, at a 1% significance level. This
the heteroscedastic error structures in conventional regression suggests that the importance of estimating the error correc-
model. In addition, the hedge ratio of conventional regression tion model, and in particular the long-run equilibrium error
with AR(2)-GARCH(1, 1) model is higher than the OLS term cannot be ignored in the conventional regression
hedge ratio for S&P 500 futures contract. model. In addition, the ECM hedge ratio is higher than the
Next, we will apply the CECM model to estimate the conventional OLS hedge ratio for S&P 500 futures contract.
optimal hedge ratio. Here, standard augmented This finding is consistent with the results in Lien (1996,
Dickey-Fuller (ADF) unit roots and Phillips and Ouliaris 2004) who argued that the MV hedge ratio will be smaller if
(1990) residual cointegration tests are performed and the the cointegration relationship is not considered.
optimal hedge ratios estimated by error correction model
Table 21.3 Hedge ratio Variable Estimate Std. error t-ratio p-value
coefficient using the conventional
regression model with Intercept 0.0490 0.0144 3.41 0.0007
heteroscedastic errors DFt 0.9994 0.0008 1179.59 <0.0001
et1 −0.9873 0.0109 −90.29 <0.0001
et2 −0.9959 0.0145 −68.83 <0.0001
x 0.0167 0.0098 1.71 0.0866
e2t1 0.3135 0.0543 5.78 <0.0001
ht1 0.6855 0.0530 12.94 <0.0001
470 21 Hedge Ratio Estimation Methods and Their Applications
Table 21.4 Unit roots and Variable ADF statistics Lag parameter p-value
residual cointegration tests results
Panel A. Level data
Spot −1.3353 1 0.8542
Futures −1.3458 1 0.8498
Panel B. First-order differenced data
Spot −10.104 1 <0.01
Futures −10.150 1 <0.01
Panel C. Phillips–Ouliaris cointegration test
Phillips–Ouliaris demeaned −60.783 1 <0.01
Table 21.5 Error correction Variable Estimate Std. error t-ratio p-value
estimates of hedge ratio
coefficient DFt 0.9892 0.0031 316.60 <0.001
^ut1 −0.3423 0.0571 −5.99 <0.001
Suppose that the spot and futures prices, which are both Now, we can run the following regression to find the hedge
unit-root processes, are cointegrated. In this case the futures ratio corresponding to hedging horizon equal to 2j1 days:
and spot prices can be described by the following processes
(see Stock and Watson 1988; Hylleberg and Mizon 1989): Dsj;t ¼ hj;0 þ hj;1 Dfj;t þ ej ð21:33Þ
St ¼ A1 Pt þ A2 s t ; ð21:31aÞ where the estimate of the hedge ratio is given by the estimate
of hj;1 .
Ft ¼ B1 Pt þ B2 s t ; ð21:31bÞ
Pt ¼ Pt1 þ wt ; ð21:31cÞ
21.6 Summary and Conclusions
st ¼ a1 st1 þ vt ; 0 ja1 j\1; ð21:31dÞ
In this chapter, we have reviewed various approaches to
where Pt and st are permanent and transitory factors that deriving the optimal hedge ratio, as summarized in Appendix
drive the spot and futures prices and wt and vt are white 21.1. These approaches can be divided into the mean–
noise processes. Note that Pt follows a pure random walk variance-based approach, the expected utility maximizing
process and st follows a stationary process. The MV hedge approach, the mean extended-Gini coefficient-based
ratio for a k-period hedging horizon is then given by (see approach, and the generalized semivariance-based approach.
Geppert 1995): All these approaches will lead to the same hedge ratio as the
conventional minimum-variance (MV) hedge ratio if the
ð1ak Þ futures price follows a pure martingale process and if the
A1 B1 kr2w þ 2A2 B2 1a2 r2v
HJ ¼ : ð21:32Þ futures and spot prices are jointly normal. However, if these
B21 kr2w þ 2B22 ð1a
1ak Þ
2 r2v conditions do not hold, then the hedge ratios-based on the
various approaches will be different.
One advantage of using Eq. (21.32) instead of a regres- The MV hedge ratio is the most understood and most
sion with non-overlapping price changes is that it avoids the widely used hedge ratio. Since the statistical properties of the
problem of a reduction in sample size associated with MV hedge ratio are well known, statistical hypothesis testing
non-overlapping differencing. can be performed with the MV hedge ratio. For example, we
An alternative way of matching the data frequency with can test whether the optimal MV hedge ratio is the same as
the hedging horizon is by using the wavelet to decompose the naïve hedge ratio. Since the MV hedge ratio ignores the
the time series into different frequencies as suggested by expected return, it will not be consistent with the mean–
Lien and Shrestha (2007). The decomposition can be done variance analysis unless the futures price follows a pure
without the loss of sample size (see Lien and Shrestha martingale process. Furthermore, if the martingale and nor-
(2007) for detail). For example, the daily spot and future mality condition do not hold, then the MV hedge ratio will
returns series can be decomposed using the maximal overlap not be consistent with the expected utility maximization
discrete wavelet transform (MODWT) as follows: principle. Following the MV hedge ratio is the mean–vari-
ance hedge ratio. Even if this hedge ratio incorporates the
Rs;t ¼ BsJ;t þ DsJ;t þ DsJ1;t þ þ Ds1;t expected return in the derivation of the optimal hedge ratio,
it will not be consistent with the expected maximization
Rf ;t ¼ BfJ;t þ DfJ;t þ DfJ1;t þ þ Df1;t principle unless either the normality condition holds or the
utility function is quadratic.
where Dsj;t and Dfj;t are the spot and futures returns series with In order to make the hedge ratio consistent with the
changes on the time scale of length 2j1 days, respectively.4 expected utility maximization principle, we can derive the
Similarly, BsJ;t and B2J;t represent spot and futures returns optimal hedge ratio by maximizing the expected utility.
series corresponding to time scale of 2J days and longer. However, to implement such approach, we need to assume a
472 21 Hedge Ratio Estimation Methods and Their Applications
specific utility function and we need to make an assumption MV hedge ratio is concerned, there are a large number of
regarding the return distribution. Therefore, different utility methods that have been proposed in the literature. These
functions will lead to different optimal hedge ratios. Fur- methods range from a simple regression method to complex
thermore, analytic solutions for such hedge ratios are not cointegrated heteroscedastic methods with regime-
known and numerical methods need to be applied. switching, and some of the estimation methods include a
New approaches have recently been suggested in deriving kernel density function method as well as an empirical dis-
optimal hedge ratios. These include the mean-Gini tribution method. Except for many of mean–variance-based
coefficient-based hedge ratio, semivariance-based hedge hedge ratios, the estimation involves the use of a numerical
ratios and Value-at-Risk-based hedge ratios. These hedge technique. This has to do with the fact that most of the
ratios are consistent with the second-order stochastic domi- optimal hedge ratio formulae do not have a closed-form
nance principle. Therefore, such hedge ratios are very gen- analytic expression. Again, it is important to mention that
eral in the sense that they are consistent with the expected based on his specific model, Lence (1995) finds that the
utility maximization principle and make very few assump- value of complicated and sophisticated estimation methods
tions on the utility function. The only requirement is that the is negligible. It remains to be seen if such a result holds for
marginal utility be positive and the second derivative of the the mean extended-Gini coefficient-based as well as
utility function be negative. However, both of these hedge semivariance-based hedge ratios.
ratios do not lead to a unique hedge ratio. For example, the In this chapter, we have also discussed about the rela-
mean-Gini coefficient-based hedge ratio depends on the risk tionship between the optimal MV hedge ratio and the
aversion parameter (m) and the semivariance-based hedge hedging horizon. We feel that this relationship has not been
ratio depends on the risk aversion parameter (a) and target fully explored and can be further developed in the future. For
return (d). It is important to note, however, that the example, we would like to know if the optimal hedge ratio
semivariance-based hedge ratio has some appeal in the sense approaches the naïve hedge ratio when the hedging horizon
that the semivariance as a measure of risk is consistent with becomes longer.
the risk perceived by individuals. The same argument can be The main thing we learn from this review is that if the
applied to Value-at-Risk-based hedge ratio. futures price follows a pure martingale process and if the
So far as the derivation of the optimal hedge ratio is returns are jointly normally distributed, then all different
concerned, almost all of the derivations do not incorporate hedge ratios are the same as the conventional MV hedge
transaction costs. Furthermore, these derivations do not allow ratio, which is simple to compute and easy to understand.
investments in securities other than the spot and corre- However, if these two conditions do not hold, then there are
sponding futures contracts. As shown by Lence (1995), once many optimal hedge ratios (depending on which objective
we relax these conventional assumptions, the resulting opti- function one is trying to optimize) and there is no single
mal hedge ratio can be quite different from the ones obtained optimal hedge ratio that is distinctly superior to the
under the conventional assumptions. Lence’s (1995) results remaining ones. Therefore, further research needs to be done
are based on a specific utility function and some other to unify these different approaches to the hedge ratio.
assumption regarding the return distributions. It remains to be For those who are interested in research in this area, we
seen if such results hold for the mean extended-Gini would like to finally point out that one requires a good
coefficient-based as well as semivariance-based hedge ratios. understanding of financial economic theories and econo-
In this chapter, we have also reviewed various ways of metric methodologies. In addition, a good background in
estimating the optimum hedge ratio, as summarized in data analysis and computer programming would also be
Appendix 21.2. As far as the estimation of the conventional helpful.
Appendix 21.1: Theoretical Models 473
Notes
A. Return Model
C
(Ret1) DVH ¼ Cs DPs þ Cf DPf ) hedge ratio ¼ H ¼ Cfs ; Cs ¼ units of spot commodity and
Cf ¼ units of futures contract
(Ret2) Rh ¼ Rs þ hRf ; Rs ¼ St SS t1
t1
(a) Rf ¼ FtFF t1
) hedge ratio : h ¼
Cf Ft1
Cs St1
t1
(b) Rf ¼ Ft SF
Cf
t1
t1
) hedge ratio : h ¼ Cs
B. Objective Function:
(O1) Minimize VarðRh Þ ¼ Cs2 r2s þ Cf2 r2f þ 2Cs Cf rsf or VarðRh Þ ¼ r2s þ h2 r2f þ 2hrsf
(O2) Maximize EðRh Þ A2 Var ðRh Þ
EðRh ÞRF
(O3) Maximize
Var ðRh Þ ðSharpe ratioÞ; RF ¼ risk free interest rate
(O4) Maximize E½U ðW Þ; Uð:Þ ¼ utility function; W ¼ terminal wealth
(O5) Minimize Cv ðRh Þ; Cv ðRh Þ ¼ vCov Rh ; ð1 F ðRh ÞÞv1
C. Hedging Effectiveness
(E1) ðRh Þ
e ¼ 1 Var
Var ðRs Þ
(E2) e ¼ Rce
h Rss ;
ce
h ðRs Þ ¼ certainty equivalent return of hedged (unhedged) portfolio
Rce ce
(E4) V ðR Þ
e ¼ 1 Vd;a h
d;a ðRs Þ
Appendix 21.2: Empirical Models 475
Notes
A:1. OLS
(M2): Rs
The return vector Y ¼ is assumed to have skew-normal distribution with covariance matrix V:
Rf
V ð1;2Þ
Hedge ration ¼ Hskn ¼ V ð2;2Þ
A:3. ARCH/GARCH
(M6): St ¼ a þ bFt þ ut P Pn
DSt ¼ qut1 þ bDFt þ m
i¼1 di DFti þ j¼1 hi DStj þ ej ; EC Hedge ratio = b
(M9): C F EðRf Þ
Hedge ratio = h2 ¼ Cfs S ¼ Ar2f
q rrfs , where the moments E Rf ; rs and rf are estimated by sample moments
f EðRs Þi
(M11): The hedge ratio is estimated by numerically minimizing the following mean extended-Gini coefficient, where the cumulative
probability distribution functionis estimated using therank function:
P v1
^ v ðRh Þ ¼ v N Rh;i Rh
C 1 G Rh;i H
N i¼1
(M12): The hedge ratio is estimated by numerically solving the first-order condition, where the cumulative probability distribution function is
estimated using the rank function
(M13): The hedge ratio is estimated by numerically solving the first-order condition, where the cumulative probability distribution function is
estimated using the kernel-based estimates
(M14): The hedge ratio is estimated by numerically maximizing the following function:
U ðRh Þ ¼ EðRh Þ Cv ðRh Þ;
where the expected values and the mean extended-Gini coefficient are replaced by their sample counterparts and the cumulative
probability distribution function is estimated using the rank function
(M15): The hedge ratio is estimated by numerically minimizing the following sample generalized hedge ratio:
sample P a 1 for d Rh;i
Vd;a ðRh Þ ¼ N1 Ni¼1 d Rh;i U d Rh;i ; where U d Rh;i ¼
0 for d\Rh;i
(M16): The hedge ratio is estimated by numerically maximizing the following function:
sample
U ðRh Þ ¼ Rh Vd;a ðRh Þ
Appendix 21.3: Monthly Data of S&P500 Index and Its Futures (January 2005–August 2020)
library(tseries)
# Augmented Dickey-Fuller Test
# Level data
adf.test(SP500$SPOT, k = 1)
adf.test(SP500$FUTURES, k = 1)
# First-order differenced data
adf.test(diff(SP500$SPOT), k = 1)
adf.test(diff(SP500$FUTURES), k = 1)
References Fishburn, P.C. (1977). Mean-risk analysis with risk associated with
below-target returns. American Economic Review, 67, 116–126.
Geppert, J.M. (1995). A statistical model for the relationship between
Baillie, R.T., & Myers, R.J. (1991). Bivariate Garch estimation of the futures contract hedging effectiveness and investment horizon
optimal commodity futures hedge. Journal of Applied Economet- length. Journal of Futures Markets, 15, 507–536.
rics, 6, 109–124. Ghosh, A. (1993). Hedging with stock index futures: estimation and
Bawa, V.S. (1978). Safety-first, stochastic dominance, and optimal forecasting with error correction model. Journal of Futures
portfolio choice. Journal of Financial and Quantitative Analysis, Markets, 13, 743–752.
13, 255–271. Grammatikos, T., & Saunders, A. (1983). Stability and the hedging
Benet, B.A. (1992). Hedge period length and ex-ante futures hedging performance of foreign currency futures. Journal of Futures
effectiveness: the case of foreign-exchange risk cross hedges. Markets, 3, 295–305.
Journal of Futures Markets, 12, 163–175. Howard, C.T., & D’Antonio, L.J. (1984). A risk-return measure of
Cecchetti, S.G., Cumby, R.E., & Figlewski, S. (1988). Estimation of hedging effectiveness. Journal of Financial and Quantitative
the optimal futures hedge. Review of Economics and Statistics, 70, Analysis, 19, 101–112.
623–630. Hsin, C.W., Kuo, J., & Lee, C.F. (1994). A new measure to compare
Chen, S.S., Lee, C.F., & Shrestha, K. (2001). On a mean-generalized the hedging effectiveness of foreign currency futures versus options.
semivariance approach to determining the hedge ratio. Journal of Journal of Futures Markets, 14, 685–707.
Futures Markets, 21, 581–598. Hung, J.C., Chiu, C.L. & Lee, M.C. (2006). Hedging with zero-value at
Cheung, C.S., Kwan, C.C.Y., & Yip, P.C.Y. (1990). The hedging risk hedge ratio, Applied Financial Economics, 16, 259–269.
effectiveness of options and futures: a mean-Gini approach. Journal Hylleberg, S., & Mizon, G.E. (1989). Cointegration and error
of Futures Markets, 10, 61–74. correction mechanisms. Economic Journal, 99, 113–125.
Chou, W.L., Fan, K.K., & Lee, C.F. (1996). Hedging with the Nikkei Jarque, C.M., & Bera, A.K. (1987). A test for normality of observations
index futures: the conventional model versus the error correction and regression residuals. International Statistical Review, 55, 163–
model. Quarterly Review of Economics and Finance, 36, 495–505. 172.
Crum, R.L., Laughhunn, D.L., & Payne, J.W. (1981). Risk-seeking Johansen, S., & Juselius, K. (1990). Maximum likelihood estimation
behavior and its implications for financial models. Financial and inference on cointegration—with applications to the demand for
Management, 10, 20–27. money. Oxford Bulletin of Economics and Statistics, 52, 169–210.
D’Agostino, R.B. (1971). An omnibus test of normality for moderate Johnson, L.L. (1960). The theory of hedging and speculation in
and large size samples. Biometrika, 58, 341–348. commodity futures. Review of Economic Studies, 27, 139–151.
De Jong, A., De Roon, F., & Veld, C. (1997). Out-of-sample hedging Junkus, J.C., & Lee, C.F. (1985). Use of three index futures in hedging
effectiveness of currency futures for alternative models and hedging decisions. Journal of Futures Markets, 5, 201–222.
strategies. Journal of Futures Markets, 17, 817–837. Kolb, R.W., & Okunev, J. (1992). An empirical evaluation of the
Dickey, D.A., & Fuller, W.A. (1981). Likelihood ratio statistics for extended mean-Gini coefficient for futures hedging. Journal of
autoregressive time series with a unit root. Econometrica, 49, 1057– Futures Markets, 12, 177–186.
1072. Kolb, R.W., & Okunev, J. (1993). Utility maximizing hedge ratios in
Ederington, L.H. (1979). The hedging performance of the new futures the extended mean Gini framework. Journal of Futures Markets,
markets. Journal of Finance, 34, 157–170. 13, 597–609.
Engle, R.F., & Granger, C.W. (1987). Co-integration and error Kroner, K.F., & Sultan, J. (1993). Time-varying distributions and
correction: representation, estimation and testing. Econometrica, dynamic hedging with foreign currency futures. Journal of Finan-
55, 251–276. cial and Quantitative Analysis, 28, 535–551.
References 489
Lee, H.T. & Yoder J. (2007). Optimal hedging with a regime-switching Lien, D., & Tse, Y.K. (2000). Hedging downside risk with futures
time-varying correlation GARCH model. Journal of Futures contracts. Applied Financial Economics, 10, 163–170.
Markets, 27, 495–516. Malliaris, A.G., & Urrutia, J.L. (1991). The impact of the lengths of
Lee, C.F., Bubnys, E.L., & Lin, Y. (1987). Stock index futures hedge estimation periods and hedging horizons on the effectiveness of a
ratios: test on horizon effects and functional form. Advances in hedge: evidence from foreign currency futures. Journal of Futures
Futures and Options Research, 2, 291–311. Markets, 3, 271–289.
Lence, S. H. (1995). The economic value of minimum-variance hedges. Myers, R.J., & Thompson, S.R. (1989) Generalized optimal hedge ratio
American Journal of Agricultural Economics, 77, 353–364. estimation. American Journal of Agricultural Economics, 71, 858–
Lence, S. H. (1996). Relaxing the assumptions of minimum variance 868.
hedging. Journal of Agricultural and Resource Economics, 21, 39– Osterwald-Lenum, M. (1992). A note with quantiles of the asymptotic
55. distribution of the maximum likelihood cointegration rank test
Lien, D. (1996). The effect of the cointegration relationship on futures statistics. Oxford Bulletin of Economics and Statistics, 54, 461–471.
hedging:A note. The Journal of Futures Markets, 16, 773–780. Phillips, P.C.B., & Perron, P. (1988). Testing unit roots in time series
Lien, Donald. “Cointegration and the optimal hedge ratio: the general regression. Biometrika, 75, 335–46.
case.” The Quarterly review of economics and finance 44.5 (2004): Phillips, Peter CB, and Sam Ouliaris. “Asymptotic properties of
654–658. residual based tests for cointegration.” Econometrica: journal of the
Lien, D., & Luo, X. (1993a). Estimating the extended mean-Gini Econometric Society (1990): 165–193.
coefficient for futures hedging. Journal of Futures Markets, 13, Rutledge, D.J.S. (1972). Hedgers’ demand for futures contracts: a
665–676. theoretical framework with applications to the United States
Lien, D., & Luo, X. (1993b). Estimating multiperiod hedge ratios in soybean complex. Food Research Institute Studies, 11, 237–256.
cointegrated markets. Journal of Futures Markets, 13, 909–920. Sephton, P.S. (1993a). Hedging wheat and canola at the Winnipeg
Lien, D., & Shaffer, D.R. (1999). Note on estimating the minimum commodity exchange.Applied Financial Economics, 3, 67–72.
extended Gini hedge ratio. Journal of Futures Markets, 19, 101– Sephton, P.S. (1993b). Optimal hedge ratios at the Winnipeg
113. commodity exchange. Canadian Journal of Economics, 26, 175–
Lien, D. & Shrestha, K. (2007). An empirical analysis of the 193.
relationship between hedge ratio and hedging horizon using wavelet Shalit, H. (1995). Mean-Gini hedging in futures markets. Journal of
analysis. Journal of Futures Markets, 27, 127–150. Futures Markets, 15, 617–635.
Lien, D. & Shrestha, K. (2010). Estimating optimal hedge ratio: a Stock, J.H., & Watson, M.W. (1988). Testing for common trends.
multivariate skew-normal distribution. Applied Financial Eco- Journal of the American Statistical Association, 83, 1097–1107.
nomics, 20, 627–636. Working, H. (1953). Hedging reconsidered. Journal of Farm Eco-
Lien, D., & Tse, Y.K. (1998). Hedging time-varying downside risk. nomics, 35, 544–561.
Journal of Futures Markets, 18, 705–722.
Application of Simultaneous Equation
in Finance Research: Methods and Empirical 22
Results
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 491
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_22
492 22 Application of Simultaneous Equation in Finance Research …
jointly determined, and thereby apply GMM in the estima- parameters estimated by 3SLS, which is a full information
tion of simultaneous equations. They find that covenants can estimation method, are asymptotically more efficient than the
mitigate the agency costs of debt for high-growth firms. limited information method (e.g., 2SLS), although 3SLS is
Berger and Bonaccorsi di Patti (2006) argue that an agency vulnerable to model specification errors. Thus, a compre-
costs hypothesis predicts that leverage affects firm perfor- hensive analysis of which method is best for the model
mance, yet firm performance also affects the choice of cap- selection would require some contemplation and relevant
ital structure. To address this problem of reverse causality statistical tests. Moreover, the instrumental variables used in
between firm performance and capital structure, they use finance studies are usually chosen arbitrarily. Thus, in
2SLS to estimate the simultaneous equations model. They Sect. 22.3, we will discuss the difference among 2SLS,
also estimate by 3SLS and do not change the main findings 3SLS, and GMM methods, present the applicable method
that higher leverage is associated with higher profit effi- under different conditions, and also present the related test
ciency. In the similar reason, Ruland and Zhou (2005) for the validity of instruments.
consider the potential endogeneity between firms’ excess
value and leverage and find that compared to specialized
firms, the values of diversified firms increase with leverage 22.3 Methodology
by using 2SLS. Aggarwal and Kyaw (2010) recognize the
interdependence between capital structure and dividend In this section, we review the discusses the 2SLS, 3SLS, and
payout policy by using 2SLS and find that multinational GMM methods applied in estimating simultaneous equations
companies have significantly lower debt ratios and pay models. Suppose that a set of observations on a variable y is
higher dividends than domestic companies. MacKay and drawn independently from probability distribution depends
Phillips (2005) use GMM and find that financial structure, on an unknown vector of parameters b of interest. One
technology, and risk are jointly determined within industries. general approach for estimating parameters b is based on
In addition, simultaneous equations models are applied in maximum likelihood (ML) estimation. The intuition behind
studies considering the interrelationship among a firm’s ML estimation is to specify a probability distribution for it,
major policies. Higgins (1972), Fama (1974), and Morgan and then find an estimate b ^ in which the data would be most
and Saint-Pierre (1978) investigate the relationship between likely to have been observed. The drawback with maximum
investment decision and dividend decision. Grabowski and likelihood methods is that we have to specify a full proba-
Mueller (1972) examine the interrelationship among bility distribution for the data. Here, we introduce an alter-
investment, dividend, and research and development (R&D). native approach for parameter estimation known as the
Fama and French (2002) consider the interaction between generalized method of moments (GMM). The GMM esti-
dividend and financing decisions. Dhrymes and Kurz (1967), mation is formalized by Hansen (1982) and is one of the
McDonald et al. (1975), McCabe (1979), Peterson and most widely used methods of estimation in economics and
Benesh (1983), and Switzer (1984) argue that investment finance. In contrast to ML estimation, the GMM estimation
decision is related to financing decision and dividend deci- only requires the specification of certain moment conditions
sion. Lee et al. (2016) empirically investigate the interrela- rather than the form of the likelihood function.
tionship among investment, financing, and dividend The idea behind GMM estimation is to choose a param-
decisions using the GMM method. Harford et al. (2014) eter estimate so as to make the sample moment conditions as
consider the interdependence of a firm’s cash holdings and close as possible to the population moment of zero according
the maturity of its debt by using a simultaneous equation to the measure of Euclidean distance. The GMM estimation
framework and performing a 2SLS estimation. Moreover, proposes a weighting matrix reflecting the importance given
Lee and Lin (2020) theoretically investigate how the to matching each of the moments. The alternative weighting
unknown variance of measurement error estimation in div- matrix is associated with the alternative estimator. Many
idend and investment decisions can be identified by the standard estimators, including ordinary least squares (OLS),
over-identified information in a simultaneous equation method of moments (MM), ML, instrumental variable (IV),
system. two-stage least squares (2SLS), and three-stage least squares
The above literature review of finance shows many (3SLS) can be seen as special cases of GMM estimators. For
studies acknowledge the existence of endogeneity problems example, when the number of moment conditions and
caused by omitted variables, measurement errors, and/or unknown parameters is the same, solving the quadratic cri-
simultaneity, however, seldom studies provide the reason for terion yields the GMM estimator, which is the same as MM
the selected estimation method (e.g., 2SLS, 3SLS, and/or estimator that sets the sample moment condition exactly
GMM). In fact, different methods of estimating the simul- equal to zero. The weighting matrix does not matter in this
taneous equations have different assumptions and thereby case. In particular, in models for which there are more
cause them to be not perfect substitutions. For example, the
22.3 Methodology 493
moment conditions than model parameters, GMM estima- here Y denotes the T 1 data vector for the endogenous
tion provides a straightforward way to test the specification variable and X is a T K data matrix for all regressors. In
of the proposed model. This is an important feature that is this matrix notation, the OLS estimator for b is as follows:
unique to GMM estimation.
Recently, the endogeneity concern has received much b
b OLS ¼ ðX0 XÞ1 X0 Y ð22:3Þ
attention in empirical corporate finance research. There are
If the disturbance term is correlated with at least some
at least three generally recognized sources of endogeneity:
components of regressors, we say that the regressors are
omitted explanatory variables, simultaneity bias, and errors
endogenous. Whenever there is endogeneity, the application
in variables. Whenever there is endogeneity, the application
of ordinary least squares (OLS) estimation to equation (22.2)
of OLS estimation yields biased and inconsistent estimates.
yields biased and inconsistent estimates. The instrumental
In literature, the IV methods are commonly used to deal with
variable (IV) methods are commonly used to deal with this
this endogeneity problem. The basic motivation for the IV
endogeneity problem. In a typical IV application, the
method is to deal with equations that exhibited both simul-
researcher first chooses a set of variables as instruments that
taneity and measurement errors in exogenous variables. The
are exogenous and applies two-stage least squares (2SLS)
idea behind IV estimation is to select suitable instruments
methods to estimate the parameter b. A good instrument
that are orthogonal to the disturbance while sufficiently
should be highly correlated with the endogenous regressors
correlated with the regressors. The IV estimator makes the
while uncorrelated with the disturbance in the structural
linear combinations of sample orthogonality conditions close
equation. The IV estimator for b can be regarded as the
to zeros. The GMM estimator proposed by Hansen (1982) is
solution to following moment conditions of the form
also based on orthogonality conditions and provides an
alternative solution. Hansen’s (1982) GMM estimator gen- E½z0t et ¼ E½z0t ðyt x0t bÞ ¼ 0 ð22:4Þ
eralizes Sargan’s (1958, 1959) linear and nonlinear IV
estimators based on optimal weighting matrix for the where zt is a 1 L vector of instrumental variables which
moment conditions. In contrast to traditional IV class esti- are uncorrelated with disturbance but correlated with xt , and
mators such as 2SLS and 3SLS estimators, the GMM esti- the sample moment conditions are
mator uses a weighting matrix considering temporal
1X T
dependence, heteroskedasticity, or autocorrelation. z0 ðyt xt b
bÞ ¼ 0 ð22:5Þ
Here, we review the application of GMM estimation in T t¼1 t
the linear regression model and further survey the GMM
estimation applied in estimating simultaneous equations Assume Z denotes a T L instrument matrix. If the
models. system is just identified (L ¼ K) and Z0 X is invertible, the
system of sample moment conditions in (22.5) has a unique
solution. We have an IV estimator bb IV as follows:
22.3.1 Application of GMM Estimation
b
b IV ¼ ðZ0 XÞ1 Z0 Y ð22:6Þ
in the Linear Regression Model
Suppose that the number of instruments exceeds the
Consider the following linear regression model: number of explanatory variables (L [ K), the system in
y t ¼ xt b þ e t ; t ¼ 1; . . .; T ð22:1Þ (22.5) is over-identified. Then there the question arises that
how to select or combine more than enough moment con-
where y is the endogenous variable, xt is a 1 K regressor ditions to get K equations. Here, the two-stage least squares
vector that includes constant term, and et is the error term. (2SLS) estimator which is the most efficient IV estimator out
Here, b denotes a K 1 parameter vector of interest. The of all possible linear combinations of the valid instruments
critical assumption made for the OLS estimation is that the under homoscedasticity, is employed in this case. The first
disturbance et is uncorrelated with the regressors xt , stage of the 2SLS estimator is regressing each endogenous
Eðx0t et Þ ¼ 0. The T observations in the model (22.1) can be regressor on all instruments to get its OLS prediction,
written in matrix form as expressed in matrix notation as X ^ ¼ ZðZ0 ZÞ1 Z. The second
stage is regressing the dependent variable on X^ to obtain the
Y ¼ Xbþe ð22:2Þ 0 1 0
2SLS estimator for b, b ^ ^ ^ X ^ Y. Substitute
2SLS ¼ X X
494 22 Application of Simultaneous Equation in Finance Research …
Thus, the GMM estimator is simply the 2SLS estimator There are two approaches to estimate the structural
under conditional homoscedasticity. However, if the parameters d and c of the system, one is the single equation
22.3 Methodology 495
" !#
estimation and the other is the system estimation. First, we 1 XT
introduce the single equation estimation shown below. We Wj ¼ 2 x0t ^ejt ^ejt xt : ð22:20Þ
T t¼1
can rewrite the j-th equation in our simultaneous equations
model in terms of the full set of T observations: The GMM estimator based on the moment conditions
(22.19) minimizes the following quadratic function:
yj ¼ Yj dj þ Xj cj þ ej ¼ Zj bj þ ej ; j ¼ 1; 2; . . .; J;
" # " #
ð22:17Þ X
T X
T
0 1 0
xt ðyjt Zjt bj Þ Wj xt ðyjt Zjt bj Þ : ð22:21Þ
where yj denotes the T1 vector of observations for the t¼1 t¼1
endogenous variables on the left-hand side of j-th equation.
The GMM estimator that minimizes this quadratic func-
Yj denotes the T(J-1) data matrix for the endogenous
tion (22.21) is obtained as
variables on the right-hand side of this equation. Xj is a data
matrix for all exogenous variables in this equation. Since h i1 h i
^
b 0 c 1 0 ðZ0j XÞ c
W 1 0
these jointly determined variables yj and Yj are determined GMM ¼ ðZj XÞ W j ðX Zj Þ j ðX yj Þ :
within the system, they are correlated with the disturbance ð22:22Þ
terms. This correlation usually creates estimation difficulties
because the OLS estimator would be biased and inconsistent In the homoscedastic and serially independent case, a
(e.g., Johnston and DiNardo 1997; Greene 2011). good estimate of the weighting matrix c
W j would be
As discussed above, the application of OLS estimation to
equation (22.17) yields biased and inconsistent estimates c ^2 0
r
W¼ ðX X Þ : ð22:23Þ
because of the correlation of Zj and ej. The 2SLS approach is T
the most common method used to deal with this endogeneity
^ 2 is obtained, then rearrange terms
Given the estimate of r
problem resulting from the correlation of Zj and ej. The
in equation (22.22), which yields
2SLS estimation uses all the exogenous variables in this
system as instruments to obtain the predictions of Yj. In the h i1
^
b ¼ ðZ 0
XÞðX 0
XÞ 1 0
X Z Þ ðZ0j XÞðX0 XÞ1 ðX0 yj Þ.
first stage, we regress Yj on all exogenous variables in the GMM j j
Eðee0 Þ ¼ R IT where signifies the Kroneker product. The system GMM estimator based on the moment con-
Here, R is defined as ditions (22.30) minimizes the quadratic function:
2 3 2 30 2 31 2 0 3
r11 r12 r1J X0 ðy1 Z1 b1 Þ c
W 11 c
W 12 c
W 1J X ðy1 Z1 b1 Þ
6 r21 r22 r2J 7 6 X0 ðy2 Z2 b2 Þ76c
7 6 W 21 c
W 22 W 2J 7
c 6 0
7 6 X ðy2 Z2 b2 Þ
7
6 7 6
6 7 6. 7 6
7
7:
R ¼ 6 .. .. . . .. 7: ð22:26Þ 4
..
5 4 .. .. .. .. 5 4
..
5
4. . . . 5 . . . . .
0
X ðyJ ZJ bJ Þ c
W J1 c
W J2 c JJ
W 0
X ðyJ ZJ bJ Þ
rJ1 rJ2 rJJ
ð22:32Þ
The 3SLS approach is the most common method used to
estimate the structural parameters of this system simultane- The GMM estimator that minimizes this quadratic func-
ously. Basically, the 3SLS estimator is a generalized least tion (22.32) is obtained as
square (GLS) estimator in the entire system taking account 2 J
P 0
3
of the covariance matrix in equation (22.26). The 3SLS 2 3 2 31 6 Zl X c W 1
1l yl 7
b Z1 X Wc 1 XZ1 Z0l X c
W 1 0 6 l¼1 7
1J X ZJ
b 1;GMM
estimator is equivalent to using all exogenous variables as 6b 7 6 0 c 11 7 6 PJ 7
6 b 2;GMM 7 6 Z2 X W 121 XZ1 Z02 X c
W 1 0 7 6 Z02 X c
2J X ZJ 7 6 W 1 7
2l yl 7
6 7 6
instruments and estimating the entire system using GLS 6 .. 7 ¼¼ 6 .. .. 7 6 6 l¼1 7:
7
4 . 5 4 . . 5 6 .. 7
estimation (Intriligator et al. 1996). The 3SLS estimation 6 . 7
b
b J;GMM Z0J X c
W 1 0
J1 X Zl Z0J X c
W 1 0
JJ X ZJ 4P J 5
0 c 1
uses all exogenous variables X ¼ ½X1 X2 XJ as ZJ X W Jl yl
l¼1
instruments in each equation of this system, pre-multiplying
ð22:33Þ
the model (22.25) by X0I ¼ diag½X0 X0 ¼ X IJ yields
the model The 2SLS and 3SLS estimators are the special cases of
T
c ^jj P 0
r
X0I Y ¼ X0I Zb þ X0I e: ð22:27Þ system GMM estimators. If W jj ¼ T xt xt and
t¼1
The covariance matrix from (22.26) is c
W jl ¼ 0 for j 6¼ l, then the system GMM estimator is
equivalent to the 2SLS estimator. In the case that
CovðX0I eÞ ¼ X0I Cov(eÞXI ¼ X0I ðR IT ÞXI : ð22:28Þ T
c o P
b
W jl ¼ Tjl x0t xt , the system GMM estimator is
The GLS estimator of the equation (22.27) is the 3SLS t¼1
estimator. Thus the 3SLS estimator is given as follows: equivalent to the 3SLS estimator.
0 1 0 1
^
b 0 1
Z0 XI X0I ðR IT ÞXI X0I Y:
3SLS ¼ fZ XI XI ðR IT ÞXI XI Zg
ð22:29Þ 22.3.3 Weak Instruments
In this case, R is a diagonal matrix, the 3SLS estimator is As mentioned above, we introduce three alternative
equivalent to the 2SLS estimator. As discussed above, the approaches, 2SLS, 3SLS, and GMM estimations to estimate
GMM estimator with all exogenous variables X ¼ a simultaneous equations system. Regardless of whether
½X1 X2 XJ as instruments, the moment conditions of 2SLS, 3SLS, or GMM estimation is used to estimate in the
this system (22.25) are, second stage, the first-stage regression instrumenting for
0
h 0 i endogenous regressors is estimated via OLS. The choice of
E XI e ¼ E XI ðY ZbÞ
h 0 i 0 instruments is critical to the consistent estimation of the IV
0 0
¼ E XI ðy1 Z1 b1 ÞE½XI ðy2 Z2 b2 Þ E½XI ðyJ ZJ bJ Þ ¼ 0 methods. Previous works have demonstrated that if the
ð22:30Þ instruments are weak, the IV estimator will not possess its
ideal properties and will be misleading (e.g., Bound
We can apply the 2SLS estimator with instruments X to et al.1995; Staiger and Stock, 1997; Stock and Yogo, 2005).
estimate bj and obtain the sample residuals A simple way to detect the presence of weak instruments
^
^ej ¼ yj Zj b c is to look at the R2 or F-statistic of first-stage regression
j;2SLS . Then, compute the weighting matrix W jl
for GMM estimator based on those residuals as follows: testing the hypothesis that the coefficients on the instruments
" !# are jointly equal to zero (Wang 2015). Institutively, the
1 XT first-stage F-statistic must be large, typically exceeding 10,
c
W jl ¼ 2 x0t^ejt^elt xt : ð22:31Þ
T for inference of 2SLS estimation to be reliable (Staiger and
t¼1
Stock 1997; Stock et al. 2002). In addition, Hahn and
22.4 Applications in Investment, Financing, and Dividend Policy 497
Hausman(2005) show that the relative bias of 2SLS esti- (Leverageit ) of firm i in year t. Investment is measured by the
mation declines as the strength of the correlation between the net property, plant, and equipment. Following Fama (1974),
instruments and the endogenous regressor increases, but both investment and dividend are measured on a per-share
grows with the number of instruments. Stock and Yogo basis. We follow Fama and French (2002) to use book
(2005) tabulate critical values for the first-stage F-statistic to leverage as the proxy for debt financing. Book leverage is
test whether instruments are weak. They report, for instance, measured as the ratio of total liabilities to total assets.
that when there is one endogenous regressor, the first-stage We also use the following exogenous variables in the
F-statistic of the 2SLS regression should have a value higher model. In addition to lag-terms of the three policies, we
than 9.08 with three instruments and 10.83 with five follow Fama (1974) to respectively incorporate sales plus the
instruments. change in inventories (Qit ) and net income minus preferred
To sum up, the choice of instruments is critical to the dividends (Pit ) into investment and dividend decisions.
consistent estimation of the instrumental variable methods. Moreover, we follow Fama and French (2002) to add natural
As the weakness of instruments in explaining the endoge- logarithm of lagged total assets (ln Ai;t1 ) and the lag of
nous regressor can be measured by F-statistic from earnings before interest and taxes divided by total assets
first-stage regression and compared to the critical value in (Ei;t1 =Ai;t1 ) as the determinants of leverage.
Stock and Yogo (2005). In addition, the traditional IV The structural equations are estimated as follows:
models such as 2SLS and 3SLS overcome the endogeneity
problem by instrumenting for variables that are endogenous. Invit ¼ a1i þ a2i Divit þ a3i Leverageit þ a4i Invi;t1 þ a5i Qit þ it ;
ð22:34Þ
22.4 Applications in Investment, Financing, Divit ¼ b1i þ b2i Invit þ b3i Leverageit þ b4i Divi;t1 þ b5i Pit þ git ;
and Dividend Policy ð22:35Þ
22.4.1 Model and Data Leverageit ¼ c1i þ c2i Invit þ c3i Divit þ c4i Leveragei;t1 þ c5i lnAi;t1
þ c6i Ei;t1 =Ai;t1 þ nit :
The investment, dividend, and debt financing are major ð22:36Þ
decisions of a firm. Past studies argue some relations among
investment, dividend, and debt financing. To control for the Our sample consists of Johnson & Johnson, and IBM
possible endogenous problems among these three decisions, companies’ annual data from 1966 to 2019. Table 22.1
we apply 2SLS, 3SLS, and GMM methods to estimate the presents summary statistics on the investment, dividend, and
simultaneous-equations model that considers the interaction debt financing for two companies, namely IBM and Johnson
of the three policies. & Johnson.
There are three equations in our simultaneous-equations
system; each equation contains the remaining two endoge-
22.4.2 Results of Weak Instruments
nous variables as explanatory variables along with other
exogenous variables. The three endogenous variables are We perform the first-stage F-statistic to test whether instru-
investment (Invit ), dividend (Divit ), and debt financing
ments are weak. Table 22.2 shows the results of testing the
relevance of instruments. We regress each endogenous Second, as for dividend decision (e.g., Table 22.3), the
variable on all exogenous variables in the system to receive impact of debt financing on the dividend is significantly
the prediction of endogenous variable and obtain as well as positive, showing that an increase in external financing
F-statistics for each firm. In Johnson & Johnson's case, the should exhibit a positive influence on the dividend. The
values of R2 for investment, dividend, and book leverage positive relationship between leverage and dividend is con-
equations are 0.9798, 0.9847, and 0.8966 respectively that sistent with McCabe (1979), Peterson and Benesh (1983),
show the strength of the instrument. Likewise, in the IBM and Switzer (1984). Moreover, an increase in the level of
case, the values of R2 for the investment, dividend, and investment expenditure has a negative influence on divi-
financing decision equations are 0.9448, 0.8688, and 0.9807, dends since investment and dividends are competing uses for
respectively. Moreover, the ratios of F-statistics over 10 for funds.
three endogenous variables both in Johnson & Johnson, and Third, turning to financing decision (e.g., Table 22.3),
IBM cases. All results support that instruments are suffi- only lagged leverage has a significantly positive effect on the
ciently strong. level of leverage. However, investment and dividend deci-
sions do not have a significantly impact on the level of
leverage. This finding supports that Johnson & Johnson
22.4.3 Empirical Results company may have a desired optimal level of leverage.
In addition, the results of control variables for Johnson &
A. Johnson & Johnson case Johnson company are shown as follows. First, the impact of
output, Qit , on the investment is significantly positive, which
Tables 22.3, 22.4, and 22.5 respectively show the 2SLS, is consistent with Fama (1974). Second, the coefficient of Pit
3SLS, and GMM estimation results for the simultaneous- in the dividend model is significantly positive, implying that
equation model for Johnson & Johnson case. Overall, our firms with high net income tend to increase to pay dividends.
findings of relations among these three financial decisions Third, in the debt financing equation, only the coefficient of
from 2SLS, 3SLS, and GMM methods are similar. The ln Ai;t1 is significantly positive, indicating that large firms
results of three financial decisions for Johnson & Johnson leverage more than smaller firms. This finding results from
company are summarized as follows. large firms that tend to have a greater reputation and less
First, looking at the investment equation (e.g., information asymmetry than small firms and thus large firms
Table 22.3), dividend ðDivit Þ has a negative impact on the can finance at a lower cost. The positive relation between
level of investment expenditure ðInvit Þ. This negative rela- size and leverage is consistent with Fama and French (2002),
tion between investment and dividend is consistent with Flannery and Rangan (2006), and Frank and Goyal (2009).
McCabe (1979) and Peterson and Benesh (1983). They
argue that dividend is a competing use of funds, the firm B. IBM case
must choose whether to expend funds on investment or
dividends. Moreover, financing decisions (ðLeverageit Þ) has Tables 22.6, 22.7, and 22.8 respectively show the 2SLS,
a positive impact on investment ðInvit Þ. Our finding that 3SLS, and GMM estimation results for the simultaneous-
increases in debt financing enhance the funds available to equation model for the IBM case. Overall, our findings of
outlays for investment is consistent with McDonald et al. relations among these three financial decisions from 2SLS,
(1975), McCabe (1979), Peterson and Benesh (1983), John 3SLS, and GMM methods are similar. The results of three
and Nachman (1985), and Froot et al. (1993).
22.4 Applications in Investment, Financing, and Dividend Policy 499
(1.2650) (0.4489)
Invit −0.0276* 0.0006
(0.0148) (0.0030)
Invi;t1 0.0581 *
(0.0323)
Qit 0.2496***
(0.0097)
Leveragei;t1 0.7835***
(0.0989)
lnAi;t1 0.0097
(0.0105)
Ei;t1 =Ai;t1 −0.0653
(0.3502)
Divi;t1 0.6196 ***
(0.0766)
Pit 0.2055***
(0.0356)
Constant −2.3771 ***
−0.5971*** 0.0029
(0.5196) (0.1893) (0.0778)
Observations 54 54 54
Adjusted R2 0.9701 0.9002 0.8549
This table presents the 2SLS regression results of a simultaneous equation system model for investment,
dividend, and debt financing:
Invit ¼ a1i þ a2i Divit þ a3i Leverageit þ a4i Invi;t1 þ a5i Qit þ it ;
Divit ¼ b1i þ b2i Invit þ b3i Leverageit þ b4i Divi;t1 þ b5i Pit þ git ;
Leverageit ¼ c1i þ c2i Invit þ c3i Divit þ c4i Leveragei;t1 þ c5i lnAi;t1 þ c6i Ei;t1 =Ai;t1 þ nit ;
where Invit ; Divit , and Leverageit are net plant and equipment, dividends, and book leverage ratio,
respectively. The independent variables in investment regression are lagged investment (Invi;t1 ), and sales
plus change in inventories (Qit ). The independent variables in dividend regression are lagged dividends
(Divi;t1 ), and net income minus preferred dividends (Pit ). All variables in both of investment and dividend
equations are measured on a per share basis. The independent variables in debt financing regression are
lagged book leverage (Leveragei;t1 ), natural logarithm of lagged total assets (lnAi;t1 ), and the lag of
earnings before interest and taxes divided by total assets (Ei;t1 =Ai;t1 ). Numbers in parentheses are standard
errors of coefficients. * p < 0.10, ** p < 0.05, *** p < 0.01
financial decisions for IBM company are summarized as on the level of leverage. Finally, the results of control
follows. variables for IBM company are similar to the findings in
First, as for investment decision, only financing decision Johnson & Johnson company. Overall, our finding supports
has a significantly negative impact on the level of investment that the investment and financing decisions are made
expenditure. Secondly, as for dividend decision, investment, simultaneously for the IBM company. That is, the interaction
and financing decisions both do not have a significant impact between investment and financing decisions should be
on the dividend payout. Thirdly, as for financing decision, considered in a system of simultaneous equations
only investment decision has a significantly positive impact framework.
500 22 Application of Simultaneous Equation in Finance Research …
(0.7077) (0.2466)
Invit −0.0293*** 0.0010
(0.0079) (0.0016)
Invi;t1 0.0953 ***
(0.0168)
Qit 0.2436***
(0.0053)
Leveragei;t1 0.8220***
(0.0518)
lnAi;t1 0.0097*
(0.0054)
Ei;t1 =Ai;t1 −0.2657
(0.1790)
Divi;t1 0.6193 ***
(0.0408)
Pit 0.2080***
(0.0183)
Constant −2.5608 ***
−0.5792*** 0.0360
(0.2906) (0.1040) (0.0406)
Observations 54 54 54
Adjusted R2 0.9681 0.8980 0.8486
This table presents the 3SLS regression results of a simultaneous equation system model for investment,
dividend, and debt financing:
Invit ¼ a1i þ a2i Divit þ a3i Leverageit þ a4i Invi;t1 þ a5i Qit þ it ;
Divit ¼ b1i þ b2i Invit þ b3i Leverageit þ b4i Divi;t1 þ b5i Pit þ git ;
Leverageit ¼ c1i þ c2i Invit þ c3i Divit þ c4i Leveragei;t1 þ c5i lnAi;t1 þ c6i Ei;t1 =Ai;t1 þ nit ;
The three endogenous variables are Invit , Divit and Leverageit , which are net plant and equipment,
dividends, and book leverage ratio, respectively. The other variables are the same as in Table 22.4. Numbers
in the parentheses are standard errors of coefficients. The sign in bracket is the expected sign of each variable
of regressions. * p < 0.10, ** p < 0.05, *** p < 0.01
22.4 Applications in Investment, Financing, and Dividend Policy 501
(0.0084)
Qit 0.2456***
(0.0041)
Leveragei;t1 0.7838***
(0.0445)
lnAi;t1 0.0098*
(0.0043)
Ei;t1 =Ai;t1 −0.1738
(0.1219)
Divi;t1 0.5280 ***
(0.0478)
Pit 0.2585***
(0.0210)
Constant −2.5196 ***
−0.4035*** 0.0206
(0.1751) (0.0708) (0.0295)
Observations 54 54 54
Adjusted R2 0.9693 0.8871 0.8491
This table presents the GMM regression results of a simultaneous equation system model for investment,
dividend, and debt financing:
Invit ¼ a1i þ a2i Divit þ a3i Leverageit þ a4i Invi;t1 þ a5i Qit þ it ;
Divit ¼ b1i þ b2i Invit þ b3i Leverageit þ b4i Divi;t1 þ b5i Pit þ git ;
Leverageit ¼ c1i þ c2i Invit þ c3i Divit þ c4i Leveragei;t1 þ c5i lnAi;t1 þ c6i Ei;t1 =Ai;t1 þ nit ;
The three endogenous variables are Invit , Divit , and Leverageit , which are net plant and equipment,
dividends, and book leverage ratio, respectively. The other variables are the same as in Table 22.4. Numbers
in the parentheses are standard errors of coefficients. * p < 0.10, ** p < 0.05, *** p < 0.01
502 22 Application of Simultaneous Equation in Finance Research …
(0.0364)
Qit 0.2809***
(0.0200)
Leveragei;t1 0.9285***
(0.0349)
lnAi;t1 0.0304***
(0.0061)
Ei;t1 =Ai;t1 −0.0012
(0.0872)
Divi;t1 0.5713 ***
(0.0454)
Pit 0.1590***
(0.0163)
Constant 21.5511 ***
−1.6140** −0.3031***
(2.3508) (0.6898) (0.0819)
Observations 54 54 54
Adjusted R2 0.9190 0.7669 0.9753
This table presents the 3SLS regression results of a simultaneous equation system model for investment,
dividend, and debt financing:
Invit ¼ a1i þ a2i Divit þ a3i Leverageit þ a4i Invi;t1 þ a5i Qit þ it ;
Divit ¼ b1i þ b2i Invit þ b3i Leverageit þ b4i Divi;t1 þ b5i Pit þ git ;
Leverageit ¼ c1i þ c2i Invit þ c3i Divit þ c4i Leveragei;t1 þ c5i ln Ai;t1 þ c6i ðEi;t1 =Ai;t1 Þ þ nit ;
The three endogenous variables are Invit , Divit , and Leverageit , which are net plant and equipment,
dividends, and book leverage ratio, respectively. The other variables are the same as in Table 22.4. Numbers
in the parentheses are standard errors of coefficients. The sign in bracket is the expected sign of each variable
of regressions. * p < 0.10, ** p < 0.05, *** p < 0.01
504 22 Application of Simultaneous Equation in Finance Research …
(2.1937) (0.3846)
Invit −0.0120 0.0016***
(0.0088) (0.0003)
Invi;t1 0.4382 ***
(0.0300)
Qit 0.2016***
(0.0160)
Leveragei;t1 0.9434***
(0.0299)
ln Ai;t1 0.0437***
(0.0043)
Ei;t1 =Ai;t1 0.1056
(0.0710)
Divi;t1 0.8039 ***
(0.0520)
Pit 0.0921***
(0.0156)
Constant 19.9544 ***
0.3580 −0.4845***
(1.4990) (0.4224) (0.0534)
Observations 54 54 54
Adjusted R2 0.9063 0.7013 0.9190
This table presents the GMM regression results of a simultaneous equation system model for investment,
dividend, and debt financing:
Invit ¼ a1i þ a2i Divit þ a3i Leverageit þ a4i Invi;t1 þ a5i Qit þ it ;
Divit ¼ b1i þ b2i Invit þ b3i Leverageit þ b4i Divi;t1 þ b5i Pit þ git ;
Leverageit ¼ c1i þ c2i Invit þ c3i Divit þ c4i Leveragei;t1 þ c5i lnAi;t1 þ c6i Ei;t1 =Ai;t1 þ nit ;
The three endogenous variables are Invit , Divit , and Leverageit , which are net plant and equipment,
dividends, and book leverage ratio, respectively. The other variables are the same as in Table 22.4. Numbers
in the parentheses are standard errors of coefficients. * p < 0.10, ** p < 0.05, *** p < 0.01
Appendix 22.1: Data for Johnson & Johnson and IBM 505
22.5 Conclusion these three corporate decisions are jointly determined and
the interaction among them should be taken into account in a
In this chapter, we investigate the endogeneity problems simultaneous equations framework.
related to the simultaneous equations system and introduce
how 2SLS, 3SLS, and GMM estimation methods deal with
endogeneity problems. In addition to reviewing applications Appendix 22.1: Data for Johnson & Johnson
of simultaneous equations in capital structure decisions, we and IBM
also use Johnson & Johnson, and IBM companies’ annual
data from 1966 to 2019 to examine the interrelationship 1.1 Johnson & Johnson data
among corporate investment, leverage, and dividend payout
policies in a simultaneous-equation system by employing
2SLS, 3SLS, and GMM methods. Our findings of relations
among these three financial decisions from 2SLS, 3SLS, and
GMM methods are similar. Overall, our study suggests that
fyear pstar div inv q debtratio et lna debtratio_peer invlag_1 divlag_l ebtratiolag. etlag lnalag
debtratiolag_peer
1966 8.6129 1.6441 18.7782 83.7590 0.2160 0.1696 5.7948 0.3105 16.3320 1.4456 0.2152 0.3146 0.1538 5.7206
1967 3 2342 0.6139 7.0284 29.4397 0.2084 0.1613 5.9186 0.3180 18.7782 1.6441 0.2160 0.3105 0.1696 5.7948
1968 3 7749 0.6481 7.5312 33.0988 0.2073 0.1841 6.0540 0.3494 7.0284 0.6139 0.2084 0.3180 0.1613 5.9186
1969 4 3568 0.8413 8.4597 36.8895 0.1868 0.1904 6.1772 0.3560 7.5312 0.6481 0.2073 0.3494 0.1841 6.0540
1970 2 0867 0.3387 4.2893 18.9954 0.2422 0.2109 6.5604 0.3635 8.4597 0.8413 0.1868 0.3560 0.1904 6.1772
1971 2.4711 0.4285 4.8064 20.7572 0.2477 0.2102 6.7215 0.3929 4.2893 0.3387 0.2422 0.3635 0.2109 6.5604
1972 2 8925 0.4455 5.3408 23.6784 0.2506 0.2131 6.8890 0.3952 4.8064 0.4285 0.2477 0.3929 0.2102 6.7215
1973 3 4686 0.5194 6.3477 29.0629 0.2663 0.2168 7.0809 0.4191 5.3408 0.4455 0.2506 0.3952 0.2131 6.8890
1974 3 8532 0.7233 8.0820 36.3433 0.2844 0.1850 7.2483 0.4631 6.3477 0.5194 0.2663 0.4191 0.2168 7.0809
1975 4 3495 0.8475 9.1023 37.7274 0.2551 0.1934 7.3470 0.4280 8.0820 0.7233 0.2844 0.4631 0.1850 7.2483
1976 4 8568 1.0473 9.7586 44.3202 0.2442 0.2033 7.4563 0.4482 9.1023 0.8475 0.2551 0.4280 0.1934 7.3470
1977 5 7056 1.3976 11.1505 50.6008 0.2644 0.2040 7.6108 0.4793 9.7586 1.0473 0.2442 0.4482 0.2033 7.4563
1978 6 7198 1.6827 13.1696 60.0825 0.2825 0.2076 7.7759 0.4906 11.1505 1.3976 0.2644 0.4793 0.2040 7.6108
1979 7 7318 1.9964 15.4836 71.1573 0.3059 0.1960 7.9634 0.4719 13.1696 1.6827 0.2825 0.4906 0.2076 7.7759
1980 8 7276 2.2151 18.7998 79.9422 0.3183 0.1843 8.1145 0.5060 15.4836 1.9964 0.3059 0.4719 0.1960 7.9634
1981 3 3144 0.8478 7.1399 29.1363 0.3353 0.1877 8.2481 0.4093 18.7998 2.2151 0.3183 0.5060 0.1843 8.1145
1982 3 6991 0.9644 8.3431 30.7638 0.3327 0.1658 8.3451 1.6483 7.1399 0.8478 0.3353 0.4093 0.1877 8.2481
1983 3 6524 1.0694 8.7191 31.3993 0.3205 0.1666 8.4032 0.3971 8.3431 0.9644 0.3327 1.6483 0.1658 8.3451
1984 4 0515 1.2027 9.4102 33.2396 0.3544 0.1642 8.4210 0.4385 8.7191 1.0694 0.3205 0.3971 0.1666 8.4032
1985 4 7262 1.2753 10.0622 35.1705 0.3423 0.1649 8.5360 0.5177 9.4102 1.2027 0.3544 0.4385 0.1642 8.4210
1986 3 4984 1.4157 11.0865 40.8348 0.5194 0.1674 8.6787 0.4621 10.0622 1.2753 0.3423 0.5177 0.1649 8.5360
1987 6 6591 1.6154 13.0741 47.4532 0.4676 0.1847 8.7866 0.4540 11.0865 1.4157 0.5194 0.4621 0.1674 8.6787
1988 7 8722 1.9636 14.9698 54.6912 0.5079 0.1972 8.8705 0.5207 13.0741 1.6154 0.4676 0.4540 0.1847 8.7866
1989 4 3236 1.1199 8.5452 29.5358 0.4762 0.2097 8.9770 0.5143 14.9698 1.9636 0.5079 0.5207 0.1972 8.8705
1990 4 6536 1.3090 9.7485 34.2925 0.4845 0.2096 9.1597 0.5147 8.5452 1.1199 0.4762 0.5143 0.2097 8.9770
1991 5 6999 1.5398 11.0065 37.8370 0.4649 0.2058 9.2604 0.5172 9.7485 1.3090 0.4845 0.5147 0.2096 9.1597
1992 3.2408 0.8956 6.2786 21.0453 0.5649 0.1916 9.3829 0.3546 11.0065 1.5398 0.4649 0.5172 0.2058 9.2604
1993 3 6393 1.0249 6.8525 21.9493 0.5452 0.1956 9.4126 0.3472 6.2786 0.8956 0.5649 0.3546 0.1916 9.3829
1994 4.2457 1.1306 7.6360 25.1598 0.5454 0.1792 9.6594 0.8785 6.8525 1.0249 0.5452 0.3472 0.1956 9.4126
1995 5.0334 1.2769 8.0225 29.2691 0.4939 0.1964 9.7910 1.5659 7.6360 1.1306 0.5454 0.8785 0.1792 9.6594
1996 2.9239 0.7310 4.2410 16.3919 0.4585 0.2150 9.9040 0.5254 8.0225 1.2769 0.4939 1.5659 0.1964 9.7910
(continued)
506 22 Application of Simultaneous Equation in Finance Research …
fyear pstar div inv q debtratio et lna debtratio_peer invlag_1 divlag_l ebtratiolag. etlag lnalag
debtratiolag_peer
1997 3.2487 0.8453 4.3193 16.8362 0.4239 0.2154 9.9736 0.5929 4.2410 0.7310 0.4585 0.5254 0.2150 9.9040
1998 3.2030 0.9709 4.6427 17.8520 0.4815 0.1925 10.1739 0.5440 4.3193 0.8453 0.4239 0.5929 0.2154 9.9736
1999 4.0376 1.0643 4.8349 19.9420 0.4441 0.2032 10.2807 4.0815 4.6427 0.9709 0.4815 0.5440 0.1925 10.1739
2000 4.5401 1.2395 5.0117 20.7673 0.3995 0.2068 10.3520 4.3154 4.8349 1.0643 0.4441 4.0815 0.2032 10.2807
2001 2.3868 0.6718 2.5331 10.8801 0.3704 0.2049 10.5581 3.5906 5.0117 1.2395 0.3995 4.3154 0.2068 10.3520
2002 2.7824 0.8021 2.9343 12.3333 0.4404 0.2386 10.6104 3.7052 2.5331 0.6718 0.3704 3.5906 0.2049 10.5581
2003 3.0546 0.9252 3.3174 14.2006 0.4433 0.2252 10.7844 2.0155 2.9343 0.8021 0.4404 3.7052 0.2386 10.6104
2004 3.5789 1.0942 3.5126 15.9891 0.4033 0.2413 10.8840 2.2194 3.3174 0.9252 0.4433 2.0155 0.2252 10.7844
2005 4.2038 1.2752 3.6410 17.0279 0.3473 0.2291 10.9686 2.4548 3.5126 1.0942 0.4033 2.2194 0.2413 10.8840
2006 4.5727 1.4748 4.5085 18.7071 0.4427 0.1925 11.1642 0.5554 3.6410 1.2752 0.3473 2.4548 0.2291 10.9686
2007 4.7014 1.6442 4.9943 21.5673 0.4649 0.1872 11.3016 0.5565 4.5085 1.4748 0.4427 0.5554 0.1925 11.1642
2008 5.6988 1.8143 5.1875 22.9992 0.4994 0.1904 11.3494 1.2258 4.9943 1.6442 0.4649 0.5565 0.1872 11.3016
2009 5.4605 1.9341 5.3585 22.5192 0.4657 0.1772 11.4583 1.5623 5.1875 1.8143 0.4994 1.2258 0.1904 11.3494
2010 5.9432 2.1197 5.3150 22.5649 0.4502 0.1606 11.5416 1.2891 5.3585 1.9341 0.4657 1.5623 0.1772 11.4583
2011 4.7094 2.2596 5.4101 24.2027 0.4977 0.1430 11.6408 1.3479 5.3150 2.1197 0.4502 1.2891 0.1606 11.5416
2012 5.2255 2.3804 5.7934 24.6299 0.4658 0.1413 11.7064 2.5761 5.4101 2.2596 0.4977 1.3479 0.1430 11.6408
2013 6.3585 2.5831 5.9242 25.4181 0.4419 0.1429 11.7957 1.9465 5.7934 2.3804 0.4658 2.5761 0.1413 11.7064
2014 7.2642 2.7910 5.7940 26.8168 0.4680 0.1629 11.7839 0.8565 5.9242 2.5831 0.4419 1.9465 0.1429 11.7957
2015 6.9524 2.9664 5.7728 25.3862 0.4667 0.1377 11.8012 1.1927 5.7940 2.7910 0.4680 0.8565 0.1629 11.7839
2016 7.4982 3.1853 5.8792 26.5955 0.5013 0.1502 11.8580 0.6180 5.7728 2.9664 0.4667 1.1927 0.1377 11.8012
2017 2.5879 3.3338 6.3392 28.7308 0.6176 0.1262 11.9659 0.9751 5.8792 3.1853 0.5013 0.6180 0.1502 11.8580
2018 8.3483 3.5661 6.3985 30.5804 0.6093 0.1398 11.9379 0.5902 6.3392 3.3338 0.6176 0.9751 0.1262 11.9659
2019 8.4057 3.7671 7.0712 31.3314 0.6230 0.1339 11.9686 0.5648 6.3985 3.5661 0.6093 0.5902 0.1398 11.9379
fyear pstar div inv q debtratio et lna debtratio_peer invlag_1 divlag_1 ebtratiolag_ debtratiolag peer etlag lnalag
1966 8.5221 4.5439 16.2576 71.1472 0.3244 0.2413 9.4662 0.4644 14.5714 5.2414 0.3455 0.4282 0.3119 9.4404
1967 8.1448 3.7954 16.8217 70.4698 0.3022 0.2185 9.4935 0.4608 16.2576 4.5439 0.3244 0.4644 0.2413 9.4662
1968 8.5660 4.2949 17.1392 80.3665 0.3036 0.2423 9.5475 0.4619 16.8217 3.7954 0.3022 0.4608 0.2185 9.4935
1969 11.7416 4.2953 19.7534 86.1986 0.3099 0.2227 9.6037 0.4726 17.1392 4.2949 0.3036 0.4619 0.2423 9.5475
1970 7.3457 3.3945 22.3586 66.7942 0.3048 0.0459 9.5592 0.4816 19.7534 4.2953 0.3099 0.4726 0.2227 9.6037
1971 12.9961 3.3975 21.6745 98.3157 0.4077 0.2004 9.8115 0.4876 22.3586 3.3945 0.3048 0.4816 0.0459 9.5592
1972 13.7900 4.4525 21.6790 107.1750 0.3607 0.2215 9.8132 0.4931 21.6745 3.3975 0.4077 0.4876 0.2004 9.8115
1973 15.3130 5.2543 21.8781 128.7056 0.3809 0.2099 9.9182 0.5305 21.6790 4.4525 0.3607 0.4931 0.2215 9.8132
1974 9.2478 3.3985 24.5583 114.4483 0.3878 0.0760 9.9266 0.5541 21.8781 5.2543 0.3809 0.5305 0.2099 9.9182
1975 11.6256 2.4013 24.7175 122.1282 0.3961 0.1100 9.9834 0.5242 24.5583 3.3985 0.3878 0.5541 0.0760 9.9266
1976 17.9336 5.5570 24.3210 167.0687 0.4115 0.2051 10.1041 0.5306 24.7175 2.4013 0.3961 0.5242 0.1100 9.9834
1977 20.0019 6.8103 28.7248 195.4323 0.4086 0.2244 10.1909 0.5301 24.3210 5.5570 0.4115 0.5306 0.2051 10.1041
1978 22.9187 6.0032 33.6705 223.0148 0.4258 0.2119 10.3287 0.5379 28.7248 6.8103 0.4086 0.5301 0.2244 10.1909
1979 21.0012 5.2539 40.2199 230.8880 0.4047 0.1448 10.3802 0.5441 33.6705 6.0032 0.4258 0.5379 0.2119 10.3287
1980 11.4936 2.9093 50.6284 192.1633 0.4848 -0.0343 10.4511 0.5594 40.2199 5.2539 0.4047 0.5441 0.1448 10.3802
1981 15.5674 2.3634 66.0044 206.4705 0.5455 0.0101 10.5711 0.5103 50.6284 2.9093 0.4848 0.5594 -0.0343 10.4511
1982 17.6421 2.3649 69.0837 189.1997 0.5583 0.0232 10.6310 0.4923 66.0044 2.3634 0.5455 0.5103 0.0101 10.5711
1983 28.0638 2.7925 60.8639 238.2435 0.5455 0.1205 10.7297 0.5152 69.0837 2.3649 0.5583 0.4923 0.0232 10.6310
(continued)
Appendix 22.2: Applications of R Language in Estimating … 507
fyear pstar div inv q debtratio et lna debtratio_peer invlag_1 divlag_1 ebtratiolag_ debtratiolag peer etlag lnalag
1984 30.0178 4.7903 61.5003 268.2601 0.5356 0.0901 10.8618 0.5033 60.8639 2.7925 0.5455 0.5152 0.1205 10.7297
1985 32.2438 5.0766 77.9633 307.6458 0.5375 0.0660 11.0640 0.5969 61.5003 4.7903 0.5356 0.5033 0.0901 10.8618
1986 30.0407 5.2097 95.7772 320.9095 0.5774 0.0374 11.1926 0.6127 77.9633 5.0766 0.5375 0.5969 0.0660 11.0640
1987 31.0784 5.3279 103.1967 330.0916 0.6199 0.0294 11.3785 0.5799 95.7772 5.2097 0.5774 0.6127 0.0374 11.1926
1988 38.1362 5.3237 120.5252 404.2870 0.7826 0.0744 12.0080 0.6298 103.1967 5.3279 0.6199 0.5799 0.0294 11.3785
1989 18.7529 3.1863 64.5974 206.4430 0.7886 0.0766 12.0628 0.6199 120.5252 5.3237 0.7826 0.6298 0.0744 12.0080
1990 8.8140 3.1713 69.3982 206.2301 0.8216 0.0472 12.1020 0.6483 64.5974 3.1863 0.7886 0.6199 0.0766 12.0628
1991 4.5951 1.7718 72.4874 197.6762 0.8447 0.0236 12.1245 1.2313 69.3982 3.1713 0.8216 0.6483 0.0472 12.1020
1992 8.5340 1.5206 66.1788 183.9771 0.9634 0.0221 12.1601 2.2858 72.4874 1.7718 0.8447 1.2313 0.0236 12.1245
1993 16.0407 1.0097 65.7128 187.3181 0.9679 0.0405 12.1453 0.5685 66.1788 1.5206 0.9634 2.2858 0.0221 12.1601
1994 20.3661 1.0489 72.7017 203.5847 0.9332 0.0581 12.1990 0.5353 65.7128 1.0097 0.9679 0.5685 0.0405 12.1453
1995 24.5531 1.4840 86.9076 221.7449 0.8925 0.0612 12.2882 0.5368 72.7017 1.0489 0.9332 0.5353 0.0581 12.1990
1996 21.8776 2.0222 89.3659 208.8503 0.8946 0.0382 12.3111 0.5786 86.9076 1.4840 0.8925 0.5368 0.0612 12.2882
1997 27.2202 2.3361 97.8707 243.8843 0.9203 0.0519 12.3410 0.7819 89.3659 2.0222 0.8946 0.5786 0.0382 12.3111
1998 22.8623 2.1191 109.1803 242.1635 0.9392 0.0369 12.4583 1.1206 97.8707 2.3361 0.9203 0.7819 0.0519 12.3410
1999 28.7838 2.2069 122.8843 288.6657 0.9227 0.0562 12.5235 0.6431 109.1803 2.1191 0.9392 1.1206 0.0369 12.4583
2000 32.4783 2.3605 142.0021 330.0820 0.8981 0.0464 12.6218 0.6408 122.8843 2.2069 0.9227 0.6431 0.0562 12.5235
2001 24.1037 2.1483 131.9002 319.9569 0.9369 0.0253 12.6884 0.7494 142.0021 2.3605 0.8981 0.6408 0.0464 12.6218
2002 26.1559 2.0840 129.8675 336.3791 0.9794 0.0195 12.8234 0.7190 131.9002 2.1483 0.9369 0.7494 0.0253 12.6884
2003 29.9645 1.9947 129.1713 334.5991 0.9430 0.0233 13.0137 0.6427 129.8675 2.0840 0.9794 0.7190 0.0195 12.8234
2004 30.0036 1.9978 132.8610 340.4939 0.9422 0.0252 13.0814 0.6392 129.1713 1.9947 0.9430 0.6427 0.0233 13.0137
2005 9.3914 2.0052 138.6357 343.4957 0.9672 -0.0075 13.0733 0.6263 132.8610 1.9978 0.9422 0.6392 0.0252 13.0814
2006 15.8608 0.9953 105.8250 327.1360 1.0228 0.0726 12.1345 0.6057 138.6357 2.0052 0.9672 0.6263 -0.0075 13.0733
2007 -59.6828 1.0017 87.8513 321.7686 1.2383 -0.0023 11.9109 0.6066 105.8250 0.9953 1.0228 0.6057 0.0726 12.1345
2008 -34.2827 0.4636 68.5965 240.9273 1.9373 -0.1316 11.4191 0.7057 87.8513 1.0017 1.2383 0.6066 -0.0023 11.9109
2009 224.9040 0.0000 37.3800 203.3080 0.7876 -0.0895 11.8226 0.6845 68.5965 0.4636 1.9373 0.7057 -0.1316 11.4191
2010 7.5426 0.0000 12.8222 91.7316 0.7325 0.0429 11.8415 0.6697 37.3800 0.0000 0.7876 0.6845 -0.0895 11.8226
2011 9.0923 0.0000 15.1733 97.4451 0.7304 0.0492 11.8818 0.6128 12.8222 0.0000 0.7325 0.6697 0.0429 11.8415
2012 8.2093 0.0000 18.9150 111.7161 0.7524 0.0037 11.9145 0.6016 15.1733 0.0000 0.7304 0.6128 0.0492 11.8818
2013 6.4520 0.0000 19.5000 103.1680 0.7405 0.0403 12.0218 0.6003 18.9150 0.0000 0.7524 0.6016 0.0037 11.9145
2014 5.6513 1.2050 21.7519 97.2075 0.7973 0.0353 12.0877 0.6259 19.5000 0.0000 0.7405 0.6003 0.0403 12.0218
2015 11.2707 1.4493 34.2673 101.6520 0.7927 0.0504 12.1783 0.6661 21.7519 1.2050 0.7973 0.6259 0.0353 12.0877
2016 13.0247 1.5580 46.8973 110.9360 0.8012 0.0539 12.3090 0.6344 34.2673 1.4493 0.7927 0.6661 0.0504 12.1783
2017 8.7964 1.5821 56.5250 101.7593 0.8296 0.0619 12.2666 0.6303 46.8973 1.5580 0.8012 0.6344 0.0539 12.3090
2018 15.1614 1.5314 58.7979 104.4300 0.8118 0.0323 12.3342 0.6618 56.5250 1.5821 0.8296 0.6303 0.0619 12.2666
2019 13.9421 1.5464 58.5036 98.4422 0.7985 0.0260 12.3373 0.6853 58.7979 1.5314 0.8118 0.6618 0.0323 12.3342
First, we load the data into the R environment and apply # Specify the instruments
the 2SLS method to estimate the parameters of this
simultaneous-equations system. Here, we use all the Insts <- list( * invlag_1 + q+ divlag_1 + pstar + deb-
exogenous variables Invi;t1 ; Qit ; Divi;t1 ; Pit ; Debti;t1 ; tratiolag_1 + lnalag + etlag)
ln Ai;t1 ; Ei;t1 =Ai;t1 Þ in this system as instruments to
obtain the prediction of each endogenous variable. By using
After specifying the simultaneous-equation system and
ivreg package in R language, we obtain the following
his instruments, we then introduce how to apply 3SLS
program code.
method to estimate the parameters of this
simultaneous-equations system in R language. By using the
Data <- read.csv(file=``IBM.csv'')
library(ivreg)
threeSLS function in gmm package, we obtain the fol-
lowing program.
# Investment policy
library(gmm)
# Dividend policy
# GMM method
summary(DIVeq)
For example, after fitting a 3SLS model and saving Fich EM, Shivdasani A (2007) Financial fraud, director reputation, and
results in threeSLS.fit, we can then call R_square shareholder wealth. J Financ Econ 86:306–336
Flannery MJ, Rangan KP (2006) Partial adjustment toward target
function to calculate R-squared and adjusted R-squared capital structures. J Financ Econ 79:469–506
statistics as follows. Frank MZ, Goyal VK (2009) Capital structure decisions: Which factors
are reliably important? Financ Manag 38:1–37
R_square(threeSLS.fit,1) # 1st equation: Investment Froot KA, Scharfstein DS, Stein JC (1993) Risk management:
policy Coordinating corporate investment and financing policies.
R_square(threeSLS.fit,2) # 2nd equation: Dividend pol- J Financ 48:1629–1658
icy Grabowski HG, Mueller DC (1972) Managerial and stockholder
welfare models of firm expenditures. Rev Econ Stat 54:9–24
R_square(threeSLS.fit,3) # 3rd equation: Financing
Greene WH (2011) Econometric analysis, 7th edn. Prentice Hall, New
policy Jersey
Gugler K (2003) Corporate governance, dividend payout policy, and
Similarly, by using the R_square function to GMM the interrelation between dividends, R&D, and capital investment. J
estimation results, we obtain the following program. Bank Financ 27:1297–1321
Hahn J, Hausman J (2005) Instrumental variable estimation with valid
and invalid instruments. Working Paper.
R_square(GMM.fit.fit,1) # Investment equation Hansen LP (1982) Large sample properties of generalized method of
R_square(GMM.fit.fit,2) # Dividend equation moments estimators. Econometrica 50:1029–1054
R_square(GMM.fit.fit,3) # Financing equation Harford J, Klasa S, Maxwell WF (2014) Refinancing risk and cash
holdings. J Financ 69:975–1012
Harvey CR, Lins KV, Roper AH (2004) The effect of capital structure
when expected agency costs are extreme. J Financ Econ 74:3–30
Higgins RC (1972) The corporate dividend-saving decision. J Financ
References Quant Anal 7:1527–1541
Intriligator MD, Bodkin RG, Hsiao C (1996) Econometric models,
techniques, and applications, 2nd edn. Prentice Hall, New Jersey
Aggarwal R, Kyaw NA (2010) Capital structure, dividend policy, and John K, Nachman DC (1985) Risky debt, investment incentives, and
multinationality: reputation in a sequential equilibrium. J Financ 40:863–878
Berger AN, Bonaccorsi di Patti E (2006) Capital structure and firm Johnston J, DiNardo J (1997) Econometric methods. McGraw-Hill,
performance: A new approach to testing agency theory and an New York
application to the banking industry. J Bank Financ 30:1065–1102 Lee CF, Chen HY, Lee J (2019) Financial econometrics, mathematics
Bhagat S, Black BS (2002) The non-correlation between board and statistics. Springer.
independence and long-term firm performance. J Corp Lee CF, Liang Wl, Lin FL, Yang Y (2016) Applications of
Law 27:231–273 simultaneous equations in finance research: Methods and empirical
Billett MT, Xue H (2007) The takeover deterrent effect of open market results. Rev Quant Finan Acc, 47: 943–971
share repurchases. J Financ 62:1827–1850 Lee, CF, Lee J (2020) Handbook of financial econometrics, mathe-
Billett MT, King THD, Mauer DC (2007) Growth opportunities and the matics, statistics, and machine learning. World Scientific, Singapore
choice of leverage, debt maturity, and covenants. J Financ 62:697– Lee CF, Chen HY, Lee J (2019) Financial econometrics, mathematics
730 and statistics. Springer.
Boone AL, Field LC, Karpoff JM, Raheja CG (2007) The determinants Lee CF, Lin FL (2020) Impacts of measurement errors on simultaneous
of corporate board size and composition: An empirical analysis. equation estimation of dividend and investment decisions. Hand-
J Financ Econ 85:66–101 book of Financial Econometrics, Mathematics, Statistics, and
Bound J, Jaeger D, Baker R (1995) Problems with instrumental variable Machine Learning (Vol. IV), Chapter 116, 4001–4023. World
estimation when the correlation between the instruments and the Scientific, Singapore
endogenous explanatory variables is weak. J Am Stat Assoc Loderer C, Martin K (1997) Executive stock ownership and perfor-
90:443–450 mance tracking faint traces. J Financ Econ 45:223–255
Chen CR, Lee CF (2010) Application of simultaneous equation in MacKay P, Phillips GM (2005) How does industry affect firm financial
finance research. In: Lee CF et al (eds) Handbook of quantitative structure? Rev Financ Stud 18:1433–1466
finance and risk management. Springer, Berlin, pp 1301–1306. McCabe GM (1979) The empirical relationship between investment
Demsetz H, Villalonga B (2001) Ownership structure and corporate and financing: A new look. J Financ Quant Anal 14:119–135
performance. J Corp Financ 7:209–233 McDonald JG, Jacquillat B, Nussenbaum M (1975) Dividend, invest-
Dhrymes PJ, Kurz M (1967) Investment, dividend, and external finance ment and financing decisions: Empirical evidence on French firms.
behavior of firms. In: Ferber R (ed) Determinants of investment J Financ Quant Anal 10:741–755
behavior. NBER, pp 427–486 Morgan IG, Saint-Pierre J (1978) Dividend and investment decisions of
Fama EF (1974) The empirical relationships between the dividend and Canadian firms. Can J Econ 11:20–37
investment decisions of firms. Am Econ Rev 64:304–318 Peterson PP, Benesh GA (1983) A reexamination of the empirical
Fama EF, French KR (2002) Testing trade‐off and pecking order relationship between investment and financing decisions. J Financ
predictions about dividends and debt. Rev Financ Stud 15:1–33 Quant Anal 18:439–453
Ferreira MA, Matos P (2008) The colors of investors’ money: The role Prevost AK, Rao RP, Hossain M (2002) Determinants of board
of institutional investors around the world. J Financ Econ composition in New Zealand: A simultaneous equations approach. J
88:499–533 Empir Financ 9:373–397
510 22 Application of Simultaneous Equation in Finance Research …
Ruland W, and Zhou P (2005) Debt, diversification, and valuation. Rev Stock JH, Wright JH, Yogo M (2002) A survey of weak instruments
Quant Financ Acc 25:277–291 and weak identification in generalized method of moments. J Bus
Sargan JD (1958) The estimation of economic relationships using Econ Stat 20:518–529
instrumental variables. Econometrica 26:393–415 Switzer L (1984) The determinants of industrial R&D: A funds flow
Sargan JD (1959) The estimation of relationships with autocorrelated simultaneous equation approach. Rev Econ Stat 66:163–168
residuals by the use of instrumental variables. J R Stat Soc B 21:91– Wang, CJ (2015) Instrumental variables approach to correct for
105 endogeneity in finance. In: Lee CF and Lee J (eds.) Handbook of
Staiger D, Stock JH (1997) Instrumental variables regression with weak financial econometrics and statistics. Springer, New York, pp 2577–
instruments. Econom 65:557–586 2600
Stock JH, Yogo M (2005) Testing for weak instruments in linear IV Woidtke T (2002) Agents watching agents?: evidence from pension
regression. In: Andrews DWK (ed.) Identification and inference for fund ownership and firm value. J Financ Econ 63:99–131
econometric models. Cambridge Univ. Press, New York, pp 80–108 Ye P (2012) The value of active investing: Can active institutional
investors remove excess comovement of stock returns? J Financ
Quant Anal 47:667–688
Three Alternative Programs to Estimate
Binomial Option Pricing Model and Black 23
and Scholes Option Pricing Model
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 511
J. Lee et al., Essentials of Excel VBA, Python, and R,
https://doi.org/10.1007/978-3-031-14283-3_23
512 23 Three Alternative Programs to Estimate Binomial Option …
1 Xn
n!
C¼ pk ð1 pÞnk max½0; ð1 þ uÞk ð1 þ dÞnk S X:
R k¼0 k!ðn k!Þ
n
ð23:3Þ
The definition of the pricing of a put option in a binomial
option pricing model with n period would then be defined as
1 X n
n!
P¼ pk ð1 pÞnk max½0; X
Rn k¼0 k!ðn k!Þ
ð1 þ uÞk ð1 þ dÞnk S: ð23:4Þ
Table. 23.6 The inputs and excel functions of European call and put options
q = q_prob(r, u, d)
option_tree = matrix(0, nrow=nrow(tree), ncol=ncol(tree))
if (type == 'put') {
option_tree[nrow(option_tree),] = pmax (X - tree[nrow(tree),], 0)
} else { option_tree[nrow(option_tree),] = pmax(tree[nrow(tree),] - X, 0) }
for (i in (nrow(tree)-1):1) {
for (j in 1: i) {
option_tree[i, j]=((1-q)*option_tree[i+1,j] +q*option_tree[i+1,
j+1])/exp(r)
}
}
return (option_tree)
}
q <- q_prob(r, u, d)
tree <- build_stock_tree(S, u, d, n)
23.7 R Codes to Compute Option Prices Obs d1 d2 C(the European P(the European
by Black and Scholes Model call option price) put option price)
1 0.9952 0.9339 59.2225 5.0055
In this section, we write R codes to price options for individual
stocks, stock indices, and currencies. We use same examples
(iii) Option for Currencies
in the previous sections to show the results in R.
The current currency rate S = $130 and the strike is
X = $125. The interest rate is r = 6%, volatility sig = 0.15,
BlackScholes <- function(S, K, r, T, sig, type){
foreign rate = 2%, and time-to-maturity = 1/3 yr, the Euro-
if(type=="C"){ pean call and put prices are:
d1 <- (log(S/K) + (r + sig^2/2)*T) / (sig*sqrt(T))
d2 <- d1 - sig*sqrt(T) Obs d1 d2 C(the European P(the European
call option price) put option price)
value <- S*pnorm(d1) - K*exp(-r*T)*pnorm(d2)
1 0.6501 0.5635 8.4275 1.8162
return(value)
}
if(type=="P"){
d1 <- (log(S/K) + (r + sig^2/2)*T) / (sig*sqrt(T)) 23.8 Summary
d2 <- d1 - sig*sqrt(T)
value <- (K*exp(-r*T)*pnorm(-d2) - S*pnorm(-d1)) In this chapter, we presented the binomial option pricing
return(value) model and Black and Scholes option pricing model, then we
} showed how Excel can be used to estimate binomial option
pricing model and Black and Scholes model for individual
}
stock options, index options, and currency options. We also
showed how R language can be used to estimate binomial
(i) Option Model for Individual Stock and Black and Scholes option pricing models. Finally, in the
appendices, we showed how SAS language programming
The current stock price is S = $42 and the strike price is can be used to estimate binomial option pricing model and
X = $40. The interest rate is r = 10%, volatility sig = 0.2, Black and Scholes option pricing model.
and time-to-maturity = 0.5 yr, the European call and put
prices are.
Appendix 23.1: SAS Programming
Obs d1 d2 C(the European P(the European to Implement the Binomial Option Trees
call option price) put option price)
1 0.7693 0.6278 4.7594 0.8086 The following SAS macro is used to implement binomial
trees and calculate the price of a stock, a call option, a put
option, and a risk-free bond price. The parameters of this
(ii) Option Model for Stock Indices macro are.
S: Stock Price,
The current stock index is S = $950 and the strike price is X: Strike Price,
X = $900. The interest rate is r = 6%, volatility sig = 0.15, U: Incease Factor,
dividend = 3%, and time-to-maturity = 1/6 yr, the European D: Decrease Factor,
call and put prices are. N: Periods, and.
r: Interest.
520 23 Three Alternative Programs to Estimate Binomial Option …
data d;
Appendix 23.2: SAS Programming to Compute
set d; Option Prices Using Black and Scholes Model
drop u;
run; In this section, we write SAS macro function code to price
options for individual stocks, stock indices, and currencies.
We use same examples in previous sections to show the
proc print data=d; results in SAS.
run;
522 23 Three Alternative Programs to Estimate Binomial Option …