Evaluating Mutual Fund Performance - 2001
Evaluating Mutual Fund Performance - 2001
Evaluating Mutual Fund Performance - 2001
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
American Finance Association, Wiley are collaborating with JSTOR to digitize, preserve and
extend access to The Journal of Finance
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
THE JOURNAL OF FINANCE * VOL. LVI, NO. 5 * OCT. 2001
ABSTRACT
1985
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1986 The Journal of Finance
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1987
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1988 The Journal of Finance
Since power will depend on fund style (e.g., large firms have lower return
variances than small firms), our simulations form style-based portfolios. Al-
though the paper's focus is power, we also present evidence on how test
specification can depend on fund style. Fama and French (1993) argue that
their three-factor model "does a good job" on the cross section of average
stock returns, but they find misspecification for low book-to-market (i.e.,
growth) stocks in size quintiles one and five (see Fama and French (1993),
Table IXa, and Fama and French (1996), Table I, Panel B). This evidence
suggests that style-based funds could be misspecified, at least using regression-
based three-factor benchmarks.
1 The Morningstar definition of median is that half of the fund's money is invested in
of firms with larger than the median market capitalization.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1989
Table I
NYSE
Median Decile of
Market the Median
Capitalization Market
of the Stocks Capitalization
Fund Number Held by a Stock, Decile
Size, of Stocks Annual Mutual Fund, Ranking as of
$million Held Turnover, % $milliona September 1996
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1990 The Journal of Finance
2 We also constructed portfolios by dividend yield, but the paper's conclusions are unchanged
and to save space the results are not reported.
3 We repeated the analysis by constructing 10 portfolios per month, which means tracki
3,480 portfolios over three-year periods. The results on specification and power are virtually
identical. This increases our confidence that 348 portfolios is large enough to permit precise
inferences. As discussed later, the 348 performance measures are reasonably independent. This
is not the case with 10 per month, however, because the average cross-correlation in portfolio
raw returns exceeds 0.9 and it is greater than 0.6 in the portfolio performance measures.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1991
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1992 The Journal of Finance
The procedures just described in Section II.C yield a time series of 348
overlapping performance measures (one set for each simulated portfolio). We
first examine the distributional properties of each performance measure.
For the null hypothesis that the time series mean of a performance measure
is zero the test statistic is
where S.E. (a) is the standard error of the mean of the estimated perfor-
mance measures. If the estimated performance measures are independently
distributed, then the standard error is given by
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1993
Since the alphas are estimated using 36-month overlapping windows, we use
a correction for serial dependence in estimating the standard error of the
mean (see Newey and West (1987, 1994) and Andrews (1991)) in the calcu-
lation of the t-statistic in equation (2).
For each sample, we also examine whether the null hypothesis is re-
jected, and we report the rejection frequencies across the 348 funds. This
is done both before and after abnormal performance is introduced (see
Section III.A). Rejection rates after introducing abnormal performance pro-
vide direct evidence on power. A regression-based performance measure
rejects the null hypothesis if the t-statistic for the estimated alpha from
the 36-month regression exceeds the critical value at the one or five per-
cent significance level. For characteristic-based measures, we calculate the
time-series mean and standard deviation of the 36 monthly abnormal re-
turns for each fund; the t-statistic is the ratio of the mean to its sample
standard error.
A. Summary
B. 1. Specification
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1994 The Journal of Finance
Table II
Characteristic-based Regression-based
Performance Measures Measures
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1995
Table II-Continued
4 The Newey-West corrected standard errors reported in this study are based on five lags
selected on the basis of sample size. There are alternative lag selection procedures discussed in
Andrews (1991) and Newey and West (1987, 1994). These alternative procedures yield 50-100
percent larger standard errors only in the case of the single-factor model abnormal performance
estimates. In all other cases, all procedures to implement the Newey-West correction yield
virtually identical standard error estimates.
5 Untabulated results show that only the single-factor model abnormal performance esti-
mates (i.e., the market-adjusted return and Jensen alpha) exhibit large autocorrelations that
decline gradually from about 0.8 at the first lag to 0.1 at lag 33. In contrast, the multifactor
model abnormal performance estimates exhibit almost no positive autocorrelation. Most of these
autocorrelations are not reliably different from zero. The point estimates are generally below
0.1 and several estimates are negative.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1996 The Journal of Finance
B.2. Power
Table III reports results when portfolios are formed with every NYSE-
AMEX stock having an equal likelihood of being included. By construction,
the typical firm selected is of median size (i.e., a mid-cap stock). From Panel
A, the average of the 348 market-adjusted abnormal performance estimates
is 0.31 percent per month (standard error = 0.07 percent) and the Jensen-
alpha estimate is 0.29 percent per month (standard error = 0.07 percent) or
about 3.6 percent per year, which is economically large. The observed mis-
specification is expected because statistically significant firm-size related
deviations from the CAPM are well documented.
Panel B of Table III shows the dramatically lower power of the tests than
reported in Table II. For example, 3 percent annual abnormal performance
is detected only 31 percent of the time using the size, BM, and momentum
characteristic-adjusted performance measure. Other measures detect 3 per-
cent abnormal performance less frequently. The fall in power highlights the
tests' frailty when applied to funds with asset characteristics that depart
from the value-weighted portfolio.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1997
D. Style Portfolios
D. 1. Book-to-Market
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
1998 The Journal of Finance
Table III
Characteristic-based Regression-based
Performance Measures Measures
Mutual
Fund's Size, BM, & Fama-French Carhart
Raw Market Momentum CAPM 3-factor 4-factor
Summary Statistic Return Adjusted Adjusted a a a
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 1999
Table III-Continued
Panels C and D of Table IV report results for high book-to-market stock funds.
The results show a marked misspecification of the market-adjusted and CAPM
alpha performance measures. In contrast, the multifactor performance mea-
sures are well specified, except for the modestly excessive rejection rate (13 per-
cent) using the size, BM, and momentum characteristic-based performance
measure. The rejection rates are also the greatest using the multifactor
characteristic-based performance measure, whereas they are low using the multi-
factor regression-based measures. For example, the four-factor regression-
based measure detects 3 percent annual abnormal performance in only 27 percent
of the funds, whereas the corresponding rejection frequency using the
characteristic-based measure is 68 percent. The latter figure overstates power,
however, due to the excessive rejection rate under the null.
Table V reports results for large (Panels A and B) and small (Panels C and
D) market capitalization stock portfolios. All NYSE-AMEX stocks whose mar-
ket capitalization falls below (above) the median of the stocks ranked at the
beginning of each year according to their market capitalization are defined
as small (large) stocks. The results for the large and small market capital-
ization stocks reinforce the findings discussed above for the high and low
book-to-market stock portfolios. Specifically, the performance measures ap-
pear slightly misspecified and the power of the tests is higher for large stock
than for small stocks. Both characteristic-based and regression-based multi-
factor performance measures exhibit low power when applied to small mar-
ket capitalization stock portfolios. From Panel D, there is only a one-in-five
chance of detecting three percent abnormal performance.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
2000 The Journal of Finance
Table IV
Characteristic-based Regression-based
Performance Measures Measures
Mutual
Fund's Size, BM, & Fama-French Carhart
Raw Market Momentum CAPM 3-factor 4-factor
Summary Statistic Return Adjusted Adjusted a a a
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 2001
Table IV-Continued
Characteristic-based Regression-based
100 percent per year, but funds are tracked for up to 10 years. We report
results using only the multifactor characteristic-based measure and the four-
factor regression-based measure.
The results in Table VI suggest that misspecification is quite substantial
at 5- and 10-year horizons using 75 securities. This is seen especially from
rejection rates under the null hypothesis. For example, the characteristic-
based measure's rejection rate under the null hypothesis is zero, compared
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
2002 The Journal of Finance
Table V
Panel A: Descriptive Statistics for the Performance Measures for Large Stock Mutual Funds
Characteristic-based Regression-based
Performance Measures Measures
Mutual
Fund's Size, BM, & Fama-French Carhart
Raw Market Momentum CAPM 3-factor 4-factor
Summary Statistic Return Adjusted Adjusted a a a
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 2003
Table V-Continued
Panel B: Rejection Frequencies Using One-tailed Tests for large Stock Mutual Funds
0% 19 38 2 8 1 2 3 11 2 8 4 10
1% 22 43 5 20 3 9 8 24 8 25 10 27
3% 28 53 26 43 35 57 33 49 50 74 51 76
5% 35 60 48 63 77 86 53 67 85 92 86 96
7.5% 45 74 72 85 91 97 77 88 96 99 98 100
10% 56 81 93 99 98 100 95 99 99 100 100 100
15% 78 89 100 100 100 100 100 100 100 100 100 100
Panel C: Descriptive Statistics for the Performance Measures for Small Stock Mutual Funds
Characteristic-based Regression-based
Panel D: Rejection Frequencies Using One-tailed Tests for Small Stock Mutual Funds
0% 21 47 10 25 1 2 9 22 1 3 0 5
1% 25 50 13 28 1 3 12 26 2 7 2 8
3% 32 53 19 35 3 21 18 36 6 20 9 22
5% 41 55 27 41 24 46 25 43 18 41 20 39
7.5% 46 60 36 49 51 69 36 52 43 63 40 62
10% 51 64 44 58 72 85 47 61 64 83 62 80
15% 59 80 64 83 92 97 65 83 91 97 89 96
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
2004 The Journal of Finance
Table VI
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 2005
Table VI-Continued
A. Simulation Design
We again form samples similar to those in Sections II and III, one starting
each month from January 1966 until December 1994. We describe the sam-
ple construction assuming we are tracking only the performance of a mutual
fund's stock purchases. We have also studied power when both purchases
and sales are tracked; power is slightly higher for sales than for purchases
only, but to save space, the results are not reported.
For each sample, we select six stocks each month for 36 months. The num-
ber of stocks selected each month is normally distributed with a mean of six
and standard deviation of two. The random number is rounded to be a non-
negative integer. Stocks are selected from the NYSE-AMEX universe, and a
stock's selection probability is its market value weight.
The average of six buys per month results in an average of 72 stock pur-
chases per year. Since a typical mutual fund has 75 stocks and the simula-
tions described in Section II assume 100 percent turnover each year, the 72
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
2006 The Journal of Finance
buys a year for the event-study simulations are roughly equivalent to 100 per-
cent turnover per year. Since the performance assessment in previous sec-
tions uses a 36-month evaluation period, the event study also tracks stock
purchases in 36 consecutive months.
For each sample, we aggregate all the buys from the 36 months (on
average, 216 buys) and evaluate the resulting equal-weighted portfolio's
characteristic-adjusted performance over a 1- to 12-month period following
each stock's purchase. We report results using size- and book-to-market
characteristic-adjusted returns. Experimentation with different characteristic-
adjusted performance measures suggests that power is not sensitive to the
choice of the characteristic matching. We test the null hypothesis that the
T-month abnormal performance of the equal-weighted portfolio of the event-
study stocks is zero using a t-statistic (see the notes to Table VII),6 where
T= 1, 3, 6, and 12 months.
C. Simulation Results
6 Since we sample multiple stocks each month in each event-study simulation and bec
can exceed a month, there can be both cross-sectional and temporal overlap in excess returns.
This very likely violates the independence assumption underlying the test statistic. We attempt
to correct for both cross-sectional dependence and dependence due to the use of overlapping
return data using the methods described in Chopra, Lakonishok, and Ritter (1992, p. 251) and
Newey and West (1987). These corrections yield similar results and are not reported.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 2007
C. 1. Power Comparisons
C.2. Limitations
7 The event study does not include the momentum factor in constructing companion port-
folios (see Lyon, Barber, and Tsai (1999)). Higher rejection rates and more powerful tests would
be expected by including a momentum factor, but this would only reinforce the paper's conclu-
sions about power gains using a trade-based approach.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
2008 The Journal of Finance
Table VII
t =(11N) IARiTS.E.(A)
where ARiT is security i's T-month abnormal return calculated by compounding the stock's
monthly abnormal returns over T months; N is the number of stocks in the event-study port-
folio, i varies from 1 to N, and S.E.(AR) is the standard error of the mean of the T-month
abnormal returns. The standard error is
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
Evaluating Mutual Fund Performance 2009
Table VII-Continued
REFERENCES
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms
2010 The Journal of Finance
Carhart, Mark M., 1997, On persistence in mutual fund performance, Journal of Finance 52,
57-82.
Chen, Hsiu-Lang, Narasimhan Jegadeesh, and Russ Wermers, 2000, The value of active mutual
fund management: An examination of the stockholdings and trades of fund managers, Jour-
nal of Financial and Quantitative Analysis 35, 343-368.
Chopra, Navin, Josef Lakonishok, and Jay R. Ritter, 1992, Measuring abnormal performance:
Do stocks overreact? Journal of Financial Economics 31, 235-268.
Daniel, Kent, Mark Grinblatt, Sheridan Titman, and Russ Wermers, 1997, Measuring mutual
fund performance with characteristic-based benchmarks, Journal of Finance 52, 1035-1058.
Daniel, Kent, and Sheridan Titman, 1997, Evidence on the characteristics of cross-sectional
variation in stock returns, Journal of Finance 52, 1-33.
Fama, Eugene F., and Kenneth R. French, 1993, Common risk factors in the returns on stocks
and bonds, Journal of Financial Economics 33, 3-56.
Fama, Eugene F., and Kenneth R. French, 1996, Multifactor explanations of asset pricing anom-
alies, Journal of Finance 51, 55-84.
Jensen, Michael C., 1968, The performance of mutual funds in the period 1945-1964, Journal
of Finance 23, 389-416.
Lyon, John D., Brad M. Barber, and Chih-Ling Tsai, 1999, Improved methods for tests of long-
run abnormal stock returns, Journal of Finance 54, 165-201.
Newey, Whitney D., and Kenneth D. West, 1987, A simple, positive semi-definite heteroskedas-
ticity and autocorrelation consistent covariance matrix, Econometrica 55, 703-708.
Newey, Whitney D., and Kenneth D. West, 1994, Automatic lag selection in covariance matrix
estimation, Review of Economic Studies 61, 631-653.
This content downloaded from 205.208.116.24 on Tue, 12 Mar 2019 14:51:02 UTC
All use subject to https://about.jstor.org/terms