Jointly published by Akadémiai Kiadó, Budapest
and Springer, Dordrecht
Scientometrics,
Vol. 63, No. 1 (2005) 87–120
Exploring size and agglomeration effects
on public research productivity
ANDREA BONACCORSI,a CINZIA DARAIOb
a
University of Pisa and Scuola Superiore S. Anna, Pisa (Italy)
b IIT-CNR and Scuola Superiore S. Anna, Pisa (Italy)
The paper assesses the empirical foundation of two largely held assumptions in science policy
making, namely scale and agglomeration effects. According to the former effect, scientific
production may be subject to increasing returns to scale, defined at the level of administrative
units, such as institutes or departments. A rationale for concentrating resources on larger units
clearly follows from this argument. According to the latter, scientific production may be positively
affected by external economies at the geographical level, so that concentrating institutes in the
same area may improve scientific spillover, linkages and collaborations.
Taken together, these arguments have implicitly or explicitly legitimated policies aimed at
consolidating institutes in public sector research and at creating large physical facilities in a small
number of cities.
The paper is based on the analysis of two large databases, built by the authors from data on the
activity of the Italian National Research Council in all scientific fields and of the French INSERM
in biomedical research. Evidence from the two institutions is that the two effects do not receive
empirical support. The implications for policy making and for the theory of scientific production
are discussed.
Introduction
In recent years policy making in the field of science and public research has been
influenced by the attempt to apply economic concepts. The pressure on public budgets
in almost all industrialised countries has lead governments to pursue (or at least to
declare they pursue) efficiency in the allocation and management of resources in the
public research sector. The increasing societal demand for accountability and
transparency of science also makes it important to demonstrate that public funding
follows clear rules.
A clear manifestation of this trend is the effort to apply to public scientific research
two very fundamental concepts drawn from economic analysis, that are, increasing
returns to scale or economies of scale, and external economies or economies of
agglomeration.
Received November 2, 2004
Address for correspondence:
CINZIA DARAIO
Institute for Informatics and Telematics (IIT), Consiglio Nazionale della Ricerca (CNR)
Area della ricerca di Pisa , Via G. Moruzzi, 1; I-56127 Pisa, Italy
E-mail: cinzia@sssup.it, cinzia.daraio@iit.cnr.it
0138–9130/2005/US $ 20.00
Copyright © 2005 Akadémiai Kiadó, Budapest
All rights reserved
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
If these two forces were at play in scientific research, then a sound policy
implication would be that in order to improve the efficiency of public research
resources should be concentrated into larger institutions and/or into geographically
agglomerated areas.
This paper explores scale and agglomeration effects in scientific research with
reference to two large European public research institutions, the Italian National
Research Council (CNR) in several research areas, and the French INSERM in the
biomedical field.
Scale and agglomeration economies in scientific research
In the attempt to apply economic concepts to science by means of analogy, it is
assumed that institutes and departments are analogous to firms, using production factors
or inputs in order to obtain scientific output.
This analogy raises several problems. First of all, there is an important identification
problem: what is the unit of production in scientific research? On one hand, it has been
argued that the appropriate unit of analysis for production is the laboratory or team
(LAREDO & MUSTAR, 2001). Researchers are members of several projects, that cut
across administrative boundaries of institutes. At the same time, it is still true that all
researchers are generally members of an institute or department defined by discipline or
thematic field. While direct production takes place in laboratories and within teams, still
the institutional level of institutes and departments makes sense. In general, it must be
recognized that organizational arrangements may differ across scientific disciplines
(SHINN, 1979; WHITLEY, 1984) and that empirical research should try to keep these
differences into account. As an example, in this paper we provide data for several
disciplines in the Italian case of CNR; furthermore, within a single large and diversified
field for which data are available (i.e. biomedicine) we also provide comparative results
between a set of institutes in two national institutions (INSERM in France and CNR in
Italy).
Second, there are several measurement problems for both inputs and outputs.
Among inputs to scientific production the following are considered:
(i) number of researchers, possibly classified by category (i.e. directors, senior
researchers, junior researchers, post-doc. and Ph.D. students), age, seniority (i.e.
number of years in the field), disciplinary background, and quality (i.e. cumulated
number of publications, or citations, or impact factor);
(ii) stock of capital equipment;
(iii) research funds;
(iv) stock of past knowledge (as measured for example by cumulated number of
publications at the level of institute).
88
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
A number of severe measurement and practical problems make the complete
analysis almost impossible. In practice, it is enormously difficult to collect data on all
these items for a sufficiently long period of time. Within relatively homogeneous
research areas it is considered acceptable to utilise a subset of inputs such as number
and category of researchers, or number of researchers and research funds. Data on the
stock of capital equipment are not easily available.
On the side of research outputs, other problems are at play. For most purposes,
especially within relatively homogeneous research areas, a simple count of publications
is considered acceptable. A more complete treatment, however, should distinguish
between quantity of output, its quality (as measured by citations received) and its
relevance (as measured by subjective evaluations of experts in the field). In addition,
relevant output of scientific production also include teaching, applied research and
consultancy for industry and third parties, patenting, and the like. Consequently, not
only scientific production is inherently multi-input multi-output, but all inputs and
outputs are heterogeneous and cannot be easily measured using commensurable
variables.
Finally, the specification of the relation between inputs and outputs is another
difficult conceptual problem. This relation is likely to be non-deterministic, have a
lagged structure, and have a time sequence which is variable over time and across
sectors. In the light of these characteristics, any meaningful measure of productivity
should be generated by a model of multi-input multi-output production without a fixed
functional specification.
Despite these severe identification, measurement and specification problems and
the resulting difficulties in testing specific predictions, the idea that scientific
production must exhibit some relation between the resources employed and the output
produced is generally accepted. For practical and policy objectives simple measures of
the ratio of output to input are considered an indicator of scientific productivity. As an
example, the crude number of paper per researcher, within relatively homogeneous
fields, is considered an acceptable indicator of productivity across large numbers.∗
Having established the analogy between scientific research and production, and
apart from the methodological problems discussed above, two questions can
legitimately arise. Let us state them as follows:
(a) does the concentration of resources over large institutions or institutes improve
scientific productivity? In other words, is there in the economics of science the same
phenomenon called economies of scale in production?
(b) does the territorial concentration of scientists improve scientific productivity? In
some countries a policy of locating laboratories and research institutes in the same
∗ The use of simple ratios within a context of multi-input multi-output production, of course, can be criticized.
See LINK (1996) for a discussion of limitations of any production function approach in science.
Scientometrics 63 (2005)
89
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
territorial area has been actively pursued, with a view of creating so called economies of
agglomeration. Does this policy improve the production of scientific publications?
Economies of scale in scientific production
In the context of manufacturing production, economies of scale refer to the fact that
an increase of k times in all factors of production determines an increase in output of
more than k times. Therefore the larger the scale of production (i.e. productive capacity
of plants), the lower the unit or average cost in the long run. To claim that increasing
returns to scale are at play one must increase simultaneously all factors of production,
not only the variable ones (i.e. work). It is useful to distinguish between economies of
scale at the level of plant and at the level of firm. The latter may be limited to
manufacturing costs for several plants or include also managerial costs for nonmanufacturing activities.
The counterparts of the plant or the firm in scientific production are not uniquely
determined. In principle, one should consider the smallest unit of production at which
fixed factors of production such as physical equipment are utilised, i.e. the research
laboratory. However, because some resources (e.g. facilities, instrumentation, technical
personnel) are shared across laboratories, the institute or department is also a
meaningful level of observation. Finally, one could consider also the overall university
or the public research institution as an appropriate level of observation, given that
several decisions about the allocation of resources (e.g. funds, personnel) are taken at
this level. According to the data utilised, the former level (laboratory) may be
considered the counterpart of the plant, while the institute and the university or public
institution levels are similar to the firm level or the multidivisional company,
respectively. In empirical work, due to the difficulty to analyse laboratories, most
studies focus on the institute or the department or the university (RAMSDEN, 1994;
JOHNSTON, 1994; ADAMS & GRILICHES, 2000).
This notion, applied to science, means that research units should be of large size, in
order to optimise the use of productive resources and increase productivity. The higher
the size of units, the higher scientific productivity. This notion is often invoked to
support policies of concentration of resources in larger institutes, forcing small
institutes to merge or disappear, or policies of merger and consolidation of scientific
institutions. The keyword for these policies is critical mass.
As it has been noted “a prominent feature of research support policy in many,
though not all countries, over the last twenty years has been the espousal and
implementation of resource allocation processes that provide ‘selectivity and
concentration’. Implicit in these policies has been the assumption that ‘bigger is better’;
in other words, that scientific research benefits from economies of scale. This approach
has been most pronounced in the UK and to some extent other Anglo-Saxon derivative
90
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
countries, but it has been the subject of consideration and experiment in many other
countries as well” (JOHNSTON, 1994, p. 25–26).
Public policies based on critical mass and large institutes induce levels of
concentration of resources that go beyond the usual level. Concentration is a robust
structural property of institutional systems that allocate research funds in proportion to
publishing output. Since publication activity follows a strongly asymmetric distribution,
it is no surprise that research funds are not allocated on a uniform basis.
In a sample of Australian researchers in 18 universities, RAMSDEN (1994) found that
14% of researchers was responsible for 50% of all publications in the 1985-89 period,
while 40% published 80% of the total. Approximately the same ratio was found by
COLE, COLE & SIMON (1981) and RESKIN (1977) for US universities (15% of
researchers published 50% of the total), while HALSEY (1980) found comparable
concentration ratios for British universities and polytechnics (23% of researchers
published 68% of the total). As a consequence, a small number of universities that
follow a consistent policy of hiring scientists with a strong publication record absorb a
large share of funds: in the US more than 50% of national budget for universities is
concentrated in the top 33 universities. In UK the top 6 universities absorb around 50%
of the total.
Policies aimed at concentration do not simply follow the structural asymmetry of
distribution of publication activity, but aim to actively improve productivity.
In Italy, for example, the recent legislative reform of the National Research Council
(Reorganization Decree no. 19 of 1999) has induced a profound change in the
administrative structure. The number of institutes has been reduced from 314 in 1999 to
108 in 2001. Many of the smallest institutes were, in effect, the result of fragmentation
processes, created around a few researchers and crystallised over time. Given that the
administrative burden is, at least to a certain extent, a fixed cost associated to service
indivisibility, the existence of a minimum efficient scale for administrative costs is
plausible.
It should be noted, however, that policy decision makers are often driven by a more
general notion that research activity itself, and not merely its administrative side, is
subject to increasing returns to scale. In other words, policy decision makers implicitly
apply notions from economics to the research activity, drawing analogies between
manufacturing and the production of knowledge.
The analogy is based on the idea that research, like manufacturing, is subject to (a)
division of labour; (b) indivisibility in the use of a minimum number of diverse
competencies; (c) utilisation of large physical infrastructure. These reasons are
sufficient conditions for the emergence of increasing returns to scale in several
industries in the manufacturing sector (SCHERER, 1980; MILGROM & ROBERTS, 1992;
MARTIN, 2002).
The analogy may be severely misleading, however, for several reasons.
Scientometrics 63 (2005)
91
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Division of labour. As it is well known, the larger the size of production units, the
better the subdivision of production into specialised tasks that maximise efficiency.
However, there are fundamental differences between productive division of labour
and cognitive division of labour. In science, the output of any individual is made public
via publication, so that any other scientist may benefit from his contribution and add to
it. Knowledge stored in publications allows division of cognitive labour to take place in
different places and periods of time. Publication is therefore the most important
mechanism for promoting division of cognitive labour. This means that placing
scientists within the same organisational boundaries is neither a necessary nor a
sufficient condition for benefiting from improved division of labour.
There may be a form of division of labour that requires the establishment of formal
collaboration and coordination of tasks between scientists. It is useful to draw a
distinction between division of labour among peers, and division of labour along the
research career, or among scientists with different seniority.
The former type takes the form of personal links, based on mutual recognition and
professional esteem. Since the most important personal assets that established scientists
bring into collaborations are competence and reputation, the boundaries of personal
links tend to follow spontaneously the actual distribution of these elements, often on a
world basis. Only occasionally one can find the entire web of personal peer
relationships included within the boundaries of a single organisation.
A different type of division of labour takes place between scientists at various stages
of careers, and between scientists and technicians or assistants. In the former case the
pattern of personal relations is based on apprenticeship and scientific leadership and
requires long periods of joint work and supervision, normally (but not necessarily)
within the same institution. In the latter case a chief scientist organises the work of a
number of people having different roles, taking the scientific responsibility of projects.
Because both types of division of labour require personal in-depth supervision, the
size of resulting units is limited by the ability of research directors to monitor closely
the work of their research students and collaborators and to contribute to their training.
In most scientific fields this amounts to say that the maximum size is quite small, in the
order of units or one or two dozens.
Again, this argument must be made domain-dependent, since the size that may
favour the cognitive division of labour may be very different across disciplines (e.g.
LATOUR & WOOLGAR, 1979; SHINN, 1982).
Summing up, it is unlikely that division of labour per se is a source of increasing
returns to scale at the level of institutes across all disciplines.
Indivisibility. Indivisibility is a serious argument. In many areas the production of
scientifically meaningful output requires the combination and coordination of many
scientists from different fields, bringing competencies in the substantive field and in
complementary areas such as measurement techniques, statistical analysis, scientific
92
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
computing, software development, data and image processing and analysis and the like.
In some areas the very substantive field requires the integration of different disciplinary
backgrounds. As a result the notion of minimum size of a research unit is economically
sensible (see for example COHEN, 1991; KRETSCHMER, 1985; QURASHI, 1991; 1993;
SEGLEN & AKSNES, 2000). However, we should be careful in defining the level of
observation. First of all, indivisibility is more important at the level of team or
laboratory than at the level of institute or department. Second, while the notion of
indivisibility is clear in abstract terms, its empirical relevance may be highly variable.
In other words, the minimum size of a team or laboratory may be extremely variable
across specific areas within the same fields. In general, this means that economies of
scale may be important up to a threshold level, then become irrelevant. If the threshold
level is quite small (in the order of a few units or a few dozens), the practical
implication is that even small institutes may be highly efficient, provided that their
teams or labs meet the minimum requirement (KYVIK, 1995).
At the same time it must be recognized that size may have strong benefits in terms
of a broader notion of organizational support. This does not only include direct
resources employed in scientific production, such as research assistants, technicians, or
equipment, but also shared resources such as libraries and facilities, and more
importantly, indirect resources such as competent colleagues. Here the argument is that
a larger availability of these resources may facilitate discovery or scientific
productivity. In particular, it is easier to find top level scientists in large than in small
laboratories or universities. In a study of scientific discovery by 16 Nobel laureates,
HURLEY (1997, p. 76) has found that “in terms of budget, library resources, technical
support and the availability of exceptional colleagues, these laboratories (where Nobel
laureates worked) are organisationally very rich indeed”. While there is merit in this
argument, the causal assumption must be made clear. It is true that talented scientists
are attracted by places where resources are abundant, but this is unlikely to be the most
important factor. In a long run perspective, it is talent that creates resources (and then
organizational size), rather than size that produces talent.
Physical infrastructure. Access to physical infrastructure is another argument
commonly associated to the call for critical mass and concentration of resources in large
institutions. Here the empirical counterpart is the so called big science, in which the cost
of research instrumentation is very high. No one denies the importance of this
phenomenon. However, it cannot be invoked as a general argument in favour of large
institutes.
More subtly, in big science the use of large experimental facilities is almost
exclusive, so that institutions must guarantee their ownership. This is not so, for
example, in fields such as genomics and proteomics. Here large research facilities such
Scientometrics 63 (2005)
93
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
as databases are also necessary, but their use is not exclusive and can be made available
also to small institutions on a contractual basis. The link between size of infrastructure
and size of institution may be broken.
Empirical evidence. Summing up, there are many arguments for assuming that
increasing returns to scale are at play at the level of institutes. However, these
arguments cannot be deduced analogically from similar factors in manufacturing
production, because the economic structure of the two fields is profoundly different.
Similarities are somewhat superficial, while differences in complementarities and
coordination patterns are deep.
On the other hand, even in industrial economics the existence and relevance of
economies of scale is ultimately an empirical matter (PRATTEN, 1971).
Several studies have examined the relation between size and research productivity
or higher education productivity (BRINKMAN, 1981; BRINKMAN & LESLIE, 1986; COHN
et al., 1989; DE GROOT et al., 1991; LLOYD et al., 1993; NELSON & HEVERT, 1992;
GETZ et al., 1991).
The evidence on returns to scale in scientific production is ambiguous. ADAMS &
GRILICHES (2000) find constant returns to scale at the level of university in the case of
United States, while NARIN & HAMILTON (1996, p.297) review several studies and
conclude “we have never found that the size of an institution is of any significance”. In
the conclusion of the review commissioned by the UK Office of Science and
Technology, VON TUNZELMANN et al. (2003) state: “there seems to be little if any
convincing evidence to justify a government policy explicitly aimed at further
concentration of research resources on large departments or large universities in the UK
on the grounds of superior economic efficiency”. On the other hand, other studies find
increasing returns to scale until a threshold level, after which constant or even
decreasing returns describe better the situation. As JOHNSTON (1994) summarises the
literature, “the results of this body of work can best be characterized as ambiguous and
contradictory. The majority verdict is that research output is linearly related to size with
no significant economies of scale apparent. Others have argued that the relationship
between output and size is more complicated – for example, that there are economies of
scale up to a certain group size after which diseconomies set in” (JOHNSTON, 1994,
p. 32).
Despite the fact that empirical evidence is not conclusive, the notion that economies
of scale matter in scientific production is firmly held in many political circles and
inspires important political and administrative decisions. For this reason it is useful to
add further evidence. Also, while there is some evidence on universities, much less is
known with respect to institutes of large public research organisations, such as CNRS,
CNR or Max Planck. This paper gives a contribution by examining this issue with
respect to non-university public research institutions.
94
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Agglomeration economies
The notion of scientific districts, clusters, poles of excellence or science areas has
been prominent in national and regional science policy in the last twenty years. The
fascinating examples of Silicon Valley and Route 128 (SAXENIAN, 1996) and the
emergence of technopoles and regional clusters (CASTELLS & HALL, 1994; COOKE &
MORGAN, 1998) have catalysed the attention of analysts and policy makers in all
advanced countries.
At a regional level the notion of cluster identifies the co-presence and interaction of
diverse subjects such as research and educational institutions, firms, innovative public
administrations, financial services, technology transfer and other intermediary
organisations (ACS, 2000; SCOTT, 2001). At this level the emphasis is not on clustering
of research activities per se, but of clustering of complementary innovative activities in
the same area.
This general notion, however, has also inspired policies of location of research
activities by some large public research institutions. In several countries large public
research institutions have pursued a policy of creating geographical concentrations of
institutes in the same area. For example in Italy CNR promoted the creation of Research
Areas, large agglomerations of institutes in different fields within the same physical
infrastructure. In France most research institutes at CNRS and INSERM are located in
close areas.
Behind these policies there is the idea that proximity favours scientific productivity,
insofar as it maximises personal interaction, face-to-face communication, on-site
demonstrations and transmission of tacit knowledge, as well as it facilitates
identification of complementary competencies, unintentional exchange of ideas, café
phenomena, and other serendipitous effects.
The focus of our discussion is therefore the notion that concentrating research
activities in the same area may bring benefits to scientific productivity. We do not enter
into a discussion on more general policies for clustering and agglomeration of
innovative activities.
Underlying these policies there are some well grounded economic ideas. As it sometimes happens, the original idea is an old one, but it was rediscovered and enlarged more
recently. The implicit economic analogy is with the concept of external economies, or
Marshallian agglomeration economies (MARSHALL, 1920; KRUGMAN, 1991; PYKE et al.,
1986). Alfred Marshall observed that the concentration of a large number of
manufacturing firms in the same area (industrial district) is not due to chance, but reflects the
presence of local externalities in the form of availability of specialised suppliers, highly
trained workforce, sources of innovative ideas. Costs of production are therefore lower
in an agglomerated area than outside it. More importantly, firms in a district enjoy a
particular industrial atmosphere and benefit from processes of collective invention.
Scientometrics 63 (2005)
95
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
The literature on agglomeration economies has been carefully reviewed by
ROSENTHAL & STRANGE (2004). The available evidence suggests that all three sources
of agglomeration economies suggested by MARSHALL (1920) are present, namely labor
market pooling, input sharing, and knowledge spillovers.
The literature that has examined the impact of knowledge spillovers has tried to
explain agglomeration processes as the result of intrinsic limits to the geographic
mobility of technological and scientific knowledge. Here the main emphasis is on the
fact that the diffusion of knowledge may take place via codification and distance
transmission, but in most cases requires also personal acquaintance and face to face
interaction. This is made easier and cheaper by physical proximity. Since there is
complementarity between codified and tacit knowledge, even in science physical
proximity may be important. The idea is therefore that knowledge flows have an
embedded nature and require physical proximity which facilitates exchange of
experience and interpersonal communication. In a path-breaking work, JAFFE et al.
(1993) studied the structure of citations to patents and found that the number of
citations sharply declines with distance from the site of inventors. Citations are 5 to 10
times more frequent in the same area. Similar results have been found by JAFFE (1989),
ACS et al. (1992), and ALMEIDA & KOGUT (1999), ZUCKER et al. (1998); AUTANTBERNARD (2001).
AUDRETSCH & FELDMAN (1996) found a positive relation between geographic
concentration of industries and the R&D/sales ratio and proportion of skilled labor,
consistent with the notion that knowledge spillovers influence agglomeration. BOTTAZZI
& PERI (2003) measured the decay of knowledge spillover at regional level and found
that the effect goes down to zero at approximately 300 km from the source.
On the other hand, this literature has been somewhat vague on the specific
mechanisms that link knowledge spillovers and agglomeration, failing to provide
compelling evidence of a general effect. A more specific effect has been suggested by
HUSSLER & RONDE (2004): in epistemic communities there is the need to negotiate
meanings because they do not share the same cognitive frame ex ante (COWAN et al.,
2000) but have to build it through interaction. Therefore physical proximity may be a
factor, while in communities of practice coordination is ensured by shared technologies
or equipment, making proximity less important. They find evidence of this pattern on
data about French inventors. Besides, BRESCHI & LISSONI (2004) found that the pattern
of citations from patents to patents follows interpersonal relations, as evidenced by
networks of co-invention, and not necessarily geographic proximity.
In sum, while the existence of knowledge spillovers is generally accepted, the
channels through which they are diffused are much less clear. Therefore the impact of
agglomeration has still to be demonstrated.
No one denies that concentrating many research institutions in the same area may
have benefits in terms of administrative activity, logistics, emergence of specialised
96
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
services and the like. Some facilities (libraries, technical services) require a minimum
size to function efficiently and cannot be replicated across many small units scattered
around a country. In addition, facilitating personal interaction may indeed spur creative
activity. Under many respects, a policy of intentional agglomeration of research units in
the same geographic area makes economic sense.
The problem is one of causality assumptions. Underlying most agglomeration
policies is the idea that geographic closeness may increase productivity. The assumed
causality direction goes from agglomeration to productivity. However, this causal
mechanism should not be taken for granted.
On the one hand, the mere existence of agglomeration does not imply the existence
of agglomeration economies. Another mechanism may be at play, going in the reverse
direction, from scientific productivity to agglomeration. As an example, one can claim
that scientific excellence creates its own agglomeration effects. When a laboratory or a
scientist in a given place open promising lines of research, PhD students and post-doc
move from other universities and choose to invest their initial career in that place,
visiting scholars spend periods of training, visiting professors are eager to deliver
seminars and suppliers of scientific instrumentation visit periodically the location. If the
institutional scientific system is sufficiently flexible, the scientist will receive support
for infrastructure and his laboratory will grow and attract further people. The choice of
the initial location may happen by chance or historical contingency, rather than being
planned rationally. Because physical facilities must follow the constraints placed by
administrators, laboratories are often located close to each other. When we observe the
phenomena over time we are tempted to conclude that scientists working in
agglomerated areas are more productive, but the reverse is true: productive scientists
create dynamically their own agglomeration effects. If this is true, a policy of
agglomeration should not confound the causes with the effects. Agglomeration per se
does not have any meaning for scientific productivity.
On the other hand, the question of empirical relevance of agglomeration economies
should not be overlooked. How severe is the disadvantage for a laboratory to work in
relatively isolated areas? Is the concentration of research activities in the same area or
rather the quality of life that attracts talented scientists in a given location? Empirical
evidence on localized knowledge spillovers should not be interpreted in the sense that
proximity is a necessary condition for transmission of knowledge. For this to be true,
one must show cases in which physical distance has actually precluded the transmission
of knowledge. We are not aware of these cases.
Summing up, there are many good reasons for a policy of agglomeration of research
activities in the same geographic area. At the same time the importance of
agglomeration is an inherently empirical matter and should be evaluated case by case.
More importantly, policies should not assume implicitly a causal mechanism being in
place.
Scientometrics 63 (2005)
97
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Our paper gives a contribution to this debate by testing the existence of economies
of scale and agglomeration with reference to non-university research institutions. This
empirical setting is particularly interesting, for several reasons. First, unlike universities
that enjoy a large degree of autonomy, public research institutions receive most of their
budget from the government and allocate it to institutes, following a centralised
procedure. So they are in a better position to influence the size of institutes, if they
believe this is better for productivity reasons. Second, unlike universities that have a
long historical tradition, institutes from public research institutions may be located in
many different places. So they have the choice to promote policies of agglomeration of
institutes in closely related areas or to scatter institutes throughout the country. For
these reasons the evidence presented in this paper should be of interest not only to
scholars of science but also to policy decision makers.
Data description
National Research Council – CNR (Italy)
We constructed an original dataset by integrating three official documents produced
by CNR in recent years:
•
•
•
Report on the CNR scientific activity in 1997 (published in 1998);
Report on the CNR Personnel in 1997 (internal documentation);
Report on the CNR European research funding.
The integration of these data was not a trivial task. The documentation on personnel
gives biographical data on individual researchers, technicians and administrators,
together with the CNR affiliation in 1997. We assigned all reported individuals to
institutes and integrated these data (input data) with those reported in the official
Report, which include both input data and output data. Input data include, for example,
research funds, funds from external sources or total costs while output data include total
number of publications and number of international publications. Interestingly, the
Report does not include data on personnel by institute. In practice, until now there was
no official document that gave the opportunity to merge the information on scientific
production with information on the structure of research units.
The research areas considered in the analysis are listed in Table 1.
98
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Table 1 Research areas
Code
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
Research area
Agriculture
Environment and habitat
Biotechnologies and molecular biology
Chemistry
Economics, sociology and statistics
Physics
Geology and mining
Engineering and architecture
Innovation and technology
Mathematics
Medicine and biology
Law and Politics
History, philosophy and philology
In order to conduct the analysis by areas with a sufficient number of observations
we carried out the following consolidation, keeping into account broad disciplinary
fields from the academic tradition (see Table 2):
•
•
•
Environment and Habitat together with Geology and mineral science;
Biotechnologies and molecular biology together with Medicine and
biology;
Engineering and architecture with Innovation and technology.
These aggregations follow the Italian academic tradition, in which these disciplines
are taught together in the same schools or polytechnics. In recent years (2003-2004),
CNR has started a major internal restructuring, leading to the creation of large research
areas, comprising several institutes. Interestingly, the aggregation adopted by CNR
corresponds to the one adopted here, with two minor differences: we separate
Chemistry and Physics (which in the restructuring of CNR are considered part of the
Basic Science area), and we keep Agriculture separated from other Life Sciences.
Institutes in Mathematics (A10), Law and Politics (A12) and History, philosophy
and philology (A13) have been excluded from the analysis.
Table 2 Aggregation of research areas
Aggregation
MA1
MA2
MA3
MA4
MA5
MA6
Corresponding research area
Agriculture
Environment and habitat and Geology and mining
Biotechnologies and molecular biology and Medicine and biology
Chemistry
Physics
Engineering and architecture and Innovation and technology
Scientometrics 63 (2005)
No. of obs.
24
26
27
26
28
31
99
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Table 3. Variables in the dataset (all variables refer to CNR institutes)
a) Size indicators
Variable
T_PERS
RESFUN
T_COS
LABCOS
Definition
Total number of personnel
Total research funds
Total costs
Labour costs
b) Personnel indicators
Variable
T_RES
TECH
ADM
ORD_RES
SEN_RES
DIR_RES
Definition
Total number of Researchers
Number of Technicians
Number of Administrative Staff
Number of Researchers
Number of Senior Researchers
Number of Research Directors
c) Scientific Productivity indicators
Variable
Definition
T_PUB
Total number of publications
P_INTPUB
Percent international publications
INTPUB
Number of International Publications
PUBPERS
Publications per capita
IPUPERS
International Publications per capita
PUBRES
Publications per researcher
IPURES
International Publications per researcher
d) Other indicators
Variable
P_MARFUN
Definition
Percent of funds raised from the market
P_INV
COPUB
COPUBINT
AVIM
GAI
Percent of Total costs allocated to investment
Cost per publication
Cost per international publication
Average Impact factor
Geographical Agglomeration Index
Source: CNR Report (1998) and our elaboration
The list of variables considered in the analysis is reported in Table 3. These
variables have been selected with the goal of allowing a careful test of hypothesis
regarding the sign and magnitude of the impact of size and agglomeration effects on
various measures of scientific productivity (see the analysis discussed in the next
sections).
All variables refer to individual institutes. We strictly follow the definition of
variables described in the CNR Report but omit some variables not used in this paper
(e.g. age structure of researchers). Monetary variables are left in Italian lira (1 euro=
1936,27 lira). Manipulations of variables are described explicitly.
100
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
INSERM (France)
The INSERM database collects data on the number of researchers and publications
of the INSERM institutes in 1997. The sample is based on 213 observations, which is
almost the entire universe of institutes. We were able to access data on institutes by
visiting systematically websites and by addressing a mail survey to directors in 1999.
Although data refer to one year only, they offer a comprehensive view of the activity of
a large part of the French biomedical research system.
The number of researchers is divided in three categories (INSERM researchers,
researchers from hospital and university, other researchers), in addition post-doc
students (boursier) and technical-administrative personnel are included. For all institutes
we define a geographical classification (see later). Although the INSERM dataset is less
rich than the CNR dataset, there is a subset of variables that is in common.
The definition of variables is reported in Table 4.
Table 4. Variables in the dataset (all variables refer to INSERM institutes)
a) Size indicators
Variable
T_RES
T_PERS
TA_PERS
INS_RES
OTH_RES
HU_RES
BORS
Definition
Total number of researchers
Total number of personnel
Technical and administrative personnel
INSERM researchers
Other researchers
Hospital/university researchers
Doc and post-doc students or scholarship holders (boursier)
b) Scientific productivity and agglomeration indicators
Definition
Variable
INTPUB
Number of International Publications
IPUPERS
International Publications per capita
IPURES
International Publications per researcher
GAI
Geographical agglomeration index
Source: our elaboration on websites and electronic survey
Limitations of data
The limitations of the two datasets should not be underestimated.
First of all, data refer to just one year for both CNR and INSERM. In the literature
on bibliometrics and the economics of science it is well known that data on scientific
publications should be averaged over some years, in order to take into account the
inherent variability of the phenomenon over time.
Scientometrics 63 (2005)
101
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
All in all, the size of the two samples is so large and the aggregation by institute so
fine that a picture over one year can still be considered reliable, at least with regards to
broad patterns.
Second, to make a meaningfully analysis one must have data or proxies for all
influencing variables. Now, in the analysis of scientific productivity most studies, and
also this paper, do not use any proxy for capital equipment, and this constitutes a strong
limit of the analysis.
Third, we take as a definition of scientific production the number of total and
international publications. For this research we had no access to data on individual
publication nor we could control for citations of CNR and INSERM publications.∗ In
addition, we recognise that the output of activity is not limited to scientific publications
but also includes patents, consulting, technology transfer to industry, hospitals and
public administration in general, and, to a limited extent, teaching and the creation of
spin-off companies. We do not have data on these joint outputs and are forced to stick
to a view of output as represented by publications. However, we believe that the view
that the main institutional output of CNR and INSERM should be scientific publications
is fundamentally correct.
Finally, all variables refer to individual institutes. No evidence is available on
research teams and laboratories within institutes. This limitation should be clearly taken
into account in examining the results.
Size effects: Does scientific productivity depend on size of institutes?
Evidence from CNR
We want to test the hypothesis that average scientific productivity of researchers is
positively influenced by the size of the institute to which they are affiliated. We
computed Pearson correlation coefficients between couples of variables.∗∗ Because we
have to test a clear Ho, we are happy with very simple correlation analysis. Our aim is
not to build a model of scientific productivity, for which data on all inputs should be
included. More modestly (but more correctly from a methodological point of view), we
work on the pars destruens, trying to demonstrate that assumed effects in scientific
research quite simply fail to meet even the weakest statistical test.
∗ Further research is currently undergoing with the objective to use measures of individual productivity of
scientists and to relate them to productivity at the level of institutes.
∗∗ In a related paper with an explicit comparative approach (BONACCORSI & DARAIO, 2003a) we use Data
Envelopment Analysis (DEA), Free Disposal Hull (FDH) and robust nonparametric techniques (order-m
frontiers), which do not ask for a functional specification; see also BONACCORSI & DARAIO (2003b).
102
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Table 5. Correlation between size of institutes and indicators of scientific output and productivity
at CNR institutes
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.759**
0.722**
0.581**
0.620**
0.640**
0.634**
0.587**
0.734**
0.707**
0.539**
P_INTPUB
0.081
0.020
-0.136
–0.003
0.117
–0.003
0.100
0.035
0.041
0.039
INTPUB
0.743**
0.684**
0.457**
0.586**
0.635**
0.602**
-0.597**
0.705**
0.702**
0.560**
IPURES
–0.191**
–0.193**
–0.180*
–0.171*
–0.193**
–0.144*
–0.092
–0.177*
–0.111
–0.021
IPUPERS
–0.236**
–0.286**
–0.269**
–0.289**
–0.218**
–0.182*
–0.163*
–0.263**
–0.197**
–0.089
PUBRES
–0.255**
–0.230**
–0.122
–0.197**
–0.272**
–0.168*
–0.142
–0.216**
–0.147*
–0.048
** Pearson Correlation is significant at the 0.01 level (2-tailed).
* Pearson Correlation is significant at the 0.05 level (2-tailed).
Table 6. Correlation between size of institutes and indicators of cost, impact factor,
and market funds at CNR institutes
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.116
0.180*
0.122
0.217**
0.121
0.075
0.075
0.168*
0.180*
0.156*
COPUBINT
0.022
0.091
0.102
0.131
–0.002
0.055
–0.017
0.077
0.081
0.069
AVIM
0.073
0.024
–0.035
–0.006
0.061
0.043
0.106
0.046
0.031
0.010
P_MARFUN
0.043
0.079
–0.079
0.126
0.189**
–0.104
–0.064
0.063
0.281**
0.448**
** Pearson Correlation is significant at the 0.01 level (2-tailed).
* Pearson Correlation is significant at the 0.05 level (2-tailed).
The results are reported in Table 5 and 6 for correlations on the whole CNR, and in
Appendix B for correlations by Research Area. They are quite clear:
•
•
•
•
•
in no scientific area is size positively correlated to productivity;
in 3 out of 6 large scientific areas (chemistry, environment, physics) size as
measured by total number of researchers is negatively and significantly
correlated to productivity (number of international publications per researcher);
in 4 out of 6 areas (agriculture, environment, chemistry, physics) size as
measured by total number of personnel is negatively and significantly correlated
to productivity (number of international publications per unit of personnel);
in two areas in which indivisibility and large infrastructures may be at stake
(i.e. medicine and engineering) the relation is not statistically significant,
nevertheless it has a negative sign;
contrary to the common wisdom, in almost all areas the most productive
institutes are not found in the largest size classes, but in the small ones.
Scientometrics 63 (2005)
103
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
In agriculture, environment, chemistry and physics, the most productive institutes
have 5-6 researchers. In medicine and biology one of the stars has around 10
researchers, but highly productive institutes can also be found in the range 5-10
researchers.
In general, there is no positive relation between size and productivity. Although the
most productive institutes are likely to be found in small size classes, the least
productive are spread across all sizes.
Interestingly, the distributions of cost per publication and cost per international
publication are again highly skewed. We are interested in checking whether the highly
productive institutes are also those that spend more per publication. Clearly, if such a
relation would hold, then a possible explanation for higher productivity would not lie in
organizational factors or in the quality of the scientific environment, but rather in
greater access to funds, complementary personnel, or external resources.
The opposite holds true. Highly productive institutes spend less resources than less
productive ones (see Appendix B). Scientific productivity is not originated by a stronger
consumption/utilization of resources.
As it is clear, Pearson coefficients give a rough global measure of association. They
are all that is needed to reject the notion of global economies of scale in science. We are
also interested, however, in exploring local effects, that may be valid for a region within
the interval of relevant independent variable. Rather than applying standard regression
tools we use a nonparametric technique. The methodological choice is consistent with
the notion that production functions, and hence the standard parametric regressionbased econometric toolbox, suffer from severe conceptual problems and cannot be
accepted for the economic analysis of science (BONACCORSI & DARAIO, 2004).
Therefore we apply a Locally weighted least-squares (Loess) technique (see
CLEVELAND, 1993; 1994). This is a local nonparametric regression technique based on
a generalization of running means. The technique gets a predicted value at each point
by fitting a weighted linear regression, in which the weights decrease with the Euclidian
distance from the point of interest. Connecting these predicted values produces a
smooth curve. This method is interesting because it shows the existence of local effects
in the causal relation between variables, that would be overlooked by an average pattern
in a standard parametric regression framework and clearly cannot be detected by simply
using Pearson coefficients. In addition, Loess techniques provide a useful graphical
representation, which facilitates a visual identification of local patterns. A locally
weighted least-squares regression is hence used to obtain smoothed values on a scatter
plot of the associated points of value of y, given the values for x (see Figure 1 and
Appendix A).
In Figures 1a and b the x axis shows the size of the institute in terms of researchers
and total personnel, respectively, and the y axis the productivity of researchers and of
total personnel, respectively, in terms of international publications. A visual inspection
104
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
of the plots shows that the initial interval is characterized by slightly decreasing returns
to scale, while the rest of the size distribution is characterized by constant returns
almost everywhere. In no region we can see segments of the plot witnessing increasing
returns to scale.
Figure 1 Loess plots of size vs. productivity indicators – whole CNR (187 Institutes)
a) Size (T_RES) vs. productivity indicators (IPURES)
b) Size (T_PERS) vs. productivity indicators (IPUPERS)
Evidence from INSERM
The same methodological approach was followed for the French INSERM.
Scientometrics 63 (2005)
105
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Table 7 shows simple correlation coefficients between several indicators of size and
productivity indicators based on categories of personnel.
The results show that size effects are weakly negative for all researchers (T_RES)
and all units of personnel (T_PERS). Total output grows linearly with all categories of
personnel.
Figure 2 shows the Locally weighted least-squares (Loess) curve fitting of
productivity indicators versus size variables. Again, the visual inspection of Figures 2a
and b shows that there is no region in the interval of size variables (number of
researchers, or number of personnel) in which increasing returns emerge.
Figure 2. Loess plots of size vs. productivity indicators – INSERM
a) Size (T_RES) vs. productivity indicators (IPURES)
b) Size (T_PERS) vs. productivity indicators (IPUPERS)
106
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Table 7. Correlation between size of institutes and indicators
of scientific output and productivity at INSERM institutes
Variable
T_RES
T_PERS
TA_PERS
INS_RES
OTH_RES
HU_RES
BORS
INTPUB
0.547**
0.585**
0.499**
0.214**
0.387**
0.385**
0.339**
IPUPERS
0.056
–0.021
0.006
–0.015
0.046
0.054
–0.126
IPURES
–0.172*
0.012
0.119
0.007
–0.162*
–0.123
0.073
** Pearson Correlation is significant at the 0.01 level (2-tailed).
* Pearson Correlation is significant at the 0.05 level (2-tailed).
Discussion of results
These results go directly against much of received wisdom in science policy
making. To put it simply, there is no evidence on the existence and importance of
increasing returns to scale in scientific research at the level of institute. On the contrary,
there is evidence of weak decreasing returns.
Policies aimed at consolidating institutes or policies of concentration of funds on
large institutes might be justified on grounds of cost savings in administrative staff, but
could have no justification with respect to the impact on scientific production.
More precisely, we propose that the level at which increasing returns apply is not
the institute, but the research team. At this level factors such as the access to physical
capital, the number of complementary scientific competencies and the extent of division
of labour significantly influence scientific productivity. Although there is only
preliminary evidence on this effect,∗ we draw the attention to the possibility that most
policy discussions on critical mass and concentration of resources may be directed to
the wrong target. It is not the administrative unit that matters, but the team and the
laboratory.
While in a few scientific fields research teams are defined around large physical
infrastructures, so that administrative units and teams largely overlap, in most fields this
is not the case. Pursuing a policy of concentration into larger institutes may miss the
point, unless institutes adopt an internal policy of rewarding scientific excellence of
teams by selectively allocating the internal resources.
∗ A formal test of the effect of team size on productivity would require micro-data that are extremely difficult
to collect. We have very preliminary evidence, based on a subset of INSERM institutes for which we have
data on the number and size of teams (n=72). Having controlled that this subset is not significantly different
from the rest of the sample, we run correlation analysis between productivity (PUB_RES), size of the institute
(T_RES) and size of the team, respectively. Interestingly, Pearson coefficient is positive and significant for
the size of the team.
Scientometrics 63 (2005)
107
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Even worse, there is the possibility that the concentration of institutes reduces
productivity. If there are regions in the size interval where decreasing returns to scale
apply, it is possible that the consolidation leads ‘efficient’ institutes in regions of lower
efficiency. This possibility would be overlooked by considering only average relations
between size and productivity.
In any case, a policy of consolidation of institutes makes sense if and only if it is
associated to a policy for promoting adequate size of research teams and laboratories.
On the contrary, promoting the adequate size of teams and laboratories requires a policy
of recognition of scientific talent whenever and wherever it is demonstrated, which
almost invariably means without formal central planning. Policy makers and
administrators of large public research institutions feel more confident with
discretionary planning than recognition of quality. Building large institutes is politically
easier than allowing promising teams to grow whatever their institute.
Agglomeration effects: Does scientific productivity depend
on geographical concentration of institutes?
Evidence from CNR
To account for the influence of proximity between research institutes we constructed
the Geographical Agglomeration Index (GAI) as follows. To each institute we assigned
one point for each other CNR institute located in the same city that is not of the same
research aggregation; and two points for each other CNR institute located in the same
city that is also of the same research aggregation of the institute considered. Then we
obtained a GAI that goes from 39 to 1, varying between 39 and 33 for the institutes
located in Rome, from 23 to 20 for the institute located in Naples, from 16 to 14 for the
institutes located in Pisa and so on. An institute has a GAI of 1 if it is the only CNR
institute in its own town.
Then we tested the existence of a relation between GAI and several measures of
scientific productivity. Results are shown in Table 8. See Appendix B for results per
research area.
As it is clear from Table 8, there is no evidence that institutes that benefit from a
strong agglomeration effect do have higher productivity.
Table 8. CNR Correlation between GAI and indicators of scientific productivity whole CNR
Variable
GAI
IPURES
0.051
IPUPERS
–0.012
PUB_PERS
–0.005
PUB_RES
0.068
INTPUB
0.151*
** Pearson Correlation is significant at the 0.01 level (2-tailed).
* Pearson Correlation is significant at the 0.05 level (2-tailed).
108
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Evidence from INSERM
In the case of INSERM we construct the Geographical Agglomeration Index (GAI)
in the same way than for CNR, assigning a score of 2 for institutes in the same category
within biomedical research. The absolute level of GAI is clearly not comparable
between INSERM and CNR, but this does not affect the results. Correlation analysis of
GAI and several productivity indicators are shown in Table 9.
Table 9. INSERM Correlation between GAI and indicators of scientific productivity
Variable
GAI
IPURES
0.150*
IPUPERS
0.217**
INTPUB
0.161*
** Pearson Correlation is significant at the 0.01 level (2-tailed).
* Pearson Correlation is significant at the 0.05 level (2-tailed).
In this case we find evidence of a positive effect even if it is not so strong. It seems
that institutes located in the same area are more productive, while isolated institutes
suffer.
Discussion of results
The combined evidence on the impact of agglomeration on scientific productivity is
mixed. Most productive institutes at CNR are not necessarily located close to other
institutes. At the same time, isolated institutes at INSERM are sacrificed in their
productivity.
A possible explanation of the observed effect is in the difference in the institutional
linkage with universities. In the French system, large public research organisations such
as CNRS, INSERM or INRA only during the ‘90s were put in systematic relation with
universities, through the creation of joint institutes, exchange of researchers and the
like. In the Italian CNR, on the contrary, the linkage with universities has historically
been very strong. This means that an institute located outside a CNR Research Area but
close to a good university may benefit from positive effects, while this is more difficult
for INSERM institutes.
Given that CNR data cover many scientific sectors, we tend to give them more
weight in balancing the evidence.
Summing up, the evidence do not support the received wisdom that agglomeration
per se is positive. It reveals a conceptual flaw in the argument of agglomeration: it is
not agglomeration that induces scientific productivity, but rather the quality of research
that attracts other scientists and induces agglomeration effects.
Scientometrics 63 (2005)
109
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Conclusions
We found no support at all for size effects and no strong support for agglomeration
effects. Although in some scientific fields local effects due to scale and agglomeration
have been identified (see Appendix A for the graphical inspection of this local effects in
the CNR case, at disaggregated level), they are clearly restricted to small regions in the
size interval. As it is clearly shown by the plots, no general pattern emerges from data
supporting the scale and agglomeration effects. The argument that scientific
productivity is favoured by concentration of resources into larger institutes, and
geographical agglomeration of institutes in the same area does not receive empirical
support.
If this is the case, why policies aimed at critical mass, concentration and
agglomeration are so diffused? A possible interpretation is that decisions about the size
and the location of institutes are among the few in which a full discretionary power of
politicians, government officials and public research central bureaucracies can be
exercised. Deciding where to locate new institutes and how large they must be is a
source of significant power, that can be shared among interested parties (politicians,
administrators, scientists).
A more benevolent interpretation is that the top management of large public
research organisations face strong pressures for bringing research activities into new
regions, particularly less developed regions. Having a strong argument in favour of
concentration and agglomeration may help to resist fragmentation tendencies.
It must be stressed again that policies aimed at concentration and agglomeration
may have (and indeed often have) strong merits from the point of view of administrative
and organisational efficiency.
Unfortunately, the causal mechanism implicitly assumed in these policies do not
hold from an empirical point of view. These policies should always be pursued with a
clear view to the need to promote scientific excellence, whatever the size and location
involved. Evidence-based science policy should consider carefully these points.
*
Part of the evidence of this paper has been presented at the conference Rethinking Science Policy, held at
the SPRU (Brighton, 21-23 March, 2002), at the 7th International Science and Technology Indicators
Conference (Karlsruhe, 25-28 September 2002), at seminars at ISPRI-CNR (Rome) and INSERM (Marseille)
and further developed within the AQuaMethPSR (Advanced Quantitative Methods for the evaluation of
Public Sector Research) project under the PRIME Network of Excellence, 6th Framework Programme. We
thank participants for stimulating comments. We would like to thank Marco Brancher for assistance in
building the database. Work partially supported by the Italian Registry of ccTLD.it. We gratefully
acknowledge the helpful suggestions of two anonymous referees. The usual disclaimers apply.
110
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
References
ACS, Z. J., AUDRETSCH, D. B., FELDMAN, M. P. (1992) Real effects of academic research. Comment,
American Economic Review, 82 (1) : 363–367.
ACS, Z. (Ed.) (2000), Regional Innovation, Knowledge and Global Change. London, Pinter.
ADAMS, J. D., GRILICHES, Z. (2000), Research productivity in a system of universities, In: D. ENCAOUA et al.
(Eds), The Economics and Econometrics of Innovation, Kluwer, Dordrecht, pp. 105–140.
ALMEIDA, P., KOGUT, B. (1999), Localization of knowledge and the mobility of engineers in regional
networks, Management Science, 45 (7) : 905–917.
AUDRETSCH, D. B., FELDMAN, M. P. (1996), R&D spillovers and the geography of innovation and production,
American Economic Review, 86 (3) : 630–640.
AUTANT-BERNARD, C. (2001), Science and knowledge flows; Evidence from the French case, Research
Policy, 30 : 1069–1078.
BONACCORSI, A., DARAIO, C. (2003a), A robust nonparametric approach to the analysis of scientific
productivity, Research Evaluation, 12 (1) : 47–69.
BONACCORSI, A., DARAIO, C. (2003b), Age effects in scientific productivity. The case of the Italian National
Research Council (CNR), Scientometrics, 58 : 47–88.
BONACCORSI, A., DARAIO, C. (2004), Econometric approaches to the analysis of productivity of R&D
systems. Production functions and production frontiers, In: H. F. MOED, W. GLÄNZEL, U. SCHMOCH
(Eds), Handbook of Quantitative Science and Technology Research, Kluwer, Dordrecht, pp. 51–74.
BOTTAZZI, L., PERI, G. (2003), Innovation and spillovers in regions: Evidence from European patent data,
European Economic Review, 47 (4) : 687–710.
BRESCHI, S., LISSONI, F. (2004), Knowledge networks from patent data, In: H. F. MOED, W. GLÄNZEL,
U. SCHMOCH (Eds), Handbook of Quantitative Science and Technology Research, Kluwer, Dordrecht,
pp. 613–644.
BRINKMAN, P. T. (1981), Factors affecting instructional costs at major research universities, Journal of
Higher Education, 52 : 265–279.
BRINKMAN, P. T., LESLIE, L. L. (1986), Economies of scale in higher education: Sixty years of research, The
Review of Higher Education, 10 (1) : 1–28.
CASTELLS, M., HALL, P. (1994), Technopoles of the World. The Making of the 21st Century Industrial
Complexes. London, Routledge.
CLEVELAND, W. S. (1993), Visualizing Data, Hobart Press, New Jersey.
CLEVELAND, W. S. (1994), The Elements of Graphing Data, Hobart Press, New Jersey.
COHEN, J. E. (1991), Size, age and productivity of scientific and technical research groups, Scientometrics,
20 : 395–416.
COHN, E., RHINE, S. L. W., SANTOS, M. C. (1989), Institutions of higher education as multi-product forms:
Economies of scale and scope, Review of Economics and Statistics, 71 (May) : 284–290.
COLE, S., COLE, J., SIMON, G. (1981), Change and consensus in peer review, Science, 214 : 881–886.
COOKE, P., MORGAN, K. (1998), The Associational Economy. Firms, Regions and Innovation. Oxford,
Oxford University Press.
COWAN, R., DAVID, P. A., FORAY, D. (2000), The explicit economics of knowledge codification and tacitness,
Industrial and Corporate Change, 9 : 211–254.
DE GROOT, H., MCMAHON, W. W., VOLKWEIN, J. F. (1991), The cost structure of American research
universities. Review of Economics and Statistics, 424–451.
GETZ, M., SIEGFRIED, J. J., ZHANG, H. (1991), Estimating economies of scale in higher education, Economics
Letters, 37 : 203–208.
HALSEY, A. H. (1980), Higher Education in Britain – A Study of University and Polytecnhnic Teachers, Final
report on SSRC Grant.
HURLEY, J. (1997), Organisation and Scientific Discovery, Wiley, Chichester NY.
Scientometrics 63 (2005)
111
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
HUSSLER, C., RONDE, P. (2004), When cognitive communities check the diffusion of academic knowledge:
Evidence from the networks of inventors of a French university, paper presented at the Workshop.
The Empirical Economic Analysis of the Academic Sphere, March 17, 2004, BETA, Univ. Louis Pasteur,
Strasbourg (France).
JAFFE, A. B. (1989), Real effects of academic research, American Economic Review, 79 (5) : 957–970.
JAFFE, A. B., TRAJTENBERG, M., HENDERSON, R. (1993), Geographic localization of knowledge spillovers as
evidenced by patent citations, Quarterly Journal of Economics, 108 : 577–598.
JOHNSTON, R. (1994), Effects of resource concentration on research performance, Higher Education,
28 (1) : 25–37.
KRETSCHMER, H. (1985), Cooperation structure, group size and productivity in research groups,
Scientometrics, 7 (1-2) : 39–53.
KRUGMAN, P. (1991), Increasing returns and economic geography, Journal of Political Economy, 99
(3) : 483–499.
KYVIK, S. (1995), Are big universities departments better than small ones? Higher Education, 30
(3) : 295–304.
LAREDO, P., MUSTAR, P. (Eds) (2001), Research and Innovation Policies in the New Global Economy. An
International Comparative Analysis, Edward Elgar.
LATOUR, B., WOOLGAR, S. (1979), Laboratory Life, Sage, London.
LINK, A. N. (1996), Economic performance measures for evaluating government sponsored research,
Scientometrics, 36 : 325–342.
LLOYD, P., MORGAN, M., WILLIAMS, R. (1993), Amalgamations of universities: Are there economies of size
and scope? Applied Economics, 25 : 1081–1092.
MARSHALL, M. (1920), Principles of Economics, London, MacMillan.
MARTIN, S. (2002), Advanced Industrial Economics, Blackwell Publishers, Malden.
MILGROM, P., ROBERTS, J. (1992), Economics, Organization and Management, Prentice Hall, Englewood
Cliffs.
NARIN, F., HAMILTON, K. S. (1996), Bibliometric performance measures, Scientometrics, 36 : 293–310.
NELSON, R., HEVERT, K. T. (1992), Effect of class size on economies of scale and marginal costs in higher
education, Applied Economics, 24 : 473–482.
PRATTEN, C. F. (1971), Economies of Scale in Manufacturing Industry, Cambridge University Press,
Cambridge.
PYKE, F., BECATTINI, G., SENGENBERGER, W. (1986), Industrial Districts and Inter-Firm Co-operation in
Italy, International Labour Office, Geneve.
QURASHI, M. M. (1991), Publication-rate and size of two prolific research groups in departments of
inorganic-chemistry at Dacca University (1944-1965) and zoology at Karachi University (1966-84),
Scientometrics, 20 (1) : 79–92.
QURASHI, M. M. (1993), Dependence of publication-rate on size of some university groups and departments
in UK and Greece in comparison with NCI, USA, Scientometrics, 27 : 19–38.
RAMSDEN, P. (1994), Describing and explaining research productivity, Higher Education, 28 : 207–226.
RESKIN, B. F. (1977), Scientific productivity and the reward structure of science, American Sociological
Review, 42 : 491–504.
ROSENTHAL, S. R., STRANGE, W. C. (2004), Evidence on the nature and sources of agglomeration economies,
In: J. V. HENDERSON, J. F. THISSE (Eds), Handbook of Urban and Regional Economics, Volume 4, New
York, North Holland.
SAXENIAN, A. (1996), Regional Advantage. Culture and Competition in Silicon Valley and Route 128.
Boston, Harvard University Press.
SCHERER, F. M. (1980), Industrial Market Structure and Economic Performance, Houghton Mifflin, Boston.
SCOTT, A. J. (Ed.) (2001), Global City-Regions. Oxford, Oxford University Press.
SEGLEN, P. O., AKSNES, D. W. (2000), Scientific productivity and group size: a bibliometric analysis of
Norwegian microbiolagical research, Scientometrics, 49 : 125–143.
112
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
SHINN, T. (1979), The French science faculty system, 1808-1914, Historical Studies in the Physical Sciences,
10.
SHINN, T. (1982), Scientific disciplines and organizational specificity, In: N. ELIAS et al. (Eds), Scientific
Establishment and Hierarchies, Sociology of Sciences Yearbook 6, Reidel, Dordrecht.
VON TUNZELMANN, N., RANGA, M., MARTIN, B., GEUNA, A. (2003), The Effects of Size on Research
Performance: A SPRU Review, Report prepared for the Office of Science and Technology, Department
of Trade and Industry.
WHITLEY, R. (1984), The Intellectual and Social Organization of the Sciences, Oxford University Press,
Oxford, second edition, 2000.
ZUCKER, L., DARBY, M., ARMSTRONG, J. (1998), Intellectual capital and the firm: The technology of
geographically localized knowledge spillovers, Economic Inquiry, 36 : 65–86.
Scientometrics 63 (2005)
113
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Appendix A
Plots of size vs. productivity indicators by research area at CNR
Disaggregates for Figure 5
a) Size (T_RES) vs. Productivity Indicators
(IPURES)
114
b) Size (T_PERS) vs. Productivity Indicators
(IPUPERS)
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
a) Size (T_RES) vs. Productivity Indicators
(IPURES)
Scientometrics 63 (2005)
b) Size (T_PERS) vs. Productivity Indicators
(IPUPERS)
115
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Appendix B
Correlations by research area at CNR
Disaggregates for Tables 5 and 6
Correlation between size of institutes and indicators of scientific output and productivity
a) MA1 Agriculture
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.704**
0.649**
0.395
0.524**
0.489*
0.444*
0.508*
0.617**
0.663**
0.676**
P_INTPUB
0.228
0.237
–0.092
0.260
0.081
0.114
0.406*
0.257
0.271
0.259
INTPUB
0.646**
0.596**
0.230
0.510*
0.403
0.392
0.605**
0.586**
0.630**
0.646**
IPURES
–0.252
–0.094
–0.187
0.043
–0.275
–0.155
0.069
–0.124
–0.084
0.066
IPUPERS
–0.253
–0.419*
–0.398
–0.429*
–0.084
–0.339
–0.147
–0.424*
–0.388
–0.185
PUBRES
–0.395
–0.224
–0.137
–0.089
–0.349
–0.226
–0.130
–0.274
–0.241
–0.081
IPURES
–0.423*
–0.489*
–0.525**
–0.513**
–0.362
–0.376
–0.330
–0.475*
–0.443*
–0.340
IPUPERS
–0.442*
–0.514**
–0.523**
–0.549**
–0.390*
–0.391*
–0.312
–0.498**
–0.485*
–0.415*
PUBRES
–0.485*
–0.509
–0.464*
–0.516**
–0.436*
–0.417*
–0.345
–0.492*
–0.462*
–0.364
IPUPERS
–0.349
–0.341
–0.189
–0.327
–0.375
–0.215
–0.231
–0.318
–0.196
–0.107
PUBRES
–0.188
–0.148
0.055
–0.141
–0.242
–0.033
–0.148
–0.140
–0.068
–0.022
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
b) MA2 Environment and habitat, Geology and Mining
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.849**
0.796**
0.684**
0.738**
0.751**
0.735**
0.635**
0.810**
0.832**
0.800**
P_INTPUB
–0.243
–0.301
–0.389*
–0.314
–0.229
–0.176
–0.232
–0.291
–0.267
–0.195
INTPUB
0.696**
0.617**
0.449*
0.560**
0.590**
0.650**
0.469*
0.634**
0.682**
0.712**
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
c) MA3 Biotechnologies and molecular biology, Medicine and Biology
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.757**
0.735**
0.818**
0.624**
0.652**
0.741**
0.509**
0.734**
0.672**
0.564**
P_INTPUB
–0.252
–0.185
–0.286
–0.099
–0.245
–0.268
–0.055
–0.170
–0.086
–0.030
INTPUB
0.843**
0.841**
0.720**
0.771**
0.745**
0.767**
0.606**
0.847**
0.831**
0.732**
IPURES
–0.296
–0.233
–0.124
–0.182
–0.339
–0.153
–0.186
–0.220
–0.107
–0.034
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
116
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
d) MA4 Chemistry
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.551**
0.544**
0.298
0.436*
0.232
0.541**
0.520**
0.580**
0.548**
0.410*
P_INTPUB
0.017
–0.060
–0.018
–0.145
–0.237
0.223
0.103
0.000
–0.005
–0.016
INTPUB
0.580**
0.547**
0.311
0.406*
0.155
0.647**
0.574**
0.608**
0.571**
0.418*
IPURES
–0.634**
–0.644**
–0.374
–0.535**
–0.562**
–0.391*
–0.454*
–0.628**
–0.605**
–0.482*
IPUPERS
–0.560**
–0.656**
–0.497**
–0.616**
–0.466*
–0.359
–0.439*
–0.632**
–0.601**
–0.458*
PUBRES
–0.625**
–0.619**
–0.353
–0.495*
–0.514**
–0.415*
–0.471*
–0.615**
–0.595**
–0.481*
IPURES
–0.402*
–0.374*
–0.280
–0.326
–0.386*
–0.322
–0.270
–0.358
–0.311
–0.122
IPUPERS
–0.457*
–0.523**
–0.435*
–0.546**
–0.428*
–0.342
–0.410*
–0.494**
–0.461*
–0.275
PUBRES
–0.229
–0.206
–0.091
–0.181
–0.266
–0.158
–0.107
–0.180
–0.151
–0.045
IPURES
–0.210
–0.229
–0.198
–0.221
–0.255
–0.135
–0.026
–0.215
–0.168
–0.075
IPUPERS
–0.230
–0.260
–0.239
–0.253
–0.236
–0.185
–0.103
–0.245
–0.207
–0.128
PUBRES
–0.272
–0.246
–0.156
–0.218
–0.343
–0.135
–0.095
–0.233
–0.193
–0.112
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
e) MA5 Physics
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.801*
0.772**
0.767**
0.669**
0.598**
0.725**
0.736**
0.799**
0.803**
0.632**
P_INTPUB
–0.465*
–0.454*
–0.445*
–0.401*
–0.300
–0.459*
–0.438*
–0.474*
–0.411*
–0.159
INTPUB
0.772**
0.743**
0.752**
0.642**
0.596**
0.685**
0.698**
0.765**
0.788**
0.668**
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
f) MA6 Engineering and architecture, Innovation and Technology
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
T_PUB
0.774**
0.704**
0.380*
0.637**
0.635**
0.779**
0.562**
0.730**
0.729**
0.665**
P_INTPUB
0.043
–0.082
–0.184
–0.132
0.128
–0.100
0.046
–0.081
–0.060
–0.022
INTPUB
0.705**
0.590**
0.249
0.512**
0.586**
0.681**
0.556**
0.616**
0.618**
0.568**
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
Scientometrics 63 (2005)
117
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
Correlation between size of institutes and indicators of cost, impact factor, and market funds
a) MA1 Agriculture
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.249
0.391
0.350
0.397
–0.011
0.386
0.303
0.467*
0.456*
0.322
COPUBINT
0.113
0.231
0.314
0.231
–0.014
0.254
0.040
0.286
0.279
0.197
AVIM
0.091
–0.017
–0.026
–0.077
0.148
–0.034
–0.010
–0.067
–0.059
–0.020
P_MARFUN
–0.305
–0.500*
–0.454*
–0.515*
0.050
–0.499*
–0.428*
–0.561**
–0.472*
–0.074
AVIM
0.063
–0.022
–0.086
–0.078
0.087
0.053
–0.051
–0.004
0.014
0.046
P_MARFUN
–0.089
–0.101
–0.111
–0.104
0.086
–0.237
–0.163
–0.125
–0.003
0.233
AVIM
–0.007
–0.015
–0.079
–0.007
–0.070
–0.039
0.237
0.006
–0.004
–0.009
P_MARFUN
0.405*
0.423*
0.110
0.447*
0.469*
0.181
0.285
0.439*
0.725**
0.803**
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
b) MA2 Environment and habitat, Geology and Mining
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.481*
0.554**
0.500**
0.604**
0.399*
0.444*
0.369
0.562**
0.575**
0.549**
COPUBINT
0.369
0.449*
0.488*
0.487*
0.338
0.269
0.380
0.443*
0.415*
0.324
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
c) MA3 Biotechnologies and molecular biology, Medicine and Biology
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.497**
0.529**
0.154
0.562**
0.522**
0.269
0.429*
0.526**
0.559**
0.516**
COPUBINT
0.380
0.382*
0.104
0.394*
0.428*
0.200
0.253
0.370
0.389*
0.357
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
118
Scientometrics 63 (2005)
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
d) MA4 Chemistry
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.463*
0.532**
0.267
0.522**
0.435*
0.239
0.382
0.510**
0.544**
0.565**
COPUBINT
0.430*
0.530**
0.248
0.563**
0.500**
0.133
0.340
0.489*
0.535**
0.589**
AVIM
0.035
0.009
0.094
–0.046
–0.374
0.366
0.163
0.095
–0.021
–0.299
P_MARFUN
–0.276
–0.246
–0.238
–0.140
0.046
–0.420*
–0.289
–0.307
–0.183
0.138
AVIM
0.029
0.018
0.104
–0.007
–0.040
0.077
0.033
0.058
–0.018
–0.205
P_MARFUN
–0.260
–0.244
–0.398*
–0.182
–0.148
–0.281
–0.217
–0.260
–0.103
0.316
AVIM
–0.074
–0.223
–0.250
–0.278
–0.108
0.038
–0.171
–0.219
–0.253
–0.284
P_MARFUN
0.028
0.060
–0.230
0.116
0.104
–0.017
–0.184
0.051
0.184
0.385*
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
e) MA5 Physics
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.314
0.389*
0.205
0.447*
0.372
0.171
0.275
0.352
0.387*
0.387*
COPUBINT
0.505**
0.575**
0.360
0.614**
0.484**
0.368
0.457*
0.551**
0.547**
0.412*
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
f) MA6 Engineering and architecture, Innovation and Technology
Variable
T_RES
T_PERS
ADM
TECH
ORD_RES
SEN_RES
DIR_RES
LABCOS
T_COS
RESFUN
COPUB
0.005
0.053
–0.008
0.084
0.120
–0.130
–0.094
0.015
0.019
0.025
COPUBINT
–0.097
0.003
–0.002
0.063
–0.040
–0.126
–0.154
–0.019
–0.012
0.002
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
Scientometrics 63 (2005)
119
A. BONACCORSI, C. DARAIO: Size and agglomeration effects
MA1
MA2
MA3
MA4
MA5
MA6
GAI
Correlation between GAI and scientific productivity indicators (IPURES, IPUPERS, PUB_PERS,
PUB_RES, INTPUB)
IPURES
0.161
–0.188
0.354
–0.270
0.104
–0.100
IPUPERS
0.080
–0.160
0.342
–0.283
0.185
–0.142
PUB_PERS
0.101
–0.066
0.424*
–0.285
0.263
–0.177
PUB_RES
0.189
0.019
0.460*
–0.273
0.172
–0.184
INTPUB
–0.048
–0.177
0.256
0.174
–0.037
0.424*
* Pearson Correlation is significant at the 0.05 level (2-tailed).
** Pearson Correlation is significant at the 0.01 level (2-tailed).
120
Scientometrics 63 (2005)