Abstract.
In this paper we present an experimental study of the properties of web graphs. We study a large crawl from 2001 of 200M pages and about 1.4 billion edges made available by the WebBase project at Stanford [17]. We report our experimental findings on the topological properties of such graphs, such as the number of bipartite cores and the distribution of degree, PageRank values and strongly connected components.
Similar content being viewed by others
References
R. Albert, H. Jeong, A.L. Barabasi, Nature 401, 130 (1999)
A.L. Barabasi, A. Albert, Science 286, 509 (1999)
S. Brin, L. Page, Computer Networks and ISDN Systems 30, 107 (1998)
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, S. Stata, A. Tomkins, J. Wiener, Computer Networks 33, 309 (2000)
C. Cooper, A. Frieze, A general model of undirected web graphs, in Proc. of the 9th Annual European Symposium on Algorithms (ESA), LNCS 2161 (Spinger-Verlag, 2001), pp. 500-511
Cyvellance, http://www.cyvellance.com
P. Erdös, R. Renyi, Publ. Math. Inst. Hung. Acad. Sci. 5 (1960)
J. Kleinberg, J. ACM 46, 604 (1997)
R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, E. Upfal, Random graph models for the web graph, in Proc. of 41st FOCS, pp. 57-65, 2000
R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, Trawling the web for emerging cyber communities, in Proc. of the 8th WWW Conference, pp. 403 (1999)
L. Laura, S. Leonardi, G. Caldarelli, P. De Los Rios, A multi-layer model for the webgraph, in On-line proceedings of the 2nd International Workshop on Web Dynamics, 2002
L. Laura, S. Leonardi, S. Millozzi, A software library for generating and measuring massive webgraphs, Technical Report 05-03, DIS - University of Rome La Sapienza, 2003
L. Laura, S. Leonardi, S. Millozzi, U. Meyer, J.F. Sibeyn, Algorithms and experiments for the webgraph, in Proc. of the 11th Annual European Symposium on Algorithms (ESA), Vol. 2461 of Lecture Notes in Computer Science (Springer-Verlag, 2002)
M. Mitznmacher, A Brief History of Generative Models for Power Law and Lognormal Distributions, Internet Mathematics 1 (2) (to appear)
G. Pandurangan, P. Raghavan, E. Upfal, Using pagerank to characterize web structure, in Proc. of the 8th Annual International Conference on Combinatorics and Computing (COCOON)
D.M. Pennock, G.W. Flake, S. Lawrence, E.J. Glover, C.L. Giles, Proc. National Ac. Sci. 99, 5207 (2002)
The stanford webbase project, http://www-diglib.stanford.edu/~testbed/doc2/WebBase/
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 5 December 2003, Published online: 30 March 2004
PACS:
89.20.Hh World Wide Web, Internet - 89.75.Fb Structures and organization in complex systems
Partially supported by the Future and Emerging Technologies programme of the EU under contracts number IST-2001-33555 COSIN “Co-evolution and Self-organization in Dynamical Networks” and IST-1999-14186 ALCOM-FT “Algorithms and Complexity in Future Technologies”, and by the Italian research project ALINWEB: “Algorithmica per Internet e per il Web”, MIUR - Programmi di Ricerca di Rilevante Interesse Nazionale.
Rights and permissions
About this article
Cite this article
Donato, D., Laura, L., Leonardi, S. et al. Large scale properties of the Webgraph. Eur. Phys. J. B 38, 239–243 (2004). https://doi.org/10.1140/epjb/e2004-00056-6
Published:
Issue Date:
DOI: https://doi.org/10.1140/epjb/e2004-00056-6