Abstract
The approximately seven billion hyperlinks on the WWW, and the anchortext surronding them, represent a valuable collection of editorial information about web pages. We begin by discussing methods for incorporating this link information into web search. Next, we consider a follow-on question: is it possible to apply data mining techniques to the link structure of the web in order to discover all communities, including those that have only just formed and whose members may not yet be aware of one another.
We also consider modeling and measurement of this hyperlink structure. A recent analysis of the web graph indicates that the macroscopic structure is considerably more intricate than suggested by earlier experiments. We describe these results, and go on to discuss some progress towards defining analytical models for graphs such as the web.
The work described here is joint with Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Raymie Stata, Sridhar Rajagopalan, Eli Upfal and Janet Wiener.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tomkins, A. (2000). Hyperlink-Aware Mining and Analysis of the Web. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_3
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive