Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2467696.2467721acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Reading the correct history?: modeling temporal intention in resource sharing

Published: 22 July 2013 Publication History

Abstract

The web is trapped in the "perpetual now", and when users traverse from page to page, they are seeing the state of the web resource (i.e., the page) as it exists at the time of the click and not necessarily at the time when the link was made. Thus, a temporal discrepancy can arise between the resource at the time the page author created a link to it and the time when a reader follows the link. This is especially important in the context of social media: the ease of sharing links in a tweet or Facebook post allows many people to author web content, but the space constraints combined with poor awareness by authors often prevents sufficient context from being generated to determine the intent of the post. If the links are clicked as soon as they are shared, the temporal distance between sharing and clicking is so small that there is little to no difference in content. However, not all clicks occur immediately, and a delay of days or even hours can result in reading something other than what the author intended. We introduce the concept of a user's temporal intention upon publishing a link in social media. We investigate the features that could be extracted from the post, the linked resource, and the patterns of social dissemination to model this user intention. Finally, we analyze the historical integrity of the shared resources in social media across time. In other words, how much is the knowledge of the author's intent beneficial in maintaining the consistency of the story being told through social posts and in enriching the archived content coverage and depth of vulnerable resources?

References

[1]
E. Adar, J. Teevan, S. T. Dumais, and J. L. Elsas. The web changes everything: understanding the dynamics of web content. In WSDM'09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 282--291, 2009.
[2]
D. Antoniades, I. Polakis, G. Kontaxis, E. Athanasopoulos, S. Ioannidis, E. Markatos, and T. Karagiannis. we. b: The web of short urls. In Proceedings of the 20th international conference on World Wide Web, pages 715--724, 2011.
[3]
Z. Bar-Yossef, A. Z. Broder, R. Kumar, and A. Tomkins. Sic transit gloria telae: towards an understanding of the web's decay. In Proceedings of the 13th international conference on World Wide Web, WWW'04, pages 328--337, New York, NY, USA, 2004. ACM.
[4]
M. Ben Saad and S. Gançarski. Archiving the Web using Page Changes Pattern: A Case Study. In JCDL'11: Proceedings of ACM/IEEE Joint Conference on Digital Libraries, Ottawa, Canada, 2011.
[5]
A. Bermingham and A. F. Smeaton. On using twitter to monitor political sentiment and predict election results.
[6]
J. Bollen, H. Mao, and X.-J. Zeng. Twitter mood predicts the stock market. abs/1010.3003, 2010.
[7]
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, STOC'02, pages 380--388, New York, NY, USA, 2002. ACM.
[8]
Z. Chen, F. Lin, H. Liu, Y. Liu, W.-Y. Ma, and L. Wenyin. User intention modeling in web applications using data mining. World Wide Web, 5(3):181--191, Nov. 2002.
[9]
J. Cho and H. Garcia-Molina. Estimating frequency of change. ACM Transactions on Internet Technology, 3(3):256--290, 2003.
[10]
N. Dai and B. D. Davison. Vetting the links of the web. In Proceedings of the 18th ACM conference on Information and knowledge management, CIKM'09, pages 1745--1748, New York, NY, USA, 2009. ACM.
[11]
Z. Dalal, S. Dash, P. Dave, L. Francisco-Revilla, R. Furuta, U. Karadkar, and F. Shipman. Managing distributed collections: evaluating web page changes, movement, and replacement. In JCDL'04: Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 160--168, 2004.
[12]
J. L. Elsas and S. T. Dumais. Leveraging temporal dynamics of document content in relevance ranking. In Proceedings of the third ACM international conference on Web search and data mining, WSDM'10, pages 1--10, New York, NY, USA, 2010. ACM.
[13]
Facebook.com. Facebook official fact sheet. http://newsroom.fb.com/content/default.aspx?NewsAreaId=22, 2012. {Online; accessed 17-December-2012}.
[14]
B. J. Jansen, D. L. Booth, and A. Spink. Determining the user intent of web search engine queries. In Proceedings of the 16th international conference on World Wide Web, WWW'07, pages 1149--1150, New York, NY, USA, 2007. ACM.
[15]
A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, WebKDD/SNA-KDD'07, pages 56--65, New York, NY, USA, 2007. ACM.
[16]
V. Jethava, L. Calderón-Benavides, R. Baeza-Yates, C. Bhattacharyya, and D. Dubhashi. Scalable multi-dimensional user intent identification using tree structured distributions. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR'11, pages 395--404, New York, NY, USA, 2011. ACM.
[17]
B. Kahle. Preserving the Internet. Scientific American, 276(3):82--83, March 1997.
[18]
A. Kathuria, B. J. Jansen, C. Hafernik, and A. Spink. Classifying the user intent of web queries using k-means clustering. Internet Research, 20(5):563--581, 2010.
[19]
M. Klein. Using the Web Infrastructure for Real Time Recovery of Missing Web Pages. PhD thesis, Old Dominion University Department of Computer Science, 2011.
[20]
M. Klein and M. L. Nelson. Revisiting lexical signatures to (re-)discover web pages. In Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries, ECDL'08, pages 371--382, Berlin, Heidelberg, 2008. Springer-Verlag.
[21]
M. Klein and M. L. Nelson. Find, new, copy, web, page - tagging for the (re-)discovery of web pages. In Proceedings of TPDL, pages 27--39, 2011.
[22]
M. Klein, J. L. Shipman, and M. L. Nelson. Is This a Good Title? In HT'10: Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, pages 3--12, 2010.
[23]
M. Klein, J. Ware, and M. L. Nelson. Rediscovering missing web pages using link neighborhood lexical signatures. In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, JCDL'11, pages 137--140, New York, NY, USA, 2011. ACM.
[24]
C. Kohlschütter, P. Fankhauser, and W. Nejdl. Boilerplate detection using shallow text features. In Proceedings of the third ACM international conference on Web search and data mining, WSDM'10, pages 441--450, New York, NY, USA, 2010. ACM.
[25]
J. H. Lee and X. Hu. Generating ground truth for music mood classification using mechanical turk. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, JCDL'12, pages 129--138, New York, NY, USA, 2012. ACM.
[26]
X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR'08, pages 339--346, New York, NY, USA, 2008. ACM.
[27]
E. Loper and S. Bird. Nltk: the natural language toolkit. In Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1, ETMTNLP'02, pages 63--70, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.
[28]
A. Mogadala and V. Varma. Twitter user behavior understanding with mood transition prediction. In Proceedings of the 2012 workshop on Data-driven user behavioral modelling and mining from social media, DUBMMSM'12, pages 31--34, New York, NY, USA, 2012. ACM.
[29]
M. L. Nelson and B. D. Allen. Object persistence and availability in digital libraries. D-Lib Magazine, 8(1), 2002.
[30]
M. E. J. Newman and J. Park. Why social networks are different from other types of networks. Physical Review E, 68(3):036122+, sep 2003.
[31]
A. Ntoulas, J. Cho, and C. Olston. What's new on the web?: the evolution of the web from a search engine perspective. In WWW'04: Proceedings of the 13th international Conference on World Wide Web, pages 1--12, 2004.
[32]
H. M. SalahEldeen. Losing my revolution: A year after the egyptian revolution, 10% of the social media documentation is gone. http://ws-dl.blogspot.com/2012/02/2012-02--11-losing-my-revolution-year.html, 2012.
[33]
H. M. SalahEldeen and M. L. Nelson. Losing my revolution: How much social media content has been lost? In Proceedings of TPDL, pages 125--137, 2012.
[34]
R. Sanderson, M. Phillips, and H. Van de Sompel. Analyzing the persistence of referenced web resources with Memento. In Proceedings of Open Repositories 2011, 2011.
[35]
R. L. Santos, C. Macdonald, and I. Ounis. Intent-aware search result diversification. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR'11, pages 595--604, New York, NY, USA, 2011. ACM.
[36]
Y. Tian and J. Zhu. Learning from crowds in the presence of schools of thought. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'12, pages 226--234, New York, NY, USA, 2012. ACM.
[37]
Twitter.com. Twitter numbers. http://blog.Twitter.com/2011/03/numbers.html, 2012. {Online; accessed 17-December-2012}.
[38]
H. Van de Sompel, M. L. Nelson, R. Sanderson, L. L. Balakireva, S. Ainsworth, and H. Shankar. Memento: Time Travel for the Web. Technical Report arXiv:0911.1112, 2009.
[39]
M. Wu, R. C. Miller, and G. Little. Web wallet: preventing phishing attacks by revealing user intentions. In Proceedings of the second symposium on Usable privacy and security, SOUPS'06, pages 102--113, New York, NY, USA, 2006. ACM.

Cited By

View all
  • (2021)Interoperability for Accessing Versions of Web Resources with the Memento ProtocolThe Past Web10.1007/978-3-030-63291-5_9(101-126)Online publication date: 1-Jul-2021
  • (2018)How it HappenedProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries10.1145/3197026.3197034(193-202)Online publication date: 23-May-2018
  • (2015)Predicting Temporal Intention in Resource SharingProceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries10.1145/2756406.2756921(205-214)Online publication date: 21-Jun-2015
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '13: Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
July 2013
480 pages
ISBN:9781450320771
DOI:10.1145/2467696
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 July 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. archiving
  2. memento
  3. modeling
  4. social media
  5. temporal user intention

Qualifiers

  • Research-article

Conference

JCDL '13
Sponsor:
JCDL '13: 13th ACM/IEEE-CS Joint Conference on Digital Libraries
July 22 - 26, 2013
Indiana, Indianapolis, USA

Acceptance Rates

JCDL '13 Paper Acceptance Rate 28 of 95 submissions, 29%;
Overall Acceptance Rate 415 of 1,482 submissions, 28%

Upcoming Conference

JCDL '24
The 2024 ACM/IEEE Joint Conference on Digital Libraries
December 16 - 20, 2024
Hong Kong , China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Interoperability for Accessing Versions of Web Resources with the Memento ProtocolThe Past Web10.1007/978-3-030-63291-5_9(101-126)Online publication date: 1-Jul-2021
  • (2018)How it HappenedProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries10.1145/3197026.3197034(193-202)Online publication date: 23-May-2018
  • (2015)Predicting Temporal Intention in Resource SharingProceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries10.1145/2756406.2756921(205-214)Online publication date: 21-Jun-2015
  • (2015)Not all mementos are created equal: measuring the impact of missing resourcesInternational Journal on Digital Libraries10.1007/s00799-015-0150-616:3-4(283-301)Online publication date: 6-May-2015
  • (2014)Survey of Temporal Information Retrieval and Related ApplicationsACM Computing Surveys10.1145/261908847:2(1-41)Online publication date: 25-Aug-2014

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media