Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2488388.2488523acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Questions about questions: an empirical analysis of information needs on Twitter

Published: 13 May 2013 Publication History

Abstract

Conventional studies of online information seeking behavior usually focus on the use of search engines or question answering (Q&A) websites. Recently, the fast growth of online social platforms such as Twitter and Facebook has made it possible for people to utilize them for information seeking by asking questions to their friends or followers. We anticipate a better understanding of Web users' information needs by investigating research questions about these questions. How are they distinctive from daily tweeted conversations? How are they related to search queries? Can users' information needs on one platform predict those on the other?
In this study, we take the initiative to extract and analyze information needs from billions of online conversations collected from Twitter. With an automatic text classifier, we can accurately detect real questions in tweets (i.e., tweets conveying real information needs). We then present a comprehensive analysis of the large-scale collection of information needs we extracted. We found that questions being asked on Twitter are substantially different from the topics being tweeted in general. Information needs detected on Twitter have a considerable power of predicting the trends of Google queries. Many interesting signals emerge through longitudinal analysis of the volume, spikes, and entropy of questions on Twitter, which provide insights to the understanding of the impact of real world events and user behavioral patterns in social platforms.

References

[1]
E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19--26. ACM, 2006.
[2]
R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In Current Trends in Database Technology-EDBT 2004 Workshops, pages 395--397. Springer, 2005.
[3]
J. Bollen, H. Mao, and X.-J. Zeng. Twitter mood predicts the stock market. CoRR, abs/1010.3003, 2010.
[4]
A. Broder, P. Ciccolo, E. Gabrilovich, V. Josifovski, D. Metzler, L. Riedel, and J. Yuan. Online expansion of rare queries for sponsored search. In Proceedings of the 18th international conference on World wide web, pages 511--520. ACM, 2009.
[5]
E. H. Chi. Information seeking can be social. Computer, 42(3):42--46, 2009.
[6]
G. Cong, L. Wang, C.-Y. Lin, Y.-I. Song, and Y. Sun. Finding question-answer pairs from online forums. pages 467--474, 2008.
[7]
C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20:273--297, 1995.
[8]
M. Efron and M. Winget. Questions are content: a taxonomy of questions in a microblogging environment. Proceedings of the American Society for Information Science and Technology, 47(1):1--10, 2010.
[9]
B. Evans and E. Chi. Towards a model of understanding social search. In Proceedings of the 2008 ACM conference on Computer supported cooperative work, pages 485--494. ACM, 2008.
[10]
C. Fellbaum. Wordnet: An electronic lexical database. Cambridge, MA: MIT Press, 38(11):39--41, 1998.
[11]
G. Forman. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res., 3:1289--1305, 2003.
[12]
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting, 1995.
[13]
R. Gallager. Claude e. shannon: A retrospective on his life, work, and impact. Information Theory, IEEE Transactions on, 47(7):2681--2695, 2001.
[14]
J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant. Detecting influenza epidemics using search engine query data. Nature, 457:1012--1014, February 2009.
[15]
C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3):424--38, July 1969.
[16]
B. Hecht, J. Teevan, M. R. Morris, and D. Liebling. Searchbuddies: Bringing search engines into the conversation. ICWSM, pages 138--145, 2012.
[17]
R. Jones, R. Kumar, B. Pang, and A. Tomkins. I know what you did last summer: query logs and user privacy. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 909--914. ACM, 2007.
[18]
R. Krovetz. Viewing morphology as an inference process. 16th ACM SIGIR Conference, pages 191--202, 1993.
[19]
B. Li, X. Si, M. R. Lyu, I. King, and E. Y. Chang. Question identification on twitter. pages 2477--2480, 2011.
[20]
X. Li, L. Wang, and E. Sung. Adaboost with svm-based component classifiers. Eng. Appl. Artif. Intell., 21(5):785--795, 2008.
[21]
Z. Liu and B. Jansen. Almighty twitter, what are people asking for? ASIST, 2012.
[22]
T. N. Mansuy and R. J. Hilderman. Evaluating wordnet features in text classification models. In Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference, pages 568--573. AAAI Press, 2006.
[23]
Q. Mei and K. Church. Entropy of search logs: how hard is search? with personalization? with backoff? In Proceedings of the international conference on Web search and web data mining, pages 45--54. ACM, 2008.
[24]
G. A. Miller. Wordnet: A lexical database for english. Communications of the ACM, 38(11):39--41, 1995.
[25]
M. Morris and J. Teevan. Exploring the complementary roles of social networks and search engines. Human-Computer Interaction Consortium Workshop(HCIC), 2012.
[26]
M. Morris, J. Teevan, and K. Panovich. What do people ask their social networks, and why?: a survey study of status message q&a behavior. In Proceedings of the 28th international conference on Human factors in computing systems, pages 1739--1748. ACM, 2010.
[27]
M. R. Morris, J. Teevan, and K. Panovich. A comparison of information seeking using search engines and social networks. Proceedings of 4th International AAAI Conference on Weblogs and Social Media, 42(3):291--294, 2010.
[28]
J. Nichols and J. Kang. Asking questions of targeted strangers on social networks. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pages 999--1002. ACM, 2012.
[29]
S. A. Paul, L. Hong, and E. H. Chi. Is twitter a good place for asking questions? a characterization study. Proceedings of the 5th International AAAI Conference on Weblogs and Social Media, 18(11):578--581, 2011.
[30]
C. Shah. Measuring effectiveness and user satisfaction in Yahoo! answers. First Monday, 16(2--7), 2011.
[31]
C. Silverstein, H. Marais, M. Henzinger, and M. Moricz. Analysis of a very large web search engine query log. In ACm SIGIR Forum, volume 33, pages 6--12. ACM, 1999.
[32]
J. Teevan, E. Adar, R. Jones, and M. Potts. Information re-retrieval: repeat queries in yahoo's logs. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 151--158. ACM, 2007.
[33]
J. Teevan, S. Dumais, and D. Liebling. To personalize or not to personalize: modeling queries with variation in user intent. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 163--170. ACM, 2008.
[34]
J. Teevan, D. Ramage, and M. Morris. # twittersearch: a comparison of microblog search and web search. In Proceedings of the fourth ACM international Conference on Web search and Data Mining, pages 35--44. ACM, 2011.
[35]
K. Wang and T.-S. Chua. Exploiting salient patterns for question detection and question retrieval in community-based question answering. pages 1155--1163, 2010.
[36]
J. Yang, M. R. Morris, J. Teevan, L. A. Adamic, and M. S. Ackerman. Culture matters: A survey study of social q&a behavior. 2011.
[37]
J. Yao, B. Cui, Y. Huang, and X. Jin. Temporal and social context based burst detection from folksonomies. In AAAI. AAAI Press, 2010.
[38]
J. Yao, B. Cui, Y. Huang, and Y. Zhou. Bursty event detection from collaborative tags. World Wide Web, 15(2):171--195, 2012.

Cited By

View all
  • (2024)Vexless: A Serverless Vector Data Management System Using Cloud FunctionsProceedings of the ACM on Management of Data10.1145/36549902:3(1-26)Online publication date: 30-May-2024
  • (2022)Unsupervised consumer intention and sentiment mining from microblogging data as a business intelligence toolOperational Research10.1007/s12351-022-00714-022:5(6007-6036)Online publication date: 12-May-2022
  • (2021)QuestionComb: A Gamification Approach for the Visual Explanation of Linguistic Phenomena through Interactive LabelingACM Transactions on Interactive Intelligent Systems10.1145/342944811:3-4(1-38)Online publication date: 3-Sep-2021
  • Show More Cited By

Index Terms

  1. Questions about questions: an empirical analysis of information needs on Twitter

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '13: Proceedings of the 22nd international conference on World Wide Web
    May 2013
    1628 pages
    ISBN:9781450320351
    DOI:10.1145/2488388

    Sponsors

    • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
    • CGIBR: Comite Gestor da Internet no Brazil

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. information need
    2. time series analysis
    3. twitter

    Qualifiers

    • Research-article

    Conference

    WWW '13
    Sponsor:
    • NICBR
    • CGIBR
    WWW '13: 22nd International World Wide Web Conference
    May 13 - 17, 2013
    Rio de Janeiro, Brazil

    Acceptance Rates

    WWW '13 Paper Acceptance Rate 125 of 831 submissions, 15%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Vexless: A Serverless Vector Data Management System Using Cloud FunctionsProceedings of the ACM on Management of Data10.1145/36549902:3(1-26)Online publication date: 30-May-2024
    • (2022)Unsupervised consumer intention and sentiment mining from microblogging data as a business intelligence toolOperational Research10.1007/s12351-022-00714-022:5(6007-6036)Online publication date: 12-May-2022
    • (2021)QuestionComb: A Gamification Approach for the Visual Explanation of Linguistic Phenomena through Interactive LabelingACM Transactions on Interactive Intelligent Systems10.1145/342944811:3-4(1-38)Online publication date: 3-Sep-2021
    • (2020)Social QA in non-CQA platformsFuture Generation Computer Systems10.1016/j.future.2019.12.023105:C(631-649)Online publication date: 1-Apr-2020
    • (2018)Microblog Analysis as a Program of WorkACM Transactions on Social Computing10.1145/31629561:1(1-40)Online publication date: 18-Jan-2018
    • (2018)Understanding and Identifying Rhetorical Questions in Social MediaACM Transactions on Intelligent Systems and Technology10.1145/31083649:2(1-22)Online publication date: 10-Jan-2018
    • (2018)Metis: A Scalable Natural-Language-Based Intelligent Personal Assistant for Maritime ServicesInformation and Software Technologies10.1007/978-3-319-99972-2_2(14-28)Online publication date: 29-Aug-2018
    • (2018)Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity in Web CorporaDatabase and Expert Systems Applications10.1007/978-3-319-99133-7_17(207-217)Online publication date: 7-Aug-2018
    • (2018)Learning to Leverage Microblog Information for QA RetrievalAdvances in Information Retrieval10.1007/978-3-319-76941-7_38(507-520)Online publication date: 1-Mar-2018
    • (2017)Life aspect inference of tweets based on probability distributionWeb Intelligence10.3233/WEB-17035215:1(55-65)Online publication date: 21-Feb-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media