Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Generalized durative event detection on social media

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Given the recent availability of large volumes of social media discussions, finding temporal unusual phenomena, which can be called events, from such data is of great interest. Previous works on social media event detection either assume a specific type of event, or assume certain behavior of observed variables. In this paper, we propose a general method for event detection on social media that makes few assumptions. The main assumption we make is that when an event occurs, affected semantic aspects will behave differently from their usual behavior, for a sustained period. We generalize the representation of time units based on word embeddings of social media text, and propose an algorithm to detect durative events in time series in a general sense. In addition, we also provide an incremental version of the algorithm for the purpose of real-time detection. We test our approaches on synthetic data and two real-world tasks. With the synthetic dataset, we compare the performance of retrospective and incremental versions of the algorithm. In the first real-world task, we use a novel setting to test if our method and baseline methods can exhaustively catch all real-world news in the test period. The evaluation results show that when the event is quite unusual with regard to the base social media discussion, it can be captured more effectively with our method. In the second real-world task, we use the event captured to help improve the accuracy of stock market movement prediction. We show that our event-based approach has a clear advantage compared to other ways of adding social media information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Availability of data and material

The data used in this paper are available from the corresponding author upon request.

Notes

  1. The preprint can be accessed online at http://arxiv.org/abs/2106.02250

  2. An example online resource that provides an implementation under this setting: https://github.com/philipperemy/japanese-words-to-vectors

  3. An implementation of this test is available as an R package: https://cran.r-project.org/web/packages/randtests/randtests.pdf

  4. https://www.r-project.org/

  5. Since politician are public, such a list can be found in many online sources, for example: https://meyou.jp/group/category/politician/

  6. https://developer.twitter.com/en/docs/tutorials/consuming-streaming-data|

  7. https://github.com/atilika/kuromoji

  8. A list of popular Japanese news Twitter accounts can be found on the same source: https://meyou.jp/ranking/follower_media

  9. https://cran.r-project.org/web/packages/wavethresh/wavethresh.pdf

  10. https://indexes.nikkei.co.jp/en/nkave/

  11. https://cran.r-project.org/web/packages/e1071/index.htm

References

  • Atefeh, F., & Khreich, W. (2015). A survey of techniques for event detection in twitter. Computational Intelligence, 31(1), 132–164. https://doi.org/10.1111/coin.12017

    Article  MathSciNet  Google Scholar 

  • Bartels, R. (1982). The rank version of von neumann’s ratio test for randomness. Journal of the American Statistical Association, 77(377), 40–46. https://doi.org/10.1080/01621459.1982.10477764

    Article  MATH  Google Scholar 

  • Batal, I., Fradkin, D., Harrison, J., Moerchen, F., Hauskrecht, M. (2012). Mining recent temporal patterns for event detection in multivariate time series data. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 280–288. https://doi.org/10.1145/2339530.2339578

  • Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8. https://doi.org/10.1016/j.jocs.2010.12.007

    Article  Google Scholar 

  • Cataldi, M., Di Caro, L., Schifanella, C. (2010). Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the Tenth International Workshop on Multimedia Data Mining, pp. 4:1–4:10. https://doi.org/10.1145/1814245.1814249

  • Chen, Y., Amiri, H., Li, Z., Chua, T. S. (2013). Emerging topic detection for organizations from microblogs. In: Proceedings of the 36th international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52. ACM. https://doi.org/10.1145/2484028.2484057

  • Cheng, H., Tan, P. N., Potter, C., Klooster, S. (2009). Detection and characterization of anomalies in multivariate time series. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 413–424. SIAM. https://doi.org/10.1137/1.9781611972795.36

  • Dong, X., Mavroeidis, D., Calabrese, F., & Frossard, P. (2015). Multiscale event detection in social media. Data Mining and Knowledge Discovery, 29(5), 1374–1405. https://doi.org/10.1007/s10618-015-0421-2

    Article  MathSciNet  Google Scholar 

  • Gao, Y., Wang, S., Padmanabhan, A., Yin, J., & Cao, G. (2018). Mapping spatiotemporal patterns of events using social media: a case study of influenza trends. International Journal of Geographical Information Science, 32(3), 425–449. https://doi.org/10.1080/13658816.2017.1406943

    Article  Google Scholar 

  • Guralnik, V., Srivastava, J. (1999) Event detection from time series data. In: Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 33–42. https://doi.org/10.1145/312129.312190

  • Hua, T., Chen, F., Zhao, L., Lu, C. T., Ramakrishnan, N. (2013). Sted: semi-supervised targeted-interest event detectionin in twitter. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1466–1469. https://doi.org/10.1145/2487575.2487712

  • Khalifa, M. B., Diaz Redondo, R. P., Vilas, A. F., & Rodríguez, S. S. (2017). Identifying urban crowds using geo-located social media data: a Twitter experiment in New York City. Journal of Intelligent Information Systems, 48(2), 287–308. https://doi.org/10.1007/s10844-016-0411-x

    Article  Google Scholar 

  • Khodabakhsh, M., Kahani, M., & Bagheri, E. (2020). Predicting future personal life events on twitter via recurrent neural networks. Journal of Intelligent Information Systems, 54(1), 101–127. https://doi.org/10.1007/s10844-018-0519-2

    Article  Google Scholar 

  • Kim, J. (1976). Events as property exemplifications. In: Action Theory, pp. 159–177. Springer. https://doi.org/10.1007/978-94-010-9074-2_9

  • Kleinberg, J. (2003). Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery, 7(4), 373–397. https://doi.org/10.1023/A:1024940629314

    Article  MathSciNet  Google Scholar 

  • Li, R., Lei, K. H., Khadiwala, R., Chang, K. C. (2012). TEDAS: A Twitter-based event detection and analysis system. In: Proceedings of 28th International Conference on Data Engineering, pp. 1273–1276. https://doi.org/10.1109/ICDE.2012.125

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J. (2013) Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119

  • Minnen, D., Isbell, C., Essa, I., Starner, T. (2007). Detecting subdimensional motifs: An efficient algorithm for generalized multivariate pattern discovery. In: Proceedings of the Seventh IEEE International Conference on Data Mining, pp. 601–606. IEEE. https://doi.org/10.1109/ICDM.2007.52

  • Olteanu, A., Castillo, C., Diaz, F., Vieweg, S. (2014). CrisisLex: A lexicon for collecting and filtering microblogged communications in crises. In: In Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, pp. 376–385

  • Parikh, R., Karlapalem, K. (2013). ET: events from tweets. In: Proceedings of the 22nd International Conference on World Wide Web, Companion Volume, pp. 613–620. ACM. https://doi.org/10.1145/2487788.2488006

  • Pennington, J., Socher, R., Manning, C. (2014). Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543

  • Popescu, A.M., Pennacchiotti, M. (2010). Detecting controversial events from Twitter. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1873–1876. https://doi.org/10.1145/1871437.1871751

  • Qian, B., & Rasheed, K. (2007). Stock market prediction with multiple classifiers. Applied Intelligence, 26(1), 25–33. https://doi.org/10.1007/s10489-006-0001-7

    Article  Google Scholar 

  • Ritter, A., Etzioni, O., Clark, S., et al. (2012). Open domain event extraction from twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112. ACM. https://doi.org/10.1145/2339530.2339704

  • Rossi, C., Acerbo, F., Ylinen, K., Juga, I., Nurmi, P., Bosca, A., Tarasconi, F., Cristoforetti, M., & Alikadic, A. (2018). Early detection and information extraction for weather-induced floods using social media streams. International Journal of Disaster Risk Reduction, 30, 145–157. https://doi.org/10.1016/j.ijdrr.2018.03.002

    Article  Google Scholar 

  • Saeed, Z., Abbasi, R. A., Maqbool, O., Sadaf, A., Razzak, I., Daud, A., Aljohani, N. R., & Xu, G. (2019). What’ s happening around the world? a survey and framework on event detection techniques on twitter. Journal of Grid Computing, 17(2), 279–312. https://doi.org/10.1007/s10723-019-09482-2

    Article  Google Scholar 

  • Sakaki, T., Okazaki, M., Matsuo, Y. (2010). Earthquake shakes Twitter users: Real-time event detection by social sensors. In: Proceedings of the 19th International World Wide Web Conference, pp. 851–860. https://doi.org/10.1145/1772690.1772777

  • Sakaki, T., Okazaki, M., & Matsuo, Y. (2013). Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering, 25(4), 919–931. https://doi.org/10.1109/TKDE.2012.29

    Article  Google Scholar 

  • Shoji, Y., Takahashi, K., Dürst, M.J., Yamamoto, Y., Ohshima, H. (2018). Location2vec: Generating distributed representation of location by using geo-tagged microblog posts. In: International Conference on Social Informatics, pp. 261–270. Springer. https://doi.org/10.1007/978-3-030-01159-8_25

  • Sul, H. K., Dennis, A. R., & Yuan, L. (2017). Trading on twitter: Using social media sentiment to predict stock returns. Decision Sciences, 48(3), 454–488. https://doi.org/10.1111/deci.12229

    Article  Google Scholar 

  • Suliman, A. T., Al Kaabi, K., Wang, D., Al-Rubaie, A., Al Dhanhani, A., Ruta, D., Davies, J., Clarke, S. S. (2016). Event identification and assertion from social media using auto-extendable knowledge base. In: Proceedings of 2016 International Joint Conference on Neural Networks, pp. 4443–4450. IEEE. https://doi.org/10.1109/IJCNN.2016.7727781

  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. https://doi.org/10.1257/mac.1.1.58

  • Taylor, J. B., & Williams, J. C. (2009). A black swan in the money market. American Economic Journal: Macroeconomics, 1(1), 58–83.

    Google Scholar 

  • Unankard, S., Li, X., & Sharaf, M. A. (2015). Emerging event detection in social networks with location sensitivity. World Wide Web, 18(5), 1393–1417. https://doi.org/10.1007/s11280-014-0291-3

    Article  Google Scholar 

  • Vahdatpour, A., Amini, N., Sarrafzadeh, M. (2009). Toward unsupervised activity discovery using multi dimensional motif detection in time series. In: Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence

  • Walther, M., Kaisser, M. (2013). Geo-spatial event detection in the twitter stream. In: Proceedings of the 2013 European Conference on Information Retrieval, pp. 356–367. Springer. https://doi.org/10.1007/978-3-642-36973-5_30

  • Wang, Y., Jin, F., Su, H., Wang, J., Zhang, G. (2018). Reasearch on user profile based on user2vec. In: Proceedings of the 2018 International Conference on Web Information Systems and Applications, pp. 479–487. Springer. https://doi.org/10.1007/978-3-030-02934-0_44

  • Weng, J., Lee, B. S. (2011). Event detection in twitter. In: Proceedings of the Fifth International Conference on Weblogs and Social Media, pp. 401–408

  • Xie, W., Zhu, F., Jiang, J., Lim, E. P., & Wang, K. (2016). TopicSketch: Real-time bursty topic detection from twitter. IEEE Transactions on Knowledge and Data Engineering, 28(8), 2216–2229. https://doi.org/10.1109/TKDE.2016.2556661

    Article  Google Scholar 

  • Xu, Y., Cohen, S. B. (2018). Stock movement prediction from tweets and historical prices. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1970–1979. https://doi.org/10.18653/v1/P18-1183

  • Zhang, T., Zhou, B., Huang, J., Jia, Y., Zhang, B., Li, Z. (2017). A refined method for detecting interpretable and real-time bursty topic in microblog stream. In: Proceedings of the 2017 International Conference on Web Information Systems Engineering, pp. 3–17. Springer. https://doi.org/10.1007/978-3-319-68783-4_1

  • Zhang, Y., Maekawa, T., Hara, T. (2021). Using social media background to improve cold-start recommendation deep models. In: Proceedings of 2021 IEEE International Joint Conference on Neural Networks IJCNN, pp. 1–8. https://doi.org/10.1109/IJCNN52387.2021.9534327

  • Zhang, Y., Shirakawa, M., Hara, T. (2021). A general method for event detection on social media. In: Proceedings of the 25th European Conference on Advances in Databases and Information Systems ADBIS 2021. https://doi.org/10.1007/978-3-030-82472-3_5

  • Zhang, Y., Siriaraya, P., Kawai, Y., Jatowt, A. (2019). Analysis of street crime predictors in web open data. Journal of Intelligent Information Systems pp. 1–25. https://doi.org/10.1007/s10844-019-00587-4

  • Zhang, Y., Szabo, C., Sheng, Q. Z. (2016). Improved object and event monitoring on twitter through lexical analysis and user profiling. In: Proceedings of the 17th International Conference on Web Information System Engineering, pp. 19–34. https://doi.org/10.1007/978-3-319-48743-4_2

  • Zhang, Y., Szabo, C., Sheng, Q. Z., & Fang, X. S. (2018). SNAF: Observation filtering and location inference for event monitoring on twitter. World Wide Web, 21(2), 311–343. https://doi.org/10.1007/s11280-017-0453-1

    Article  Google Scholar 

  • Zhao, L., Chen, F., Lu, C. T., & Ramakrishnan, N. (2016). Online spatial event forecasting in microblogs. ACM Transactions on Spatial Algorithms and Systems (TSAS), 2(4), 1–39. https://doi.org/10.1145/2997642

    Article  Google Scholar 

  • Zhou, X., & Chen, L. (2014). Event detection over twitter social media streams. The VLDB Journal, 23(3), 381–400. https://doi.org/10.1007/s00778-013-0320-3

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research is partially supported by JST CREST Grant Number JPMJCR21F2.

Author information

Authors and Affiliations

Authors

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Shirakawa, M. & Hara, T. Generalized durative event detection on social media. J Intell Inf Syst 60, 73–95 (2023). https://doi.org/10.1007/s10844-022-00730-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-022-00730-8

Keywords

Profiles

  1. Yihong Zhang