Keyword: web mining : Search

short-paper

RevEx: An Online Consumer Reviews Extraction Tool

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge ManagementPages 5169–5173https://doi.org/10.1145/3627673.3679214

This paper presents RevEx, an online consumer reviews extraction tool. RevEx extracts the comments section for products in webshops. In contrast to other web scraping tools, it can work with heterogeneous web pages automatically, that is, it does not ...

research-article

WallStreetBets: Assessing the Collective Intelligence of Reddit for Investment Advice

ACM Transactions on Social Computing (TSC), Volume 7, Issue 1-4Article No.: 3, Pages 1–23https://doi.org/10.1145/3660760

The WallStreetBets (WSB) community on Reddit gained prominence for its role in the GameStop saga and the resulting meme stock phenomenon. Concurrently, this has boosted the popularity of finance-related communities on Reddit, with the top five totaling ...

research-article

Comprehensive selective improvements in agri-informatics semantics

Journal of Information Science (JIPP), Volume 50, Issue 4Pages 910–923https://doi.org/10.1177/01655515221110987

The advent of information technology re-innovates all sectors of bio-sciences. Researchers use Semantic Web to improve web searching, mining and integration, which alleviates the time-consuming task of finding relevant and high-quality content. Semantics ...

research-article

Discovering Image Usage Online: A Case Study with "Flatten the Curve"

JCDL '23: Proceedings of the 2023 ACM/IEEE Joint Conference on Digital LibrariesPages 293–294https://doi.org/10.1109/JCDL57899.2023.00064

Understanding the spread of images across the web helps us understand the reuse of scientific visualizations and their relationship with the public. The "Flatten the Curve" graphic was heavily used during the COVID-19 pandemic to convey a complex concept ...

research-article

Open Access

FASETS: Discovering Faceted Sets of Entities

WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 1521–1529https://doi.org/10.1145/3589335.3651924

Computing related entities for a given seed entity is an important task in exploratory search and comparative data analysis.Prior works, using the seed-based set expansion paradigm, have focused on the single aspect of identifying homogeneous sets with ...

keynote

Archiving and Temporal Analysis of Behavioral Web Data - Tales from the Inside

Stefan Dietze

WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 1373–1374https://doi.org/10.1145/3589335.3641260

Behavioral web data such as social web activity streams, query logs or behavioral traces from web search and navigation are crucial to understand the temporal evolution of the web and the human interactions that produce web data and models trained on ...

research-article

Open Access

Fast Inference of Removal-Based Node Influence

WWW '24: Proceedings of the ACM Web Conference 2024Pages 422–433https://doi.org/10.1145/3589334.3645389

Graph neural networks (GNNs) are widely utilized to capture the information spreading patterns in graphs. While remarkable performance has been achieved, there is a new trending topic of evaluating node influence. We propose a new method of evaluating ...

short-paper

Towards the Identification of Experts in Informal Learning Portals at Scale

L@S '23: Proceedings of the Tenth ACM Conference on Learning @ ScalePages 316–320https://doi.org/10.1145/3573051.3596179

During the past decade, there has been growing interest among researchers in informal learning at scale, particularly in the area of expert finding. These platforms have played a fundamental role in facilitating informal learning at scale, by providing ...

demonstration

DataExpo: A One-Stop Dataset Service for Open Science Research

WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023Pages 32–36https://doi.org/10.1145/3543873.3587305

The large volumes of data on the Internet provides new opportunities for scientific discovery, especially promoting data-driven open science research. However, due to lack of accurate semantic markups, finding relevant data is still difficult. To ...

research-article

Active Learning from the Web

Ryoma Sato

WWW '23: Proceedings of the ACM Web Conference 2023Pages 1616–1625https://doi.org/10.1145/3543507.3583346

Labeling data is one of the most costly processes in machine learning pipelines. Active learning is a standard approach to alleviating this problem. Pool-based active learning first builds a pool of unlabelled data and iteratively selects data to be ...

research-article

Social Network Analysis on Interpretable Compressed Sparse Networks

ASONAM '22: Proceedings of the 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningPages 324–331https://doi.org/10.1109/ASONAM55673.2022.10068716

Big data are everywhere. World Wide Web is an example of these big data. It has become a vast data production and consumption platform, at which threads of data evolve from multiple devices, by different human interactions, over worldwide locations, ...

research-article

Is Twitter Enough? Investigating Situational Awareness in Social and Print Media during the Second COVID-19 Wave in India

ASONAM '22: Proceedings of the 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningPages 339–346https://doi.org/10.1109/ASONAM55673.2022.10068667

The COVID-19 pandemic required efficient allocation of public resources and transforming existing ways of societal functions. To manage any crisis, governments and public health researchers exploit the information available to them in order to make ...

research-article

Web mining based on word-centric search with clustering approach using MLP-PSO hybrid

International Journal of Business Intelligence and Data Mining (IJBIDM), Volume 20, Issue 1Pages 35–55https://doi.org/10.1504/ijbidm.2022.119980

With web development, sometimes in keeping track of information on the web, the semantic meaning of words is not important, and the mere presence of words in the text is enough to extract information. In this research, the word-centric search method is ...

research-article

Detecting Product Adoption Intentions via Multiview Deep Learning

INFORMS Journal on Computing (INFORMS-IJOC), Volume 34, Issue 1Pages 541–556https://doi.org/10.1287/ijoc.2021.1083

Detecting product adoption intentions on social media could yield significant value in a wide range of applications, such as personalized recommendations and targeted marketing. In the literature, no study has explored the detection of product adoption ...

research-article

Open Access

QuAX: Mining the Web for High-utility FAQ

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementPages 1518–1527https://doi.org/10.1145/3459637.3482289

Frequently Asked Questions (FAQ) are a form of semi-structured data that provides users with commonly requested information and enables several natural language processing tasks. Given the plethora of such question-answer pairs on the Web, there is an ...

research-article

"A Virus Has No Religion": Analyzing Islamophobia on Twitter During the COVID-19 Outbreak

HT '21: Proceedings of the 32nd ACM Conference on Hypertext and Social MediaPages 67–77https://doi.org/10.1145/3465336.3475111

The COVID-19 pandemic has disrupted people's lives driving them to act in fear, anxiety, and anger, leading to worldwide racist events in the physical world and online social networks. Though there are works focusing on Sinophobia during the COVID-19 ...

research-article

Page-Level Main Content Extraction From Heterogeneous Webpages

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 15, Issue 6Article No.: 105, Pages 1–105https://doi.org/10.1145/3451168

The main content of a webpage is often surrounded by other boilerplate elements related to the template, such as menus, advertisements, copyright notices, and comments. For crawlers and indexers, isolating the main content from the template and other ...

research-article

War of Words II: Enriched Models of Law-Making Processes

WWW '21: Proceedings of the Web Conference 2021Pages 2014–2024https://doi.org/10.1145/3442381.3450131

The European Union law-making process is an instance of a peer-production system. We mine a rich dataset of law edits and introduce models predicting their adoption by parliamentary committees. Edits are proposed by parliamentarians, and they can be in ...

research-article

Web Mining in e-Procurement: A Case Study in Indonesia

APIT '21: Proceedings of the 2021 3rd Asia Pacific Information Technology ConferencePages 101–107https://doi.org/10.1145/3449365.3449382

E-procurement is an electronic procurement system that became a key factor required to manage financial aspect of a country with appropriate controls and protected by legal policies. The Presidential Regulation in Indonesia expect all government ...

research-article

Is your home becoming a spy?: a data-centered analysis and classification of smart connected home systems

IoT '20: Proceedings of the 10th International Conference on the Internet of ThingsArticle No.: 17, Pages 1–8https://doi.org/10.1145/3410992.3411012

Smart connected home systems bring different privacy challenges to residents. The contribution of this paper is a novel privacy grounded classification of smart connected home systems that is focused on personal data exposure. This classification is ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Upcoming Conferences