Keyword: document similarity : Search

short-paper

Public Access

Predicting Guiding Entities for Entity Aspect Linking

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementPages 3848–3852https://doi.org/10.1145/3511808.3557671

Entity linking can disambiguate mentions of an entity in text. However, there are many different aspects of an entity that could be discussed but are not differentiable by entity links, for example, the entity "oyster'' in the context of "food'' or "...

research-article

Specialized document embeddings for aspect-based similarity of research papers

JCDL '22: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital LibrariesArticle No.: 7, Pages 1–12https://doi.org/10.1145/3529372.3530912

Document embeddings and similarity measures underpin content-based recommender systems, whereby a document is commonly represented as a single generic embedding. However, similarity computed on single vector representations provides only one perspective ...

research-article

Open Access

Sentiment Analysis of Portuguese Political Parties Communication

SIGDOC '21: Proceedings of the 39th ACM International Conference on Design of CommunicationPages 63–69https://doi.org/10.1145/3472714.3473624

Political communication in social media has gained increasing importance in the last years. In this study, we analyze the political parties’ communication on Twitter and understand the sentiment of their communication. First by identifying their ...

research-article

Evaluating document representations for content-based legal literature recommendations

ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and LawPages 109–118https://doi.org/10.1145/3462757.3466073

Recommender systems assist legal professionals in finding relevant literature for supporting their case. Despite its importance for the profession, legal applications do not reflect the latest advances in recommender systems and representation learning ...

research-article

Cross-lingual text similarity exploiting neural machine translation models

Kazuhiro Seki

Journal of Information Science (JIPP), Volume 47, Issue 3Pages 404–418https://doi.org/10.1177/0165551520912676

This article studies cross-lingual text similarity using neural machine translation models. A straightforward approach based on machine translation is to use translated text so as to make the problem monolingual. Another possible approach is to use ...

abstract

CS50's GitHub-Based Tools for Teaching and Learning

SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science EducationPage 1354https://doi.org/10.1145/3408877.3432499

For CS50 at Harvard, we have developed a suite of free, open-source tools to help students with writing, testing, and submitting programming assignments and to help teachers grade those assignments and check them for similarities. help50 parses often-...

research-article

What do governments plan in the field of artificial intelligence?: Analysing national AI strategies using NLP

ICEGOV '20: Proceedings of the 13th International Conference on Theory and Practice of Electronic GovernancePages 100–111https://doi.org/10.1145/3428502.3428514

The primary goal of this paper is to explore how Natural Language Processing techniques (NLP) can assist in reviewing, understanding, and drawing conclusions from text datasets. We explore NLP techniques for the analysis and the extraction of useful ...

research-article

Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles

JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020Pages 127–136https://doi.org/10.1145/3383583.3398525

Many digital libraries recommend literature to their users considering the similarity between a query document and their repository. However, they often fail to distinguish what is the relationship that makes two documents alike. In this paper, we model ...

abstract

Legal Data Analytics: Developing Assistive Tools for Legal Practitioners

Paheli Bhattacharya

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information RetrievalPage 2476https://doi.org/10.1145/3397271.3401448

With the advancement of the Web and large number of legal documents being made available digitally, legal practitioners are now facing new challenges. It is now intractable for them to manually find relevant information (prior cases, related statutes, ...

research-article

Measuring semantic similarity of documents with weighted cosine and fuzzy logic

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology (JIFS), Volume 39, Issue 2Pages 2263–2278https://doi.org/10.3233/JIFS-179889

Currently, the semantic analysis is used by different fields, such as information retrieval, the biomedical domain, and natural language processing. The primary focus of this research work is on using semantic methods, the cosine similarity algorithm, and ...

research-article

Open Access

Combining Similarity and Transformer Methods for Case Law Entailment

ICAIL '19: Proceedings of the Seventeenth International Conference on Artificial Intelligence and LawPages 290–296https://doi.org/10.1145/3322640.3326741

We tackle the complex problem of determining entailment relationships between case law documents, one of the tasks in the Competition on Legal Information Extraction and Entailment (COLIEE). With input of an entailed fragment from a case coupled with a ...

research-article

A Combination of Text Mining Techniques for Relevant Literature Search and Extractive Summarization

NLPIR '18: Proceedings of the 2nd International Conference on Natural Language Processing and Information RetrievalPages 7–11https://doi.org/10.1145/3278293.3278300

Over the past few years, the amount of research papers published has dramatically increased. Consequently, researchers spend a lot of time reviewing relevant literature in order to better understand their domain of interest and keep up with new ...

research-article

Syncretic matching: story similarity between documents

CODS-COMAD '18: Proceedings of the ACM India Joint International Conference on Data Science and Management of DataPages 146–156https://doi.org/10.1145/3152494.3152508

In several document matching applications like comparing across judgments, patent claims or movie plots, conventional bag-of-words models are insufficient. Bag of words are useful for computing lexical similarity; while in this case, there is a need to ...

research-article

Open Access

Word importance-based similarity of documents metric (WISDM): Fast and scalable document similarity metric for analysis of scientific documents

WOSP 2017: Proceedings of the 6th International Workshop on Mining Scientific PublicationsPages 17–23https://doi.org/10.1145/3127526.3127530

We present the Word importance-based similarity of documents metric (WISDM), a fast and scalable novel method for document similarity/distance computation for analysis of scientific documents. It is based on recent advancements in the area of word ...

research-article

An Efficient Tag Recommendation Method using Topic Modeling Approaches

RACS '17: Proceedings of the International Conference on Research in Adaptive and Convergent SystemsPages 56–61https://doi.org/10.1145/3129676.3129709

Software information sites such as Stack Overflow, Super User, and Ask Ubuntu allow users to post software-related questions, answer the questions asked by other users, and add tags to their questions. Tagging is a popular system across web communities ...

research-article

Plagiarism detection using document similarity based on distributed representation

Procedia Computer Science (PROCS), Volume 111, Issue CPages 382–387https://doi.org/10.1016/j.procs.2017.06.038

Accurate methods are required for plagiarism detection from documents. Generally, plagiarism detection is implemented on the basis of similarity between documents. This paper evaluates the validity of using distributed representation of words for ...

research-article

Generating Stories From Archived Collections

WebSci '17: Proceedings of the 2017 ACM on Web Science ConferencePages 309–318https://doi.org/10.1145/3091478.3091508

With the extensive growth of the Web, multiple Web archiving initiatives have been started to archive different aspects of the Web. Services such as Archive-It exist to allow institutions to develop, curate, and preserve collections of Web resources. ...

short-paper

Centroid Terms as Text Representatives

DocEng '16: Proceedings of the 2016 ACM Symposium on Document EngineeringPages 99–102https://doi.org/10.1145/2960811.2967150

Algorithms to topically cluster and classify texts rely on information about their semantic distances and similarities. Standard methods based on the bag-of-words model to determine this information return only rough estimations regarding the ...

research-article

SQLiDDS: SQL injection detection using document similarity measure

Journal of Computer Security (JOCS), Volume 24, Issue 4Pages 507–539https://doi.org/10.3233/JCS-160554

SQL injection attack has been a major security threat to web applications for over a decade. Now a days, attackers use automated tools to discover vulnerable websites from search engines and launch attacks on multiple websites simultaneously. Being ...

short-paper

Learning User Preferences for Topically Similar Documents

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementPages 1795–1798https://doi.org/10.1145/2806416.2806617

Similarity measures have been used widely in information retrieval research. Most research has been done on query-document or document-document similarity without much attention to the user's perception of similarity in the context of the information ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Caption

Predicting Guiding Entities for Entity Aspect Linking

Specialized document embeddings for aspect-based similarity of research papers

Sentiment Analysis of Portuguese Political Parties Communication

Evaluating document representations for content-based legal literature recommendations

Cross-lingual text similarity exploiting neural machine translation models

Upcoming Conferences

CS50's GitHub-Based Tools for Teaching and Learning

What do governments plan in the field of artificial intelligence?: Analysing national AI strategies using NLP

Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles

Legal Data Analytics: Developing Assistive Tools for Legal Practitioners

Measuring semantic similarity of documents with weighted cosine and fuzzy logic

Combining Similarity and Transformer Methods for Case Law Entailment

A Combination of Text Mining Techniques for Relevant Literature Search and Extractive Summarization

Syncretic matching: story similarity between documents

Word importance-based similarity of documents metric (WISDM): Fast and scalable document similarity metric for analysis of scientific documents

An Efficient Tag Recommendation Method using Topic Modeling Approaches

Plagiarism detection using document similarity based on distributed representation

Generating Stories From Archived Collections

Centroid Terms as Text Representatives

SQLiDDS: SQL injection detection using document similarity measure

Learning User Preferences for Topically Similar Documents

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Upcoming Conferences