Learning Semantic Relatedness From Human Feedback Using Metric Learning.
arXiv preprint arXiv:1705.07425, 2017.
cite arxiv: 1705.07425
Thomas Niebler, Martin Becker, Christian Pölitz and Andreas Hotho.
[doi]
[abstract]
[BibTeX]
Assessing the degree of semantic relatedness between words is an important
task with a variety of semantic applications, such as ontology learning for the
Semantic Web, semantic search or query expansion. To accomplish this in an
automated fashion, many relatedness measures have been proposed. However, most
of these metrics only encode information contained in the underlying corpus and
thus do not directly model human intuition. To solve this, we propose to
utilize a metric learning approach to improve existing semantic relatedness
measures by learning from additional information, such as explicit human
feedback. For this, we argue to use word embeddings instead of traditional
high-dimensional vector representations in order to leverage their semantic
density and to reduce computational cost. We rigorously test our approach on
several domains including tagging data as well as publicly available embeddings
based on Wikipedia texts and navigation. Human feedback about semantic
relatedness for learning and evaluation is extracted from publicly available
datasets such as MEN or WS-353. We find that our method can significantly
improve semantic relatedness measures by learning from additional information,
such as explicit human feedback. For tagging data, we are the first to generate
and study embeddings. Our results are of special interest for ontology and
recommendation engineers, but also for any other researchers and practitioners
of Semantic Web techniques.
Learning Semantic Relatedness from Human Feedback Using Relative Relatedness Learning.
In: N. Nikitina, D. Song, A. Fokoue and P. Haase, editors,
Proceedings of the ISWC 2017.
2017.
Thomas Niebler, Martin Becker, Christian Pölitz and Andreas Hotho.
[doi]
[BibTeX]
Learning Word Embeddings from Tagging Data: A methodological comparison.
In:
Proceedings of the LWDA.
2017.
Thomas Niebler, Luzian Hahn and Andreas Hotho.
[BibTeX]
What Users Actually do in a Social Tagging System: A Study of User Behavior in BibSonomy.
ACM Transactions on the Web, 10(2):14:1-14:32, 2016.
Stephan Doerfel, Daniel Zoller, Philipp Singer, Thomas Niebler, Andreas Hotho and Markus Strohmaier.
[doi]
[abstract]
[BibTeX]
Social tagging systems have established themselves as an important part in today’s web and have attracted the interest of our research community in a variety of investigations. Henceforth, several aspects of social tagging systems have been discussed and assumptions have emerged on which our community builds their work. Yet, testing such assumptions has been difficult due to the absence of suitable usage data in the past. In this work, we thoroughly investigate and evaluate four aspects about tagging systems, covering social interaction, retrieval of posted resources, the importance of the three different types of entities, users, resources, and tags, as well as connections between these entities’ popularity in posted and in requested content. For that purpose, we examine live server log data gathered from the real-world, public social tagging system BibSonomy. Our empirical results paint a mixed picture about the four aspects. While for some, typical assumptions hold to a certain extent, other aspects need to be reflected in a very critical light. Our observations have implications for the understanding of social tagging systems, and the way they are used on the web. We make the dataset used in this work available to other researchers.
Extracting Semantics from Unconstrained Navigation on Wikipedia.
KI -- Künstliche Intelligenz, 30(2):163-168, 2016.
Thomas Niebler, Daniel Schlör, Martin Becker and Andreas Hotho.
[abstract]
[BibTeX]
Semantic relatedness between words has been successfully extracted from navigation on Wikipedia pages. However, the navigational data used in the corresponding works are sparse and expected to be biased since they have been collected in the context of games. In this paper, we raise this limitation and explore if semantic relatedness can also be extracted from unconstrained navigation. To this end, we first highlight structural differences between unconstrained navigation and game data. Then, we adapt a state of the art approach to extract semantic relatedness on Wikipedia paths. We apply this approach to transitions derived from two unconstrained navigation datasets as well as transitions from WikiGame and compare the results based on two common gold standards. We confirm expected structural differences when comparing unconstrained navigation with the paths collected by WikiGame. In line with this result, the mentioned state of the art approach for semantic extraction on navigation data does not yield good results for unconstrained navigation. Yet, we are able to derive a relatedness measure that performs well on both unconstrained navigation data as well as game data. Overall, we show that unconstrained navigation data on Wikipedia is suited for extracting semantics.
FolkTrails: Interpreting Navigation Behavior in a Social Tagging System.
In:
Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, series CIKM '16.
ACM, 2016.
Thomas Niebler, Martin Becker, Daniel Zoller, Stephan Doerfel and Andreas Hotho.
[doi]
[abstract]
[BibTeX]
Social tagging systems have established themselves as a quick and easy way to organize information by annotating resources with tags. In recent work, user behavior in social tagging systems was studied, that is, how users assign tags, and consume content. However, it is still unclear how users make use of the navigation options they are given. Understanding their behavior and differences in behavior of different user groups is an important step towards assessing the effectiveness of a navigational concept and of improving it to better suit the users’ needs. In this work, we investigate navigation trails in the popular scholarly social tagging system BibSonomy from six years of log data. We discuss dynamic browsing behavior of the general user population and show that different navigational subgroups exhibit different navigational traits. Furthermore, we provide strong evidence that the semantic nature of the underlying folksonomy is an essential factor for explaining navigation.