Showing posts with label discovery challenge. Show all posts

Monday, July 15, 2013

Given Name Recommendations in Nameling

Today, we'll report on the name search platform Nameling - another system which is run at the KDE research group and which regular readers of our blog are probably already familiar with (Discover Names, 20DC13 - The ECML PKDD Discovery Challenge 2013 on Recommending Given Names ).

Its been a while (more than a year) that Nameling has been running stable without any code updates, but meanwhile we weren't idle and wrote some papers about the task of finding similar names by means of analyzing name relatedness based on data from the social web ([1], [2]) and how these statically derived name similarities actually fit to the users' search activities in the running system ([3]). We also considered the task of personalized name recommendations ([4]) which is also the task of the 15th ECML PKDD Discovery Challenge, organized by members of the KDE research group at the university of Kassel, the Data Mining and Information Retrieval Group at the university of Würzburg and the Institut für Verteilte Systeme - Fachgebiet Wissensbasierte Systeme at the Leibniz Universität Hannover (see also our previous blog post).

Now, we integrate given name recommendations in the running Nameling system (you can already get a glimpse at the new recommendation features by visiting Namelings beta version). Firstly, on the fly name recommendations, based on a user's search profile are shown on almost every query page in Nameling (e.g., look at the sidebar while looking for similar names of the given name "Emma").

Via the navigation buttons below the list of recommended names, you can browse through even more recommdations (arrows), or request other recommended names (recycle).

And this is the clue: For the next two months, these name recommendations are provided by participants of the 20DC13 Online Challenge. That is, every recommendation request is anonymously passed to each recommender system of the 20DC13 participants. For each user in Nameling, a randomly chosen system is selected for actually displaying name recommendations. But whenever a user presses the recycle button, this association is changed.

But you can also explicitly ask Nameling for name recommendations, relative to a list of names you like (e.g., the future parents' given names)

By clicking on the '+' and '-' signs, you can add a name to your name recommendation query or, respectively, permanently ban a name from your result views. Of course, you can still add each name to your personal list of favourite names and explore a names neighbourhood. Additionally we added the feature of automatically determining your current location (by clicking on the 'satellite dish'), which allows to improve your name recommendations based on your geographic background.

There is an ongoing debate in the recommender system community, concerning the online vs. offline evaluation of recommender systems. Nameling's name recommendation back-end is designed for easy integration of new recommender systems, which may even reside on servers of affiliate research groups (e.g. participants of the online challenge). The performance of a recommender system is then evaluated relative to actual user interactions with the displayed names (e.g. by counting the number of names which were added to the list of favourite names). These feedback information is also (anonymously) passed to the included recommender systems, so that these systems can adopt and improve their recommendations accordingly.

If you are working on recommender systems and are interested in testing your system in a live setting, feel free to contact us. You only have to implement a simple Java interface or a simple Python interface. If you manage to set up a running recommender system within the next ten days (until July 22nd), you may even take part in the 20DC13 online recommender challenge.

In any case: Keep on tagging and happy number crunching!

Your 20DC13 Team
Stephan, Andreas, Robert, Folke & Juergen

Tuesday, September 8, 2009

Tagging for Championship

As a social bookmarking system, assigning tags to resources is one of BibSonomy's most important and frequent processes. Since a while, the user is assisted by a set of recommended tags as shown in Figure 1.

The Challenge

Recommender systems are subject to active research and different approaches emerged. In the context of this year's ECML PKDD Discovery Challenge, BibSonomy's tag recommendations were provided by 14 different recommender systems from 10 different research teams in 7 different countries during the last five weeks. The challenge consisted of three tasks where the first two tasks were dealing with fixed datasets obtained from BibSonomy, while the third task's subject was to provide tag recommendations to the user in the running system.

Yesterday, during the ECML PKDD Discovery Challenge Workshop, the challenge's participants presented their recommender systems and discussed the different approaches, still ignorant of the third task's winning team, which finally was announced in the evening during the conference's opening session.

Rating the Systems

Algorithms for tag recommendations are typically evaluated by computing some performance measure in an "off-line" setting, that is, by iterating over posts in a dataset, which was derived from a social bookmarking system, presenting only a user and a resource to the recommender system. Thus, for each post, the set of suggested tags can be compared with those the user had assigned. Participants in Task 1 and Task 2 were evaluated in such a setting.

But these "off-line" settings not only ignore some constraints in real live applications (e.g. cpu usage and memory consumption), they also can't take into account the effect of presenting a set of recommended tags to the user. To evaluate these effects, we set up Task 3, were recommender systems were integrated into BibSonomy and the recommender systems had to deliver their tag recommendations within a timeout of 1000 ms.

For evaluating the different recommender systems (in the off-line settings as well as Task 3), we calculated precision and recall for each system. While precision measures, how many recommended tags where adequate, recall takes into account, how many of the tags the user actually assigned to the resource where recommended.

Figure 2 shows the final results of the on-line challenge (which is available here). For each recommender system, we calculated precision and recall, considering only the first n tags (for n=1,2,..., 5) and averaged over all posts. The top blue graph for example shows, that from the corresponding recommender system's five recommended tags (the very right point) around 18% were chosen by the user (precision 0.18) and around 23% of the tags which the user finally assigned to the resource were "predicted" by the recommender.

The winning teams are:

Task 1: Marek Lipczak, Yeming Hu, Yael Kollet, and Evangelos Milios (Paper)
Task 2: Steffen Rendle and Lars Schmidt-Thieme (Paper)
Task 3: Marek Lipczak, Yeming Hu, Yael Kollet, and Evangelos Milios (Paper)

We are happy to say, that it was an interesting challenge which gave substantial insight into the performance of different approaches to the task of tag recommendation. We'd like to thank everybody who contributed to this challenge - last but not least each of BibSonomy's users.

Friday, May 9, 2008

ECML/PKDD Discovery Challenge

Since we're organising this year's Discovery Challenge, we would like to announce the

Call for Participation

ECML/PKDD Discovery Challenge

Antwerp, Belgium, 15 Sept. 2008

This year's discovery challenge deals with two tasks in the area of social bookmarking. One task covers spam detection and the other is about tag recommendations. The dataset the challenge is based on is a snapshot of BibSonomy. More details about the tasks can be found at the challenge website.

Important dates

May 5, 2008: Tasks and datasets available online.
July 30th, 2008: Test dataset will be released (by midnight CEST).
August 1st, 2008: Result submission deadline (by midnight CEST).
August 4th, 2008: Workshop paper submission deadline.
August 8th 2008: Notification of winners, publication of results on webpage, notification of paper acceptance.
August 14th, 2008: Workshop proceedings (camera-ready) deadline.
September 15/19th, 2008: ECML/PKDD 2008 Workshop

Header