Nothing Special   »   [go: up one dir, main page]

Link Search Menu Expand Document
Start for Free

Label service

Stardog 10 offers a new SPARQL service to retrieve human-readable labels for resources in your data. This facilitates writing SPARQL queries that no longer need to know how labels are modeled in the data, for example, using rdfs:label, skos:prefLabel, or any combination of other properties. Stardog will handle that behind the scenes based on your database configuration.

Page Contents
  1. Introduction
  2. Basic Usage
  3. The Service Form
    1. Examples
  4. Interaction with other features
    1. Full-text Search
    2. Named graphs
    3. Virtual graphs
    4. Reasoning

Introduction

Querying labels in the data is, somewhat surprisingly, not all that easy. It’s supposed to be a trivial matter of adding rdfs:label triple patterns to your query but there are complications. What happens if the data uses a different predicate for labels? What if you integrate multiple datasets that use different predicates? Some of them can be Virtual Graphs (VGs). The data can use labels in different languages. The list goes on.

Stardog 10 introduces a new SPARQL service that aims at isolating queries from these (and other) complications. The idea is to let query authors write simple ?s stardog:label ?label triple patterns (where the built-in stardog: prefix resolves to tag:stardog:api:) and have the query engine evaluate it based on the new database option label.properties. That option lists all predicates that specify labels in your data (which by default is just rdfs:label). Finally, the new service can take advantage of full-text search (if enabled) to look up entities by keywords in the label.

Basic Usage

stardog:label is a short syntactic form:

prefix stardog: <tag:stardog:api:>

SELECT ?country ?label {
    ?country a :Country ;
             stardog:label ?label
}

In place of ?country, one can use a concrete IRI to look up label(s) of a specific node in the graph. Behind the scenes, stardog:label is rewritten into a union over all properties configured via label.properties.

Alternatively, the service can be used to look up entities by labels:

prefix stardog: <tag:stardog:api:>

SELECT ?country {
    ?country a :Country ;
            stardog:label "Germany"
}

Again, stardog:label is rewritten into a union over all properties configured via label.properties and matches all entities with "Germany" as the label. In this section below, we explain how the execution of this query changes depending on whether the database has full-text search enabled or not.

Matching labels (without full-text search enabled) is sensitive to the labels’ language tags. That is ?country stardog:label "Germany" . will only match entities with the exact label "Germany" but will not match entities with the label "Germany"@en.

The Service Form

With Stardog 10.2, the longer SERVICE form allows one to configure the label service’s behavior more specifically. There are three main configurations:

  • stardog:predicate: Takes a variable to which the predicate of the matching label will be bound.
  • stardog:lang: Takes an RDF list of language tags by which language-tagged labels will be sorted.
  • stardog:limit: Takes an integer that defines the maximum number of labels to be retrieved per entity.

Let’s take a look at how we can refine the labels to retrieve using these properties with a few examples.

Examples

Consider the following simple dataset about countries and their names.

:Germany a :Country;
         rdfs:label "Deutschland";
         skos:prefLabel "Germany"@en, "Deutschland"@de .
:Spain a :Country;
        rdfs:label "España";
        skos:prefLabel "Spain"@en .

For the database, we set two label properties as follows.

label.properties="http://www.w3.org/2004/02/skos/core#prefLabel,http://www.w3.org/2000/01/rdf-schema#label"

Since the order of the label properties determines their rank, this means that we prefer skos:prefLabel over rdfs:label. First, let’s get all labels for countries using the service form with the entity label service IRI stardog:labels:

SELECT ?country ?label {
    ?country a :Country;
    SERVICE stardog:labels {
        [] stardog:entity ?country ;
           stardog:label ?label .
    }
}

The query returns the list of all countries and all their labels:

+----------+------------------+
| country  |      label       |
+----------+------------------+
| :Germany | "Deutschland"@de |
| :Germany | "Germany"@en     |
| :Germany | "Deutschland"    |
| :Spain   | "Spain"@en       |
| :Spain   | "España"         |
+----------+------------------+

Predicates

Since we defined multiple label properties, we might also be interested in the actual predicate for each label. This can be achieved with the stardog:predicate property:

SELECT ?country ?label ?predicate {
    ?country a :Country;
    SERVICE stardog:labels {
        [] stardog:entity ?country ;
           stardog:label ?label ;
           stardog:predicate ?predicate .
    }
}

which will bind the ?predicate variable to the corresponding label property:

+----------+------------------+----------------+
| country  |      label       |   predicate    |
+----------+------------------+----------------+
| :Germany | "Deutschland"@de | skos:prefLabel |
| :Germany | "Germany"@en     | skos:prefLabel |
| :Germany | "Deutschland"    | rdfs:label     |
| :Spain   | "Spain"@en       | skos:prefLabel |
| :Spain   | "España"         | rdfs:label     |
+----------+------------------+----------------+

Limit

Usually, we are not interested in all labels of an entity and we want to retrieve just a single best-matching label for each entity. This can be achieved using the stardog:limit property.

SELECT ?country ?label {
    ?country a :Country;
    SERVICE stardog:labels {
        [] stardog:entity ?country ;
           stardog:label ?label ;
           stardog:limit 1 .
    }
}

which returns a single label per entity:

+----------+--------------+
| country  |    label     |
+----------+--------------+
| :Germany | "Germany"@en |
| :Spain   | "Spain"@en   |
+----------+--------------+

Since we configured label.properties such that skos:prefLabel is preferred over rdfs:label, the entity label service returns labels for skos:prefLabel. Note, however, that since there are two skos:prefLabel labels for :Germany in our example, the two labels are equally ranked. As a result, the query above may also return "Deutschland"@de as the label for :Germany.

Using stardog:limit requires the labels to be retrieved by a correlated subquery similar to correlated subqueries with the Stored Query Service. This means a query is executed per entity to retrieve the labels of that entity. The labels per entity are sorted according to label property preference and language tag (see below) and finally truncated according to the limit. As a result, this requires either stardog:entity to be an IRI or a variable that is guaranteed to be bound in the same scope as the entity label service.

With a stardog:limit configured, the following query will return results since the stardog:entity is bound to an IRI.

SELECT ?label {
    SERVICE stardog:labels {
        [] stardog:entity :Germany ;
           stardog:label ?label ;
           stardog:limit 1 .
    }
}

However, the following query will fail since the ?country variable is not bound in the query:

SELECT ?country ?label {
    SERVICE stardog:labels {
        [] stardog:entity ?country ;
           stardog:label ?label ;
           stardog:limit 1 .
    }
}

Languages

In addition, we might also prefer labels with specific language tags. We can achieve this by using the stardog:lang predicate which allows us to specify a (RDF) list with preferred language tags. Just like with label.properties, the order of the language tag determines their rank. For example, let’s say we always want the German label for a country if available. If not, we want the English label instead.

SELECT ?country ?label {
    ?country a :Country;
    SERVICE stardog:labels {
        [] stardog:entity ?country ;
           stardog:label ?label ;
           stardog:lang ('de' 'en') ;
           stardog:limit 1 .
    }
}

which returns the following labels per entity:

+----------+------------------+
| country  |      label       |
+----------+------------------+
| :Germany | "Deutschland"@de |
| :Spain   | "Spain"@en       |
+----------+------------------+

While the RDF specification does not support empty language tags, the entity label service interprets an empty language tag '' to retrieve non-language-tagged labels. For example, in the following query which aims to retrieve non-language-tagged or English labels.

SELECT ?country ?label {
    ?country a :Country;
    SERVICE stardog:labels {
        [] stardog:entity ?country ;
           stardog:label ?label ;
           stardog:lang ('' 'en') ;
           stardog:limit 1 .
    }
}

which returns the following labels per entity:

+----------+----------------+
| country  |     label      |
+----------+----------------+
| :Germany | "Deutschland"  |
| :Spain   | "España"       |
+----------+----------------+

In this example, we see that language preference (no language tag over an English one) is taken into account before the label property rank (skos:prefLabel over rdfs:label). For both entities, we get a single non-language-tagged label as the result even though it is from the property rdfs:label.

Labels are always first sorted by language and then according to the label property rank.

Interaction with other features

The query engine can evaluate the label service differently depending on what other features are enabled for the database.

The label service can take advantage of our Lucene-based keyword search when looking up entities by their labels. If full-text search is enabled, the query engine transforms stardog:label predicates with bound labels:

prefix stardog: <tag:stardog:api:>

SELECT ?country {
    ?country a :Country ;
            stardog:label "United States"
}

into FTS patterns:

prefix stardog: <tag:stardog:api:>
prefix search: <tag:stardog:api:search:>

SELECT ?country {
    ?country a :Country ;
            rdfs:label ?label 
    SERVICE search:textMatch {
      [] search:result ?label ;
         search:query "United States"
    }
}

This means the query can find not only countries whose label is precisely "United States" but also those where it’s a part of the name, like "United States of America".

Named graphs

Normally SPARQL services do not interact with the GRAPH or FROM keywords. This is also true for our full-text search service. The label service is, however, different: it allows matching labels in specific named graphs:

prefix stardog: <tag:stardog:api:>

SELECT ?country FROM :european_countries {
    ?country a :Country ;
            stardog:label ?label
}

When looking up entities by a specific keyword, a full-text search (if enabled) will be conducted over the entire full-text index but the joined rdfs:label pattern will restrict the results to the right graph(s).

Virtual graphs

If the query dataset includes Virtual Graphs, the label service will be evaluated over them. The exact behavior depends on whether the label query term is a constant. If it is a variable, the label service will first be translated to the union of triple patterns based on the label.properties option (for example, rdfs:label). The resulting pattern will then be translated to the VG’s target language, e.g., SQL, and executed in the usual way.

If the label query term is a constant, however, the query engine will add an equality filter before sending the query to the VG. That should have the same effect as querying for labels locally without full-text search (at some performance cost depending on the VG backend and its available indexes).

The label service supports VG caches. If the query accesses a cached VG, the label service pattern will be sent to the cache node as-is. Thus, it can benefit from the full-text search index if it is enabled on the cache node.

Reasoning

The label service does not interfere with reasoning. The query patterns other than stardog:label are evaluated with or without reasoning based on query configuration.

However, there is one particular limitation: the service currently ignores schema axioms and rules which may derive new rdfs:label facts. Note that rdfs:label (as well as other common label properties such as skos:prefLabel) are annotation properties in OWL and their use in axioms and rule heads is not recommended. Instead, one may add extra properties to the label.propeties database option.