Using Taxonomies for SPARQL-DL optimizations

by Petr Kremen

Last post I spent showing directions for optimizing SPARQL-DL in general. Now I would like to touch a more particular case of SPARQL-DL queries, namely those that contain class/property variables in ABox part of the query.

I call down-monotonic all the class/property variables ?x that occur in either Type(•,?x), or PropertyValue(•,?x,•) atom. For these variables we can use class/property hierarchies to prune the search. Let’s take an atom Type(i, ?x) (i.e., i is either a binding for variable •, or constant •). If i is not of type C we can safely avoid considering subclasses of C as bindings for ?x ( and analogically for PropertyValue(i, ?x, j) and property hierarchy ).

However, reverse implication doesn’t hold, because of possible interaction with other TBox atoms. Finding a binding C for (down-monotonic) ?x we can’t take all superclasses/superporperties of C as valid bindings. Consider the query Type(i,?x), ComplementOf(?x, not(C)). If C is a subclass of D, it might happen that C is a valid binding for ?x, while D is not.

Of course, the best performance of the optimization is achieved when the search fails for a down-monotonic variables binding that is a root of a deep and complex sub/super hierarchy. Let’s take two sample queries run against LUBM :

  • “Give me all people (?X) together with their type (?A) that are advisors of themselves.”
    Abstract syntax:Type(?X, ?A), SubClassOf(?A, ub:Person), PropertyValue(?X, ub:advisor, ?X)

  • “Give me all people (?X) together with their type (?A) that are teaching assistants of some course (?Y)”
    Abstract syntax:SubClassOf(?A, ub:Person), Type(?X, ?A), PropertyValue(?X,ub:teachingAssistantOf,?Y), Type(?Y, ub:Course).

The first query does not return any result, thus the optimization prunes all subclasses of Person immediately, resulting in reduction of execution time by an order of magnitude (from 3 second to 0.3 seconds). On the other hand, the second query returns nonempty result set, resulting in the significant decrease of the pruning gain (from 3 seconds to 1.5 seconds).

Spread the word:
  • Reddit
  • Digg
  • del.icio.us
  • TwitThis
  • Technorati

Leave a Reply