Pellint: Predicting Reasoner Performance

by Evren Sirin

This week we released Pellint 0.1, an ontology lint tool for detecting and fixing performance problems in ontologies. In this post, I’ll explain what kind of problems Pellint detect and why.

Predicting Reasoning Performance is Hard

Predicting the reasoning performance for an ontology is not an easy task, and what we do in Pellint is to define patterns that will detect some patterns known to be problematic. Unlike the traditional lint tools, the problems found by Pellint do not necessarily indicate a modeling error in the ontology. It just means that Pellet is probably going to be slow when reasoning with that ontology. Therefore, there might not be an appropriate solution for some of the reported problems, i.e. the only options available might be removing some of the modeling constructs or waiting longer for the reasoner to return answers, etc.

It is still important to note that we have seen on many occasions that incorrect modeling in an ontology causes performance problems. For example, one pattern Pellint detects is whether a concept is defined to be the complement of another concept. There are very few cases where such a definition makes sense because if two concepts are complement of each other, every object in the universe should be an instance of one of these two concepts.

A Pellet-specific Tool

Pellint is a Pellet-specific tool; the point is to help users avoid performance problems with Pellet reasoning. But Pellint can be extended pretty easily to customize it for different reasoners.

The predefined set of patterns that come with Pellint detect modeling constructs that will be problematic for Pellet, which is based on tableaux algorithms for DL reasoning. All the reasoning services in tableaux algorithm can be reduced to consistency checking, which is done by building a completion graph. There are two main sources of performance problems here: (1) nondeterminism in “completing” the graph; and (2) the size of the graph built.

Building the completion graph is nondeterministic due to constructs like owl:unionOf. The instance of the union class must be an instance of at least one of the union elements. The reasoner will do a case-by-case analysis to figure out if that is possible. A nondeterministic choice made because of a union class will have an impact on the completion graph, e.g. new edges may be added because of that choice. In the event that we made a wrong choice (which will be indicated by an inconsistency in the later stages of completion process) we have to backtrack and undo all the changes made to the graph. That gets expensive in terms of CPU cycles, obviously.

There are many sophisticated optimization algorithms implemented in Pellet to reduce the effects of nondeterminism, but it is not hard to see how very large number of union classes will adversely affect reasoning time.

Tricky Interactions

What is harder to see is how the interaction of seemingly deterministic constructs may cause nondeterminism. For example, having a complex class expression on the left hand side of a subclass axiom will have that effect as well as making a concept both equivalent to a complex description and subclass of another concept. We document these patterns in the Pellint distribution.

The size of the completion graph will depend on the size of the initial graph (i.e., the asserted instances), but also on the use of existential restrictions. Constructs like owl:someValuesFrom and owl:minCardinality will cause the tableau algorithm to create new nodes in the completion graph. Applying the tableau completion rules to new nodes will require more processing time and possibly increase the nondeterminism involved because there will be new nondeterministic choices made for these new nodes.

Predicting the exact size of the completion graph (without actually building the graph) is not possible, but Pellint uses some heuristics and graph analysis techniques to approximate this size and warn (or repair) accordingly.

We’ve included documentation of the implemented patterns in Pellint. We will extend these patterns and include syntactic checks in upcoming releases. For example, using a URI without any type declaration is typically an error (might be due to misspelling of the URI or using a wrong namespace). These errors might not cause performance problems, but having one tool to report different kind of problems in ontologies is a boon to ontology engineers and others.

 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus