Archive for the 'Probabilistic Reasoning' Category

Upcoming Talks and Conferences

Sunday, April 27th, 2008 · Kendall Clark

We’re starting to give more talks at more conferences since our SemWeb infrastructure framework—from OWL reasoners and ontology browsers, to RDF linked data browsers, to policy management apps— is really starting to round itself out. Upcoming talks include:

  1. 28 April, Boston, BioIT World Conference’s “Harnessing the Semantic Web for Your Organization” workshop . Mike Smith is giving a talk with our NCI customers about how HCLS and other bio orgs can start to take advantage of semantic web stuff.
  2. 19 to 21 May, San Jose, Semantic Technology Conference (Bad URL that will break for the 2009 conference; ironic, that!). Evren Sirin, Mike Smith, Pavel Klinov, and I will be giving three talks—Pellet, Pronto, and XACML-DL Policy Analysis. Actually, I’ll be there trying to act “managerial”; I leave the talk-giving to the smart guys.
  3. 2 to 4 June, Palisades, NY, POLICY 2008. Markus Stocker and I will be giving a demo talk of XACML-DL, our XACML policy analysis tool.

Upcoming we’re targeting a conference about Ontologies and Model-Driven Architectures (MDA)— which is what OMG is doing now that CORBA is dead-dead-dead—that’s sometime in the fall in Toulouse, which is nice.

One of our new customers—who’s sponsoring a pending Pellet maintenance release, version 1.5.2, that should be out ver soon—is using Pellet to drive a pretty complex code-generation process, and that’s an area where we think Pellet has a huge upside. And we’ve found giving talks and papers with customers as partners is a good pattern.

If you’re planning on attending any of these conferences, shoot me an email as we’d love to chat with users, friends, fans, and interested bystanders.

Spread the word: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Reddit
  • Digg
  • del.icio.us
  • TwitThis
  • Technorati

OwlSight 0.36 Released

Thursday, March 27th, 2008 · Michael Grove

Fresh on the heels of our recent announcement of the first release of Pronto, we’re happy to announce release of a new version of OwlSight, which includes support for Pronto and browsing probabilistic ontologies and explanations for probabilistic inferences.

In addition to including support for probabilistic ontologies, the new version of also has an entirely new look and feel. We used GWT-Ext to build the new UI components, with very pleasing results. We’ve also fixed a number of bugs, and made some other modest improvements. But if you are interested in Pronto, or you are just interested in a lightweight, browser-based ontology browser, you should give the latest version of OwlSight a try.

To get started, there are two new bookmarks included with OwlSight, both of which illustrate probabilistic reasoning in the context of OWL DL reasoning. Choose the Penguin Bird or BRC bookmarks to load and browse a probabilistic ontology.

Spread the word: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Reddit
  • Digg
  • del.icio.us
  • TwitThis
  • Technorati

Pronto 0.1 Release

Monday, March 10th, 2008 · Kendall Clark

We’re announcing the release of Pronto 0.1, a probabilistic OWL DL reasoner, for doing uncertainty reasoning in Pellet.

I won’t add much to the release announcement, which you can read for yourself; however, let me thank the people who are responsible for the Pronto 0.1 release: Pavel Klinov, Evren Sirin, Mike Smith, and Mike Grove. Good work, peeps!

Pronto is available under the terms of AGPL v.3 license, which is rather a new free software license that we think makes sense in this case and which I personally think will become increasingly used (though, honestly, since it’s so littled used today, that’s a pretty safe bet) in FOSS projects.

Spread the word: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Reddit
  • Digg
  • del.icio.us
  • TwitThis
  • Technorati

Using Pronto: Breast Cancer Risk Models

Tuesday, October 2nd, 2007 · Pavel Klinov

In my previous post, I introduced Pronto, a probabilistic DL reasoning extension for Pellet. I gestured at some of the algorithmic and technical details of Pronto’s capabilities—for the technically curious, a careful read of “Probabilistic Description Logics for the Semantic Web” paper is the best place to start. Now let’s move to a more realistic example than those poor birds and penguins and Richard Nixon.

Consider the domain of cancer, more precisely, women’s breast cancer, yet more specifically, the issue of breast cancer risk assessment . Very roughly, the central problem is to combine all the risk factors that apply to a particular woman and come up with a credible number reflecting her chance of developing breast cancer, either in her lifetime or in the short term (normally in the next 10 years). There are a few models that do that, basically, just by computing an empirically inferred function of input parameters (risk factors).

Pronto offers a different way of approaching the problem. It supports a wide use of all the background knowledge captured in a classical ontology—for example, of the sort maintained in the NCI Thesaurus—but also allows us to augment the classical KB with probabilistic statements, such that the risk of developing cancer can be computed as an ontological inference. That makes modeling a lot more explicit and illustrative—especially with support of Pellet’s explanations—than using a “black box” function.

So, consider a classical part of an ontology for modeling the breast cancer domain—by the way, we’re not claiming it’s correct or useful from medical point of view; you may also consider Matt Williams’ version of clinical ontology if you’re concerned with correctness of terms and stuff like that.

The ontology defines risk factors that are relevant to breast cancer, i.e., subclasses of RiskFactor. Then it also defines different categories of women, first, those that have certain risk factors (subclasses of WomanWithRiskFactors); and, second, those distinct in terms of the risk of developing cancer (subclasses of WomanUnderBRCRisk). The basic task is to compute the probability that a certain woman is an instance of some WomanUnderBRCRisk subclass given that she is an instance of some WomanWithRiskFactors subclass. In addition, it will be useful to infer the generic probabilistic subsumption between classes under WomanUnderBRCRisk and under WomanWithRiskFactors.

The first thing to do in order to enable such probabilistic reasoning is to express the uncertain background knowledge about the domain. This is done by listing the conditional constraints in the form of OWL 1.1 axiom annotations. The constraints can either be in a separate file that imports the classical OWL ontology or be embedded into the classical part.

For this example, the constraints express how individual risk factors influence the risk of developing cancer (numbers taken from “Risk Factors and Prevention”). The job of Pronto is to combine factors that apply to a particular woman and compute the probability that she is an instance of some WomanUnderBRCRisk subclass.

Let’s now quickly go through the individuals to illustrate the reasoning:


  • Julie is a woman in her thirties. The only risk factor that applies to her is AgeUnder50, so the Pronto concludes that Julie:(WomanWithBRCInShortTime|owl:Thing)[0.0;0.027] (her chance of developing cancer in next 10 years is no higher than 2.7%)

  • Mary is known to have BRCA1 gene mutation which is known to be a hugely important risk factor. Using the generic constraint (WomanUnderGreatBRCRisk|WomanWithBRCAMutation)[1;1], Pronto puts her in the category of women with the highest relative risk of cancer (this example, also shows, that conditional constraints, with some caveats, can model certain subsumption relationships)

  • For Ann we know two risk factors – her mother had BRC and she is an Ashkenazi Jew, so she has an increased chance of having inherited gene mutation. Using the combination of risk factors without overriding, Pronto concludes that she has a 31.25% chance of being in the category of 3x increased risk and over 2.5% of being in the highest risk category.

  • Helen is the most interesting case. For her we again know 2 risk factors – her age is over 50 and her mother had cancer. Using overriding we can specify how these two factors strengthen or weaken each other to produce the actual risk. This can be done by defining a generic constraint (WomanUnderGreatBRCRisk|SeniorWomanWithMotherBRCAffected)[0.9;1] that overrides constraints for both factors individually. Thus, Pronto entails that she in the highest risk category with more than 90% probability.

Finally, I want to mention explanations, which are an important part of Pellet’s reasoning services. I begin by pointing out that DL reasoning can be difficult to understand and probabilistic reasoning can also be difficult to understand. Not surprisingly, the hybrid reasoning that Pronto is capable of can be very difficult to understand. This is both a limitation and an opportunity to extend Pellet’s explanation services to Pronto.

In other words, we want to extract the minimal set of conditional constraints that are sufficient to produce the given probabilistic entailment. For the breast cancer example above that would imply filtering out all the irrelevant risk factors and leaving only those which were taken into account during reasoning. We’ve got some initial work done in extending explanations in Pronto, but there’s much more work to do, including extending debug and repair features of Pellet to Pronto.

Spread the word: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Reddit
  • Digg
  • del.icio.us
  • TwitThis
  • Technorati

Introducing Pronto: Probabilistic DL Reasoning in Pellet

Thursday, September 27th, 2007 · Pavel Klinov

This is the first in a series of posts on extending Pellet with probabilistic reasoning capabilities. We call this tool “Pronto”. It offers core OWL reasoning services for knowledge bases containing uncertain knowledge; that is, it processes statements like “Bird is a subclass-of Flying Object with probability greater than 90%” or “Tweety is-a Flying Object with probability less than 5%”.

The use cases for Pronto include ontology and data alignment, as well as reasoning about uncertain domain knowledge generally, for example, risk factors associated with breast cancer.

First, I should say that if you are interested in a rigorous description of the approach, read the paper by Thomas Lukasiewicz “Probabilistic Description Logics for the Semantic Web”. Pronto is to a large extent an implementation of the Lukasiewicz approach—the rest is optimization and the support of explanations.

In a nutshell, the features of Pronto (in addition to the features of Pellet) are the following:


  1. Expressing generic probabilistic knowledge. “Generic” means that the knowledge doesn’t apply to any specific individual but rather to a fresh, randomly chosen one. Generic probabilistic knowledge is represented in the form of generic conditional constraint (GCC). A GCC is an expression of the form (D|C)[l,u], where C and D are DL concepts and [l,u] is a closed subinterval of [0,1]. Without getting deeply into the semantics, the meaning of a GCC is roughly for a randomly chosen instance of C, the probability of being an instance of D is within [l,u]. The above statement about birds would be written as (FlyingObject|Bird)[0.9;1.0].

  2. Expressing concrete probabilistic knowledge. Here the knowledge applies to a specific individual. Concrete probabilistic knowledge is represented in the form of a:X, where “a” is an individual and “X” is a GCC restricted to the form (D|owl:Thing)[l,u]. We can express “Tweety is-a Flying Object with probability less than 5%” as Tweety:(FlyingObject|owl:Thing)[0.0;0.05].

  3. Probabilistic reasoning, that is, generic and concrete entailments. A generic entailment is, given a probabilistic KB and a pair of concepts, compute the tightest interval (D|C)[l,u]. A concrete entailment is, given a probabilistic KB, an individual “a”, and a concept “D”, compute the tightest interval (D|owl:Thing)[l,u] for “a”. So we can ask Pronto to infer the probability of a statement like Tweety being a flying object based on other statements rather than asserting the conditional constraint.

  4. Probabilistic explanations. Pronto is capable of computing all minimally sufficient (w.r.t. inclusion) subsets of conditional constraints for a particular entailment, both generic and concrete.

Perhaps the single most important point about Pronto reasoning is that all inferences are done in a totally “logical” way, i.e. using a well-defined entailment relation and without any explicit or implicit translation of KB (or some parts of KB) to Bayesian graphs. This is the major difference between Pronto and other approaches, e.g. P-CLASSIC or “Probabilistic Extension to OWL”.

Finally, I should mention overriding as a feature of Pronto that we particularly like. Pronto allows certain conflicts between different pieces of probabilistic knowledge, more precisely, between different conditional constraints. The famous example is that of flying birds and non-flying penguins. (It’s similar to the famous Nixon Diamond problem.) The problem here is related to non-monotonicity: A bird is a flying object with high probability and all penguins are birds but a penguin has a low probability of flying.

The way Pronto resolves these conflicts is by allowing more specific constraints to override more generic ones. So if Pronto knows that Tweety is a Penguin and Penguin is a subclass-of Bird, it will override the constraint (FlyingObject|Bird)[0.9;1.0] by (FlyingObject|Penguin)[0.0;0.05] and correctly entail Tweety:(FlyingObject|owl:Thing)[0.0;0.05]. This is the idea borrowed from reference class reasoning and supported by Lehmann’s lexicographic entailment employed in Pronto (see the Lukasiewicz paper for technical details). The decision whether a constraint is more specific/generic than some other one is made through the classical DL reasoning.

In the next post of this series, I’ll take you through an actual use of Pronto in the life sciences domain.

Spread the word: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Reddit
  • Digg
  • del.icio.us
  • TwitThis
  • Technorati