Why Reasoning Matters: Explanations (3)

by Kendall Clark

Previously I talked about the most fundamental reasoning service, consistency checking. It’s the most fundamental because every other reasoning service, ultimately, is performed by doing one or more consistency checks. I undersold the utility of consistency checking last time intentionally, because saying it’s key to all the other things one can do with automated reasoning isn’t very interesting before you know about some other things automated reasoning can do.

To recall, consistency checking itself is useful in, for example, data integration projects because it eliminates from run-time and query-time errors based on conceptual or modeling issues, and it does that at design-time and with certain guarantees, modulo bugs, about soundness and completeness.

(A neglectable aside: in automated reasoning, “sound and complete” comes up a lot. In principle, a reasoner is “sound and complete” if, but only if, it uses a decision procedure (i.e., a kind of algorithm) that is sound and complete. Which means that it is guaranteed to give no wrong answers (“sound”) and to give all the answers there are to give (“complete”). I say “in principle” because automated reasoners have bugs just like any complex software. I say “guaranteed” because someone has proven the soundness or completeness, or both, of the decision procedure. Unsound automated reasoners are not, as far as I know, very interesting for real apps. But when designing an automated reasoner, people often trade completeness for efficiency by implementing an incomplete decision procedure—there are answers that such a reasoner can never provide, by design. But it least it doesn’t provide them quickly!)

By way of comparison, Linked Data and RDF triple store vendors try to make virtue of their vice—they can’t do consistency checking, so they claim no one would ever want or need to do it. As to this tendency, I blame no one. I’d say the same thing, too!

Explanations

Now I want to talk about the utility of another reasoning service, which in Pellet we call “explanation, debugging, and repair”—in this post, I’ll focus only on explanation, saving the others for another day.

The utility of explanation, in a nutshell, is that the reasoner can not only create new knowledge from existing knowledge by means of inference, but it can also—and this is the cool part—tell you how it reached the conclusion, or inference, that it reached.

So Pellet can derive new knowledge and then explain how it derived that new knowledge. It explains its inferences by providing the minimal set of facts or other knowledge necessary to draw the inference.

Think about a perfectly ordinary conversation between two people. Bob tells Nancy a lot of stuff about the physics of baseball, how curve balls work, and so on. Nancy thinks about what Bob’s said and infers some other stuff based on it; for instance, maybe she draws some conclusions about how split-finger fastballs work based on how 3-seam fastballs work. So Nancy tells Bob these new bits of knowledge she’s inferred from what he told her about baseball physics. Then Bob asks Nancy to tell him why she reached those new bits. And, in response, Nancy picks out just the bits that she relied on and tells Bob about them.

This is damn useful because, while people are themselves reasoners by virture of their nature, they tend to be very skeptical about machine or automated reasoning. And with good reason! Explanations provide a means for people to, basically, check the computer’s work.

Configuration Management

One of the ways we’ve used Pellet for customers is to build a configuration management engine for some problem using Pellet and ontologies. Basically you make an ontology describing some problem domain—say, stereo equipment. The ontology describes what things there are in this part of the world (AudioEquipment, Receiver, Interconnect, Speaker, Component, AcceptableUsePolicy, etc.) and what relationships between those things are possible (connect_to, provides_output, etc). The application takes some input from the user, say, a preferred stereo system buildout, and determines whether that’s a legal configuration by means of some of its built-in reasoning services.

So far, so cool. But it’s one thing to tell a user that her preferred buildout isn’t legal; it’s a much better thing to tell her why it’s not legal (explanation) and to suggest changes she can make (repair) that will make it legal. Obviously this capability exists on every PC and auto manufacturer’s web site, as well as many other examples. I will save for another day a discussion of why you might want to solve this problem with an automated reasoner and OWL, rather than, say, Java or Python code.

But there’s another reason why explanation is so useful: not only does it let people check the computer’s work, but it also allows people to understand mistakes they’ve made. In other words, Pellet’s explanation service works for every inference it makes, including the inference that something is inconsistent. Pellet doesn’t just tell you that you’ve made a mistake, it shows you which bits of what you’ve said, and what it inferred, together cause the mistake.

A Comparison with RDF & Linked Data

By the way, Linked Data and RDF triple stores don’t do consistency checking and they don’t do explanations, either. Why not? Since the kinds of inferences one can draw in RDF are mostly trivial or meaningless, there’s nothing that needs to be explained. Anyone can understand the most complex RDF inference by thinking about it for no more than 2.7 seconds. That’s a scientific fact! :)

In cases where you don’t care about stuff like explanations, debugging and automatic repair, RDF can be a good choice, depending on a lot of other factors. But in cases where that stuff really matters, you need to think about automated reasoning languages and systems, like OWL and Pellet.

Viewing 5 Comments

    • ^
    • v
    Kendall
    I've been giving a talk lately called the "Two Towers" where I discuss the issue of the different views of ontology. I pretty strongly reject your contention that every would want sound and complete answers (or consistency) if they could only do it. The whole field of hueristic programming shows that to be untrue - rather there are important problems where it is needed, and important ones where it isn't. Further, the requirements needed to get consistency to be meaningful may be restrictive in many cases.
    In the talk I point out that both the traditional logic view and the more "small ontologies near the data" view are producing ROI, and the key is to figure out the appropriate things to use for the applications you are building. Your contention that

    "By way of comparison, Linked Data and RDF triple store vendors try to make virtue of their vice—they can’t do consistency checking, so they claim no one would ever want or need to do it. As to this tendency, I blame no one. I’d say the same thing, too!"

    strikes me as just throwing more confusion onto the fire, and your last paragraph, which actually makes an important point, does it in a pretty condescending way.

    Maybe if you'd like people to stop bad=mouthing logic, you could contribute by making the differences clear (as you're doing) without misrepresenting the alternate view or denigrating the alternate approach. Frankly, I think we all win if we make the different strengths clearer, than if we add further confusion to an already blurred messaging.
    -JH
    p.s. Feel free to check out the slides of my talk at http://www.cs.rpi.edu/~hendler/presentations/Se...
    • ^
    • v
    In the talk I point out that both the traditional logic view and the more “small ontologies near the data” view are producing ROI, and the key is to figure out the appropriate things to use for the applications you are building.


    Indeed, this is a matter of engineering tradeoffs and requirements analysis -- boring, sure, but true nonetheless. I thought I'd said this clearly in the piece.

    Re: the utility of small-o approach--see, I read yr slides right after SemTech!--we've built two production systems for NASA that do exactly that. Customers are happy with those systems, which we built vastly cheaper than off-the-shelf stuff was going to cost. Yay for ROI!

    We also develop a Linked Data Browser, jSpace (which Linked Data people ignore since, I suppose, it's not *also* a web browser), so we really do understand that stuff.

    As to the rest of yr objections, I take them to come down to style, where tastes and opinions diverge. Our mileages, as they say, obviously vary! :)
    • ^
    • v
    I very much dislike the contention that a deductive reasoner like Pellet can "create new knowledge", it can recombine, specialize and apply; it does not create knowledge in any meaningful sense of the word.
    • ^
    • v
    Valentin, you make a fair point. I probably wouldn't put it just that way in some other contexts, though I think's it a defensible claim. We might need to say "new knowledge relative to existing set of beliefs". It's certainly the case in bioinformatics that previously unknown knowledge was discovered using deductive reasoning. I think that counts as "new knowledge" or close enough.
    • ^
    • v
    Valentin, first, one way I personally make new knowledge is by recombining, specializing, and applying.

    Second, this is an old puzzle. Poke around for the paradox of analysis.
 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus