This week we are digging deeper into the published work of Language Computer Corporation’s QA (they have an online demo on their home page) system by looking at COGEX: A Logic Prover for Question Answering by Moldovan, Clark, Harabagiu and Maiorano. If you recall from a few weeks ago, the LCC system was the top performer at TREC 2003 (see last week’s POTW for a discussion of the overall system.)
In this paper, LCC demonstrates that using a theorem proving strategy to validate answers found through traditional QA means can improve performance by over 30% on TREC style questions. To support this claim, LCC takes us through, at a high level, the steps involved in creating a theorem prover for use in QA.
Section 1 introduces us to the problem at hand and why they think COGEX is useful for QA and what types of challenges were overcome to implement it. To summarize, COGEX creates logical representations of the questions, candidates and other world and language knowledge to re-rank candidate answers and remove incorrect answers. It should be noted that it doesn’t identify the candidates to begin with, just “proves” they are correct. The main technical challenges in the approach occur in two main areas: creating the logically representation of the necessary world knowledge and other inputs and the high failure rates and long processing time required to apply the prover (I wonder if LCC has overcome these for their online demo, or if they don’t use the theorem prover in the demo.)
Section 2 outlines how the theorem prover is integrated into the QA system. They have a module called the Axiom Builder which takes in various free text inputs and WordNet glosses and builds logical representations of these. On a side note, I wonder about the storage and retrieval algorithms used for these. I would imagine one needs a way of quickly looking up the appropriate pieces, especially when it comes to the Wordnet glosses. All of this input is fed into the justification module, which tries to prove out the theorem, relying on some relaxation techniques if the proof fails.
Getting into the guts of the program, sections 3 and 4 discuss in detail how to create logical representations of the text and the world knowledge incorporated into the system. For the text, they use what they call a “logic form” which is a middle ground between a syntactic parse and a deep semantic representation. In other words, they capture the various pieces of the structure of the text that they think are important, in this case: “(1) syntactic subjects, (2) syntactic objects, (3) prepositional attachments, (4) complex nominals, and (5) adjectival/adverbial adjuncts” (page 2). To build the logic form, they map the words to the predicates, using the base form plus part of speech info. Nouns are supplemented with a tag that allows it to be referred to from other predicates. From here, they use grammar rules to construct the actual representation. Since there are so many grammar rules, they observe that the top ten most frequently used rules cover more than 90% of the cases that they need in WordNet. If I understand this correctly, they are saying common grammar rules such as Subject Verb Object, etc. are sufficient for most of their cases.
The next part of the paper details how to use WordNet glosses to build a bank of world knowledge for use in the theorem prover by developing lexical chains linking synsets across hierarchies. This is useful for increasing passage retrieval and extractions, as well. The chains codify topically related concepts and are used in the proving algorithm. Several examples are given at the end of section 4 for those interested in seeing more details.
Before getting to how the theorem prover works in section 6, section 5 discusses how NLP Axioms are built for the various language nuggets they are interested in, such as complex nominals, possessives, etc. One thing to note is these axioms have strengths associated with them that play a role in scoring the answers in the prover. Weaker axioms result in weaker confidence in a given proof.
Sections 6 and 7 discuss how the axioms are used and show an in-depth example of the process in action. Reading the syntax is a bit mind numbing, but essentially it relies on a couple key pieces: hyperresolution, paramodulation and proof by contradiction. See the subsection of section 6 on Inference Rules for definitions of hyperresolution and paramodulation. In my naive understanding, they are ways of reducing and substituting to make the proof more manageable. Proof by contradiction is simply a way of proving something by assuming it is incorrect and then logically deducing that if it were incorrect than some known fact would be wrong. In the example of section 7, they assume there is NOT a company/organization that developed the Mosaic Internet browser which is contradicted by the fact that their passage states that NCSA developed Mosaic. One interesting thing to note here, is this type of proof would not work well for subjective things or on passages that are lying or incorrect.
Section 8 discusses results. Namely, they claim a 30% improvement in correct answers for TREC 2002. Quite significant results with the trade off lying in the time and effort it takes to both codify the prover and actually run it on real systems. This is often the trade off in NLP systems in the current state of things. It seems you can have either fast or good, but rarely can you have both when it comes to deep, complex problems such as QA.
Popularity: 6% [?]