Remaining Sections of Kleinberg
Sections 5 through 7 of Authoritative Sources in a Hyperlinked Environment cover some more applications of the hubs and authorities approach and then wraps up with the conclusion.
Section 5 examines the quality of the authority measure by comparing the results achieved by running the Kleinberg algorithm against some of the searchable hierarchies that exist on the web (Yahoo, others that are long gone) and shows that this approach does pretty well at the task.
Section 6 then covers one of the problems with the approach, which the authors label as diffusion. Diffusion happens when the algorithm produces results that are not on the original topic. In most cases, the results are more generalized given a specific query. Kleinberg’s example is the query “medical conferences” which yields results that are mainly about medicine. A proposed solution is to reintroduce the query back into the problem after the algorithm has run by using the terms of the query to rerank the results in the result set. Section 6 gives a fair amount of detail on how this reranking can take place.
Well, that wraps up another week. I hope people are finding this useful. I know I am (even if I cheat a little bit like I did tonight and gloss over the details a little more.) If anyone is interested in discussing a paper on a particular topic, please feel free to leave a comment suggesting one. For now, I am going to do one or two more weeks on graph based approaches to NLP and then I’m going to start looking at some Question Answering papers.
NLP, Graph Theory, Kleinberg, diffusion, information retrieval, IR
Popularity: 5% [?]
Technorati Tags: NLP, Graph Theory, Kleinberg, diffusion, information retrieval, IR

