Tuesday, May 22, 2007

SemTech conference session notes: related search using semantics

Semantic Technology Conference sesssion: Related Search using Semantics: A Case Study from CNET, Tim Musgrove. CNet is the 10th largest global web site, many web brands (news.com, shopping.com, etc). Collaborative filtering recommendations can throw up anomolous recommendations (e.g. see also 'ipod' when searching for 'hp laptop'). Click through rate for alternative searches CF about 3%, but coverage only about 4% of incoming searches. Problem is lack of data for statistical approaches. Have to use whole query, since word-by-word query decomposition introduces too much ambiguity.

Word sense disambiguation, e.g. "ultralight" to "ultraportable" is an alternative to CF. Case study with CNet: integration took one day with one engineer. Has been up on CNet site for 10 days, so only early untuned performance. Results: 3.9% CF coverage, semantic equivalance 19.1%. Click-through for SE only slightly lower, but best results for click-through and coverage when combined methods.

Adding named entities to the lexical background knowledge could increase coverage, allow lowering of the quality threshold. Next step: expand search using hypernym, hyponym, etc.

