Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/www2008/paper/592> ?p ?o. }
Showing items 1 to 12 of
12
with 100 items per page.
- 592 creator bin-tan.
- 592 creator fuchun-peng.
- 592 type InProceedings.
- 592 label "Unsupervised Query Segmentation using Generative Language Models and Wikipedia".
- 592 sameAs 592.
- 592 abstract "In this paper, we propose a novel unsupervised approach to query segmentation, which is an important task in Web search. We use a generative query model to recover a query’s underlying concepts that compose its original segmented form. The model’s parameters are estimated using an expectation-maximization (EM) algorithm, optimizing the minimum description length objective function on a partial corpus that is specific to the query. To augment this unsupervised learning, we incorporate evidence from Wikipedia. Experiments show that our approach dramatically improves performance over the traditional approach based on mutual information, and produces comparable results with a supervised method. In particular, the basic generative language model contributes a 7.4% improvement over the mutual information based method (measured by segment F1 on the Intersection test set). EM optimization further improves the performance by 14.3%. Additional knowledge from Wikipedia provides another improvement of 24.3%, adding up to a total of 46% improvement (from 0.530 to 0.774).".
- 592 hasAuthorList authorList.
- 592 hasTopic World_Wide_Web.
- 592 isPartOf proceedings.
- 592 keyword "content discovery".
- 592 keyword "query segmentation".
- 592 title "Unsupervised Query Segmentation using Generative Language Models and Wikipedia".