Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/www2012/paper/329> ?p ?o. }
Showing items 1 to 16 of
16
with 100 items per page.
- 329 creator daniel-herzig.
- 329 creator thanh-tran.
- 329 type InProceedings.
- 329 label "Heterogeneous Web Data Search Using Relevance-based On The Fly Data Integration".
- 329 sameAs 329.
- 329 abstract "Searching over heterogeneous structured data on the Web is challenging due to vocabulary and structure mismatches among different data sources. In this paper, we study two main directions. The first one relies on data integration to mediate these mismatches through upfront computation of mappings, based on which queries are rewritten to fit the vocabulary and structure of individual sources. The other extreme is keyword search, which does not require any upfront investment, but ignores structure information that can be exploited for more effective search. Then, we present a hybrid approach, which assumes only one single structured query that adheres to the vocabulary of just one of the sources. However, this so-called seed query is not rewritten to obtain structured queries for individual sources, but processed as a keyword query. For more effective keyword search that also takes structure information into account, we construct an entity relevance model (ERM), which captures both the content and structure of the seed query results. On the fly, this ERM model is then aligned with keyword search results retrieved from other sources to bridge vocabulary mismatches, and finally used to rank these results. Through experiments using large-scale real world datasets, we study these three different strategies. The outcomes suggest that upfront investment in data integration leads to higher search effectiveness compared to keyword search, and that the hybrid strategy clearly provide best results.".
- 329 hasAuthorList authorList.
- 329 isPartOf proceedings.
- 329 keyword "Ad-hoc dataintegration".
- 329 keyword "Information Retrieval".
- 329 keyword "Query processing".
- 329 keyword "Querying multiple datasets".
- 329 keyword "Search on RDF".
- 329 keyword "structured web data".
- 329 keyword "vertical search".
- 329 title "Heterogeneous Web Data Search Using Relevance-based On The Fly Data Integration".