Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/eswc2010/paper/social_web/1> ?p ?o. }
Showing items 1 to 13 of
13
with 100 items per page.
- 1 creator dimitrios-skoutas.
- 1 creator ekaterini-ioannou.
- 1 creator odysseas-papapetrou.
- 1 creator wolfgang-nejdl.
- 1 type InProceedings.
- 1 label "Efficient Semantic-Aware Detection of Near Duplicate Resources".
- 1 sameAs 1.
- 1 abstract "Efficiently detecting near duplicate resources is an important task when integrating information from various sources and applications. Once detected, near duplicate resources can be grouped together, merged, or removed, in order to avoid repetition and redundancy, and to increase the diversity in the information provided to the user. In this paper, we introduce an approach for efficient semantic-aware near duplicate detection, by combining indexing schemes for similarity search with the RDF representations of the resources. We provide a probabilistic analysis for the correctness of the suggested approach, which allows applications to configure it for satisfying their specific quality requirements. Our experimental evaluation on the RDF descriptions of real-world news articles from various news agencies, demonstrates the efficiency and the effectiveness of our approach.".
- 1 hasAuthorList authorList.
- 1 isPartOf proceedings.
- 1 keyword "data integration".
- 1 keyword "near duplicate detection".
- 1 title "Efficient Semantic-Aware Detection of Near Duplicate Resources".