Data Portal @ linkeddatafragments.org

ScholarlyData

Matches in ScholarlyData for { ?s ?p Acquiring structured data from wikis is a problem of increasing interest in knowledge engineering and semantic web. In fact, collaboratively developed resources are growing in time, have high quality and are constantly updated. A very promising approach to this aim is extracting thesauri from wikis. A thesaurus is a work that lists words grouped together according to similarity of meaning, generally organized by synonyms. Thesauri are very useful for a large variety of applications, including information retrieval and knowledge engineering. Most information in wikis is expressed by means of natural language texts and internal links among web pages, the so called wikilinks. In this paper, an innovative method for inducing thesauri from Wikipedia is presented. It leverages on the Wikipedia structure to extract concepts and terms denoting them, obtaining a thesaurus that can be profitably used into applications. To boost precision and avoid noise, we apply word sense disambiguation techniques for lexical substitution and latent semantic analysis. In the paper, we show how to represent the extracted results following an RDF/OWL schema that can be published in the semantic web.. }

Showing items 1 to 1 of 1 with 100 items per page.

33 abstract "Acquiring structured data from wikis is a problem of increasing interest in knowledge engineering and semantic web. In fact, collaboratively developed resources are growing in time, have high quality and are constantly updated. A very promising approach to this aim is extracting thesauri from wikis. A thesaurus is a work that lists words grouped together according to similarity of meaning, generally organized by synonyms. Thesauri are very useful for a large variety of applications, including information retrieval and knowledge engineering. Most information in wikis is expressed by means of natural language texts and internal links among web pages, the so called wikilinks. In this paper, an innovative method for inducing thesauri from Wikipedia is presented. It leverages on the Wikipedia structure to extract concepts and terms denoting them, obtaining a thesaurus that can be profitably used into applications. To boost precision and avoid noise, we apply word sense disambiguation techniques for lexical substitution and latent semantic analysis. In the paper, we show how to represent the extracted results following an RDF/OWL schema that can be published in the semantic web.".