Matches in ScholarlyData for { ?s <https://w3id.org/scholarlydata/ontology/conference-ontology.owl#abstract> ?o. }
- 47 abstract "A common limit of most existing methods that manage XML structure information is that they do not handle the semantic meanings that might be associated to the markup tags. In this paper, we study how to map structure information available from XML elements into semantically related concepts in order to support the generation of XML semantic features of XML structural type. For this purpose, we define an unsupervised word sense disambiguation method to select the most appropriate meaning for each element contextually to its respective XML path. The proposed approach exploits conceptual relations provided by a lexical ontology such as WordNet and employs different notions of sense relatedness. Experiments with data from various application domains are discussed, showing that our approach can be effectively used to generate structural semantic features.".
- 5 abstract "We propose a middleware for automated implementation of security protocols for Web services. The proposed middleware consists of two main layers: the communication layer and the service layer. The communication layer is built on the SOAP layer and ensures the implementation of security and service protocols. The service layer provides the discovery of services and the authorization of client applications. In order to provide automated access to the platform services we propose a novel specification of security protocols, consisting of a sequential component, implemented as a WSDL-S specification, and an ontology component, implemented as an OWL specification. Specifications are generated using a set of rules, where information related to the implementation of properties such as cryptographic algorithms or key sizes, are provided by the user. The applicability of the proposed middleware is validated by implementing a video surveillance system.".
- 70 abstract "With the increasing storage capacity of personal computing devices, the problems of information overload and information fragmentation become apparent on users' desktops. For the Web, semantic technologies aim at solving this problem by adding a machine-interpretable information layer on top of existing resources, and it has been shown that the application of these technologies to desktop environments is helpful for end users. Certain characteristics of the Semantic Web architecture that are commonly accepted in the Web context, however, are not desirable for desktops; e.g., incomplete information, broken links, or disruption of content and annotations. To overcome these limitations, we propose the sile model, an intermediate data model that combines attributes of the Semantic Web and file systems. This model is intended to be the conceptual foundation of the Semantic Desktop, and to serve as underlying infrastructure on which applications and further services, e.g., virtual file systems, can be built. In this paper, we present the sile model, discuss Semantic Web vocabularies that can be used in the context of this model to annotate desktop data, and analyze the performance of typical operations on a virtual file system implementation that is based on this model.".
- 71 abstract "In different areas ontologies have been developed and many of these ontologies contain overlapping information. Often we would therefore want to be able to use multiple ontologies. To obtain good results, we need to find the relationships between terms in the different ontologies, i.e. we need to align them. Currently, there already exist a number of ontology alignment systems. In these systems an alignment is computed from scratch. However, recently, some situations have occurred where a partial reference alignment is available, i.e. some of the correct mappings between terms are given or have been obtained. In this paper we investigate whether and how a partial reference alignment can be used in ontology alignment. We use partial reference alignments to partition ontologies, to compute similarities between terms and to filter alignment suggestions. We test the approaches on previously developed golden standards and discuss the results.".
- 88 abstract "In this paper we investigate the transformation of OWL-S process models to ISPL - the system description language for MCMAS, a symbolic model checker for multi agent systems. We take the view that services can be considered as agents and service compositions as multi agent systems. We illustrate how atomic and composite processes in OWL-S can be encoded into ISPL using the proposed transformation rules for a restricted set of data types. As an illustrative example, we use an extended version of the BravoAir process model. We formalise certain interesting properties of the example in temporal-epistemic logic and present results from their verification using MCMAS.".
- 96 abstract "Reliable methods to assess the costs and benefits of ontologies are an important instrument to demonstrate the tangible business value of semantic technologies within enterprises, as an argument to encourage their wide-scale adoption. The economic aspects of ontologies have been investigated in previous work of ours. With ONTOCOM we proposed a cost estimation model for ontologies and ontology development projects. This paper revisits this model and presents its latest achievements. We report on a comprehensive calibration of ONTOCOM based on a considerably larger data set of 148 ontology development projects. The calibration used a combination of statistical methods, ranging from preliminary data analysis to regression and Bayes analysis, and resulted a significant improvement of the prediction quality of up to 50%. In addition, the availability of a representative data set allowed us to identify meaningful directions for customizing the generic cost model along particular types of ontologies, and ontology-like structures as those specific to the emerging Web 3.0. Last but not least, we developed a software tool that allows ontology development project managers to easily use and adapt and to systematically calibrate the model, thus facilitating its adoption in real-world projects.".
- 13 abstract "In this paper we describe the application of various Semantic Web technologies and their combination with emerging Web2.0 use patterns in the eParticipation domain and show how they have been used in a pilot system for the Regional Government of the Prefecture of Samos, Greece. We present the parts of the system that are based on Semantic Web technology and how they are merged with a Web2.0 philosophy and explain the benefits of this approach, as showcased by applications for annotating, searching, browsing and cross-referencing content in eParticipation communities.".
- 14 abstract "While the Semantic Web is rapidly filling up, appropriate tools for searching it are still at infancy. In this paper we describe an approach that allows humans to access information contained in the Semantic Web according to its semantics and thus to leverage the specific characteristic of this Web. To avoid the ambiguity of natural language queries, users only select already defined attributes organized in facets to build their search queries. The facets are represented as nodes in a graph visualization and can be interactively added and removed by the users in order to produce individual search interfaces. This provides the possibility to generate interfaces in arbitrary complexities and access arbitrary domains. Even multiple and distantly connected facets can be integrated in the graph facilitating the access of information from different user-defined perspectives. Challenges include massive amounts of data, massive semantic relations within the data, highly complex search queries and users’ unfamiliarity with the Semantic Web.".
- 18 abstract "The HermesWiki project is a semantic wiki application on Ancient Greek History. As an e-learning plattform, it aims at providing students eective access to concise and reliable domain knowledge, that is especially important for exam preparation. In this paper, we show how semantic technologies introduce new methods of learning by supporting teachers in the creation of contents and students in the personalized identication of required knowledge. Therefore, we give an overview of the project and characterize the semi-formalized content. Additionally, we present several use cases and describe the semantic web techniques that are used to support the application. Furthermore, we report on the user experiences regarding the usefulness and applicability of semantic technologies in this context.".
- 29 abstract "In this paper we present a concrete case study in which semantic technology has been used to enable a territorial innovation. Firstly, we describe a scenario of the ICT regional demand in Trentino, Italy; where the main idea of territorial innovation is based on the so-called innovation tripole. Specifically, we believe that innovation arises as a result of the synergic coordination and technology transfer among three main innovation stakeholders: (i) final users, bringing domain knowledge, (ii) enterprises and SMEs, bringing knowledge of the market, and (iii) research centers, bringing the latest research results. The tripole is instantiated/generated for innovation projects, and, technically, can be viewed as a competence search (based on metadata) among the key innovation stakeholders for those projects. Secondly, we discuss the implementation of the tripole generation within the TasLab portal, including the use of domain ontologies and thesauri (e.g., Eurovoc), indexing and semantic search techniques we have employed. Finally, we provide a discussion on empirical and economic evaluation of our solution, the results of which are encouraging.".
- 31 abstract "This paper presents an approach for the interactive discovery of relationships between selected elements via the Semantic Web. It fills the gap between algorithms that find relationships in datasets of the Semantic Web and their efficient usage in real-world contexts. Selected elements are first semi-automatically mapped to unique objects of Semantic Web datasets. These datasets are then crawled for relationships which are presented both, in detail and overview. Interactive features and visual clues allow for sophisticated exploration of the found relationships on different levels. The general process is described and the RelFinder tool as a concrete implementation and proof-of-concept is presented. The benefits and application potentials are illustrated by a scenario which uses the RelFinder and DBpedia to assist a business analyst in decision-making. Finally, the approach is evaluated in a user study, and discussed and compared with related work.".
- 40 abstract "In this work we introduce ``context-driven semantic enrichment'', aiming at annotating textual expressions with the metadata that are semantically relevant for the specific context in which such expressions occur. The great benefit is that context-dependent information (e.g., occupations of people changing during the time) will be enriched with the appropriate semantic tags according to the different contexts. Context-driven semantic enrichment builds upon three core tasks: automated extraction of named entities (i.e., people, locations and organizations) from text; semantic integration techniques for organizing the background knowledge by means of ontologies populated with the extracted named entities; individuation of the mapping between entities found in text and their ontological counterparts using contextual constraints derived from text (e.g., temporal, geographic and thematic constraints). The approach has been fully implemented in a system, which has been practically deployed and applied to the textual archive of the local Italian newspaper ``L'Adige'', covering the decade of years from 1999 to 2009.".
- 45 abstract "A single datum or a set of a categorical data has little value on its own. Combination of disparate sets of data increases the value of those data sets and helps to discover interesting patterns or relationships, facilitating the construction of new applications and services. In this paper, we describe an implementation of using open geographical data as a core set of ``join points'' to mesh different public datasets. We describe the challenges faced during the implementation, which include, sourcing the datasets, publishing them as linked data, and normalising these linked data in terms of finding the appropriate ``join point''(s) from the individual datasets, as well as developing the client application used for data consumption. We describe the design decisions and our solutions to these challenges. We conclude by drawing some general principles from this work.".
- 56 abstract "This paper presents an ontology management system and ontology processing techniques used to support a distributed event-triggered knowledge network (ETKnet), which has been developed for deployment in a national network for rapid detection and reporting of crop disease and pest outbreaks. The ontology management system called Lyra, is improved to addresses the issues of terminology mapping, rule service discovery, and large ABox’s query processing. A domain ontology that covers the concepts related to events, rules, roles and collaborating organizations in ETKnet was developed. Terms used by different organizations can be located in the ontology by terminology searching. Services that implement knowledge rules and rule structures can be discovered through semantic matching using the concepts defined in the ontology. A tableau algorithm was extended to lazy-load only the needed instances and their relationships into main memory. With this extension, Lyra is capable of processing a large ontology database stored in secondary storage even when the ABox cannot be entirely loaded into memory.".
- 58 abstract "Controlled vocabularies of various kinds (e.g., thesauri, classification schemes) play an integral part in making Cultural Heritage collections electronically accessible. The various institutions participating in the Dutch CATCH programme maintain and make use of a rich and diverse set of vocabularies. This makes it hard to provide a uniform point of access to all collections at once. Our SKOS-based vocabulary and alignment repository aims at providing technology for managing the various vocabularies, and for exploiting semantic alignments across any two of them. The repository system exposes web services that effectively support the construction of tools for searching and browsing across vocabularies and collections or for collection curation (indexing), as we demonstrate.".
- 59 abstract "Within the archaeology domain, datasets frequently refer to time periods using a variety of textual or numeric formats. Traditionally controlled vocabularies of time periods have used classification notation and the collocation of terms in the printed form to represent and convey tacit information about the relative order of concepts. The emergence of the semantic web entails encoding this knowledge into machine readable forms, and so the meaning of this informal ordering arrangement can be lost. Conversion of controlled vocabularies to Simple Knowledge Organisation System (SKOS) format provides a formal basis for semantic web indexing but does not facilitate chronological inference - as thesaurus relationship types are an inappropriate mechanism to fully describe temporal relationships. This becomes an issue in archaeological data where periods are often described in terms of (e.g.) named monarchs or emperors, without additional information concerning relative chronological context. An exercise in supplementing existing controlled vocabularies of time period concepts with dates and temporal relationships was undertaken as part of the Semantic Technologies for Archaeological Resources (STAR) project. The general aim of the STAR project is to demonstrate the potential benefits in cross searching archaeological data conforming to a common overarching conceptual data structure schema - the CIDOC Conceptual Reference Model (CRM). This paper gives an overview of STAR applications and services and goes on to particularly focus on issues concerning the extraction and representation of time period information.".
- 7 abstract "The aim of this paper is to benchmark various semantic repositories to evaluate their deployment in a commercial image retrieval and browsing technology. We adopt a two-phase approach for evaluating the target semantic repositories: analytical parameters such as query/reasoning language and expected minimum level of reasoning support are used to select the pool of the target repositories, and practical parameters such as load and query response times are used to select the best match to application requirements. In addition to utilising a widely accepted benchmark for OWL repositories (UOBM), we also use a real-life dataset from the target application, which provides us with the opportunity of consolidating our findings. A distinctive advantage of this benchmarking exercise is that the essential requirements for the target system such as the semantic expressivity and data scalability are clearly defined, which allows us to claim contribution to the benchmarking methodology for this class of applications.".
- 10 abstract "Innovative context aware mobile applications require performing complex reasoning tasks that combine streaming information with rich background knowledge. For instance, from a stream of quantitative information about latitude and longitude such applications need to derive qualitative information about places like home, office, or gym. Then, rising up the level of abstraction, they need to reason about concrete problems, such as deciding how to reach such places, which means of transportation to choose, how to attend ongoing events nearby, or how to skip traffic jams. This article presents a technique for Stream Reasoning consisting in incremental maintenance of materializations of ontological entailments in presence of streaming information. In this article we elaborate on previous papers that extend to logic programming results from incremental maintenance of materialized views in deductive databases. Our contribution is a new technique that takes in explicit consideration the order in which streaming information arrives to the Stream Reasoner. By adding time validity information to each RDF statement, we show that is possible to compute a new complete and correct materialization by (a) simply dropping explicit statements and entailments that are no longer valid, and (b) evaluating a maintenance program that propagates changes in explicit RDF statements to the stored implicit entailments. In this paper, we also provide experimental evidence that our approach significantly reduces the time required to compute the new materialization, and that our approach opens up for several further optimizations.".
- 12 abstract "Novel wireless handheld devices allow the adoption of revised and adapted discovery approaches originally devised for the Semantic Web in mobile ad-hoc networks. Nevertheless, resource constraints of such devices require an accurate re-design of frameworks and algorithms to efficiently support mobile users. The paper focuses on an implementation of concept abduction and contraction algorithms in (fuzzy) ALND DL settings to perform semantic matchmaking and provide logical explanation services. OWL-DL Knowledge Bases have been properly exploited to enable standard and non-standard inference services. The proposed framework has been implemented and tested in a fire hazards prevention case study: early experimental results are reported.".
- 13 abstract "Research in spatio-temporal data mangement has progressed significantly towards efficient storage and indexing of mobility data. Typically such mobility data analytics is assumed to follow the model of a stream of (x,y,t) points, usually coming from GPS-enabled mobile devices. With large-scale adoption of GPS-driven systems in several application sectors (shipment tracking to geo-social networks), there is a growing demand from applications to understand the {\it spatio-semantic} behavior of mobile entities. {\it Spatio-semantic} behavior essentially means a semantic (and preferably contextual) abstraction of raw spatio-temporal location feeds. The core contribution of this paper lies in presenting a Hybrid Model and Computing Platform for developing a semantic overlay - analyzing and transforming raw mobility data (GPS) to meaningful semantic abstractions, starting from raw feeds to semantic trajectories. Secondly, we analyze large-scale GPS data using our computing platform and present results of extracted spatio-semantic trajectories. This impacts a large class of mobile applications requiring such semantic abstractions over streaming location feeds in real systems today.".
- 4 abstract "This paper describes a real-time routing system that implements a mobile museum tour guide for providing personalized tours tailored to the user position inside the museum and interests. The core of this tour guide originates from the CHIP (Cultural Heritage Information Personalization) Web-based tools set for personalized access to the Rijksmuseum Amsterdam collection. In a number of previous papers we presented these tools for interactive discovery of user's interests, semantic recommendations of artworks and art-related topics, and the (semi-)automatic generation of personalized museum tours. Typically, a museum visitor could wander around the museum and get attracted by artworks outside of the current tour he is following. To support a dynamic adaptation of the tour to the current user position and changing interests, we have extended the existing CHIP mobile tour guide with a routing mechanism based on the SWI-Prolog Space package. The package uses (1) the CHIP user profile containing user's preferences and current location; (2) the semantically enriched Rijksmuseum collection and (3) the coordinates of the artworks and rooms in the museum. This is a joint work between the Dutch nationally funded CHIP and Poseidon projects and the prototype demonstrator can be found at http://www.chip-project.org/spacechip.".
- 6 abstract "The sizes of datasets available as RDF (e.g., as part of the Linked Data cloud) are increasing continuously. For instance, the recent DBpedia version consists of nearly 500 millions triples. A common strategy to avoid problems that arise e.g., from limited network connectivity or lack of bandwidth is to replicate data locally, therefore making them accessible for applications without depending on a network connection. For mobile devices with limited capabilities, however, the replication and synchronization of billions of triples is not feasible. To overcome this problem, we propose an approach to replicate parts of an RDF graph to a client. Applications may then apply changes to this partial replica while being offline; these changes are written back to the original data source upon reconnection. Our approach does not require any kind of additional logic (e.g., change logging) or data structures on the client side, and hence is suitable to be applied on devices with limited computing power and storage capacity.".
- 14 abstract "Recent research has shown that annotations are useful for representing access restrictions to the axioms of an ontology and their implicit consequences. Previous work focused on assigning a label, representing its access level, to each consequence from a given ontology. However, a security administrator might not be satisfied with the access level obtained through these methods. In this case, one is interested in finding which axioms would need to get their access restrictions modified in order to get the desired label for the consequence. In this paper we look at this problem and present algorithms for solving it with a variety of optimizations. We also present first experimental results on large scale ontologies, which show that our methods perform well in practice.".
- 16 abstract "With large datasets such as Linked Open Data available, there is a need for more user-friendly interfaces which will bring the advantages of these data closer to the casual users. Several recent studies have shown user preference to Natural Language Interfaces (NLIs) in comparison to others. Although many NLIs to ontologies have been developed, those that have reasonable performance are domain-specific and tend to require customisation for each new domain which, from a developer''s perspective, makes them expensive to maintain and unattractive for practical applications spanning different domains. In this paper we describe FREyA, a system which combines three methods used to assist the user while interacting with the system: feedback, query refinement, and query expansion. These are used in combination to explore whether the performance of existing NLI systems to ontologies can be improved without the extra cost of extensive customisation. We present an evaluation using Mooney geography dataset in order to compare our approach with other similar systems.".
- 27 abstract "Geo-spatial ontologies provide knowledge about places in the world and spatial relations between them. They are fundamental in order to build semantic information retrieval systems and to achieve semantic interoperability in geo-spatial applications. In this paper we present GeoWordNet, a semantic resource we created from the full integration of GeoNames, other high quality resources and WordNet. The methodology we followed was largely automatic, with manual checks when needed. This allowed us accomplishing at the same time a never reached before accuracy level and a very satisfactory quantitative result, both in terms of concepts and geographical entities.".
- 29 abstract "In this paper we build on our methodology for combining and selecting alignment techniques for vocabularies, with two alignment case studies of large vocabularies in two languages. Firstly, we analyze the vocabularies and based on that analysis choose our alignment techniques. Secondly, we test our hypothesis based on earlier work that by first generating alignments using simple lexical alignment techniques, followed by a separate disambiguation of alignments performs best in terms of precision and recall. The experimental results show for example, that this combination of techniques provides an estimated precision of 0.7 for a sample of 12,725 concepts for which alignments were generated of the total 27,992 concepts. Thirdly, we explain our results in light of the characteristics of the vocabularies and discuss their impact on the alignments techniques.".
- 42 abstract "Medical imaging plays an important role in today''s clinical daily tasks, such as patient screening, diagnosis, treatment planning and follow up. But still a generic and flexible image understanding is missing. Although, there exist several approaches for semantic image annotation, those approaches do not make use of practical clinical knowledge, such as best practice solutions or clinical guidelines. We introduce a knowledge engineering approach aiming for reasoning-based enhancement of medical images annotation by integrating practical clinical knowledge. We will exemplify the reasoning steps of the methodology along a use case for automatic lymphoma patient staging.".
- 43 abstract "The use of natural language identifiers as reference for ontology elements—in addition to the URIs required by the Semantic Web standards—is of utmost importance based on their predominance in the human everyday life, i.e. speech or print media. Depending on the context, different names can be chosen for one and the same element, and the same element can be referenced by different names. Here homonymy and synonymy are the main cause of the appearance of ambiguity in perceiving which concrete unique ontology element ought to be referenced by a specific natural language identifier describing an entity. We propose a novel method to resolve entity references under the aspect of ambiguity which explores only formal background knowledge represented in RDF graph structures. The key idea of our domain independent approach is to build an entity network with the most likely referenced ontology elements by constructing spanning graphs based on spreading activation (SA). Additional to the exploitation of complex graph structures we devise a new ranking technique that characterises the likelihood of entities in this network, i.e. interpretation contexts. Experiments in a highly polysemic domain show the ability of the algorithm to retrieve the correct ontology elements in almost all cases.".
- 44 abstract "In previous work we have shown that the MapReduce framework for distributed computation can be deployed for highly scalable inference over RDF graphs under the RDF Schema semantics. Unfortunately, several key optimizations that enabled the scalable RDFS inference do not generalize to the richer OWL semantics. In this paper we analyze these problems, and we propose solutions to overcome them. Our solutions allow distributed computation of the closure of an RDF graph under the OWL Horst semantics. We have implemented our approach on the Hadoop platform and deployed it on a compute cluster of 64 machines. We have evaluated our approach using a real-world dataset (UniProt, about 1.5 billion triples) and a synthetic benchmark (LUBM, up to 100 billion triples). Results show that our implementation is scalable and vastly outperforms current systems when comparing supported language expressivity, maximum data size and inference speed.".
- 56 abstract "The availability of a concrete language for embedding knowledge patterns inside OWL ontologies makes it possible to analyze their impact on the semantics when applied to the ontologies themselves. Starting from recent results available in the literature, this work proposes a sufficient condition for identifying safe patterns encoded in OPPL. The resulting framework can be used to implement OWL ontology engineering tools that help knowledge engineers to understand the level of extensibility of their models as well as pattern users to determine what are the safe ways of utilizing a pattern in their ontologies.".
- 57 abstract "In SPARQL queries, the combination of triple patterns is expressed by using shared variables across patterns. Based on this characterization, basic graph patterns in a SPARQL query can be partitioned into groups of acyclic pattern combinations that share exactly one variable, or star-shaped groups. We observe that the number of triples in a group is proportional to the number of individuals that play the role of the subject or the object; however, depending on the degree of participation of the subject individuals in the properties, a group could be not much larger than a class or type to which the subject or object belongs. Thus, it may be significantly more efficient to independently evaluate each of the groups, and then merge the resulting sets, than linearly joining all triples in a basic graph pattern. Based on these properties of star-shaped groups, we have developed query optimization and evaluation techniques. We have conducted an empirical analysis on the benefits of the optimization and evaluation techniques in several SPARQL query engines. We observe that our proposed techniques are able to speed up query evaluation time for join queries with star-shaped patterns by at least one order of magnitude.".
- 66 abstract "Description Logic Programs (DL-programs) have been introduced as a realization of combined ontological and rule-based reasoning in the context of the Semantic Web. A DL-program loosely combines a Description Logic (DL) ontology with a non-monotonic logic program (LP) such that dedicated atoms in the LP, called DL-atoms, allow for a bidirectional flow of knowledge between the two components. Unfor- tunately, the information sent from the LP-part to the DL-part might cause an inconsistency in the latter, leading to the trivial satisfaction of every query. As a consequence, in such a case, the answer sets that define the semantics of the DL-program may contain spoiled informa- tion influencing the overall deduction. For avoiding unintuitive answer sets, we introduce a refined semantics for DL-programs that is sensitive for inconsistency caused by the combination of DL and LP, and dynam- ically deactivates rules whenever such an inconsistency would arise. We analyze the complexity of the new semantics, discuss implementational issues and introduce an adequate notion of stratification that guarantees uniqueness of answer sets.".
- 68 abstract "Semantic Web policies are general statements that define the behavior of a system acting on behalf of real users. These policies have various applications ranging from dynamic agent control to advanced access control policies. Although policies attracted a lot of research efforts in recent years, suitable representation and reasoning facilities allowing for \emph{reactive} policies are not likewise developed. In this paper, we describe the concept of reactive Semantic Web policies. Reactive policies allow for the definition of events and actions, that is, they allow to define reactive behavior of a system acting on the Semantic Web. A reactive policy makes use of the tremendous amount of knowledge available on the Semantic Web in order to guide system behaviour while at the same time ensuring trusted and policy-compliant communication. In paper, we present a formal framework for expressing and enforcing such reactive policies in combination with advanced trust establishing techniques that featuring an interplay between reactivity and agent negotiation. We show how we used our approach in a real system by presenting a prototype that allows to define and enforce reactive Semantic Web policies on the Social Network and communication tool Skype.".
- 72 abstract "Query answering on a wide and heterogeneous environment such as the Web can return a large number of results that can be hardly manageable by users/agents. The adoption of grouping criteria of the results could be of great help. Up to date, most of the proposed methods for aggregating results on the (Semantic) Web are mainly grounded on syntactic approaches. However, they could not be of significant help, when the values instantiating a grouping criterion are all equal (thus creating a unique group) or when the values instantiating a grouping criterion are almost all different (thus creating one group for each answer). We propose a novel approach that is able to overcome such drawbacks, where, given a query in the form of a conjunctive query, grouping is grounded on the exploitation of the semantics of background ontologies during the aggregation of query results. Speciffically, we propose a solution where, in a deductive modality, answers are grouped taking into account the subsumption hierarchy of the underlying knowledge base. As such, the results can be showed and navigated similarly to a faceted search. An experimental evaluation of the proposed method is also reported.".
- 11 abstract "This thesis investigates the unification of folksonomies and ontologies in such a way that the resulting structures can better support exploration and search on the World Wide Web. First, an integrated computational method is employed to extract the ontological structures from folksonomies. It exploits the power of low support association rule mining supplemented by an upper ontology such as WordNet. Promising results have been obtained from experiments on Flickr and Citeulike. Next, a crowdsourcing method is introduced to channel online usersâ•˙ search efforts to help to evolve the extracted ontology.".
- 12 abstract "We propose the development of a Global Semantic Graph (GSG) as the foundation for future information and collaboration-centric applications and services. It would provide a single abstraction for storing, processing and communicating information based on globally interlinked semantic resources. The GSG adopts approaches and methods from the Semantic Web and thus facilitates a better information sharing abstraction.".
- 17 abstract "Since the amount of user-generated content has been sharply increasing in recent years, mainly due to Web 2.0 technology and effects of social networking, it is necessary to build mechanisms to assess the reliability of the content. On the web this notion of trust is a key ingredient for an effective manipulation of knowledge on a (world-wide) web scale. The web of trust has thus become an important research area both for web science and semantic web. In the PhD research we have laid out for us, we focus on the notion of trust and methods for representing and computing trust of users in the web content. This paper outlines the vision at the start of the PhD period on the research problem and the semantic web-based approach to solve that problem.".
- 19 abstract "This thesis focuses on developing an efficient framework for contextualized knowledge representation on Semantic Web. We bring about the drawbacks of existing context formalisms which hinders an efficient implementation and propose a formalism for contexts that enables for the development of framework with desired properties. Some of the future milestones to be achieved for this thesis work are (i) develop a proof theory for the logical framework based on Description Logics (DL) (ii) developing reasoning algorithms (iii) verify and compare the performance of these algorithms to existing distributed reasoning formalisms and (iv) system implementation".
- 22 abstract "The Semantic Web uses formal ontologies as a key instrument that adds structure to the underlying data, but building domain specific ontologies is still a difficult, time consuming and error-prone process because most information is currently available as free-text or semi-structured text. Therefore the development of fast and cheap solutions for ontology learning from text is a key factor for the success and large scale adoption of the Semantic Web. Ontology development is primarily concerned with the definition of concepts and relations between them, so one of the fundamental research problems related to ontology learning is the reliable extraction of concepts from text. To investigate this research problem we will focus on the expert finding application, i.e. the reliable extraction of expertise topics from relevant text that can be assigned to individuals in an organization.".
- 23 abstract "Although one might argue that little wisdom can be conveyed in messages of 140 characters or less, this PhD research sets out to explore if and what kind of knowledge can be acquired from different aggregations of social awareness streams. The expected contribution of this research is a network-theoretic model for defining, comparing and analyzing different kinds of social awareness streams and an experimental prototype to extract semantic models from them.".
- 25 abstract "The current state of the art regarding scalable reasoning consists of programs that run on a single machine. When the amount of data is too large, or the logic is too complex, the computational resources of a single machine are not enough. We propose a distributed approach that overcomes these limitations and we sketch a research methodology. A distributed approach is challenging because of the skew in data distribution and the difficulty in partitioning Semantic Web data. We present initial results which are promising and suggest that the approach may be successful.".
- 30 abstract "Recent work in Ontology Learning and Textmining mainly focused on engineering methods to solve practical problem. In this thesis, we investigate methods that can substantially improve a wide range of existing approaches by minimizing the underlying problem: The Semantic Gap between formalized meaning and human cognition. We deploy OWL as a Meaning Representation Language and create a unified model, which combines existing NLP methods with Linguistic knowledge and aggregates disambiguated background knowledge from the Web of Data. The presented methodology allows to study and evaluate the capabilities of such aggregated knowledge to improve the efficiency of methods in Ontology Learning.".
- 5 abstract "With the proliferation of knowledge intensive applications, there is a vivid research in the domain of knowledge representation. Description Logics are designed to be a convenient means for such representation task. One of their main advantages over other formalisms is a clearly defined semantics which opens the possibility to provide reasoning services with mathematical rigorousness. My PhD work is concerned with Description Logic reasoning. I am particularly interested in ABox reasoning when the available data is really large. This domain is much less explored than TBox reasoning. Nevertheless, reasoning over large ABoxes is useful for problems like web-based reasoning. I am one of the developers of the DLog data reasoner which implements a two phase reasoning: the first phase uses complex reasoning to turn the TBox into a set of simple rules, while the second phase is geared towards very fast query answering over large ABoxes. DLog currently supports the SHIQ DL language. We are trying to extend the reasoner to more expressive languages, hopefully until SROIQ, the logic behind OWL 2.".
- 6 abstract "In parallel with the proliferation of ontologies and their use in semantically-enabled applications, the issue of finding and dealing with defects in ontologies has become increasingly important. Current work mostly targets detecting and repairing semantic defects in ontologies. In our work, we focus on another kind of severe defects, modeling defects, which require domain knowledge to detect and resolve. In particular, we are interested in detecting and repairing the missing structural relations (is-a hierarchy) in the ontologies. Our goal is to develop a system, which allows a domain expert to detect and repair the structure of ontologies in a semi-automatic way.".
- 1 abstract "RDF is the metadata model of choice in the Semantic Sensor Web. However, RDF can only represent thematic metadata and needs to be extended if we want to model spatial and temporal information. For this purpose, we develop the data model stRDF and the query language stSPARQL. stRDF is a constraint data model that extends RDF with the ability to represent spatial and temporal data. stSPARQL extends SPARQL so that spatial and temporal data can be queried using a declarative and user-friendly language. We follow the main ideas of spatial and temporal constraint databases and represent spatial objects as quantifier-free formulas in a first-order logic of linear constraints. The main contribution of stRDF is to bring to the RDF world the benefits of constraint databases and constraint-based reasoning so that spatial and temporal data can be represented in RDF using constraints. In this paper, we present the syntax and semantics of stRDF and stSPARQL and discuss some details of our in-progress implementation.".
- 11 abstract "Service orientation is a promising paradigm for offering and consuming functionalities within and across organizations. Ever increasing acceptance of service oriented architectures in combination with the acceptance of the Web as a platform for carrying out electronic business triggers a need for automated methods to find appropriate Web services. Various formalisms for discovery of semantically described services with varying expressivity and complexity have been proposed in the past. However, they are difficult to use since they apply the same formalisms to service descriptions and request. Furthermore, an intersection-based matchmaking is insufficient to check for the applicability of Web services for a given request. In this paper we show that, although most of prior approaches provide a formal semantics, their pragmatics to describe requests is improper since it differs from the user intention. We introduce distinct formalisms to describe functionalities and service requests. We also provide the formal underpinning and implementation of a matching algorithm.".
- 15 abstract "The availability of contents and information as linked data or Web services, i.e. over standardized interfaces, fosters the integration and reuse of data. One common form of information integration is the creation of composed documents, e.g. in form of dynamic Web pages. Service and data providers restrict allowed usage of their resources and and link it to obligations, e.g. only non-commercial usage is allowed and requires an attribution of the provider. These terms and conditions are currently typically available in natural language which makes checking, if a document composition is compliant with the policies of the used services, a tedious task. In order to make it easier for users to adhere to these usage policies, we propose to formalize them, which enables policy-aware tools that support the creation of compliant compositions. In this paper we propose an OWL model of document compositions and show how it can be used together with the policy language AIR to build a policy-aware document composition platform. We furthermore present a use case and illustrate how it can be realized with our approach.".
- 2 abstract "Semantic web services (SWS) promise to take service oriented computing to a new level by allowing to semi-automate time-consuming programming tasks. At the core of SWS are solutions to the problem of SWS matchmaking, i.e., the problem of filtering and ranking a set of services with respect to a service query. Comparative evaluations of different approaches to this problem form the base for future progress in this area. Reliable evaluations require informed choices of evaluation measures and parameters. This paper establishes a solid foundation for such choices by providing a systematic discussion of the characteristics and behavior of various retrieval correctness measures in theory and through experimentation.".
- 24 abstract "Current proposals on Semantic Web Services discovery and ranking are based on user preferences descriptions that often come with insufficient expressiveness, consequently making more difficult or even preventing the description of complex user desires. There is a lack of a general and comprehensive preference model, so discovery and ranking proposals have to provide ad hoc preference descriptions whose expressiveness depends on the facilities provided by the corresponding technique, resulting in user preferences that are tightly coupled with the underlying formalism being used by each concrete solution. In order to overcome these problems, in this paper an abstract and sufficiently expressive model for defining preferences is presented, so that they may be described in an intuitively and user-friendly manner. The proposed model is based on a well-known query preference model from database systems, which provides highly expressive constructors to describe and compose user preferences semantically. Furthermore, the presented proposal is independent from the concrete discovery and ranking engines selected, and may be used to extend current Semantic Web Service frameworks, such as WSMO, SAWSDL, or OWL-S. In this paper, the presented model is also validated against a complex discovery and ranking scenario, and a concrete implementation of the model in WSMO is outlined.".
- 3 abstract "Most approaches to application integration require an unambiguous exchange of events. Ontologies can be used to annotate the events exchanged and thus ensure a common understanding of those events. The domain knowledge formalized in ontologies can also be employed to facilitate more intelligent, semantic event processing, but at the cost of higher processing efforts. When application integration and event processing are implemented on the user interface layer, performance is an important issue to ensure acceptable reactivity of the integrated system. In this paper, we analyze different architecture variants of implementing such an event exchange, and present an evaluation with regard to performance. An example of an integrated application from the emergency management domain is used to demonstrate those variants.".
- 1 abstract "Efficiently detecting near duplicate resources is an important task when integrating information from various sources and applications. Once detected, near duplicate resources can be grouped together, merged, or removed, in order to avoid repetition and redundancy, and to increase the diversity in the information provided to the user. In this paper, we introduce an approach for efficient semantic-aware near duplicate detection, by combining indexing schemes for similarity search with the RDF representations of the resources. We provide a probabilistic analysis for the correctness of the suggested approach, which allows applications to configure it for satisfying their specific quality requirements. Our experimental evaluation on the RDF descriptions of real-world news articles from various news agencies, demonstrates the efficiency and the effectiveness of our approach.".
- 29 abstract "In order to support informal learning, we complement the formal knowledge represented by ontologies developed by domain experts with the informal knowledge emerging from social tagging. To this end, we have developed an ontology enrichment pipeline that can automatically enrich a domain ontology using: data extracted by a crawler from social media applications, similarity measures, the DBpedia knowledge base, a disambiguation algorithm and several heuristics. The main goal is to provide dynamic and personalized domain ontologies that include the knowledge of the community of users. They will support a more personalized learning experience able to fulfill the needs of different types of learners.".
- 30 abstract "With increasing usage of Social Networks, the ability of users to establish access restrictions on their data and resources becomes more and more important. However, privacy preferences in nowadays Social Network applications, are rather limited and do not allow to define policies with fine-grained concept definitions. Moreover, due to the walled garden structure of the Social Web, current privacy settings for one platform cannot to refer to information about people on other platforms. In addition, although most of the Social Network''s privacy setting share the same nature, users are forced to define and maintain their privacy settings separately for each platform. In this paper, we present a semantic model for privacy preferences on Social Web applications that overcomes those problems. Our model extends the current privacy model for Social Platforms by semantic concept definitions. By means of those concepts, users are enabled to exactly define what portion of their profile or which resources they want to protect and which user category is allowed to see those parts. Such category definitions are not limited to one single platform but can refer to information from other platforms as well. We show how this model can be implemented as extension of the OpenSocial standard, to enable advanced privacy settings that can be exchanged among OpenSocial platforms.".
- 33 abstract "Acquiring structured data from wikis is a problem of increasing interest in knowledge engineering and semantic web. In fact, collaboratively developed resources are growing in time, have high quality and are constantly updated. A very promising approach to this aim is extracting thesauri from wikis. A thesaurus is a work that lists words grouped together according to similarity of meaning, generally organized by synonyms. Thesauri are very useful for a large variety of applications, including information retrieval and knowledge engineering. Most information in wikis is expressed by means of natural language texts and internal links among web pages, the so called wikilinks. In this paper, an innovative method for inducing thesauri from Wikipedia is presented. It leverages on the Wikipedia structure to extract concepts and terms denoting them, obtaining a thesaurus that can be profitably used into applications. To boost precision and avoid noise, we apply word sense disambiguation techniques for lexical substitution and latent semantic analysis. In the paper, we show how to represent the extracted results following an RDF/OWL schema that can be published in the semantic web.".
- 34 abstract "Many systems exist for community formation in extensions of traditional Web environments but little work has been done for forming and maintaining communities in the more dynamic environments emerging from {\it ad hoc} and peer-to-peer networks. This paper proposes an approach for forming and evolving peer communities based on the sharing of choreography specifications (interaction models). Two mechanisms for discovering interaction models and collaborative peers are presented based on a meta-search engine and dynamic peer grouping algorithm respectively. OKBook, a system allowing peers to publish, discover and subscribe/unsubscribe to interaction models, has been implemented in accordance with our approach. For the meta-search engine, a strategy for integrating and re-ranking search results obtained from Semantic Web search engines is also described. This allows peers to discover interaction models from their group members, thus reducing the burden on the meta-search engine. Our approach complies with principles of Linked Data and is capable of both contributing to and benefiting from the Web of data.".
- 4 abstract "In this paper we present a model for multifaceted tagging, i.e. tagging enriched with contextual information. We present TagMe!, a social tagging front-end for Flickr images, that provides multifaceted tagging functionality: It enables users to attach tag assignments to a specific area within an image and to categorize tag assignments. Moreover, TagMe! automatically maps tags and categories to DBpedia URIs to clearly define the meaning of freely-chosen words. Our experiments reveal the benefits of those additional tagging facets. For example, the exploitation of those facets significantly improves the performance of FolkRank-based search. Further, we demonstrate the benefits of TagMe! tagging facets for learning semantics within folksonomies.".
- 5 abstract "The Live Social Semantics is an innovative application that encourages and guides social networking between researchers at conferences and similar events. The application integrates data and technologies from the semantic web, online social networks, and a face-to-face contact sensing platform. It helps researchers to find like-minded and influential researchers, to identify and meet people in their community of practice, and to capture and later retrace their real-world networking activities at conferences. The application was successfully deployed at two international conferences, attracting more than 300 users in total. This paper describes this application, and discusses and evaluates the results of its two deployments.".
- 6 abstract "Managing one''s memberships in different online communities more and more becomes a cumbersome task. This is due to the rapidly increasing number of communities in which we participate such as professional networks, sports clubs, and groups with specific interests like hobbies and others. In addition, these communities are scattered and distributed over multiple systems, i.e., different community platforms that each require a distinct user account. They also have different user interfaces and different definitions of communities and groups. In this paper, we present dgFOAF, an approach for distributed group management based on the well known Friend-of-a-Friend (FOAF) vocabulary. Our dgFOAF approach is independent of the concrete community platforms we find today and needs no central server. Thus it allows for defining communities that span across multiple systems and thus alleviates the community administration task.".
- 17 abstract "On the Web of Data, entities are often interconnected in a way similar to web documents. Previous works have shown how PageRank can be adapted to achieve entity ranking. In this paper, we propose to exploit locality on the Web of Data by taking a layered approach, similar to hierarchical PageRank approaches. We provide justifications for a two-layer model of the Web of Data, and introduce DING (Dataset Ranking) a novel ranking methodology based on this two-layer model. DING uses links between datasets to compute dataset ranks and combines the resulting values with semantic-dependent entity ranking strategies. We quantify the effectiveness of the approach with other link-based algorithms on large datasets coming from the Sindice search engine. The evaluation which includes a user study indicates that the resulting rank is better than the other approaches. Also, the resulting algorithm is shown to have desirable computational properties such as parallelisation.".
- 18 abstract "Now motivated also by the partial support of major search engines, hundreds of millions of documents are being published on the web embedding semi-structured data in RDF, RDFa and Microformats. This scenario calls for novel information search systems which provide effective means of retrieving relevant semi-structured information. In this paper, we present an ``entity retrieval system'''' designed to provide entity search capabilities over datasets as large as the entire Web of Data. Our system supports full-text search, semi-structural queries and top-k query results while exhibiting a concise index and efficient incremental updates. We advocate the use of a node indexing scheme and show that it offers a good compromise between query expressiveness, query processing time and update complexity in comparison to three other indexing techniques. We then demonstrate how such system can effectively answer queries over 10 billion triples on a single commodity machine.".
- 28 abstract "in the Linked Open Data cloud one of the largest data sets, comprising of 2.5 billion triples, is derived from the Life Science domain. Yet this represents a small fraction of the total number of publicly available data sources on the Web. In this paper, we briefly describe past attempts to transform specific Life Science sources from a plethora of open as well as proprietary formats into RDF data. In particular, we identify and tackle two bottlenecks in current practice: Acquiring ontologies to formally describe these data and creating RDFizer programs to convert data from legacy formats into RDF. We propose an unsupervised method, based on transformation rules, for performing these two key tasks, which makes use of our previous work on unsupervised wrapper induction for extracting labelled data from complete Life Science Web sites. We apply our approach to 13 real-world online Life Science databases. The learned ontologies are evaluated by domain experts as well as against gold standard ontologies. Furthermore, we compare the learned ontologies against ontologies that are “lifted” directly from the underlying relational schema using an existing unsupervised approach. Finally, we apply our approach to three online databases to extract RDF data. Our results indicate that this approach can be used to bootstrap and speed up the migration of life science data into the Linked Open Data cloud.".
- 29 abstract "Keyword search has been regarded as an intuitive paradigm for searching not only documents but also data, especially when the users are not familiar with the data and the query language. Two types of approaches can be distinguished. Answers to keywords can be computed by searching for matching subgraphs directly in the data. The alternative to this is keyword translation, which is based on searching the data schema for matching join graphs, which are then translated to queries. Answering these queries is performed in the later stage. While clear advantages have been shown for the approaches based on query translation, we observe that processing done during query translation has some overlaps with the processing needed for query answering. Thus, we propose a tight integration of query translation with query answering. Instead of using the schema, we employ a bisimulation-based structure index graph. Searching this index for matching subgraphs results not only in queries, but also candidate answers. We propose a set of algorithms which allow for an incremental process where intermediate results computed during query translation can be reused for subsequent query answering. In experiments, we show that this integrated approach consistently outperforms the state of the art.".
- 30 abstract "Recently, the publishing of structured, semantic information as linked data has gained quite some momentum. For ordinary users on the Internet, however, this information is not yet very visible and (re-) usable. With LESS we present an end-to-end approach for the syndication and use of linked data based on the definition of templates for linked data resources and SPARQL query results. Such syndication templates are edited, published and shared by using a collaborative Web platform. Templates for common types of entities can then be combined with specific, linked data resources or SPARQL query results and integrated into a wide range of applications, such as personal homepages, blogs/wikis, mobile widgets etc. In order to improve reliability and performance of linked data, LESS caches versions either for a certain time span or for the case of inaccessibility of the original source. LESS supports the integration of information from various sources as well as any text-based output formats. This allows not only to generate HTML, but also diagrams, RSS feeds or even complete data mashups without any programming involved.".
- 32 abstract "It has been argued that linked open data is the major benefit of the use of semantic technologies on the web as it provides a huge amount of structured data that can be accessed in a more effective way than web pages. While linked open data avoids many problems connected with the use of expressive ontologies, e.g. the knowlegde acquisition bottleneck, data heterogeneity remains a problem. In particular, the same objects may be referred to using different URIs in different data sets. Identifying such representations of the same object is called object reconciliation. In this paper, we propose a novel object reconciliation method that is based on an existing semantic similarity measure for linked data. We adapt the measure to the object reconciliation problem, present complete and approximate algorithms for efficiently computing the methods and present a systematic evaluation of the method based on a benchmark dataset. As our main result, we show that the use of light-weight ontologies and schema information significantly improves object reconciliation in the context of linked open data.".
- 36 abstract "The performance of triple stores is one of the major obstacles for the deployment of semantic technologies in many usage scenarios. In particular, Semantic Web applications, which use triple stores as persistence backends, trade performance for the advantage of flexibility with regard to information structuring. In order to get closer to the performance of relational database-backed Web applications, we developed an approach for improving the performance of triple stores by caching query results and even complete application objects. The selective invalidation of cache objects, following updates of the underlying knowledge bases, is based on analysing the graph patterns of cached SPARQL queries in order to obtain information about what kind of updates will change the query result. We evaluated our approach by extending the BSBM triple store benchmark with an update dimension as well as in typical Semantic Web application scenarios.".
- 44 abstract "The increasing amount of data on the Web bears potential for addressing complex information needs more effectively. Instead of keyword search and browsing along links between results, users can specify the needs in terms of complex queries and obtain precise answers right away. However, users might not always know the query language and more importantly, the underlying schema. Motivated by the burden facing the data Web search users in specifying complex information needs, we identify a particular class of search approaches that we refer to as the schema-agnostic paradigm. Common to these search approaches is that no knowledge about the schema is required to specify complex information needs. We have conducted a systematic study of four popular approaches: (1) simple keyword search, (2) answer completion, which is based on computing complex answers as candidate results for user provided keywords, (3) query completion, which is based on computing structured queries as candidate interpretations of user provided keywords and (4) faceted search. We study these approaches from a process-oriented view to derive the main conceptual steps required for addressing complex information needs. Then, an experimental study is perform based on established conduct of a task-based evaluation. We derive main conclusions from the studies, as well as directions for future research.".
- 48 abstract "The Linking Open Data community project is promoting the creation of interlinked RDF datasets with links between data items identified using dereferenceable URIs. This promising direction for publishing data on the web brings forward a number of issues. A key challenge is to understand the data, the schema, and the interlinks that are actually used both within and across linked datasets. Understanding actual RDF usage is critical in the increasingly common situations where terms from many different RDFS and OWL vocabularies are mixed. In this paper we propose a novel mechanism to describe RDF usage by creating RDF summaries from bisimulation contractions of neighbourhoods (or BCNs). We describe a tool, ExpLOD, that supports constructing and visualizing BCNs, as well as generating SPARQL queries based on the BCNs. We use ExpLOD to describe RDF usage within several of the collections from the Linked Open Data cloud. We present a performance evaluation of graph and SPARQL-based BCN implementations.".
- 8 abstract "Lots of RDF data have been published in the Semantic Web. The RDF data model, together with the decentralized linkage nature of the Semantic Web, brings object link structure to the worldwide scope. Object links are critical to the Semantic Web and the macroscopic properties of object links are helpful for better understanding the current Data Web. In this paper, we propose a notion of object link graph (OLG) in the Semantic Web, and analyze the complex network structure of an OLG constructed from the latest dataset (FC09) collected by Falcons. We find that the OLG has the scale-free nature and the effective diameter of the graph is small compared to its scale. By another experimental result on the last year''s dataset (FC08), we confirm our findings and observe that the object link graph is becoming denser and its diameter is shrinking during the past year, which indicates a good evolution of the Data Web. Finally, we repeat the complex network analysis on the two largest domain-specific subsets of FC09, namely Bio2RDF(FC09) and DBpedia(FC09). The results show that both Bio2RDF(FC09) and DBpedia(FC09) have low density in object links, which has great influence on the density of object links in FC09.".
- 1 abstract "There are several applications that provide access to objects whose descriptions are accompanied by a degree expressing their strength. Such degrees can have various application specific semantics, such as relevance, precision, certainty, trust, etc. In this paper we consider Fuzzy RDF as the representation framework for such "weighted" descriptions, and we propose a novel model for interactively browsing and exploring such sources which allows formulating complex queries by simple clicks. Specifically, and in order to exploit the fuzzy degrees, the model proposes interval-based transition markers. The advantage of the model is that it significantly increases the discrimination power of the interaction, without making it complex for the end user.".
- 10 abstract "Abstract. Distributed Human Computation (DHC) is used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI has many research problems that are considered as AI-complete. E.g. co-reference resolution, which involves determining whether different URIs refer to the same entity, is a significant hurdle to overcome in the re- alisation of large-scale Semantic Web applications. In this paper, we pro- pose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the con- cept, we are focusing on handling the co-reference resolution when inte- grating distributed datasets. Traditionally machine-learning algorithms are used as a solution for this but they are often computationally expen- sive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co- reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various pub- lication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a dis- tributed manner. The aggregated results are dereferenceable in the Open Linked Data Cloud.".
- 2 abstract "Information and knowledge retrieval are today some of the main assets of the Semantic Web. However, a notable immaturity still exists, as to what tools, methods and standards may be used to effectively achieve these goals. No matter what approach is actually followed, querying Semantic Web information often requires deep knowledge of the ontological syntax, the querying protocol and the knowledge base structure as well as a careful elaboration of the query itself, in order to extract the desired results. In this paper, we propose a structured semantic query interface that helps to construct and submit entailment-based queries in an intuitive way. It is designed so as to capture the meaning of the intended user query, regardless of the formalism actually being used, and to transparently formulate one in reasoner-compatible format. This interface has been deployed on top of the semantic search prototype of the DSpace digital repository system.".
- 1 abstract "Knowledge Management systems are one of the key strategies that allow companies to fully tap into their collective knowledge. However, two main entry barriers currently limit the potential of this approach: i) the hurdles employees encounter discouraging them from a strong and active participation (knowledge providing) and ii) the lack of truly evolved intelligent technologies that allow those employees to easily benefiting from the global knowledge provided by them and other users (knowledge consuming). Both needs can sometimes require opposite approaches, tending the current solutions to be not user friendly enough for user participation to be strong or not intelligent enough for them to be useful. In this paper, a lightweight framework for Knowledge Management is proposed based on the combination of two layers that cater to each need: a microblogging layer that simplifies how users interact with the whole system and a semantic powered engine that performs all the intelligent heavy lifting by combining semantic indexing and search of messages and users. Different mechanisms are also presented as extensions that can be plugged-in on demand and help expanding the capabilities of the whole system.".
- 2 abstract "Geo-spatial applications need to provide powerful search capabilities to support users in their daily activities. However, discovery services are often limited by only syntactically matching user terminology to metadata describing geographical resources. We report our work on the implementation of a geographical catalogue, and corresponding semantic extension, for the spatial data infrastructure (SDI) of the Autonomous Province of Trento (PAT) in Italy. We focus in particular to the semantic extension which is based on the adoption of the S-Match semantic matching tool and on the use of a faceted ontology codifying geographical domain specific knowledge. We finally report our experience in the integration of the faceted ontology with the multi-lingual geo-spatial ontology GeoWordNet.".
- 21 abstract "Exploration and analysis of vast empirical data is a cornerstone of the development and assessment of driver assistance systems. A common challenge is to apply the domain specific knowledge to the (mechanised) data handling, pre-processing and analysis process. Ontologies can describe domain specific knowledge in a structured way that is manageable for both humans and algorithms. This paper outlines an architecture to support an ontology based analysis process for data stored in databases. Build on these concepts and architecture, a prototype that handles semantic data annotations is presented. Finally, the concept is demonstrated in a realistic example. The usage of exchangeable ontologies generally allows the adaption of presented methods for different domains.".
- 34 abstract "Requirements managers aim at keeping their sets of requirements well-defined, consistent and up to date throughout a project's life cycle. Semantic web technologies have found many valuable applications in the field of requirements engineering, with most of them focusing on requirements analysis. However the usability of results originating from such requirements analyses strongly depends on the quality of the original requirements, which often are defined using natural language expressions without meaningful structures. In this work we present the prototypic implementation of a semantic guidance system used to assist requirements engineers with capturing requirements using a semi-formal representation. The semantic guidance system uses concepts, relations and axioms of a domain ontology to provide a list of suggestions the requirements engineer can build on to define requirements. The semantic guidance system is evaluated based on a domain ontology and a set of requirements from the aerospace domain. The evaluation results show that the semantic guidance system effectively supports requirements engineers in defining well-structured requirements.".
- 37 abstract "One of any government’s main responsibilities is the provision of public services to its citizens, for example, education, health, transportation, and social services. Additionally, with the explosion of the Internet in the past 20 years, many citizens have moved online as their main method of communication, learning, buying, selling, etc. Therefore, a logical step for governments is to move the provision of public services online. However, public services have a complex structure and may span across multiple, disparate public agencies. Moreover, the legislation that governs a public service is usually difficult for a layman to understand. Despite this, governments have attempted to create online portals to enable citizens to glean knowledge and utilise specific public services. While this is a positive progression, most portals fail to engage citizens because the portals do not manage to hide the complexity of public services from users. Many also fail to address the specific needs of users, providing instead only information about the most general use-case. In order to address these issues a more user-friendly, customisable approach is required, so that citizens may find the public services that address their individual needs. In this paper we present the Semantic Public Service Portal (S-PSP), which structures and stores detailed public-services semantically, so that they may be presented to citizens on-demand in a relevant, yet uncomplicated, manner. This ontology-based approach enables automated and logical decision-making to take place semantically in the application layer of the portal, while the user remains blissfully unaware of its complexities. An additional benefit of this approach is that the eligibility of a citizen for a particular public service may be identified early. The S-PSP provides a rich, structured and personalised public service description to the citizen, with which he/she can consume the public service as directed. In this paper, a use-case of the S-PSP in a rural community in Greece is described, demonstrating how its use can directly reduce the administrative burden of a citizen, who in this case is a rural SME.".
- 4 abstract "Disaster management software deals with supporting staff in large catastrophic incidents such as earthquakes or floods, e.g., by providing relevant information, facilitating task and resource planning, and managing communication with all involved parties. In this paper, we introduce the SoKNOS support system, which is a functional prototype for such software using semantic technologies for various purposes. Ontologies are used for creating a mutual understanding between developers and end users from different organizations. Information sources and services are annotated with ontologies for improving the provision of the right information at the right time, and for connecting existing systems and databases to the SoKNOS system using those annotations. Furthermore, the users' actions are constantly supervised, and errors are avoided by employing ontology-based consistency checking. We show how the pervasive and holistic use of semantic technologies leads to a significant improvement of both the development and the usability of disaster management software, and present some key lessons learned from employing semantic technologies in a large-scale software project.".
- 50 abstract "The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in different formats, not only in RDF. The aim in releasing open datasets is for developers to use them in innovative applications, but the datasets need to be found first and metadata available is often minimal, heterogeneous, and distributed making the search for the right dataset often problematic. To address the problem, we present DataFinland, a semantic portal featuring a distributed content creation model and tools for annotating and publishing metadata about LOD and non-RDF datasets on the web. The metadata schema for DataFinland is based on a modified version of the voiD vocabulary for describing linked RDF datasets, and annotations are done using an online metadata editor SAHA connected to ONKI ontology services providing a controlled set of annotation concepts. The content is published instantly on an integrated faceted search and browsing engine HAKO for human users, and machines as a SPARQL end-point and as a source file. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.".
- 52 abstract "Biodiversity management requires the usage of heterogeneous biological information from multiple sources. Indexing, aggregating, and finding such information is based on names and taxonomic knowledge of organisms. However, taxonomies change in time due to evolution, new scientific findings, opinions of authorities, and changes in our conception about life forms. Furthermore, organism names and their meaning change in time, different authorities use different scientific names for the same taxon in different times, and various vernacular names are in use in dif- ferent languages. This makes data integration and information retrieval difficult without detailed biological information. This paper introduces a meta-ontology for managing the names and taxonomies of organisms, and presents three applications for it: 1) publishing biological species lists as ontology services (ca. 20 taxonomies including more than 80,000 names), 2) collaborative management of the vernacular names of vascu- lar plants (ca. 26,000 taxa), and 3) management of individual scientific name changes based on research results, covering a group of beetles. The applications are based on the databases of the Finnish Museum of Natural History and are used in a living lab environment on the web.".
- 56 abstract "In this paper we present a novel approach for achieving energy efficiency in public building (especially sensor-enabled offices) based on the application of the intelligent complex event processing and semantic technologies. In the nutshell of the approach is an efficient method for realizing real-time situational awareness that helps in recognizing, in real-time, the situations where a more efficient energy consumption is possible and reaction on those opportunities promptly. Semantics allows a proper contextualization of the sensor data (its abstract interpretation) whereas complex event processing enables the efficient real-time processing of sensor data and its logic-based nature supports a declarative definition of the situations of interests. The approach has been implemented in the iCEP framework for the intelligent Complex Event Reasoning, developed by authors. The results from a preliminary evaluation study are very promising: the approach enables a very precise real-time detection of the office occupancy situations that limit the operation of the lighting system based on the actual use of the space".
- 4 abstract "The amount of semantic data available as RDF is large and increasing. Despites the increased awareness that exploiting this large amount of data requires not only logic-based reasoning but also statistics-based inference capabilities, only little work can be found for the latter. On semantic data, supervised approaches particularly kernel-based Support Vector Machines (SVM) are promising. However, obtaining the right features to be used in kernels is an open problem because the amount of features that can be extracted from the complex structure of semantic data might very large. Further, instead of a single one, combining several kernels that specialize on subsets of features can help to deal with eciency and data sparsity. This creates the additional problem of identifying and combining dierent subset of features and kernels, respectively. In this work, we solve these two problems by employing the strategy of dynamic propositionalization to compute a hypothesis, representing the relevant features for a set of examples. Then, an R-convolution kernel is obtained from a set of clause kernels derived from components of the hypothesis. The learning of the hypothesis and kernel(s) is performed in an interleaving fashion, using a coevolution-based genetic algorithm for the underlying problem of multi-objective optimization. Based on experiments on real-world datasets, we show that the resulting relational kernel machine improves the SVM baseline".
- 6 abstract "An advantage of Semantic Web standards like RDF and OWL is their flexibility in modifying the structure of a knowledge base. To turn this flexibility into a practical advantage, it is of high importance to have tools and methods, which offer similar flexibility in extracting information from a knowledge base. This is closely related to the ability to easily formulate queries over those knowledge bases. We explain benefits and drawbacks of existing techniques in achieving this goal and then present the QTL algorithm, which fills a gap in research and practice. It uses supervised machine learning and allows users to ask queries without knowing the schema of the underlying knowledge base beforehand and without expertise in the SPARQL query language. We then present the AutoSPARQL user interface, which implements an active learning approach on top of QTL. Finally, we present an evaluation based on the SPARQL query log of the DBpedia knowledge base.".
- 8 abstract "The Linked Open Data (LOD) is a major milestone towards realizing the Semantic Web vision, and can enable applications such as robust Question Answering (QA) systems that can answer queries requiring multiple, disparate information sources. However, realizing these applications requires relationships at both the schema and instance level, but currently the LOD only provides relationships for the latter. To address this limitation, we present a solution for automatically finding schema-level links between two LOD ontologies -- in the sense of ontology alignment. Our solution, called BLOOMS+, extends our previous solution (i.e. BLOOMS) in two significant ways. BLOOMS+ 1) uses a more sophisticated metric to determine which classes between two ontologies to align, and 2) considers contextual information to further support (or reject) an alignment. We present a comprehensive evaluation of our solution using schema-level mappings from LOD ontologies to Proton (an upper level ontology) -- created manually by human experts for a real world application called FactForge. We show that our solution performed well on this task. We also show that our solution significantly outperformed existing ontology alignment solutions (including our previously published work on BLOOMS) on this same task.".
- 11 abstract "The amount of Linked Data is increasing steadily. Optimized top-down Linked Data query processing based on complete knowledge about all sources, bottom-up processing based on run-time discovery of sources as well as a mixed strategy that combines them has been proposed. One particular problem with Linked Data processing is that the heterogeneity of the sources and access options lead to varying input latency, rendering the application of blocking join operators infea- sible. Previous work partially address this by proposing a non-blocking iterator-based operator and another one based on symmetric-hash join. In this paper, we propose detailed cost models for these two operators to systematically compare them, and to allow for query optimization. Further, we propose a novel operator called the Symmetric Index Hash Join to address one open problem of Linked Data query processing: to query not only remote but also local Linked Data. We perform experiments on real-world datasets to compare our approach against the iterator-based baseline, and create a synthetic dataset to more systematically analyze the impacts of the individual components captured by the proposed cost models.".
- 28 abstract "Link traversal based query execution is a new query execution paradigm for the Web of Data. This approach allows the execution engine to discover potentially relevant data during the query execution and, thus, enables users to tap the full potential of the Web. In earlier work we propose to implement the idea of link traversal based query execution using a synchronous pipeline of iterators. While this idea allows for an easy and efficient implementation, it introduces restrictions that cause less comprehensive result sets. In this paper we address this limitation. We analyze the restrictions and discuss how the evaluation order of a query may affect result set size and query execution costs. To identify a suitable order, we propose a heuristic for our scenario where no a-priory information about relevant data sources is present. We evaluate the effectiveness of this heuristic by executing real-world queries over the Web of Data.".
- 29 abstract "A sizable amount of data on the Web is currently available via Web APIs that expose data in formats such as JSON or XML. Combining data from different APIs and data sources requires glue code which is typically not shared and hence not reused. We propose Linked Data Services (LIDS), a general, formalised approach for integrating data-providing services with Linked Data, a popular mechanism for data publishing which facilitates data integration and allows for decentralised publishing. We present conventions for service access interfaces that conform to Linked Data principles, and an abstract lightweight service description formalism. We develop algorithms that use LIDS descriptions to automatically create links between services and existing data sets. To evaluate our approach, we realise LIDS wrappers and LIDS descriptions for existing services and measure performance and effectiveness of an automatic interlinking algorithm over multiple billions of triples.".
- 3 abstract "The explosion in growth of the Web of Linked Data has provided, for the first time, a plethora of information in disparate locations, yet bound together by machine-readable, semantically typed relations. Utilisation of the Web of Data has been, until now, restricted to members of the community, eating their own dogfood, so to speak. To the regular web user browsing Facebook and watching Youtube, this utility is yet to be realised. The primary factor inhibiting such uptake is the usability of the Web of Data, which requires users to have prior knowledge of elements from the Semantic Web technology stack. We present a solution to this problem by hiding the stack, allowing end users to browse the Web of Data, explore and discover the information contained, and use Linked Data. Our solution employs a template-based visualisation approach where information attributed to a given resource is rendered according to its rdf:type.".
- 4 abstract "As more and more user traces become available as Linked Data Web, using those traces for expert finding becomes an interesting challenge, especially for the open innovation platforms. The existing expert search approaches are mostly limited to one corpus and one particular type of trace – sometimes even to a particular domain. We argue that different expert communities use different communication channels as their primary mean for communicating and disseminating knowledge, and thus different types of traces would be relevant for finding experts on different topics. We propose an approach for adapting the expert search process (choosing the right type of trace and the right expertise hypothesis) to the given topic of expertise, by relying on Linked Data metrics. In a gold standard-based experiment, we have shown that there is a significant positive correlation between the values of our metrics and the precision and recall of expert search. We also present hy.SemEx, a system that uses our Linked Data metrics to recommend the expert search approach to serve for finding experts in an open innovation scenario at hypios. The evaluation of the users’ satisfaction with the system’s recommendations is presented as well.".
- 8 abstract "While the realization of the Semantic Web as once envisioned by Tim Berners-Lee remains in a distant future, the Web of Data has already become a reality. Billions of RDF statements out there on the Internet, facts about a variety of different domains, are ready to be used by semantic applications. Some of these applications, however, crucially hinge on the availability of expressive schemas suitable for logical inference that yields non-trivial conclusions. In this paper, we present a statistical approach to the induction of expressive schemas from large RDF repositories. We describe in detail the implementation of this approach and report on an evaluation that we conducted using several data sets including DBpedia.".
- 3 abstract "As comparatively powerful mobile computing devices become more common, mobile web applications are dramatically gaining popularity. In this paper we present an approach for a mobile semantic collaboration platform based on the OntoWiki framework. It allows users to collect instance data, refine the structure of knowledge bases and browse data using hierarchical or faceted navigation on the go even without present data connection. A crucial part of OntoWiki Mobile is the advanced replication and conflict resolution for RDF content. The approach for conflict resolution is based on a combination of distributed revision control strategies and the EvoPat method for data evolution and ontology refactoring. OntoWiki mobile is available as an HTML5 Web applications and can be used in scenarios, where semantically rich information has to be collected in field-conditions such as during bio-diversity expeditions to remote areas.".
- 4 abstract "Smartphones, which contain a large number of sensors and integrated devices, are increasingly powerful and become full featured computing platforms in our pockets. For many people they already replace the computer as their window to the Internet, to the Web as well as to social networks. Hence, the management and presentation of information about contactsSmartphones, which contain a large number of sensors and integrated devices, are becoming increasingly powerful and fully featured computing platforms in our pockets. For many people they already replace the computer as their window to the Internet, to the Web as well as to social networks. Hence, the management and presentation of information about contacts, social relationships and associated information is one of the main requirements and features of today’s smartphones. The problem is currently solved only for centralized proprietary platforms (such as Google mail, contacts & calendar) as well as data-silo-like social networks (e.g. Facebook). Within the Semantic Web initiative standards and best-practices for social, semantic web applications such as FOAF emerged. However, there is no comprehensive strategy, how these technologies can be used efficiently in a mobile environment. In this paper we present the architecture as well as the implementation of a mobile Social Semantic Web framework, which weaves a distributed social network based on semantic technologies.".
- 14 abstract "We present an approach to content selection and discourse structuring in text generation that work directly on a task-independent domain knowledge base (KB) modeled in OWL. In order to facilitate the representation of objects, events and semantic relations that are inferred from the basic facts of the domain to obtain richer and more fluent texts, we distinguish between an extended ontology and an upper layer ontology. The nodes in the KB are weighted according to learnt models of content selection, such that a subset of them can be extracted. The extraction is done using templates that also consider semantic relations between the nodes and the user profile. The discourse structuring submodule maps the semantic relations to discourse relations and forms discourse units to then arrange them into a coherent discourse graph. The approach is illustrated and evaluated on an ontology that models the First Spanish Football League.".
- 17 abstract "There are a large number of ontologies currently available on the Semantic Web. However, in order to exploit them within natural language processing applications, more morphosyntactic information than can be represented in current Semantic Web standards is required. Further, there are a large number of lexical resources available representing a wealth of linguistic information, but this data exists in varied formats and is difficult to link to ontologies and other resources. We present a model we call lemon (Lexicon Model for Ontologies) that supports the sharing of terminological and lexicon resources on the Semantic Web as well as their linking to the existing semantic representations provided by ontologies. We demonstrate that lemon can succinctly represent existing lexical resources and in combination with standard NLP tools we can easily generate new lexica for domain ontologies according to the lemon model. We demonstrate that by combining generated and existing lexica we can collaboratively develop rich lexical descriptions of ontology entities. We also show that the adoption of Semantic Web standards can provide added value for lexicon models by supporting a rich axiomatization of linguistic categories that can be used to constrain the usage of the model and to perform consistency checks.".
- 4 abstract "Structured semantic metadata about unstructured web documents can be created using automatic subject indexing methods, avoiding laborious manual indexing. A succesful automatic subject indexing tool for the web should work with texts in multiple languages and be independent of the domain of discourse of the documents and controlled vocabularies. However, analyzing text written in a highly inflected language requires word form normalization that goes beyond rule-based stemming algorithms. We have tested the state-of-the art automatic indexing tool Maui on Finnish texts using three stemming and lemmatization algorithms and tested it with documents and vocabularies of different domains. Both of the lemmatization algorithms we tested performed significantly better than a rule-based stemmer, and the subject indexing quality was found to be comparable to that of human indexers.".
- 11 abstract "A growing number of ontologies have been published on the Semantic Web by various parties, to be shared for describing things. Because of the decentralized nature of the Web, there often exist different but similar ontologies from overlapped domains, or even within the same domain. In this paper, we collect more than four thousand ontologies and perform a large-scale pairwise matching based on an ontology matching tool. We create about 3.1 million mappings between the concepts (classes and properties) from these ontologies, and build a complex concept mapping graph (CMG) with concepts as nodes and mappings as edges. We analyze the macroscopic properties of the CMG as well as the induced ontology mapping graph (OMG), which characterize the global ontology matchability in many aspects, including the degree distribution, connectivity and reachability. We further establish a pay-level-domain mapping graph (PMG) to understand the common interests between different ontology publishers. Besides, we publish these generated mappings online based on the R2R mapping language. These mappings and our observations are believed to be useful for the Linked Data community in ontology creation, integration and maintenance.".
- 2 abstract "Ontologies may contain redundancy in terms of axioms that logically follow from other axioms and that could be removed for the sake of consolidation and conciseness without changing the overall meaning. In this paper, we investigate methods for removing such redundancy from ontologies. We define notions around redundancy and discuss typical cases of redundancy and their relation to ontology engineering and evolution. We provide methods to compute irredundant ontologies both indirectly by calculating justifications, and directly by utilising a hitting set tree algorithm and module extraction techniques for optimization. Moreover, we report on experimental results on removing redundancy from existing ontologies available on the Web.".
- 21 abstract "It is important that the ontology captures the essential conceptual structure of the target world as generally as possible. However, such ontologies are sometimes regarded as weak and shallow by domain experts because they often want to understand the target world from the domain-specific viewpoints in which they are interested. Therefore, it is highly desirable to have not only knowledge structuring from the general perspective but also from the domain-specific and multi-perspective so that concepts are structured for appropriate understanding from the multiple experts. On the basis of this observation, the authors propose a novel approach, called divergent exploration of an ontology, to bridge the gap between ontologies and domain experts. Based on the approach, we developed an ontology exploration tool which allows experts to explore an ontology and visualizes the result in a user-friendly form, i.e. a conceptual map, depending on the viewpoints that they specify. We evaluated the system through its application to an environmental domain and an experimental use by domain experts. As a result, we confirmed that the tool supports experts to obtain meaningful knowledge for them through the divergent exploration and it contributes to integrated understanding of the ontology and its target domain.".
- 23 abstract "There is an assumption that ontology developers will use a top-down approach by using a foundational ontology, because it purportedly speeds up ontology development and improves quality and interoperability of the domain ontology. Informal assessment of these assumption reveals ambiguous results that are not only open to different interpretations but also such that foundational ontology use is not included in several methodologies. Therefore, we investigated these assumptions in a controlled experiment. After a brief lecture about the contents of DOLCE, BFO, and part-whole relations, one-third chose to commence domain ontology development with an OWLized foundational ontology. Concerning new classes and class axioms, there is a trend in favour of those who commenced with a foundational ontology, but it is not statistically significant. The comprehensive results show that the `cost' incurred in spending time getting acquainted with a foundational ontology compared to starting from scratch was more than made up for in size, understandability, and interoperability already within the limited time frame of the experiment.".
- 24 abstract "While ontologies are widely accepted internationally as knowledge management mechanism across disciplines, the ability to reason over knowledge bases regardless of the natural languages used in them has become a pressing issue in digital content management. To enable knowledge sharing and reuse, ontology mapping techniques must be able to work with otherwise isolated ontologies that are labelled in diverse natural languages. Machine translation techniques are often employed by cross-lingual ontology mapping approaches to turn a cross-lingual mapping problem into a monolingual mapping problem which can then be solved by state of the art monolingual ontology matching tools. However in the process of doing so, complications introduced by machine translation tools can compromise the performance of the subsequent monolingual matching techniques. In this paper, a novel approach to improve the quality of cross-lingual ontology mapping is presented and evaluated. The proposed approach adopts the pseudo feedback technique that is similar to the well understood relevance feedback mechanism used in the field of information retrieval. It is shown through the evaluation that the pseudo feedback feature can enhance the effectiveness of machine translation and monolingual matching techniques in a cross-lingual ontology mapping scenario.".
- 56 abstract "When different versions of an ontology are published online, the links between them are often lost as the standard mechanisms (such as owl:versionInfo and owl:priorVersion) to expose these links are rarely used. This generates issues in scenarios where people or applications are required to make use of large scale, heterogenous ontology collections, implicitly containing multiple versions of ontologies. In this paper, we propose a method to detect automatically versioning links between ontologies which are available online through a Semantic Web search engine. Our approach is based on two main steps. The first step selects candidate pairs of ontologies by using versioning information expressed in their identifiers. In the second step, these candidate pairs are characterized through a set of features, including similarity measures, and classified by using Machine Learning Techniques, to distinguish the pairs that represent versions from the ones that do not. We discuss the features used, the methodology employed to train the classifiers and the precision obtained when applying this approach on the collection of ontologies of the Watson Semantic Web search engine.".