Matches in ESWC 2020 for { ?s ?p ?o. }
- Manfred_Hauswirth holdsRole Author.60.6.
- Paper.60_Review.0_Reviewer type RoleDuringEvent.
- Paper.60_Review.0_Reviewer label "Anonymous Reviewer for Paper 60".
- Paper.60_Review.0_Reviewer withRole ReviewerRole.
- Paper.60_Review.0_Reviewer withRole AnonymousReviewerRole.
- Paper.60_Review.0 type ReviewVersion.
- Paper.60_Review.0 issued "2001-01-15T14:58:00.000Z".
- Paper.60_Review.0 creator Paper.60_Review.0_Reviewer.
- Paper.60_Review.0 hasRating ReviewRating.2.
- Paper.60_Review.0 hasReviewerConfidence ReviewerConfidence.5.
- Paper.60_Review.0 reviews Paper.60.
- Paper.60_Review.0 issuedAt easychair.org.
- Paper.60_Review.0 issuedFor Conference.
- Paper.60_Review.0 releasedBy Conference.
- Paper.60_Review.0 hasContent "The paper presents Piveau, a platform for "Large-scale Open Data Management". The authors provide several arguments on why a new platform for open data management is needed and how their solutions compare with other open and popular solutions, and present some details of their implementation of the system. All the components of the system are open-source and publicly available on github. The solution relies on semantic web technologies and is deployed on https://www.europeandataportal.eu/ and so meets the requirements of the In-Use Track. I would like to see the paper published and presented at ESWC, although I have a few comments that I encourage the authors to address before final publication. My main comment is about the core hypothesis, which is: "a more sophisticated application of Semantic Web technologies can lower many barriers in Open Data publishing and reuse". I was happy to see that you define this core hypothesis early in the paper, but at the end I was somewhat disappointed that you haven't completely tested and validated this hypothesis. Throughout the paper you outline how semantic web technologies are used and how other solutions don't use such technologies, but then you don't make it clear how the use of these technologies can help (or have helped) the end users of the system. - In Section 6, Table 1, you are comparing features, and e.g. you are giving your solution 2 points for having a "Linked Data interface" through a "SPARQL endpoint". But the main question is: why is this needed and how can it help? If I can achieve my goals through a simple JSON based REST API, then why SPARQL? Similarly, one of the semantic technologies you are using is to achieve "Quality Assurance" comparing with the other solutions. Can you more clearly outline why this is important and how one could have achieved it without e.g. using SHACL and how crucial is the role of semantic web technologies here? - One suggestion is to use examples throughout the paper. You seem to have a really large deployment that you can use to derive examples. https://www.europeandataportal.eu/ shows 483,714 datasets. How would the solution for europeandataportal.eu look like if it wasn't based on Piveau? - Related to the above, it would be good to know whether RDF enables particular forms of reasoning that a standard JSON meta-data based solution cannot provide. - Is there a reason you do not mention the Socrata Platform which is a very widely used open data publication platform? I believe it is also open https://github.com/socrata ? - Another core question / food for thought: why not using Wikidata for all the meta-data, which has the additional benefit that you will contribute to a large and completely open knowledge base? And on the technical side, do you really need to rely on Virtuoso? Couldn't you contribute facts to Wikidata instead and query the public APIs?"".
- Paper.60_Review.1_Reviewer type RoleDuringEvent.
- Paper.60_Review.1_Reviewer label "Anonymous Reviewer for Paper 60".
- Paper.60_Review.1_Reviewer withRole ReviewerRole.
- Paper.60_Review.1_Reviewer withRole AnonymousReviewerRole.
- Paper.60_Review.1 type ReviewVersion.
- Paper.60_Review.1 issued "2001-02-03T10:09:00.000Z".
- Paper.60_Review.1 creator Paper.60_Review.1_Reviewer.
- Paper.60_Review.1 hasRating ReviewRating.0.
- Paper.60_Review.1 hasReviewerConfidence ReviewerConfidence.4.
- Paper.60_Review.1 reviews Paper.60.
- Paper.60_Review.1 issuedAt easychair.org.
- Paper.60_Review.1 issuedFor Conference.
- Paper.60_Review.1 releasedBy Conference.
- Paper.60_Review.1 hasContent "The paper describes the architecture and technical choices behind Piveau, a platform for creating, sharing, curating and querying Open Data. The authors explain their thinking with respect to the choices made. As can be quickly seen, Piveau tries to attain many different goals, thus there are "many different solutions" and one could, apparently, have envisioned other "combinations of solutions" while still doing something meaningful and interesting. The strong point of the paper, according to me, is the comparison table (Table 1) on page 12. That really shows how the system relates to comparable ones. The strong point of the work itself is that Piveau appears to be the system behind https://www.europeandataportal.eu, with millions of RDF datasets and (the authors state) "tens of thousands of updates per day". I think this level of usefulness and real deployment, alone, would justify acceptance. However, I'm a bit unhappy with the paper *writing*. It starts with the introduction: it is hard to figure out exactly what the problem being solved is, and what the limitations or shortcoming of the state of the art was, before this paper was written. The abstract talks of "barriers" and "limitations...", then of "bodies that encourage and foster...", then states "However, no existing solution for managing Open Data takes full advantage of these possibilities and benefits." This claim is too vague: which possibilities and benefits? I understand that due to the platform having wide applications, it is hard to pinpoint ONE advantage. Yet I still the authors should try to at least rewrite the abstract and introduction to clarify their contribution as much as it can be done. Then, the paper suffers from several typos: "verication", "tenths", "driven. [9]" (the citation should not be outside of the previous phrase to which it belongs), a phrase without a predicate ("For instance the integration of synchronous third-party libraries into our asynchronous programming model.") The authors should really proof-read it thoroughly to catch typos and improve the style. More annoyingly, the paper makes some claims or choices whose reasons or justification are not fully clear: 1. "Furthermore, there is no satisfactory and human-friendly method to present RDF in a user interface." This ignores significant efforts invested in the data visualization community to do just that (present RDF in user interfaces). Other methods are based on RDF graph summarization etc. I understand what the authors had in mind, but the claim here is too broad and needs to be nuanced. 2. The orchestration through PPL (at the end of 4.1) appears to use an ad-hoc model for orchestration, whereas very well-known standards exist for orchestrating Web services (think WSDL or BPML). Why was it necessary to invent something new? 3. The authors state that existing ETL platforms are not developed specifically for RDF. How big of an obstacle is that? Would it have been hard to tweak an existing platform to obtain an RDF-specific ETL one? Overall, I think the paper has value, but it is also annoying in some declarations that appear unjustified. It also suffers from "describing a lot of disconnected aspects" which I think follows from *implementing* a lot of orthogonal aspects. Thus, this second problem may be hard or impossible to solve in the paper. [After reading the authors' rebuttal] It is still not clear to me why micro-services and an ad-hoc orchestration approach is better (more flexible?... why, how?) than the standards in that area. I also find the authors' renewed statement of their contributions not very convincing. However, the paper clearly describes a fair amount of work, I wouldn't fight against acceptance."".
- Paper.60_Review.2_Reviewer type RoleDuringEvent.
- Paper.60_Review.2_Reviewer label "Anonymous Reviewer for Paper 60".
- Paper.60_Review.2_Reviewer withRole ReviewerRole.
- Paper.60_Review.2_Reviewer withRole AnonymousReviewerRole.
- Paper.60_Review.2 type ReviewVersion.
- Paper.60_Review.2 issued "2001-01-15T21:34:00.000Z".
- Paper.60_Review.2 creator Paper.60_Review.2_Reviewer.
- Paper.60_Review.2 hasRating ReviewRating.3.
- Paper.60_Review.2 hasReviewerConfidence ReviewerConfidence.4.
- Paper.60_Review.2 reviews Paper.60.
- Paper.60_Review.2 issuedAt easychair.org.
- Paper.60_Review.2 issuedFor Conference.
- Paper.60_Review.2 releasedBy Conference.
- Paper.60_Review.2 hasContent "The paper describes the Open Data management solution "Piveau", which is a framework for publishing, harvesting, and managing dataset descriptions. The framework is deployed at the European data portal, which is impressive in size and functionality. Current Open Data publishing frameworks, such as CKAN or OpenDataSoft, only partially support RDF metadata (e.g., only export of RDF, flat data schema, etc.). The proposed solution, "Piveau", clearly demonstrates the advantages of using semantic web technologies for the Open Data use case: The solution is very well-thought and scalable (as shown with the application in the European data portal). In particular, the section on the impact of SW technologies (6.3) is an interesting read and gives us a list of open issues which have to be tackled to support such solutions. While I really like the work and, in my opinion, it clearly should get accepted, I have some points that could help to improve the paper: * The difference between datasets and metadata descriptions was not always clear. In my opinion you could make more clear that the e.g. the importer collects metadata descriptions (and not datasets). A clarification/definition of terms at the beginning and a consistent use throughout the paper (dataset vs data vs metadata) could help to make this easier to understand. * Why do you compare to uData? While it is clear that CKAN is the most popular software for Open Data publishing, I was missing an argument why you select uData (and not OpenDataSoft, Socrata, DCAN, etc.) * To what extent are there actually links to other resources? Import of existing (e.g. JSON-based) dataset description makes it obviously not Linked Data. It could be mentioned in your critical assessment that the technology alone is not enough to get from Open Data to Linked Open Data. * While you discuss already related works with respect to management solutions, you could also include SW-based approaches which aim at harmonized/integrated Open Data metadata, e.g. [1, 2]. [1] Brickley, Dan, Matthew Burgess, and Natasha Noy. "Google Dataset Search: Building a search engine for datasets in an open Web ecosystem." The World Wide Web Conference. ACM, 2019. [2] Neumaier, Sebastian, Jürgen Umbrich, and Axel Polleres. "Automated quality assessment of metadata across open data portals." Journal of Data and Information Quality (JDIQ) 8.1 (2016): 2. p.3: recent efforts focusses on -> focus on"".
- Author.62.1 type RoleDuringEvent.
- Author.62.1 label "Sven Hertling, 1st Author for Paper 62".
- Author.62.1 withRole PublishingRole.
- Author.62.1 isHeldBy Sven_Hertling.
- b0_g161 first Author.62.2.
- b0_g161 rest nil.
- Sven_Hertling type Person.
- Sven_Hertling name "Sven Hertling".
- Sven_Hertling label "Sven Hertling".
- Sven_Hertling holdsRole Author.62.1.
- Sven_Hertling holdsRole Author.284.2.
- Author.284.2 type RoleDuringEvent.
- Author.284.2 label "Sven Hertling, 2nd Author for Paper 284".
- Author.284.2 withRole PublishingRole.
- Author.284.2 isHeldBy Sven_Hertling.
- Armando_Stellato type Person.
- Armando_Stellato name "Armando Stellato".
- Armando_Stellato label "Armando Stellato".
- Armando_Stellato holdsRole Paper.62_Review.0_Reviewer.
- Armando_Stellato holdsRole Author.189.2.
- Armando_Stellato mbox mailto:stellato@uniroma2.it.
- Paper.62_Review.0_Reviewer type RoleDuringEvent.
- Paper.62_Review.0_Reviewer label "Armando Stellato, Reviewer for Paper 62".
- Paper.62_Review.0_Reviewer withRole ReviewerRole.
- Paper.62_Review.0_Reviewer withRole NonAnonymousReviewerRole.
- Paper.62_Review.0_Reviewer isHeldBy Armando_Stellato.
- Author.189.2 type RoleDuringEvent.
- Author.189.2 label "Armando Stellato, 2nd Author for Paper 189".
- Author.189.2 withRole PublishingRole.
- Author.189.2 isHeldBy Armando_Stellato.
- Paper.62_Review.0 type ReviewVersion.
- Paper.62_Review.0 issued "2001-01-28T10:03:00.000Z".
- Paper.62_Review.0 creator Paper.62_Review.0_Reviewer.
- Paper.62_Review.0 hasRating ReviewRating.1.
- Paper.62_Review.0 hasReviewerConfidence ReviewerConfidence.5.
- Paper.62_Review.0 reviews Paper.62.
- Paper.62_Review.0 issuedAt easychair.org.
- Paper.62_Review.0 issuedFor Conference.
- Paper.62_Review.0 releasedBy Conference.
- Paper.62_Review.0 hasContent "++++ COMMENTS AFTER REVIEW ++++ While I still hold some reserves, the authors' response has been clear and convincing on some of the points raised by me. Additionally, the required changes do not require a full revision of the work but only clarifications. I thus revise my evaluation to "weak accept". ++++ INITIAL REVIEW ++++ The paper reports on the latest evolutions and improvements in the evaluation campaign OAEI, introducing in particular the latest gold standards for the recently added Knowledge Graph track and discussing the results of a hidden task – an evaluation of a run of matchers over datasets with no-overlapping domains - the that they performed over the last competing systems of the initiative. The paper is well and clearly written, it introduces the initiative for those who are new to it, briefly updates the reader on the latest additions to the evaluation and discusses the benchmarks, the way they have been built and the challenges characterizing the task as much as its evaluation I have a doubt about the trustworthiness of the Gold Standards. If the GS 2019 revealed so many matches, then there are a lot of them missing from GS 2018. In particular, why, if the models have been always matched by experts (i.e., even in 2019, as claimed by the authors), the two dataset pairs that are present in both GSs (i.e. memoryalpha-memorybeta and memoryalpha-stexpanded) have so different results? Have they simply been improved in 2019? However, if GS 2019 shows 4 and 7 trivial matches (respectively, for the two pairs, i.e. 14 total – 10 = 4 and 13 – 6 = 7) for them, why these slipped out of the attention of the crowdworkers on such a small number of elements? Additionally, if the number of matched instances with the link method is decently reliable (at least as an order of magnitude), how can the poor numbers of negative matches from 2018 be of any support in the evaluation? (as in section 3.1 it is said that only them have been adopted for the precision). I’m guessing they are mostly negative trivial matches, thus avoiding a phenomenon which might be smaller; however I’m not so sure of that and I don’t think it is understandable from this data. As these corpora of matches are used for the evaluation, noting such large differences across years raises some doubts about how much these can be called “gold standards” Later on, in the golden hammer bias section, in order to assess how much a 50-sample is reliable for the golden hammer, a statistical test of confidence should have been carried on, repeating experiments with different 50-matches samples on the same systems and checking their variation. The observations on the golden hammer bias are also interesting, but it should be, again, statistically assessed how much those numbers in the domain-overlapping case are not a result of those overfitting matchers, yet a purely proportional result of a matcher that “throws out some results” in an ocean populated with many positive matches versus one where there are not. I’m ambivalent, on the one side I value the importance of publicly disseminating on the results of such a renowned initiative such as OAEI, on the other side we must critically address flaws in the initiative itself or on its related dissemination, where the results might be questionable and the way to analyze them not as complete as it could be. I’d really suggest, for future investigation, to collaborate with some expert in statistics, to understand the statistical significance of the results. Numbers are important, but their meaning (as we all know working in the field of semantics) is even more. TYPOs. * the reference to table 3.2 in pag. 6 is actually table 4 MINOR REMARKS: The concept of trivial match (which can – rightly, as the reviewer is aware of it – be guessed from the end of the paragraph where “same names” are mentioned) should be clarified to the reader, as they might not be aware of it."".
- Author.66.1 type RoleDuringEvent.
- Author.66.1 label "Qingxia Liu, 1st Author for Paper 66".
- Author.66.1 withRole PublishingRole.
- Author.66.1 isHeldBy Qingxia_Liu.
- b0_g163 first Author.66.2.
- b0_g163 rest b0_g164.
- Author.66.2 type RoleDuringEvent.
- Author.66.2 label "Yue Chen, 2nd Author for Paper 66".
- Author.66.2 withRole PublishingRole.
- Author.66.2 isHeldBy Yue_Chen.
- b0_g164 first Author.66.3.
- b0_g164 rest b0_g165.
- b0_g165 first Author.66.4.
- b0_g165 rest b0_g166.
- Author.66.4 type RoleDuringEvent.
- Author.66.4 label "Evgeny Kharlamov, 4th Author for Paper 66".
- Author.66.4 withRole PublishingRole.