Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/www2012/phd/39> ?p ?o. }
Showing items 1 to 12 of
12
with 100 items per page.
- 39 creator cheng-wang.
- 39 type InProceedings.
- 39 label "AMBER: Turning Annotations into Knowledge".
- 39 sameAs 39.
- 39 abstract "Web extraction is the task of turning unstructured HTML into knowledge. Computers are able to generate annotations of unstructured HTML, but it is more important to turn those annotations into structured knowledge. Unfortunately, the current systems extracting knowledge from result pages lack accuracy. In this proposal, we present AMBER, a system fully automated turning annotations to structured knowledge from any result page of a given domain. AMBER observes basic domain attributes on a page and leverages repeated occurrences of similar attributes to group related attributes into records. This contrasts to previous approaches that analyze the repeated structure only of the HTML, as no domain knowledge is available. Our multi-domain experimental evaluation on hundreds of sites demonstrates that AMBER achieves accuracy (>98%) comparable to skilled human annotator.".
- 39 hasAuthorList authorList.
- 39 isPartOf proceedings.
- 39 isPartOf proceedings.
- 39 keyword "knowledge extraction".
- 39 keyword "vertical search".
- 39 keyword "web data extraction".
- 39 title "AMBER: Turning Annotations into Knowledge".