Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/365> ?p ?o. }
Showing items 1 to 18 of
18
with 100 items per page.
- 365 creator gertjan-van-noord.
- 365 creator ineke-schuurman.
- 365 creator martin-reynaert.
- 365 creator nelleke-oostdijk.
- 365 creator paola-monachesi.
- 365 creator roeland-ordelman.
- 365 creator vincent-vandeghinste.
- 365 type InProceedings.
- 365 label "From D-Coi to SoNaR: a reference corpus for Dutch".
- 365 sameAs 365.
- 365 abstract "The computational linguistics community in The Netherlands and Belgium has long recognized the dire need for a major reference corpus of written Dutch. In part to answer this need, the STEVIN programme was established. To pave the way for the effective building of a 500-million-word reference corpus of written Dutch, a pilot project was established. The Dutch Corpus Initiative project or D-Coi was highly successful in that it not only realized about 10% of the projected large reference corpus, but also established the best practices and developed all the protocols and the necessary tools for building the larger corpus within the confines of a necessarily limited budget. We outline the steps involved in an endeavour of this kind, including the major highlights and possible pitfalls. Once converted to a suitable XML format, further linguistic annotation based on the state-of-the-art tools developed either before or during the pilot by the consortium partners proved easily and fruitfully applicable. Linguistic enrichment of the corpus includes PoS tagging, syntactic parsing and semantic annotation, involving both semantic role labeling and spatiotemporal annotation. D-Coi is expected to be followed by SoNaR, during which the 500-million-word reference corpus of Dutch should be built.".
- 365 hasAuthorList authorList.
- 365 hasTopic Linguistics.
- 365 isPartOf proceedings.
- 365 keyword "Corpus (creation, annotation, etc.)".
- 365 keyword "LR national/international projects, organizational/policy issues".
- 365 keyword "Standards for LRs".
- 365 title "From D-Coi to SoNaR: a reference corpus for Dutch".