Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/89> ?p ?o. }
Showing items 1 to 13 of
13
with 100 items per page.
- 89 creator simon-krek.
- 89 creator tomas-erjavec.
- 89 type InProceedings.
- 89 label "The JOS Morphosyntactically Tagged Corpus of Slovene".
- 89 sameAs 89.
- 89 abstract "The JOSmorphosyntactic resources for Slovene consist of the specifications, lexicon, and two corpora: jos100k, a 100,000 word balanced monolingual sampled corpus annotated with hand validated morphosyntactic descriptions (MSDs) and lemmas, and jos1M, the 1 million-word partially hand validated corpus. The two corpora have been sampled from the 600M-word Slovene reference corpus FidaPLUS. The JOS resources have a standardised encoding, with the MULTEXT-East-type morphosyntactic specifications and the corpora encoded according to the Text Encoding Initiative Guidelines P5. JOS resources are available as a dataset for research under the Creative Commons licence and are meant to facilitate developments of HLT for Slovene.".
- 89 hasAuthorList authorList.
- 89 hasTopic Linguistics.
- 89 isPartOf proceedings.
- 89 keyword "Corpus (creation, annotation, etc.)".
- 89 keyword "Standards for LRs".
- 89 keyword "Tagging".
- 89 title "The JOS Morphosyntactically Tagged Corpus of Slovene".