Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/120> ?p ?o. }
Showing items 1 to 13 of
13
with 100 items per page.
- 120 creator barbara-plank.
- 120 creator khalil-sima-an.
- 120 type InProceedings.
- 120 label "Subdomain Sensitive Statistical Parsing using Raw Corpora".
- 120 sameAs 120.
- 120 abstract "Modern statistical parsers are trained on large annotated corpora (treebanks). These treebanks usually consist of sentences addressing different subdomains (e.g. sports, politics, music), which implies that the statistics gathered by current statistical parsers are mixtures of subdomains of language use. In this paper we present a method that exploits raw subdomain corpora gathered from the web to introduce subdomain sensitivity into a given parser. We employ statistical techniques for creating an ensemble of domain sensitive parsers, and explore methods for amalgamating their predictions. Our experiments show that introducing domain sensitivity by exploiting raw corpora can improve over a tough, state-of-the-art baseline.".
- 120 hasAuthorList authorList.
- 120 hasTopic Linguistics.
- 120 isPartOf proceedings.
- 120 keyword "Acquisition, Machine Learning".
- 120 keyword "Parsing Systems".
- 120 keyword "Statistical methods".
- 120 title "Subdomain Sensitive Statistical Parsing using Raw Corpora".