Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/779> ?p ?o. }
Showing items 1 to 13 of
13
with 100 items per page.
- 779 creator kazuaki-maeda.
- 779 creator stephanie-strassel.
- 779 creator xiaoyi-ma.
- 779 type InProceedings.
- 779 label "Creating Sentence-Aligned Parallel Text Corpora from a Large Archive of Potential Parallel Text using BITS and Champollion".
- 779 sameAs 779.
- 779 abstract "Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has supported research on statistical machine translations and other NLP applications by creating and distributing a large amount of parallel text resources for the research communities. However, manual translations are very costly, and the number of known providers that offer complete parallel text is limited. This paper presents a cost effective approach to identify parallel document pairs from sources that provide potential parallel text - namely, sources that may contain whole or partial translations of documents in the source language - using the BITS and Champollion parallel text alignment systems developed by LDC.".
- 779 hasAuthorList authorList.
- 779 hasTopic Linguistics.
- 779 isPartOf proceedings.
- 779 keyword "Corpus (creation, annotation, etc.)".
- 779 keyword "Machine Translation, SpeechToSpeech Translation".
- 779 title "Creating Sentence-Aligned Parallel Text Corpora from a Large Archive of Potential Parallel Text using BITS and Champollion".