Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/458> ?p ?o. }
Showing items 1 to 15 of
15
with 100 items per page.
- 458 creator atsushi-fujii.
- 458 creator masao-utiyama.
- 458 creator mikio-yamamoto.
- 458 creator takehito-utsuro.
- 458 type InProceedings.
- 458 label "Producing a Test Collection for Patent Machine Translation in the Seventh NTCIR Workshop".
- 458 sameAs 458.
- 458 abstract "In aiming at research and development on machine translation, we produced a test collection for Japanese-English machine translation in the seventh NTCIR Workshop. This paper describes details of our test collection. From patent documents published in Japan and the United States, we extracted patent families as a parallel corpus. A patent family is a set of patent documents for the same or related invention and these documents are usually filed to more than one country in different languages. In the parallel corpus, we aligned Japanese sentences with their counterpart English sentences. Our test collection, which includes approximately 2,000,000 sentence pairs, can be used to train and test machine translation systems. Our test collection also includes search topics for cross-lingual patent retrieval and the contribution of machine translation to a patent retrieval task can also be evaluated. Our test collection will be available to the public for research purposes after the NTCIR final meeting.".
- 458 hasAuthorList authorList.
- 458 hasTopic Linguistics.
- 458 isPartOf proceedings.
- 458 keyword "Corpus (creation, annotation, etc.)".
- 458 keyword "Evaluation methodologies".
- 458 keyword "Machine Translation, SpeechToSpeech Translation".
- 458 title "Producing a Test Collection for Patent Machine Translation in the Seventh NTCIR Workshop".