Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/319> ?p ?o. }
Showing items 1 to 15 of
15
with 100 items per page.
- 319 creator chai-wutiwiwatchai.
- 319 creator koji-iwano.
- 319 creator markpong-jongtaveesataporn.
- 319 creator sadaoki-furui.
- 319 type InProceedings.
- 319 label "Thai Broadcast News Corpus Construction and Evaluation".
- 319 sameAs 319.
- 319 abstract "Large speech and text corpora are crucial to the development of a state-of-the-art speech recognition system. This paper reports on the construction and evaluation of the first Thai broadcast news speech and text corpora. Specifications and conventions used in the transcription process are described in the paper. The speech corpus contains about 17 hours of speech data while the text corpus was transcribed from around 35 hours of television broadcast news. The characteristics of the corpus were analyzed and shown in the paper. The speech corpus was split according to the evaluation focus condition used in the DARPA Hub-4 evaluation. An 18K-word Thai speech recognition system was setup to test with this speech corpus as a preliminary experiment. Acoustic model adaptations were performed to improve the system performance. The best system yielded a word error rate of about 20% for clean and planned speech, and below 30% for the overall condition.".
- 319 hasAuthorList authorList.
- 319 hasTopic Linguistics.
- 319 isPartOf proceedings.
- 319 keyword "Corpus (creation, annotation, etc.)".
- 319 keyword "Speech recognition and understanding".
- 319 keyword "Speech resource/database".
- 319 title "Thai Broadcast News Corpus Construction and Evaluation".