Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/564> ?p ?o. }
Showing items 1 to 15 of
15
with 100 items per page.
- 564 creator chikara-hashimoto.
- 564 creator daisuke-kawahara.
- 564 creator keiji-shinzato.
- 564 creator sadao-kurohashi.
- 564 type InProceedings.
- 564 label "A Large-Scale Web Data Collection as a Natural Language Processing Infrastructure".
- 564 sameAs 564.
- 564 abstract "In recent years, language resources acquired from theWeb are released, and these data improve the performance of applications in several NLP tasks. Although the language resources based on the web page unit are useful in NLP tasks and applications such as knowledge acquisition, document retrieval and document summarization, such language resources are not released so far. In this paper, we propose a data format for results of web page processing, and a search engine infrastructure which makes it possible to share approximately 100 million Japanese web data. By obtaining the web data, NLP researchers are enabled to begin their own processing immediately without analyzing web pages by themselves.".
- 564 hasAuthorList authorList.
- 564 hasTopic Linguistics.
- 564 isPartOf proceedings.
- 564 keyword "LR Infrastructures and Architectures".
- 564 keyword "LR web services".
- 564 keyword "Standards for LRs".
- 564 title "A Large-Scale Web Data Collection as a Natural Language Processing Infrastructure".