Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/402> ?p ?o. }
Showing items 1 to 14 of
14
with 100 items per page.
- 402 creator silvia-moraes.
- 402 creator susana-azeredo.
- 402 creator vera-lima.
- 402 type InProceedings.
- 402 label "Keywords, k-NN and Neural Networks: a Support for Hierarchical Categorization of Texts in Brazilian Portuguese".
- 402 sameAs 402.
- 402 abstract "A frequent problem in automatic categorization applications involving Portuguese language is the absence of large corpora of previously classified documents, which permit the validation of experiments carried out. Generally, the available corpora are not classified or, when they are, they contain a very reduced number of documents. The general goal of this study is to contribute to the development of applications which aim at text categorization for Brazilian Portuguese. Specifically, we point out that keywords selection associated with neural networks can improve results in the categorization of Brazilian Portuguese texts. The corpus is composed of 30 thousand texts from the Folha de São Paulo newspaper, organized in 29 sections. In the process of categorization, the k-Nearest Neighbor (k-NN) algorithm and the Multilayer Perceptron neural networks trained with the backpropagation algorithm are used. It is also part of our study to test the identification of keywords parting from the log-likelihood statistical measure and to use them as features in the categorization process. The results clearly show that the precision is better when using neural networks than when using the k-NN.".
- 402 hasAuthorList authorList.
- 402 hasTopic Linguistics.
- 402 isPartOf proceedings.
- 402 keyword "Document Classification, Text categorisation".
- 402 keyword "Information Extraction, Information Retrieval".
- 402 keyword "Tools, systems, applications".
- 402 title "Keywords, k-NN and Neural Networks: a Support for Hierarchical Categorization of Texts in Brazilian Portuguese".