Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/552> ?p ?o. }
Showing items 1 to 13 of
13
with 100 items per page.
- 552 creator ben-allison.
- 552 creator louise-guthrie.
- 552 type InProceedings.
- 552 label "Authorship Attribution of E-Mail: Comparing Classifiers over a New Corpus for Evaluation".
- 552 sameAs 552.
- 552 abstract "The release of the Enron corpus provided a unique resource for studying aspects of email use, because it is largely unfiltered, and therefore presents a relatively complete collection of emails for a reasonably large number of correspondents. This paper describes a newly created subcorpus of the Enron emails which we suggest can be used to test techniqes for authorship attribution, and further shows the application of three different classification methods to this task to present baseline results. Two of the classifiers used are are standard, and have been shown to perform well in the literature, and one of the classifiers is novel and based on concurrent work that proposes a Bayesian hierarchical distribution for word counts in documents. For each of the classifiers, we present results using six text representations, including use of linguistic structures derived from a parser as well as lexical information.".
- 552 hasAuthorList authorList.
- 552 hasTopic Linguistics.
- 552 isPartOf proceedings.
- 552 keyword "Document Classification, Text categorisation".
- 552 keyword "Language modelling".
- 552 keyword "Statistical methods".
- 552 title "Authorship Attribution of E-Mail: Comparing Classifiers over a New Corpus for Evaluation".