Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/lrec2008/papers/335> ?p ?o. }
Showing items 1 to 12 of
12
with 100 items per page.
- 335 creator tomanek-katrin.
- 335 creator udo-hahn.
- 335 type InProceedings.
- 335 label "Approximating Learning Curves for Active-Learning-Driven Annotation".
- 335 sameAs 335.
- 335 abstract "Active learning (AL) is getting more and more popular as a methodology to considerably reduce the annotation effort when building training material for statistical learning methods for various NLP tasks. A crucial issue rarely addressed, however, is when to actually stop the annotation process to profit from the savings in efforts. This question is tightly related to estimating the classifier performance after a certain amount of data has already been annotated. While learning curves are the default means to monitor the progress of the annotation process in terms of classifier performance, this requires a labeled gold standard which - in realistic annotation settings, at least - is often unavailable. We here propose a method for committee-based AL to approximate the progression of the learning curve based on the disagreement among the committee members. This method relies on a separate, unlabeled corpus and is thus well suited for situations where a labeled gold standard is not available or would be too expensive to obtain. Considering named entity recognition as a test case we provide empirical evidence that this approach works well under simulation as well as under real-world annotation conditions.".
- 335 hasAuthorList authorList.
- 335 hasTopic Linguistics.
- 335 isPartOf proceedings.
- 335 keyword "Acquisition, Machine Learning".
- 335 keyword "Corpus (creation, annotation, etc.)".
- 335 title "Approximating Learning Curves for Active-Learning-Driven Annotation".