Matches in UGent Biblio for { <https://biblio.ugent.be/publication/942452#aggregation> ?p ?o. }
Showing items 1 to 33 of
33
with 100 items per page.
- aggregation classification "C1".
- aggregation creator B66766.
- aggregation creator B66767.
- aggregation creator person.
- aggregation creator person.
- aggregation date "2010".
- aggregation format "application/pdf".
- aggregation hasFormat 942452.bibtex.
- aggregation hasFormat 942452.csv.
- aggregation hasFormat 942452.dc.
- aggregation hasFormat 942452.didl.
- aggregation hasFormat 942452.doc.
- aggregation hasFormat 942452.json.
- aggregation hasFormat 942452.mets.
- aggregation hasFormat 942452.mods.
- aggregation hasFormat 942452.rdf.
- aggregation hasFormat 942452.ris.
- aggregation hasFormat 942452.txt.
- aggregation hasFormat 942452.xls.
- aggregation hasFormat 942452.yaml.
- aggregation isPartOf urn:isbn:9780898717037.
- aggregation language "eng".
- aggregation publisher "SIAM".
- aggregation rights "I have transferred the copyright for this publication to the publisher".
- aggregation subject "Technology and Engineering".
- aggregation title "Estimation of topic cardinality in document collections".
- aggregation abstract "The exponential growth of the size and popularity of the world wide web has increased the interest in text analysis. One of the applications of text analysis consists in grouping (i.e. clustering) texts according to the main topic they deal with. This paper presents the first part of a fundamental new approach towards this problem. A competitive setting for production of documents by n data providers is introduced. It is reasoned that in this setting, production of documents about topics is not random, but obeys Zipf’s law. Under this assumption, the number of topics can be estimated with fairly high accuracy. Three main advantages of this technique are noticed. Firstly, the estimated number is not a fixed constant in terms of the size of the text collection. Secondly, the estimation does not make assumptions on the clustering method that is used. Thirdly, this method provides a dynamical instrument to verify the recall of clustering.".
- aggregation authorList BK168434.
- aggregation endPage "39".
- aggregation startPage "31".
- aggregation aggregates 942463.
- aggregation isDescribedBy 942452.
- aggregation similarTo LU-942452.