DBpedia 2014 |

DBpedia 2014

Matches in DBpedia 2014 for { <http://dbpedia.org/resource/Brown_clustering> ?p ?o. }

Showing items 1 to 13 of 13 with 100 items per page.

Brown_clustering abstract "In natural language processing, Brown clustering or IBM clustering is a form of hierarchical clustering of words based on the contexts in which they occur, proposed by Peter F. Brown of IBM in the context of language modeling. The intuition behind the method is that a class-based language model (also called cluster n-gram model), i.e. one where probabilities of words are based on the classes (clusters) of previous words, can overcome the data sparsity problem inherent in language modeling. Jurafsky and Martin give the example of a flight reservation system that needs to estimate the likelihood of the bigram "to Shanghai", without having seen this in a training set. The system can obtain a good estimate if it can cluster "Shanghai" with other city names, then make its estimate based on the likelihood of phrases such as "to London", "to Beijing" and "to Denver".Brown clustering is an agglomerative, bottom-up form of clustering that groups words (i.e., types) into a binary tree of classes, using a merging criterion based on the log-probability of a text under a class-based language model, i.e. a probability model that takes the clustering into account. This model has the same general form as a hidden Markov model. That is, given cluster membership indicators cᵢ for the tokens wᵢ in a text, the probability of the wᵢ given wᵢ₋₁ is given byThe cluster memberships of words resulting from Brown clustering can be used as features in a variety of machine-learned natural language processing tasks.".
Brown_clustering wikiPageID "42016100".
Brown_clustering wikiPageRevisionID "603576175".
Brown_clustering subject Category:Cluster_analysis.
Brown_clustering subject Category:Hidden_Markov_models.
Brown_clustering subject Category:Language_modeling.
Brown_clustering comment "In natural language processing, Brown clustering or IBM clustering is a form of hierarchical clustering of words based on the contexts in which they occur, proposed by Peter F. Brown of IBM in the context of language modeling. The intuition behind the method is that a class-based language model (also called cluster n-gram model), i.e. one where probabilities of words are based on the classes (clusters) of previous words, can overcome the data sparsity problem inherent in language modeling.".
Brown_clustering label "Brown clustering".
Brown_clustering sameAs m.0_v7l_x.
Brown_clustering sameAs Q17003931.
Brown_clustering sameAs Q17003931.
Brown_clustering wasDerivedFrom Brown_clustering?oldid=603576175.
Brown_clustering isPrimaryTopicOf Brown_clustering.