Controlling the reinforcement in Bayesian non-parametric mixture models
The paper deals with the problem of determining the number of components in a mixture model. We take a Bayesian non-parametric approach and adopt a hierarchical model with a suitable non-parametric prior for the latent structure. A commonly used model for such a problem is the mixture of Dirichlet process model. Here, we replace the Dirichlet process with a more general non-parametric prior obtained from a generalized gamma process. The basic feature of this model is that it yields a partition structure for the latent variables which is of Gibbs type. This relates to the well-known (exchangeable) product partition models. If compared with the usual mixture of Dirichlet process model the advantage of the generalization that we are examining relies on the availability of an additional parameter belonging to the interval (0,1): it is shown that such a parameter greatly influences the clustering behaviour of the model. A value of that is close to 1 generates a large number of clusters, most of which are of small size. Then, a reinforcement mechanism which is driven by acts on the mass allocation by penalizing clusters of small size and favouring those few groups containing a large number of elements. These features turn out to be very useful in the context of mixture modelling. Since it is difficult to specify a priori the reinforcement rate, it is reasonable to specify a prior for . Hence, the strength of the reinforcement mechanism is controlled by the data.
Document Type: Research Article
Affiliations: 1: Università degli Studi di Pavia, Italy 2: Universidad Nacional Autónoma de México, Mexico City, Mexico 3: Università degli Studi di Torino, Collegio Carlo Alberto and International Centre for Economic Research, Turin, Italy
Publication date: September 1, 2007