|
Tree-Structured Stick Breaking Processes for Hierarchical Modeling AbstractMany data are naturally modeled by hierarchies, but often the hierarchy itself is unobserved. In this situation, it is appealing to construct a nonparametric prior on tree-structured partitions of data. Several such models have been proposed, but these typically have the property that the data live only at the bottom of the tree. This modeling assumption does not fit well with many data we would expect to model with hierarchies. For example, in topic modeling, cars might be a natural ancestor of Hondas, but we nevertheless expect to find some documents that are about cars generally and not about a specific brand. To remedy this shortcoming, we propose a distribution over tree-structured measures, based on the stick-breaking approach to the Dirichlet process. Our construction provides an intuitive and flexible model for hierarchical data, while maintaining tractability for inference.
[Edit] |