PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Improvements to the Sequence Memoizer
Jan Gasthaus and Yee Whye Teh
In: NIPS 2010, 6-11 Dec 2010, Vancouver, Canada.

Abstract

The sequence memoizer is a model for sequence data with state-of-the-art performance on language modeling and compression. We propose a number of improvements to the model and inference algorithm, including an enlarged range of hyperparameters, a memory-efficient representation, and inference algorithms operating on the new representation. Our derivations are based on precise defi- nitions of the various processes that will also allow us to provide an elementary proof of the “mysterious” coagulation and fragmentation properties used in the original paper on the sequence memoizer by Wood et al. (2009). We present some experimental results supporting our improvements.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Natural Language Processing
Theory & Algorithms
ID Code:8126
Deposited By:Yee Whye Teh
Deposited On:24 April 2011