Learning Probabilistic Automata: A Study In State
Borja Balle, Jorge Castro and Ricard Gavaldà
Theoretical Computer Science
Known algorithms for learning PDFA can only be shown to run in time poly-
nomial in the so-called distinguishability μ of the target machine, besides
the number of states and the usual accuracy and confidence parameters.
We show that the dependence on μ is necessary in the worst case for every
algorithm whose structure resembles existing ones. As a technical tool, a
new variant of Statistical Queries termed L∞ -queries is defined. We show
how to simulate L∞ -queries using classical Statistical Queries and show that
known PAC algorithms for learning PDFA are in fact statistical query algo-
rithms. Our results include a lower bound: every algorithm to learn PDFA
with queries using a reasonable tolerance must make Ω(1/μ1−c ) queries for every c > 0.
Finally, an adaptive algorithm that PAC-learns w.r.t. another
measure of complexity is described. This yields better efficiency in many
cases, while retaining the same inevitable worst-case behavior. Our algo-
rithm requires less input parameters than previously existing ones, and has a better sample bound.
Keywords: Distribution Learning, PAC Learning, Probabilistic
Automata, Statistical Queries