A bound for the convergence rate of parallel tempering for sampling restricted Boltzmann machines
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
A bound for the convergence rate of parallel tempering for sampling restricted Boltzmann machines. / Fischer, Asja; Igel, Christian.
In: Theoretical Computer Science, Vol. 598, 2015, p. 102-117.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - A bound for the convergence rate of parallel tempering for sampling restricted Boltzmann machines
AU - Fischer, Asja
AU - Igel, Christian
PY - 2015
Y1 - 2015
N2 - Abstract Sampling from restricted Boltzmann machines (RBMs) is done by Markov chain Monte Carlo (MCMC) methods. The faster the convergence of the Markov chain, the more efficiently can high quality samples be obtained. This is also important for robust training of RBMs, which usually relies on sampling. Parallel tempering (PT), an MCMC method that maintains several replicas of the original chain at higher temperatures, has been successfully applied for RBM training. We present the first analysis of the convergence rate of PT for sampling from binary RBMs. The resulting bound on the rate of convergence of the PT Markov chain shows an exponential dependency on the size of one layer and the absolute values of the RBM parameters. It is minimized by a uniform spacing of the inverse temperatures, which is often used in practice. Similarly as in the derivation of bounds on the approximation error for contrastive divergence learning, our bound on the mixing time implies an upper bound on the error of the gradient approximation when the method is used for RBM training.
AB - Abstract Sampling from restricted Boltzmann machines (RBMs) is done by Markov chain Monte Carlo (MCMC) methods. The faster the convergence of the Markov chain, the more efficiently can high quality samples be obtained. This is also important for robust training of RBMs, which usually relies on sampling. Parallel tempering (PT), an MCMC method that maintains several replicas of the original chain at higher temperatures, has been successfully applied for RBM training. We present the first analysis of the convergence rate of PT for sampling from binary RBMs. The resulting bound on the rate of convergence of the PT Markov chain shows an exponential dependency on the size of one layer and the absolute values of the RBM parameters. It is minimized by a uniform spacing of the inverse temperatures, which is often used in practice. Similarly as in the derivation of bounds on the approximation error for contrastive divergence learning, our bound on the mixing time implies an upper bound on the error of the gradient approximation when the method is used for RBM training.
KW - Restricted Boltzmann machines
KW - Parallel tempering
KW - Mixing time
U2 - 10.1016/j.tcs.2015.05.019
DO - 10.1016/j.tcs.2015.05.019
M3 - Journal article
VL - 598
SP - 102
EP - 117
JO - Theoretical Computer Science
JF - Theoretical Computer Science
SN - 0304-3975
ER -
ID: 144247294