Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. / Fischer, Asja; Igel, Christian.

Artificial Neural Networks – ICANN 201: 20th International Conference, Thessaloniki, Greece, September 15-18, 2010, Proceedings, Part III. ed. / K. Diamantaras; W. Duch; L. S. Iliadis. Springer, 2010. p. 208-217 (Lecture notes in computer science, Vol. 6354).

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Fischer, A & Igel, C 2010, Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. in K Diamantaras, W Duch & LS Iliadis (eds), Artificial Neural Networks – ICANN 201: 20th International Conference, Thessaloniki, Greece, September 15-18, 2010, Proceedings, Part III. Springer, Lecture notes in computer science, vol. 6354, pp. 208-217, 20th International Conference on Artificial Neural Networks (ICANN 2010), Thessaloniki, Greece, 15/09/2010. https://doi.org/10.1007/978-3-642-15825-4_26

APA

Fischer, A., & Igel, C. (2010). Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. In K. Diamantaras, W. Duch, & L. S. Iliadis (Eds.), Artificial Neural Networks – ICANN 201: 20th International Conference, Thessaloniki, Greece, September 15-18, 2010, Proceedings, Part III (pp. 208-217). Springer. Lecture notes in computer science Vol. 6354 https://doi.org/10.1007/978-3-642-15825-4_26

Vancouver

Fischer A, Igel C. Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. In Diamantaras K, Duch W, Iliadis LS, editors, Artificial Neural Networks – ICANN 201: 20th International Conference, Thessaloniki, Greece, September 15-18, 2010, Proceedings, Part III. Springer. 2010. p. 208-217. (Lecture notes in computer science, Vol. 6354). https://doi.org/10.1007/978-3-642-15825-4_26

Author

Fischer, Asja ; Igel, Christian. / Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. Artificial Neural Networks – ICANN 201: 20th International Conference, Thessaloniki, Greece, September 15-18, 2010, Proceedings, Part III. editor / K. Diamantaras ; W. Duch ; L. S. Iliadis. Springer, 2010. pp. 208-217 (Lecture notes in computer science, Vol. 6354).

Bibtex

@inproceedings{67affb2b3ed2467296fc72871d0f0fa2,

title = "Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines",

abstract = "Learning algorithms relying on Gibbs sampling based stochastic approximations of the log-likelihood gradient have become a common way to train Restricted Boltzmann Machines (RBMs). We study three of these methods, Contrastive Divergence (CD) and its refined variants Persistent CD (PCD) and Fast PCD (FPCD). As the approximations are biased, the maximum of the log-likelihood is not necessarily obtained. Recently, it has been shown that CD, PCD, and FPCD can even lead to a steady decrease of the log-likelihood during learning. Taking artificial data sets from the literature we study these divergence effects in more detail. Our results indicate that the log-likelihood seems to diverge especially if the target distribution is difficult to learn for the RBM. The decrease of the likelihood can not be detected by an increase of the reconstruction error, which has been proposed as a stopping criterion for CD learning. Weight-decay with a carefully chosen weight-decay-parameter can prevent divergence. ",

author = "Asja Fischer and Christian Igel",

year = "2010",

doi = "10.1007/978-3-642-15825-4_26",

language = "English",

isbn = "978-3-642-15824-7",

series = "Lecture notes in computer science",

publisher = "Springer",

pages = "208--217",

editor = "K. Diamantaras and W. Duch and Iliadis, {L. S. }",

booktitle = "Artificial Neural Networks – ICANN 201",

address = "Switzerland",

note = "null ; Conference date: 15-09-2010 Through 18-09-2010",

}

RIS

TY - GEN

T1 - Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines

AU - Fischer, Asja

AU - Igel, Christian

PY - 2010

Y1 - 2010

N2 - Learning algorithms relying on Gibbs sampling based stochastic approximations of the log-likelihood gradient have become a common way to train Restricted Boltzmann Machines (RBMs). We study three of these methods, Contrastive Divergence (CD) and its refined variants Persistent CD (PCD) and Fast PCD (FPCD). As the approximations are biased, the maximum of the log-likelihood is not necessarily obtained. Recently, it has been shown that CD, PCD, and FPCD can even lead to a steady decrease of the log-likelihood during learning. Taking artificial data sets from the literature we study these divergence effects in more detail. Our results indicate that the log-likelihood seems to diverge especially if the target distribution is difficult to learn for the RBM. The decrease of the likelihood can not be detected by an increase of the reconstruction error, which has been proposed as a stopping criterion for CD learning. Weight-decay with a carefully chosen weight-decay-parameter can prevent divergence.

AB - Learning algorithms relying on Gibbs sampling based stochastic approximations of the log-likelihood gradient have become a common way to train Restricted Boltzmann Machines (RBMs). We study three of these methods, Contrastive Divergence (CD) and its refined variants Persistent CD (PCD) and Fast PCD (FPCD). As the approximations are biased, the maximum of the log-likelihood is not necessarily obtained. Recently, it has been shown that CD, PCD, and FPCD can even lead to a steady decrease of the log-likelihood during learning. Taking artificial data sets from the literature we study these divergence effects in more detail. Our results indicate that the log-likelihood seems to diverge especially if the target distribution is difficult to learn for the RBM. The decrease of the likelihood can not be detected by an increase of the reconstruction error, which has been proposed as a stopping criterion for CD learning. Weight-decay with a carefully chosen weight-decay-parameter can prevent divergence.

U2 - 10.1007/978-3-642-15825-4_26

DO - 10.1007/978-3-642-15825-4_26

M3 - Article in proceedings

SN - 978-3-642-15824-7

T3 - Lecture notes in computer science

SP - 208

EP - 217

BT - Artificial Neural Networks – ICANN 201

A2 - Diamantaras, K.

A2 - Duch, W.

A2 - Iliadis, L. S.

PB - Springer

Y2 - 15 September 2010 through 18 September 2010

ER -

ID: 33862803