Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy. / Wickstrøm, Kristoffer K.; Løkse, Sigurd; Kampffmeyer, Michael C.; Yu, Shujian; Príncipe, José C.; Jenssen, Robert.

In: Entropy, Vol. 25, No. 6, 899, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Wickstrøm, KK, Løkse, S, Kampffmeyer, MC, Yu, S, Príncipe, JC & Jenssen, R 2023, 'Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy', Entropy, vol. 25, no. 6, 899. https://doi.org/10.3390/e25060899

APA

Wickstrøm, K. K., Løkse, S., Kampffmeyer, M. C., Yu, S., Príncipe, J. C., & Jenssen, R. (2023). Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy. Entropy, 25(6), [899]. https://doi.org/10.3390/e25060899

Vancouver

Wickstrøm KK, Løkse S, Kampffmeyer MC, Yu S, Príncipe JC, Jenssen R. Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy. Entropy. 2023;25(6). 899. https://doi.org/10.3390/e25060899

Author

Wickstrøm, Kristoffer K. ; Løkse, Sigurd ; Kampffmeyer, Michael C. ; Yu, Shujian ; Príncipe, José C. ; Jenssen, Robert. / Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy. In: Entropy. 2023 ; Vol. 25, No. 6.

Bibtex

@article{b1cf5acef5db42db8698d7a2ce27d58e,
title = "Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy",
abstract = "Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs{\textquoteright} generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based R{\'e}nyi{\textquoteright}s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.",
keywords = "deep learning, information plane, information theory, kernels methods",
author = "Wickstr{\o}m, {Kristoffer K.} and Sigurd L{\o}kse and Kampffmeyer, {Michael C.} and Shujian Yu and Pr{\'i}ncipe, {Jos{\'e} C.} and Robert Jenssen",
note = "Publisher Copyright: {\textcopyright} 2023 by the authors.",
year = "2023",
doi = "10.3390/e25060899",
language = "English",
volume = "25",
journal = "Entropy",
issn = "1099-4300",
publisher = "MDPI AG",
number = "6",

}

RIS

TY - JOUR

T1 - Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy

AU - Wickstrøm, Kristoffer K.

AU - Løkse, Sigurd

AU - Kampffmeyer, Michael C.

AU - Yu, Shujian

AU - Príncipe, José C.

AU - Jenssen, Robert

N1 - Publisher Copyright: © 2023 by the authors.

PY - 2023

Y1 - 2023

N2 - Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.

AB - Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.

KW - deep learning

KW - information plane

KW - information theory

KW - kernels methods

UR - http://www.scopus.com/inward/record.url?scp=85163873094&partnerID=8YFLogxK

U2 - 10.3390/e25060899

DO - 10.3390/e25060899

M3 - Journal article

C2 - 37372243

AN - SCOPUS:85163873094

VL - 25

JO - Entropy

JF - Entropy

SN - 1099-4300

IS - 6

M1 - 899

ER -

ID: 360255530