Input Selection for Bandwidth-Limited Neural Network Inference

Research

Input Selection for Bandwidth-Limited Neural Network Inference

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

Input Selection for Bandwidth-Limited Neural Network Inference. / Oehmcke, Stefan; Gieseke, Fabian.

Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022. SIAM, 2022. p. 280-288.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Oehmcke, S & Gieseke, F 2022, Input Selection for Bandwidth-Limited Neural Network Inference. in Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022. SIAM, pp. 280-288, 2022 SIAM International Conference on Data Mining, SDM 2022, Virtual, Online, 28/04/2022. https://doi.org/10.1137/1.9781611977172.32

APA

Oehmcke, S., & Gieseke, F. (2022). Input Selection for Bandwidth-Limited Neural Network Inference. In Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022 (pp. 280-288). SIAM. https://doi.org/10.1137/1.9781611977172.32

Vancouver

Oehmcke S, Gieseke F. Input Selection for Bandwidth-Limited Neural Network Inference. In Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022. SIAM. 2022. p. 280-288 https://doi.org/10.1137/1.9781611977172.32

Author

Oehmcke, Stefan ; Gieseke, Fabian. / Input Selection for Bandwidth-Limited Neural Network Inference. Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022. SIAM, 2022. pp. 280-288

Bibtex

@inproceedings{b1363a05f4104e0398c720408e35350f,

title = "Input Selection for Bandwidth-Limited Neural Network Inference",

abstract = "Data are often accommodated on centralized storage servers. This is the case, for instance, in remote sensing and astronomy, where projects produce several petabytes of data every year. While machine learning models are often trained on relatively small subsets of the data, the inference phase typically requires transferring significant amounts of data between the servers and the clients. In many cases, the bandwidth available per user is limited, which then renders the data transfer to be one of the major bottlenecks. In this work, we propose a framework that automatically selects the relevant parts of the input data for a given neural network. The model as well as the associated selection masks are trained simultaneously such that a good model performance is achieved while only a minimal amount of data is selected. During the inference phase, only those parts of the data have to be transferred between the server and the client. We propose both instance-independent and instance-dependent selection masks. The former ones are the same for all instances to be transferred, whereas the latter ones allow for variable transfer sizes per instance. Our experiments show that it is often possible to significantly reduce the amount of data needed to be transferred without affecting the model quality much.",

author = "Stefan Oehmcke and Fabian Gieseke",

note = "Publisher Copyright: Copyright {\textcopyright} 2022 by SIAM.; 2022 SIAM International Conference on Data Mining, SDM 2022 ; Conference date: 28-04-2022 Through 30-04-2022",

year = "2022",

doi = "10.1137/1.9781611977172.32",

language = "English",

pages = "280--288",

booktitle = "Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022",

publisher = "SIAM",

}

RIS

TY - GEN

T1 - Input Selection for Bandwidth-Limited Neural Network Inference

AU - Oehmcke, Stefan

AU - Gieseke, Fabian

PY - 2022

Y1 - 2022

N2 - Data are often accommodated on centralized storage servers. This is the case, for instance, in remote sensing and astronomy, where projects produce several petabytes of data every year. While machine learning models are often trained on relatively small subsets of the data, the inference phase typically requires transferring significant amounts of data between the servers and the clients. In many cases, the bandwidth available per user is limited, which then renders the data transfer to be one of the major bottlenecks. In this work, we propose a framework that automatically selects the relevant parts of the input data for a given neural network. The model as well as the associated selection masks are trained simultaneously such that a good model performance is achieved while only a minimal amount of data is selected. During the inference phase, only those parts of the data have to be transferred between the server and the client. We propose both instance-independent and instance-dependent selection masks. The former ones are the same for all instances to be transferred, whereas the latter ones allow for variable transfer sizes per instance. Our experiments show that it is often possible to significantly reduce the amount of data needed to be transferred without affecting the model quality much.

AB - Data are often accommodated on centralized storage servers. This is the case, for instance, in remote sensing and astronomy, where projects produce several petabytes of data every year. While machine learning models are often trained on relatively small subsets of the data, the inference phase typically requires transferring significant amounts of data between the servers and the clients. In many cases, the bandwidth available per user is limited, which then renders the data transfer to be one of the major bottlenecks. In this work, we propose a framework that automatically selects the relevant parts of the input data for a given neural network. The model as well as the associated selection masks are trained simultaneously such that a good model performance is achieved while only a minimal amount of data is selected. During the inference phase, only those parts of the data have to be transferred between the server and the client. We propose both instance-independent and instance-dependent selection masks. The former ones are the same for all instances to be transferred, whereas the latter ones allow for variable transfer sizes per instance. Our experiments show that it is often possible to significantly reduce the amount of data needed to be transferred without affecting the model quality much.

UR - http://www.scopus.com/inward/record.url?scp=85131309611&partnerID=8YFLogxK

U2 - 10.1137/1.9781611977172.32

DO - 10.1137/1.9781611977172.32

M3 - Article in proceedings

AN - SCOPUS:85131309611

SP - 280

EP - 288

BT - Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022

PB - SIAM

T2 - 2022 SIAM International Conference on Data Mining, SDM 2022

Y2 - 28 April 2022 through 30 April 2022

ER -

ID: 314303760