FOWD: A Free Ocean Wave Dataset for Data Mining and Machine Learning

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

FOWD : A Free Ocean Wave Dataset for Data Mining and Machine Learning. / Hafner, Dion; Gemmrich, Johannes; Jochum, Markus.

In: Journal of Atmospheric and Oceanic Technology, Vol. 38, No. 7, 23.07.2021, p. 1305-1322.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Hafner, D, Gemmrich, J & Jochum, M 2021, 'FOWD: A Free Ocean Wave Dataset for Data Mining and Machine Learning', Journal of Atmospheric and Oceanic Technology, vol. 38, no. 7, pp. 1305-1322. https://doi.org/10.1175/JTECH-D-20-0185.1

APA

Hafner, D., Gemmrich, J., & Jochum, M. (2021). FOWD: A Free Ocean Wave Dataset for Data Mining and Machine Learning. Journal of Atmospheric and Oceanic Technology, 38(7), 1305-1322. https://doi.org/10.1175/JTECH-D-20-0185.1

Vancouver

Hafner D, Gemmrich J, Jochum M. FOWD: A Free Ocean Wave Dataset for Data Mining and Machine Learning. Journal of Atmospheric and Oceanic Technology. 2021 Jul 23;38(7):1305-1322. https://doi.org/10.1175/JTECH-D-20-0185.1

Author

Hafner, Dion ; Gemmrich, Johannes ; Jochum, Markus. / FOWD : A Free Ocean Wave Dataset for Data Mining and Machine Learning. In: Journal of Atmospheric and Oceanic Technology. 2021 ; Vol. 38, No. 7. pp. 1305-1322.

Bibtex

@article{606e1080b2354d3fbb62ae1dcff8b418,
title = "FOWD: A Free Ocean Wave Dataset for Data Mining and Machine Learning",
abstract = "The occurrence of extreme (rogue) waves in the ocean is for the most part still shrouded in mystery, because the rare nature of these events makes them difficult to analyze with traditional methods. Modern data-mining and machine-learning methods provide a promising way out, but they typically rely on the availability of massive amounts of well-cleaned data. To facilitate the application of such data-hungry methods to surface ocean waves, we developed the Free Ocean Wave Dataset (FOWD), a freely available wave dataset and processing framework. FOWD describes the conversion of raw observations into a catalog that maps characteristic sea state parameters to observed wave quantities. Specifically, we employ a running-window approach that respects the nonstationary nature of the oceans, and extensive quality control to reduce bias in the resulting dataset. We also supply a reference Python implementation of the FOWD processing toolkit, which we use to process the entire Coastal Data Information Program (CDIP) buoy data catalog containing over 4 billion waves. In a first experiment, we find that, when the full elevation time series is available, surface elevation kurtosis and maximum wave height are the strongest univariate predictors for rogue wave activity. When just a spectrum is given, crest-trough correlation, spectral bandwidth, and mean period fill this role.",
keywords = "Wave properties, Waves, oceanic, Data mining, Data processing, Data quality control, Data science, Machine learning, ROGUE WAVES, KURTOSIS",
author = "Dion Hafner and Johannes Gemmrich and Markus Jochum",
year = "2021",
month = jul,
day = "23",
doi = "10.1175/JTECH-D-20-0185.1",
language = "English",
volume = "38",
pages = "1305--1322",
journal = "Journal of Atmospheric and Oceanic Technology",
issn = "0739-0572",
publisher = "American Meteorological Society",
number = "7",

}

RIS

TY - JOUR

T1 - FOWD

T2 - A Free Ocean Wave Dataset for Data Mining and Machine Learning

AU - Hafner, Dion

AU - Gemmrich, Johannes

AU - Jochum, Markus

PY - 2021/7/23

Y1 - 2021/7/23

N2 - The occurrence of extreme (rogue) waves in the ocean is for the most part still shrouded in mystery, because the rare nature of these events makes them difficult to analyze with traditional methods. Modern data-mining and machine-learning methods provide a promising way out, but they typically rely on the availability of massive amounts of well-cleaned data. To facilitate the application of such data-hungry methods to surface ocean waves, we developed the Free Ocean Wave Dataset (FOWD), a freely available wave dataset and processing framework. FOWD describes the conversion of raw observations into a catalog that maps characteristic sea state parameters to observed wave quantities. Specifically, we employ a running-window approach that respects the nonstationary nature of the oceans, and extensive quality control to reduce bias in the resulting dataset. We also supply a reference Python implementation of the FOWD processing toolkit, which we use to process the entire Coastal Data Information Program (CDIP) buoy data catalog containing over 4 billion waves. In a first experiment, we find that, when the full elevation time series is available, surface elevation kurtosis and maximum wave height are the strongest univariate predictors for rogue wave activity. When just a spectrum is given, crest-trough correlation, spectral bandwidth, and mean period fill this role.

AB - The occurrence of extreme (rogue) waves in the ocean is for the most part still shrouded in mystery, because the rare nature of these events makes them difficult to analyze with traditional methods. Modern data-mining and machine-learning methods provide a promising way out, but they typically rely on the availability of massive amounts of well-cleaned data. To facilitate the application of such data-hungry methods to surface ocean waves, we developed the Free Ocean Wave Dataset (FOWD), a freely available wave dataset and processing framework. FOWD describes the conversion of raw observations into a catalog that maps characteristic sea state parameters to observed wave quantities. Specifically, we employ a running-window approach that respects the nonstationary nature of the oceans, and extensive quality control to reduce bias in the resulting dataset. We also supply a reference Python implementation of the FOWD processing toolkit, which we use to process the entire Coastal Data Information Program (CDIP) buoy data catalog containing over 4 billion waves. In a first experiment, we find that, when the full elevation time series is available, surface elevation kurtosis and maximum wave height are the strongest univariate predictors for rogue wave activity. When just a spectrum is given, crest-trough correlation, spectral bandwidth, and mean period fill this role.

KW - Wave properties

KW - Waves, oceanic

KW - Data mining

KW - Data processing

KW - Data quality control

KW - Data science

KW - Machine learning

KW - ROGUE WAVES

KW - KURTOSIS

U2 - 10.1175/JTECH-D-20-0185.1

DO - 10.1175/JTECH-D-20-0185.1

M3 - Journal article

VL - 38

SP - 1305

EP - 1322

JO - Journal of Atmospheric and Oceanic Technology

JF - Journal of Atmospheric and Oceanic Technology

SN - 0739-0572

IS - 7

ER -

ID: 275993811