Data-Driven Model Building for Life Course Epidemiology

Research output: Contribution to journalJournal articleResearchpeer-review

Life course epidemiology is useful for describing and analyzing complex etiological mechanisms for disease development, but existing statistical methods are essentially confirmatory, as they rely on a priori model specification. This limits the scope of causal inquiries that can be made, since these methods are mostly suited to examine well-known hypotheses that do not question our established view of health, which may lead to confirmation bias. We propose an exploratory alternative. Instead of specifyinga life course model prior to data analysis, our method infers the life course model directly from the data. Our proposed method extends the well-known PC algorithm (named after its authors, Peter and Clark) for causal discovery and it facilitates including temporal information for inferring a model from observational data. The extended algorithm is called temporal PC. The obtained life course model can afterwards be perused for interesting causal hypotheses. Our method complements classical confirmatory methods, and guides researchers in expanding their models in new directions. We showcase the method on a dataset encompassing almost 3000 Danish men followed from birth until age 65. Using this dataset, we infer life course models for the role of socio-economic and health-related factors on development of depression.

Original languageEnglish
JournalAmerican Journal of Epidemiology
Issue number9
Pages (from-to)1898–1907
Number of pages10
Publication statusPublished - 2021

ID: 259558116