Multi-Faceted Analysis and Prediction for the Outbreak of Pediatric Respiratory Syncytial Virus

Abstract

Objective: Respiratory syncytial virus (RSV) is a significant cause of pediatric hospitalizations. This paper aims to utilize multi-source data and leverage the tensor methods to uncover distinct RSV geographic clusters and develop an accurate RSV prediction model for future seasons. Materials and Methods: This study utilizes five-year RSV data from sources, including medical claims, CDC surveillance data, and Google search trends. We conduct spatio-temporal tensor analysis and prediction (TAP) for pediatric RSV in the US by designing (i) a non-negative tensor factorization (NTF) model for pediatric RSV diseases and location clustering; (ii) and a recurrent neural network tensor regression model for county-level trend prediction using the disease and location features. Results: We identify a clustering hierarchy of pediatric diseases: Three common geographic clusters of RSV outbreaks were identified from independent sources, showing an annual RSV trend shifting across different US regions, from the South and Southeast regions to the Central and Northeast regions and then to the West and Northwest regions, while precipitation and temperature were found as correlative factors with the coefficient of determination , respectively. Our regression model accurately predicted the 2022-2023 RSV season at the county level, achieving mean absolute error MAE<0.4 and a Pearson correlation greater than 0.75, which significantly outperforms the baselines with p-values <0.05. Conclusions: Our proposed framework provides a thorough analysis of RSV disease in the US, which enables healthcare providers to better prepare for potential outbreaks, anticipate increased demand for services and supplies, and save more lives with timely interventions.

Publication
Journal of the American Medical Informatics Association