Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study
Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study
Background: influenza outbreaks pose a significant threat to global public health. Traditional surveillance systems and simple algorithms often struggle to predict influenza outbreaks in an accurate and timely manner. Big data and modern technology have offered new modalities for disease surveillance and prediction. Influenza-like illness can serve as a valuable surveillance tool for emerging respiratory infectious diseases like influenza and COVID-19, especially when reported case data may not fully reflect the actual epidemic curve.
Objective: this study aimed to develop a predictive model for influenza outbreaks by combining Baidu search query data with traditional virological surveillance data. The goal was to improve early detection and preparedness for influenza outbreaks in both northern and southern China, providing evidence for supplementing modern intelligence epidemic surveillance methods.
Methods: we collected virological data from the National Influenza Surveillance Network and Baidu search query data from January 2011 to July 2018, totaling 3,691,865 and 1,563,361 respective samples. Relevant search terms related to influenza were identified and analyzed for their correlation with influenza-positive rates using Pearson correlation analysis. A distributed lag nonlinear model was used to assess the lag correlation of the search terms with influenza activity. Subsequently, a predictive model based on the gated recurrent unit and multiple attention mechanisms was developed to forecast the influenza-positive trend.
Results: this study revealed a high correlation between specific Baidu search terms and influenza-positive rates in both northern and southern China, except for 1 term. The search terms were categorized into 4 groups: essential facts on influenza, influenza symptoms, influenza treatment and medicine, and influenza prevention, all of which showed correlation with the influenza-positive rate. The influenza prevention and influenza symptom groups had a lag correlation of 1.4-3.2 and 5.0-8.0 days, respectively. The Baidu search terms could help predict the influenza-positive rate 14-22 days in advance in southern China but interfered with influenza surveillance in northern China.
Conclusions: complementing traditional disease surveillance systems with information from web-based data sources can aid in detecting warning signs of influenza outbreaks earlier. However, supplementation of modern surveillance with search engine information should be approached cautiously. This approach provides valuable insights for digital epidemiology and has the potential for broader application in respiratory infectious disease surveillance. Further research should explore the optimization and customization of search terms for different regions and languages to improve the accuracy of influenza prediction models.
early warning, epidemic intelligence, infectious disease, influenza-like illness, surveillance
e45085
Yang, Liuyang
e9874ffe-6130-46ea-b73c-7c6ce629afb8
Zhang, Ting
a723a456-96d6-4bb8-8f29-ddc46523d3bb
Han, Xuan
2c43c8af-0cf3-4279-975e-494716fc918f
Yang, Jiao
135abdc1-6a68-4493-9c81-ce7694575531
Sun, Yanxia
fa59200d-f9ad-4de3-89f0-870edf5e02a8
Ma, Libing
3e596904-a285-46bd-b5a6-b23da524c043
Chen, Jialong
995b07b7-1a2c-43c1-b32b-b55e9bcda830
Li, Yanming
cd90dc10-ca76-4848-b9a4-7b983f58cd4f
Lai, Shengjie
b57a5fe8-cfb6-4fa7-b414-a98bb891b001
Li, Wei
ab5e097b-b347-4edf-95dd-2b245edf0f81
Feng, Luzhao
5842cd78-bfa7-40d1-ae76-92ca4bf70c4d
Yang, Weizhong
65d18fbc-d752-42a7-ac38-01534ceda15c
17 October 2023
Yang, Liuyang
e9874ffe-6130-46ea-b73c-7c6ce629afb8
Zhang, Ting
a723a456-96d6-4bb8-8f29-ddc46523d3bb
Han, Xuan
2c43c8af-0cf3-4279-975e-494716fc918f
Yang, Jiao
135abdc1-6a68-4493-9c81-ce7694575531
Sun, Yanxia
fa59200d-f9ad-4de3-89f0-870edf5e02a8
Ma, Libing
3e596904-a285-46bd-b5a6-b23da524c043
Chen, Jialong
995b07b7-1a2c-43c1-b32b-b55e9bcda830
Li, Yanming
cd90dc10-ca76-4848-b9a4-7b983f58cd4f
Lai, Shengjie
b57a5fe8-cfb6-4fa7-b414-a98bb891b001
Li, Wei
ab5e097b-b347-4edf-95dd-2b245edf0f81
Feng, Luzhao
5842cd78-bfa7-40d1-ae76-92ca4bf70c4d
Yang, Weizhong
65d18fbc-d752-42a7-ac38-01534ceda15c
Yang, Liuyang, Zhang, Ting, Han, Xuan, Yang, Jiao, Sun, Yanxia, Ma, Libing, Chen, Jialong, Li, Yanming, Lai, Shengjie, Li, Wei, Feng, Luzhao and Yang, Weizhong
(2023)
Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study.
Journal of Medical Internet Research, 25 (1), , [e45085].
(doi:10.2196/45085).
Abstract
Background: influenza outbreaks pose a significant threat to global public health. Traditional surveillance systems and simple algorithms often struggle to predict influenza outbreaks in an accurate and timely manner. Big data and modern technology have offered new modalities for disease surveillance and prediction. Influenza-like illness can serve as a valuable surveillance tool for emerging respiratory infectious diseases like influenza and COVID-19, especially when reported case data may not fully reflect the actual epidemic curve.
Objective: this study aimed to develop a predictive model for influenza outbreaks by combining Baidu search query data with traditional virological surveillance data. The goal was to improve early detection and preparedness for influenza outbreaks in both northern and southern China, providing evidence for supplementing modern intelligence epidemic surveillance methods.
Methods: we collected virological data from the National Influenza Surveillance Network and Baidu search query data from January 2011 to July 2018, totaling 3,691,865 and 1,563,361 respective samples. Relevant search terms related to influenza were identified and analyzed for their correlation with influenza-positive rates using Pearson correlation analysis. A distributed lag nonlinear model was used to assess the lag correlation of the search terms with influenza activity. Subsequently, a predictive model based on the gated recurrent unit and multiple attention mechanisms was developed to forecast the influenza-positive trend.
Results: this study revealed a high correlation between specific Baidu search terms and influenza-positive rates in both northern and southern China, except for 1 term. The search terms were categorized into 4 groups: essential facts on influenza, influenza symptoms, influenza treatment and medicine, and influenza prevention, all of which showed correlation with the influenza-positive rate. The influenza prevention and influenza symptom groups had a lag correlation of 1.4-3.2 and 5.0-8.0 days, respectively. The Baidu search terms could help predict the influenza-positive rate 14-22 days in advance in southern China but interfered with influenza surveillance in northern China.
Conclusions: complementing traditional disease surveillance systems with information from web-based data sources can aid in detecting warning signs of influenza outbreaks earlier. However, supplementation of modern surveillance with search engine information should be approached cautiously. This approach provides valuable insights for digital epidemiology and has the potential for broader application in respiratory infectious disease surveillance. Further research should explore the optimization and customization of search terms for different regions and languages to improve the accuracy of influenza prediction models.
Text
PDF
- Version of Record
More information
Submitted date: 15 December 2022
Accepted/In Press date: 4 August 2023
Published date: 17 October 2023
Additional Information:
Funding Information:
This study was supported by grants from the CAMS Innovation Fund for Medical Sciences (2021-I2M-1-044). All authors would like to thank Baidu and China CDC for the data publication and Sinosoft Company Limited for technical support.
Publisher Copyright:
©Liuyang Yang, Ting Zhang, Xuan Han, Jiao Yang, Yanxia Sun, Libing Ma, Jialong Chen, Yanming Li, Shengjie Lai, Wei Li, Luzhao Feng, Weizhong Yang.
Keywords:
early warning, epidemic intelligence, infectious disease, influenza-like illness, surveillance
Identifiers
Local EPrints ID: 484798
URI: http://eprints.soton.ac.uk/id/eprint/484798
ISSN: 1438-8871
PURE UUID: dec12764-aefa-4875-a6d3-9cd6a5f8c78d
Catalogue record
Date deposited: 22 Nov 2023 17:31
Last modified: 18 Mar 2024 03:48
Export record
Altmetrics
Contributors
Author:
Liuyang Yang
Author:
Ting Zhang
Author:
Xuan Han
Author:
Jiao Yang
Author:
Yanxia Sun
Author:
Libing Ma
Author:
Jialong Chen
Author:
Yanming Li
Author:
Wei Li
Author:
Luzhao Feng
Author:
Weizhong Yang
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics