The University of Southampton
University of Southampton Institutional Repository

Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study

Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study
Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study

Background: influenza outbreaks pose a significant threat to global public health. Traditional surveillance systems and simple algorithms often struggle to predict influenza outbreaks in an accurate and timely manner. Big data and modern technology have offered new modalities for disease surveillance and prediction. Influenza-like illness can serve as a valuable surveillance tool for emerging respiratory infectious diseases like influenza and COVID-19, especially when reported case data may not fully reflect the actual epidemic curve.

Objective: this study aimed to develop a predictive model for influenza outbreaks by combining Baidu search query data with traditional virological surveillance data. The goal was to improve early detection and preparedness for influenza outbreaks in both northern and southern China, providing evidence for supplementing modern intelligence epidemic surveillance methods.

Methods: we collected virological data from the National Influenza Surveillance Network and Baidu search query data from January 2011 to July 2018, totaling 3,691,865 and 1,563,361 respective samples. Relevant search terms related to influenza were identified and analyzed for their correlation with influenza-positive rates using Pearson correlation analysis. A distributed lag nonlinear model was used to assess the lag correlation of the search terms with influenza activity. Subsequently, a predictive model based on the gated recurrent unit and multiple attention mechanisms was developed to forecast the influenza-positive trend.

Results: this study revealed a high correlation between specific Baidu search terms and influenza-positive rates in both northern and southern China, except for 1 term. The search terms were categorized into 4 groups: essential facts on influenza, influenza symptoms, influenza treatment and medicine, and influenza prevention, all of which showed correlation with the influenza-positive rate. The influenza prevention and influenza symptom groups had a lag correlation of 1.4-3.2 and 5.0-8.0 days, respectively. The Baidu search terms could help predict the influenza-positive rate 14-22 days in advance in southern China but interfered with influenza surveillance in northern China.

Conclusions: complementing traditional disease surveillance systems with information from web-based data sources can aid in detecting warning signs of influenza outbreaks earlier. However, supplementation of modern surveillance with search engine information should be approached cautiously. This approach provides valuable insights for digital epidemiology and has the potential for broader application in respiratory infectious disease surveillance. Further research should explore the optimization and customization of search terms for different regions and languages to improve the accuracy of influenza prediction models.

early warning, epidemic intelligence, infectious disease, influenza-like illness, surveillance
1438-8871
e45085
Yang, Liuyang
e9874ffe-6130-46ea-b73c-7c6ce629afb8
Zhang, Ting
a723a456-96d6-4bb8-8f29-ddc46523d3bb
Han, Xuan
2c43c8af-0cf3-4279-975e-494716fc918f
Yang, Jiao
135abdc1-6a68-4493-9c81-ce7694575531
Sun, Yanxia
fa59200d-f9ad-4de3-89f0-870edf5e02a8
Ma, Libing
3e596904-a285-46bd-b5a6-b23da524c043
Chen, Jialong
995b07b7-1a2c-43c1-b32b-b55e9bcda830
Li, Yanming
cd90dc10-ca76-4848-b9a4-7b983f58cd4f
Lai, Shengjie
b57a5fe8-cfb6-4fa7-b414-a98bb891b001
Li, Wei
ab5e097b-b347-4edf-95dd-2b245edf0f81
Feng, Luzhao
5842cd78-bfa7-40d1-ae76-92ca4bf70c4d
Yang, Weizhong
65d18fbc-d752-42a7-ac38-01534ceda15c
Yang, Liuyang
e9874ffe-6130-46ea-b73c-7c6ce629afb8
Zhang, Ting
a723a456-96d6-4bb8-8f29-ddc46523d3bb
Han, Xuan
2c43c8af-0cf3-4279-975e-494716fc918f
Yang, Jiao
135abdc1-6a68-4493-9c81-ce7694575531
Sun, Yanxia
fa59200d-f9ad-4de3-89f0-870edf5e02a8
Ma, Libing
3e596904-a285-46bd-b5a6-b23da524c043
Chen, Jialong
995b07b7-1a2c-43c1-b32b-b55e9bcda830
Li, Yanming
cd90dc10-ca76-4848-b9a4-7b983f58cd4f
Lai, Shengjie
b57a5fe8-cfb6-4fa7-b414-a98bb891b001
Li, Wei
ab5e097b-b347-4edf-95dd-2b245edf0f81
Feng, Luzhao
5842cd78-bfa7-40d1-ae76-92ca4bf70c4d
Yang, Weizhong
65d18fbc-d752-42a7-ac38-01534ceda15c

Yang, Liuyang, Zhang, Ting, Han, Xuan, Yang, Jiao, Sun, Yanxia, Ma, Libing, Chen, Jialong, Li, Yanming, Lai, Shengjie, Li, Wei, Feng, Luzhao and Yang, Weizhong (2023) Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study. Journal of Medical Internet Research, 25 (1), e45085, [e45085]. (doi:10.2196/45085).

Record type: Article

Abstract

Background: influenza outbreaks pose a significant threat to global public health. Traditional surveillance systems and simple algorithms often struggle to predict influenza outbreaks in an accurate and timely manner. Big data and modern technology have offered new modalities for disease surveillance and prediction. Influenza-like illness can serve as a valuable surveillance tool for emerging respiratory infectious diseases like influenza and COVID-19, especially when reported case data may not fully reflect the actual epidemic curve.

Objective: this study aimed to develop a predictive model for influenza outbreaks by combining Baidu search query data with traditional virological surveillance data. The goal was to improve early detection and preparedness for influenza outbreaks in both northern and southern China, providing evidence for supplementing modern intelligence epidemic surveillance methods.

Methods: we collected virological data from the National Influenza Surveillance Network and Baidu search query data from January 2011 to July 2018, totaling 3,691,865 and 1,563,361 respective samples. Relevant search terms related to influenza were identified and analyzed for their correlation with influenza-positive rates using Pearson correlation analysis. A distributed lag nonlinear model was used to assess the lag correlation of the search terms with influenza activity. Subsequently, a predictive model based on the gated recurrent unit and multiple attention mechanisms was developed to forecast the influenza-positive trend.

Results: this study revealed a high correlation between specific Baidu search terms and influenza-positive rates in both northern and southern China, except for 1 term. The search terms were categorized into 4 groups: essential facts on influenza, influenza symptoms, influenza treatment and medicine, and influenza prevention, all of which showed correlation with the influenza-positive rate. The influenza prevention and influenza symptom groups had a lag correlation of 1.4-3.2 and 5.0-8.0 days, respectively. The Baidu search terms could help predict the influenza-positive rate 14-22 days in advance in southern China but interfered with influenza surveillance in northern China.

Conclusions: complementing traditional disease surveillance systems with information from web-based data sources can aid in detecting warning signs of influenza outbreaks earlier. However, supplementation of modern surveillance with search engine information should be approached cautiously. This approach provides valuable insights for digital epidemiology and has the potential for broader application in respiratory infectious disease surveillance. Further research should explore the optimization and customization of search terms for different regions and languages to improve the accuracy of influenza prediction models.

Text
PDF - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Submitted date: 15 December 2022
Accepted/In Press date: 4 August 2023
Published date: 17 October 2023
Additional Information: Funding Information: This study was supported by grants from the CAMS Innovation Fund for Medical Sciences (2021-I2M-1-044). All authors would like to thank Baidu and China CDC for the data publication and Sinosoft Company Limited for technical support. Publisher Copyright: ©Liuyang Yang, Ting Zhang, Xuan Han, Jiao Yang, Yanxia Sun, Libing Ma, Jialong Chen, Yanming Li, Shengjie Lai, Wei Li, Luzhao Feng, Weizhong Yang.
Keywords: early warning, epidemic intelligence, infectious disease, influenza-like illness, surveillance

Identifiers

Local EPrints ID: 484798
URI: http://eprints.soton.ac.uk/id/eprint/484798
ISSN: 1438-8871
PURE UUID: dec12764-aefa-4875-a6d3-9cd6a5f8c78d
ORCID for Shengjie Lai: ORCID iD orcid.org/0000-0001-9781-8148

Catalogue record

Date deposited: 22 Nov 2023 17:31
Last modified: 18 Mar 2024 03:48

Export record

Altmetrics

Contributors

Author: Liuyang Yang
Author: Ting Zhang
Author: Xuan Han
Author: Jiao Yang
Author: Yanxia Sun
Author: Libing Ma
Author: Jialong Chen
Author: Yanming Li
Author: Shengjie Lai ORCID iD
Author: Wei Li
Author: Luzhao Feng
Author: Weizhong Yang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×