The prediction of influenza-like illness using national influenza surveillance data and Baidu query data

Background: Seasonal influenza and other respiratory tract infections are serious public health problems that need to be further addressed and investigated. Internet search data are recognized as a valuable source for forecasting influenza or other respiratory tract infection epidemics. However, the selection of internet search data and the application of forecasting methods are important for improving forecasting accuracy. The aim of the present study was to forecast influenza epidemics based on the long short-term memory neural network (LSTM) method, Baidu search index data, and the influenza-like-illness (ILI) rate.

Methods: The official weekly ILI% data for northern and southern mainland China were obtained from the Chinese Influenza Center from 2018 to 2021. Based on the Baidu Index, search indices related to influenza infection over the corresponding time period were obtained. Pearson correlation analysis was performed to explore the association between influenza-related search queries and the ILI% of southern and northern mainland China. The LSTM model was used to forecast the influenza epidemic within the same week and at lags of 1-4 weeks. The model performance was assessed by evaluation metrics, including the mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE).

Results: In total, 24 search queries in northern mainland China and 7 search queries in southern mainland China were found to be correlated and were used to construct the LSTM model, which included the same week and a lag of 1-4 weeks. The LSTM model showed that ILI% + mask with one lag week and ILI% + influenza name were good prediction modules, with reduced RMSE predictions of 16.75% and 4.20%, respectively, compared with the estimated ILI% for northern and southern mainland China.

Conclusions: The results illuminate the feasibility of using an internet search index as a complementary data source for influenza forecasting and the efficiency of using the LSTM model to forecast influenza epidemics.