Kim, S., Kim, MA., Kim, B. et al. Machine learning assessment of zoonotic potential in avian influenza viruses using PB2 segment. BMC Genomics 26, 395 (2025)
Background
Influenza A virus (IAV) is a major global health threat, causing seasonal epidemics and occasional pandemics. Particularly, Influenza A viruses from avian species pose significant zoonotic threats, with PB2 adaptation serving as a critical first step in cross-species transmission. A comprehensive risk assessment framework based on PB2 sequences is necessary, which should encompass detailed analyses of specific residues and mutations while maintaining sufficient generality for application to non-PB2 segments.
Results
In this study, we developed two complementary approaches: a regression-based model for accurately distinguishing among risk groups, and a SHAP-based risk assessment model for more meaningful risk analyses. For the regression-based risk models, we compared various methodologies, including tree ensemble methods, conventional regression models, and deep learning architectures. The optimized regression model, combined with SHAP value analysis, identified and ranked individual residues contributing to zoonotic potential. The SHAP-based risk model enabled intra-class analyses within the zoonotic risk assessment framework and quantified risk yields from specific mutations.
Conclusion
Experimental analyses demonstrated that the Random Forest regression model outperformed other models in most cases, and we validated the target value settings for risk regression through ablation studies. Our SHAP-based analysis identified key residues (271A, 627K, 591R, 588A, 292I, 684S, 684A, 81M, 199S, and 368Q) and mutations (T271A, Q368R/K, E627K, Q591R, A588T/I/V, and I292V/T) critical for zoonotic risk assessment. Using the SHAP-based risk assessment model, we found that influenza A viruses from Phasianidae showed elevated zoonotic risk scores compared to those from other avian species. Additionally, mutations I292V/T, Q368R, A588T/I, V598A/I/T, and E/V627K were identified as significant mutations in the Phasianidae. These PB2-focused quantitative methods provide a robust and generalizable framework for both rapid screening of avians’ zoonotic potential and analytical quantification of risks associated with specific residues or mutations.
Influenza A virus (IAV) is a major global health threat, causing seasonal epidemics and occasional pandemics. Particularly, Influenza A viruses from avian species pose significant zoonotic threats, with PB2 adaptation serving as a critical first step in cross-species transmission. A comprehensive risk assessment framework based on PB2 sequences is necessary, which should encompass detailed analyses of specific residues and mutations while maintaining sufficient generality for application to non-PB2 segments.
Results
In this study, we developed two complementary approaches: a regression-based model for accurately distinguishing among risk groups, and a SHAP-based risk assessment model for more meaningful risk analyses. For the regression-based risk models, we compared various methodologies, including tree ensemble methods, conventional regression models, and deep learning architectures. The optimized regression model, combined with SHAP value analysis, identified and ranked individual residues contributing to zoonotic potential. The SHAP-based risk model enabled intra-class analyses within the zoonotic risk assessment framework and quantified risk yields from specific mutations.
Conclusion
Experimental analyses demonstrated that the Random Forest regression model outperformed other models in most cases, and we validated the target value settings for risk regression through ablation studies. Our SHAP-based analysis identified key residues (271A, 627K, 591R, 588A, 292I, 684S, 684A, 81M, 199S, and 368Q) and mutations (T271A, Q368R/K, E627K, Q591R, A588T/I/V, and I292V/T) critical for zoonotic risk assessment. Using the SHAP-based risk assessment model, we found that influenza A viruses from Phasianidae showed elevated zoonotic risk scores compared to those from other avian species. Additionally, mutations I292V/T, Q368R, A588T/I, V598A/I/T, and E/V627K were identified as significant mutations in the Phasianidae. These PB2-focused quantitative methods provide a robust and generalizable framework for both rapid screening of avians’ zoonotic potential and analytical quantification of risks associated with specific residues or mutations.
See Also:
Latest articles in those days:
- Extended influenza seasons in Australia and New Zealand in 2025 due to the emergence of influenza A(H3N2) subclade K viruses 6 hours ago
- Dynamic ensemble deep learning with multi-source data for robust influenza forecasting in Yangzhou 6 hours ago
- Structural and immunological characterization of the H3 influenza hemagglutinin during antigenic drift 6 hours ago
- Novel Highly Pathogenic Avian Influenza A(H5N1) Virus, Argentina, 2025 9 hours ago
- Avian influenza overview September - November 2025 1 days ago
[Go Top] [Close Window]


