Rongye Ye, Lun Li, Shuhui Song. [preprint]Influ-BERT: An Interpretable Model for Enhancing Low-Frequency Influenza A virus Subtype Recognition. https://doi.org/10.1101/2025.07.31.667841
Influenza A Virus (IAV) poses a continuous threat to global public health due to its wide host adaptability, high-frequency antigenic variation, and potential for cross-species transmission. Accurate recognition of IAV subtypes is cru-cial for the early pandemic warning. Here, we propose Influ-BERT, a domain-adaptive pretraining model based on the transformer architecture. Optimized from DNABERT-2, Influ-BERT constructed a dedicated corpus of approximately 900,000 influenza genome sequences, developed a custom Byte Pair Encoding (BPE) tokenizer, and employ a two-stage training strategy involving domain-adaptive pretraining followed by task-specific fine-tuning. This approach significantly enhanced recognition performance for low-frequency subtypes. Experimental results demonstrate that Influ-BERT outper-forms traditional machine learning methods and general genomic language models (DNABERT-2, MegaDNA) in subtype recognition, achieving a substantial improvement in F1-score, particularly for subtypes H5N8, H5N1, H7N9, H9N2. Furthermore, sliding window perturbation analysis revealed the model´s specific focus on key regions of the IAV genome, providing interpretable evidence supporting the observed performance gains.
See Also:
Latest articles in those days:
- T cell help is a limiting factor for rare anti-influenza memory B cells to reenter germinal centers and generate potent broadly neutralizing antibodies 1 days ago
- Wild birds drive the introduction, maintenance, and spread of H5N1 clade 2.3.4.4b high pathogenicity avian influenza viruses in Spain, 2021-2022 1 days ago
- [preprint]FluNexus: a versatile web platform for antigenic prediction and visualization of influenza A viruses 1 days ago
- Salpingitis and multiorgan lesions caused by highly pathogenic avian influenza A(H5N1) virus in a cat associated with consumption of recalled raw milk in California 1 days ago
- Detection of highly pathogenic avian influenza A(H5N1) virus 2.3.4.4b in alpacas 1 days ago
[Go Top] [Close Window]


