HK Model 2 is an advanced hybrid deep learning model that combines the strengths of Convolutional Neural Networks (CNN) and Transformers. This innovative model excels in accurately detecting multiple base modifications using single-molecule real-time sequencing (SMRT-seq). The modifications it can identify include 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC), and N6-methyladenine (6mA).
HK Model 2 demonstrates significantly improved sensitivity and specificity in detecting these modification types. Furthermore, it enables strand-specific analysis of base modifications and allows for precise detection of modifications even at sites close to the ends of DNA fragments.
Download
We have deposited the sequence data for the training datasets utilized in this work in the European Genome-Phenome Archive (EGA), hosted by the European Bioinformatics Institute (EBI) with accession no. EGAS50000000366. The computer codes used to generate the results presented in the manuscript are the proprietary information of Centre for Novostics, which is a subsidiary of The Chinese University of Hong Kong. These codes can be available for evaluating the results presented in the study, subject to a Software and Data Access Agreement. If you are an authorized user, you can download the software and full manual of it below.
Publications
Below is a list of publications related to this research work:
-
Transformer-based deep learning for accurate detection of multiple base modifications
using single molecule real-time sequencing
Hu, X., Shi, Y., Cheng, S. H., Huang, Z., Zhou, Z., Shi, X., ... & Lo, Y. D., 2025, Published in Communications Biology
Access full text -
Genome-wide detection of cytosine methylation by single molecule real-time sequencing
Tse, O. O., Jiang, P., Cheng, S. H., Peng, W., Shang, H., Wong, J., ... & Lo, Y. D. , 2021, Published in Proceedings of the National Academy of Sciences
Access full text