Personal information

As a researcher dedicated to the intersection of artificial intelligence and bioinformatics, I have conducted fruitful postdoctoral research at prestigious institutions like Stanford University and Princeton University. Over the past five years, I have published 13 high-impact research papers, including 5 as the first author, with 3 in Briefings in Bioinformatics and 2 in Nature Machine Intelligence. I have also accumulated 647 citations and an H-index of 13, significantly demonstrating the broad impact and academic contributions of my work. My research focuses on applying AI technologies to solve complex problems in bioinformatics, particularly in the prediction of biomolecular associations and vaccine design. Currently, my work is centered on RNA vaccine design and the application of language and generative models, providing innovative solutions for targeted therapy and vaccine design.

Education

  • 2022.06-Present:Stanford University & Princeton University
    Jointed Postdoc Scholar

    Department of Pathology & Department of Electrical Engineering

    • Supervisor: Prof. Le Cong & Prof. Mengdi Wang

    • Thesis title: AI + Bioinformatics
  • 2017.09-2022.06:Shanghai Jiao Tong University
    Zhiyuan honorary PhD candidates of Bioinformatics and Computational Biology

    • Supervisor: Prof. Dong-Qing Wei and Prof. Yi Xiong

    • Thesis title: The artificial intelligence method of biomolecule association prediction and neoantigen design
  • 2021.01-2022.04:University of Calgary
    Visiting PhD candidates of Cheminformatics and Molecular Simulation

    • Supervisor: Prof. Dennis Russell Salahub

    • Thesis title: A program to predict HLA-I peptide binding and optimize mutated peptides for vaccine design
  • 2013.09 - 2017.06:Shenyang Pharmaceutical University
    Bachelor of Pharmaceutics

    • Supervisor: Prof. Yihui Deng

    • Thesis title # 1: Gene expression profile and protein interaction network in patients with acute myeloid leukemia caused by overexpression of EVI1

    • Thesis title # 2: Accelerated Blood Clearance of Nanoemulsions Modified with PEG Derivatives in Rats

Research experience during PhD

The artificial intelligence method of biomolecule association prediction for
  • Targeted therapy
  • Drug-target interaction (DTI) prediction: Update gold-standard datasets; Apply deep cascade forest model to DTI prediction (DTI-CDF); Fuse multi-label algorithms and community detection algorithms to DTI prediction (DTI-MLCD).

    MiRNA-disease association (MDA) prediction: Apply graph neural networks and propose a new graph building strategy to MDA prediction (MDA-GCNFTG).


  • Immunity therapy
  • Peptide-HLA binding (pHLA) prediction: Apply the Transformer-derived self-attention model on pHLA prediction (TransPHLA).

    Peptide-based vaccine design: The first time proposed an automatically optimizing mutated peptides (AOMP) program for vaccine design based on TransPHLA.


  • RNA vaccine design
  • 5’ UTR prediction: Propose a language model for the 5’ UTR (UTR-LM); Outperform benchmarks in predicting translation efficiency and identified unannotated regions.

    5’ UTR design: Based on UTR-LM, develop an automated optimization tool tailored for RNA vaccine design, with a focus on BMD/DMD treatments.

Publications

  1. Yanyi Chu, Dan Yu, Yupeng Li, Kaixuan Huang, Yue Shen, Le Cong, Jason Zhang, and Mengdi Wang, A 5'UTR Language Model for Decoding Untranslated Regions of mRNA and Function Predictions, Nature Machine Intelligence, https://doi.org/10.1038/s42256-024-00823-9 (IF = 23.8, JCR Q1)
  2. Yanyi Chu, Yan Zhang, Qiankun Wang, Lingfeng Zhang, Xuhong Wang, Yanjing Wang, Dennis Russell Salahub, Qin Xu, Jianmin Wang, Xue Jiang, Yi Xiong, Dong-Qing Wei, A transformer-based model to predict peptide-HLA class I binding and optimize mutated peptides for vaccine design, Nature Machine Intelligence, Volume 4, 23 March 2022, Pages 300-311, https://doi.org/10.1038/s42256-022-00459-7 (IF = 25.898, JCR Q1)
  3. Yanyi Chu, Xuhong Wang, Qiuying Dai, Yanjing Wang, Qiankun Wang, Shaoliang Peng, Xiaoyong Wei, Jingfei Qiu, Dennis Russell Salahub, Yi Xiong, Dong-Qing Wei, MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph, Briefings in Bioinformatics, Volume 22, Issue 6, November 2021, bbab165, https://doi.org/10.1093/bib/bbab165 (IF = 11.622, JCR Q1)
  4. Yanyi Chu, Xiaoqi Shan, Tianhang Chen, Mingming Jiang, Yanjing Wang, Qiankun Wang, Dennis Russell Salahub, Yi Xiong, Dong-Qing Wei, DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method, Briefings in Bioinformatics, Volume 22, Issue 3, May 2021, bbaa205, https://doi.org/10.1093/bib/bbaa205 (IF = 11.622, JCR Q1)
  5. Yanyi Chu, Aman Chandra Kaushik, Xiangeng Wang, Wei Wang, Yufang Zhang, Xiaoqi Shan, Dennis Russell Salahub, Yi Xiong, Dong-Qing Wei, DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features, Briefings in Bioinformatics, Volume 22, Issue 1, January 2021, Pages 451–462, https://doi.org/10.1093/bib/bbz152 (IF = 11.622, JCR Q1)
  6. Mengyang Liu, Yanyi Chu, Huan Liu, Yuqing Su, Qi Zhang, Jiao Jiao, Mingqi Liu, Junqiang Ding, Min Liu, Yawei Hu, Yueying Dai, Rongping Zhang, Xinrong Liu, Yihui Deng, Yanzhi Song, Accelerated Blood Clearance of Nanoemulsions Modified with PEG-Cholesterol and PEG-Phospholipid Derivatives in Rats: The Effect of PEG-Lipid Linkages and PEG Molecular Weights, Molecular pharmaceutics, Volume 17, Issue 4, April 2020, Pages 1059–1070, https://doi.org/10.1021/acs.molpharmaceut.9b00770 (IF = 4.939, JCR Q1)
    Contribution: Project planning, experimental work, and data analysis before paper submission.
  7. Zhiwen Shi, Yanyi Chu, Yonghong Zhang, Yanjing Wang, Dong-Qing Wei, Prediction of Blood-Brain Barrier Permeability of Compounds by Fusing Resampling Strategies and eXtreme Gradient Boosting, IEEE Access, Volume 9, December 2020, Pages 9557-9566, http://doi.org/10.1109/ACCESS.2020.3047852 (IF = 3.367, JCR Q1)
    Contribution: Project planning, and manuscript revision.
  8. Qiuying Dai, Yanyi Chu, Zhiqi Li, Yusong Zhao, Xueying Mao, Yanjing Wang, Yi Xiong, Dong-Qing Wei, MDA-CF: Predicting MiRNA-Disease associations based on a cascade forest model by fusing multi-source information, Computers in Biology and Medicine, Volume 136, September 2021, 104706, https://doi.org/10.1016/j.compbiomed.2021.104706 (IF = 4.589, JCR Q2)
    Contribution: Method discussion.
  9. Tianhang Chen, Xiangeng Wang, Yanyi Chu, Yanjing Wang, Mingming Jiang, Dong-Qing Wei, Yi Xiong, T4SE-XGB: Interpretable Sequence-Based Prediction of Type IV Secreted Effectors Using eXtreme Gradient Boosting Algorithm, Frontiers in Microbiology, Volume 11, Article 580382, September 2020, https://doi.org/10.3389/fmicb.2020.580382 (IF = 5.64, JCR Q1)
    Contribution: Method discussion.
  10. Shenggeng Lin, Yanjing Wang, Lingfeng Zhang, Yanyi Chu, Yatong Liu, Yitian Fang, Mingming Jiang, Qiankun Wang, Bowen Zhao, Yi Xiong, Dong-Qing Wei, MDF-SA-DDI: predicting drug-drug interaction events based on multi-source drug fusion, multi-source feature fusion and transformer self-attention mechanism, Briefings in Bioinformatics, Article bbab421, October 2021, https://doi.org/10.1093/bib/bbab421 (IF = 11.622, JCR Q1)
    Contribution: Method discussion.
  11. Xiaoqi Shan, Xiangeng Wang, Cheng-dong Li, Yanyi Chu, Yufang Zhang, Yi Xiong, and Dong-Qing Wei, Prediction of CYP450 Enzyme–Substrate Selectivity Based on the Network-Based Label Space Division Method, Journal of Chemical Information and Modeling, Volume 59, Issue 11, October 2019, Pages 4577-4586, https://doi.org/10.1021/acs.jcim.9b00749 (IF = 3.99, JCR Q1)
    Contribution: Method discussion.
  12. Yufang Zhang, Xiangeng Wang, Aman Chandra Kaushik, Yanyi Chu, Xiaoqi Shan, Ming-Zhu Zhao, Qin Xu, Dong-Qing Wei, SPVec: A Word2vec-Inspired Feature Representation Method for Drug-Target Interaction Prediction, Frontiers in Chemistry, Volume 7, January 2020, Pages 895, https://doi.org/10.3389/fchem.2019.00895 (IF = 5.221, JCR Q1)
    Contribution: Method discussion.
  13. Mingming Jiang, Bowen Zhao, Shenggan Luo, Qiankun Wang, Yanyi Chu, Tianhang Chen, Xueying Mao, Yatong Liu, Yanjing Wang, Xue Jiang, Dong-Qing Wei, Yi Xiong, NeuroPpred-Fuse: an interpretable stacking model for prediction of neuropeptides by fusing sequence information and feature selection methods, Briefings in Bioinformatics, Volume 22, Issue 6, November 2021, bbab310, https://doi.org/10.1093/bib/bbab310 (IF = 11.622, JCR Q1)
    Contribution: Method discussion.

Academic conferences

  • 2021 Annual Meeting of Overseas Chinese Society of Microbiology & International Conference on Metabolic Sciences
  • Poster presentation
  • 2021 The 7th National Conference on Computational Biology and Bioinformatics
  • Excellent Poster presentation

Intern experience

  • 2020.07-2020.09: Tencent
    Using graph neural networks to predict the user loyalty
  • 2019.01-2019.12: Intel
    Using Monte Carlo tree search and deep learning to do drug retrosynthesis
  • 2018.10-2018.12: Databerry
    Using data dining and recommendation algorithm to predict television placement advertising prediction

Awards

2021 National Scholarship The Ministry of Education of China
2020 National Scholarship The Ministry of Education of China
2016 Excellent college student The Working Committee for Education and Science of Shenyang Municipal Party Committee
2015 3st Prize National College Students Mathematical Modeling Competition Liaoning Division


Skills

Language

Python, R, Matlab, SQL, Latex, Markdown, C++, etc

Development

Linux, Git, Shell, etc

Framework

Pytorch, Scikit-learn, Tensorflow, Keras, Scikit-multilearn, XGBoost, Mlxtend, Networkx, Deep Graph Library, Requests, etc

Biological experiments

Biochemistry and Molecular Biology, Medicinal Chemistry, Pharmacology, Pharmaceutical Analysis, Pharmaceutics, Inorganic Chemistry, Organic Chemistry, Analytical Chemistry, Physical Chemistry, Principles of Chemical Engineering, rat administration & orbital blood sampling & dissection, etc