Molecular Structure Characterization Learning and Property Prediction

Developing equivariant geometry-enhanced graph neural networks and geometric transformers that elegantly extract geometric features and efficiently model molecular structures with low computational costs for molecular property prediction, drug discovery, and molecular dynamics simulation

Research Overview

Geometric deep learning has been revolutionizing the molecular modeling field. Despite state-of-the-art neural network models approaching ab initio accuracy for molecular property prediction, their applications in drug discovery and molecular dynamics simulation have been hindered by insufficient utilization of geometric information and high computational costs. Our research addresses these fundamental challenges through equivariant geometry-enhanced graph neural networks and geometric transformers.

Our key innovations include ViSNet, which elegantly extracts geometric features and efficiently models molecular structures with low computational costs; Geoformer, a novel geometric Transformer with Interatomic Positional Encoding that paves the way for molecular geometric modeling based on Transformer architecture; and Long-Short-Range Message-Passing frameworks that capture non-local interactions for scalable molecular dynamics simulation. These advances enable efficient exploration of conformational space, superior drug-target interaction prediction, and provide reasonable interpretability to map geometric representations to molecular structures.

Research Framework Architecture

Comprehensive framework integrating equivariant geometry-enhanced graph neural networks and geometric transformers for accurate molecular property prediction and structure characterization

Molecular Structure Characterization Learning and Property Prediction Architecture

The molecular structure characterization framework combines equivariant geometry-enhanced graph neural networks with geometric transformers to achieve superior molecular property prediction and structure understanding.

Key Research Areas

Graph Neural Networks

Equivariant geometry-enhanced graph neural networks that elegantly extract geometric features for molecular structure characterization.

  • Vector-scalar interactive message passing
  • Long-short-range message passing
  • Conformational space exploration
Geometric Deep Learning

Geometric transformers with interatomic positional encoding that parameterize atomic environments for molecular structure modeling.

  • Interatomic positional encoding
  • Three-way Transformer architecture
  • Geometric feature extraction
ML Force Fields

Physics-informed frameworks for scalable molecular dynamics simulation that capture non-local interactions with low computational costs.

  • Long-range interaction modeling
  • Scalable MD simulation
  • Chemical accuracy preservation

Technical Innovations

Novel Architectures

We develop cutting-edge graph neural network architectures that push the boundaries of what's possible in molecular modeling:

Multi-Scale GNNs

Hierarchical representations from atoms to complexes

Attention Mechanisms

Learning important molecular interactions dynamically

Equivariant Layers

Preserving molecular symmetries in predictions

Performance Achievements

Accuracy Improvements

  • - 50% reduction in force prediction errors
  • - 10x improvement in energy conservation
  • - State-of-the-art on QM9 benchmark
  • - Superior performance on MD17 dataset

Computational Efficiency

  • - 100x faster than DFT calculations
  • - Linear scaling with system size
  • - GPU-optimized implementations
  • - Distributed training capabilities

Applications & Future Prospects

Current Applications

Drug Discovery

  • Virtual screening acceleration
  • Lead optimization guidance
  • ADMET property prediction
  • Drug-target interaction modeling

Materials Science

  • Catalyst design and optimization
  • Battery material discovery
  • Polymer property prediction
  • Crystal structure prediction

Key Publications

Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing

Wang, Y., Wang, T., Li, S., He, X., Li, M., Wang, Z., Zheng, N., Shao, B., & Liu, T. Y.

Nature Communications2024, 15(1), 313Co-first author (2nd), Co-corresponding (primary)Editors' Highlights

Innovation: Equivariant geometry-enhanced graph neural network that elegantly extracts geometric features and efficiently models molecular structures with low computational costs

Contribution: Outperforms SOTA on multiple MD benchmarks (MD17, revised MD17, MD22) and achieves excellent chemical property prediction on QM9 and Molecule3D datasets

Impact: Efficiently explores conformational space with reasonable interpretability, addressing key challenges in drug discovery and MD simulation applications

View

Geometric transformer with interatomic positional encoding

Wang, Y., Li, S., Wang, T., Shao, B., Zheng, N., & Liu, T. Y.

NeurIPS2023, 36, 55981-55994Sole corresponding authorTop-tier ML

Innovation: Novel geometric Transformer with Interatomic Positional Encoding (IPE) that parameterizes atomic environments as Transformer's positional encodings

Contribution: Outperforms SOTA algorithms on QM9 and achieves best performance on Molecule3D for both random and scaffold splits, compared with both Transformers and equivariant GNN models

Impact: Paves the way for molecular geometric modeling based on Transformer architecture, opening new avenues for molecular modeling applications

View

Long-short-range message-passing: A physics-informed framework to capture non-local interaction for scalable molecular dynamics simulation

Li, Y., Wang, Y., Huang, L., Yang, H., Wei, X., Zhang, J., Wang, T., Wang, Z., Shao, B., & Liu, T. Y.

ICLR2024Co-corresponding author (3rd)Top-tier ML

Innovation: Physics-informed framework generalizing existing equivariant GNNs to incorporate long-range interactions efficiently and effectively, inspired by fragmentation-based methods

Contribution: Demonstrates SOTA results with up to 40% MAE reduction for molecules in MD22 and Chignolin datasets, with consistent improvements to various EGNNs

Impact: Addresses satisfactory description of long-range and many-body interactions, enabling scalable molecular dynamics simulation for chemical and biological systems

View

SAMF: a self-adaptive protein modeling framework

Ding, W., Xu, Q., Liu, S., Wang, T., Shao, B., Gong, H., & Liu, T. Y.

Bioinformatics2021, 37(22), 4075-4082Sole corresponding author

Innovation: Self-adaptive framework that eliminates redundancy of constraints, resolves conflicts, and folds protein structures iteratively with deep quality analysis system

Contribution: Achieves SOTA performance by exploiting cutting-edge deep learning techniques without complicated domain knowledge and numerous patches as barriers

Impact: Modular design enables easy customization and extension, with superiority amplified over time as input constraint quality improves

View

Improved drug-target interaction prediction with intermolecular graph transformer

Liu, S., Wang, Y., Deng, Y., He, L., Shao, B., Yin, J., Wang, T., & Liu, T. Y.

Briefings in Bioinformatics2022, 23(5), bbac162Sole corresponding author

Innovation: Dedicated attention mechanism modeling intermolecular information with three-way Transformer-based architecture, addressing topological and spatial limitations

Contribution: Outperforms SOTA by 9.1% and 20.5% for binding activity and pose prediction, with superior generalization to unseen receptor proteins

Impact: Exhibits promising drug screening ability against SARS-CoV-2, identifying 83.1% active drugs validated by wet-lab experiments with near-native binding poses

View

DSN-DDI: An accurate and generalized framework for drug-drug interaction prediction by dual-view representation learning

Li, Z., Zhu, S., Shao, B., Zeng, X., Wang, T., & Liu, T. Y.

Briefings in Bioinformatics2023, 24(1), bbac597Co-corresponding author (primary)

Innovation: Novel dual-view drug representation learning network employing local and global modules iteratively, learning substructures from single drug (intra-view) and drug pair (inter-view)

Contribution: Achieves 13.01% relative improvement and >99% accuracy under transductive setting, with 7.07% relative improvement for unseen drugs

Impact: Exhibits good transferability on synergistic drug combination prediction, serving as generalized framework for drug discovery field applications

View