3D Structure Prediction

This tutorial demonstrates how to use Boltz-2 for state-of-the-art biomolecular structure prediction, featuring examples from Nobel Prize-winning therapeutic breakthroughs that have saved millions of lives worldwide.

Tutorial Overview

Learning Objectives

By completing this tutorial, you will:

  • Master modern AI-based structure prediction workflows
  • Understand confidence metrics and quality assessment
  • Analyze structural predictions with integrated visualization
  • Connect structure prediction to downstream applications

Tutorial Examples

COVID-19 Spike RBD

Viral receptor binding domain structure prediction

FMC63 CAR-T Antibody

Therapeutic antibody structure prediction

Prerequisites

  • Basic understanding of protein structure
  • Protein sequence in FASTA format
  • Access to Chiral Potter platform

Stage 1: COVID-19 Spike RBD Structure Prediction

Setting Up the Prediction

Create New Project and Workflow

  • Project name: "COVID-19-Spike-RBD-Prediction"
  • Workflow name: "Structure prediction"

Prepare Sequence Input Upload the Spike RBD sequence (UniProt P0DTC2, region 331-524):

>A|protein
NITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFT
NVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLF
RKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLH
APATV

Configure Boltz-2 Parameters

  • Output Format: PDB (standard protein structure format)
  • Recycling Steps: 5 (enhanced accuracy through iterative refinement)
  • Sampling Steps: 200 (diffusion model denoising iterations)
  • Diffusion Samples: 10 (explore conformational space diversity)
  • Step Scale: 1.638 (diffusion step size optimization)
  • Devices: 1 (GPU allocation)
  • Accelerator: GPU (hardware acceleration)
  • MSA Server: Enabled (evolutionary information integration)

Understanding Boltz-2 Configuration

Input Options

Sequence Input Formats

  • FASTA: Simple sequence input (recommended for single proteins)
  • YAML: Advanced format for complexes and constraints
  • MSA: Include evolutionary information directly

MSA (Multiple Sequence Alignment)

  • Automatically generated from sequence databases
  • Provides evolutionary context
  • Improves prediction accuracy significantly

Quality Settings

Confidence Levels (Boltz-2 scale: 0-1)

  • pLDDT > 0.9: Very high confidence, near-atomic-resolution
  • pLDDT 0.7-0.9: Confident regions, reliable structure
  • pLDDT 0.5-0.7: Low confidence, interpret carefully
  • pLDDT < 0.5: Very low confidence, likely disordered

Boltz-2 Overall Confidence Score

The Boltz-2 confidence score is calculated as: 0.8 × overall pLDDT + 0.2 × interface pTM

Recycle Steps

  • More steps = better accuracy but longer time
  • 3-5 steps typical for most proteins
  • 7+ steps for challenging cases

Diffusion Sampling

  • Controls conformational exploration
  • Higher values explore more states
  • Useful for flexible proteins

Advanced Options

Advanced Parameters

  • Step Scale: Diffusion step size (default: 1.638)
  • Devices: Number of GPUs to use
  • Accelerator: Hardware acceleration (GPU/CPU)
  • Checkpoint: Model checkpoint selection

Performance Optimization

  • Batch Size: Process multiple predictions
  • Memory Management: Automatic GPU memory handling
  • Parallel Processing: Multi-GPU support

Results Analysis

Quality Assessment

  • Review confidence scores (pLDDT values)
  • Identify high/low confidence regions
  • Compare with known structures if available

Structural Visualization

  • Interactive 3D visualization with embedded Mol* viewer
  • Export options for further analysis

Stage 2: FMC63 CAR-T Antibody Prediction

For complex multi-domain proteins like FMC63 CAR-T:

Input Sequence Upload the FMC63 CAR-T construct sequence containing multiple functional domains.

>A|protein
MLLLVTSLLLCELPHPAFLLIPDIQMTQTTSSLSASLGDRVTISCRASQDISKYLNWYQQKPD
GTVKLLIYHTSRLHSGVPSRFSGSGSGTDYSLTISNLEQEDIATYFCQQGNTLPYTFGGGTKL
EITGSTSGSGKPGSGEGSTKGEVKLQESGPGLVAPSQSLSVTCTVSGVSLPDYGVSWIRQPPR
KGLEWLGVIWGSETTYYNSALKSRLTIIKDNSKSQVFLKMNSLQTDDTAIYYCAKHYYYGGSY
AMDYWGQGTSVTVSSAAAIEVMYPPPYLDNEKSNGTIIHVKGKHLCPSPLFPGPSKPFWVLVV
VGGVLACYSLLVTVAFIIFWVRSKRSRLLHSDYMNMTPRRPGPTRKHYQPYAPPRDFAAYRSR
VKFSRSADAPAYQQGQNQLYNELNLGRREEYDVLDKRRGRDPEMGGKPRRKNPQEGLYNELQK
DKMAEAYSEIGMKGERRRGKGHDGLYQGLSTATKDTYDALHMQALPPR

Adjusted Parameters

  • Output Format: PDB (standard protein structure format)
  • Recycling Steps: 7 (for complex multi-domain structure)
  • Sampling Steps: 200 (diffusion model denoising iterations)
  • Diffusion Samples: 15 (explore domain orientations)
  • Step Scale: 1.638 (diffusion step size optimization)
  • Devices: 1 (GPU allocation)
  • Accelerator: GPU (hardware acceleration)
  • MSA Server: Enabled (evolutionary information integration)

Expected Results

  • Structured domains (scFv) show high confidence
  • Flexible linkers show lower confidence
  • Overall structure suitable for downstream analysis

Structure Reporting and Visualization

Generated structures are automatically processed for analysis:

  1. Automated Reporting

    • Confidence score analysis and visualization
    • Quality metrics and validation
  2. Mol Integration*

    • Interactive 3D structure viewer
    • Export options for publication graphics

Stage 4: Comparative Studies and Benchmarking

Method Comparison

Understanding Boltz-2's performance vs other methods:

vs AlphaFold2

  • Speed: Boltz-2 is significantly faster
  • Accuracy: Comparable for most proteins
  • Complexes: Better handling of multi-protein systems
  • Flexibility: Enhanced conformational sampling

vs Experimental Methods

  • Resolution: AI predictions ~2-4Å effective resolution
  • Coverage: Can predict any sequence
  • Speed: Minutes vs months/years
  • Cost: Dramatically reduced

Conclusion

This tutorial demonstrates the transformative power of AI-based structure prediction for biomedical research. By predicting structures of Nobel Prize-winning therapeutic targets, we've shown how modern computational methods can accelerate drug discovery and deepen our understanding of biological mechanisms.

Key Takeaways

  1. AI prediction is mature for most protein structures
  2. Confidence metrics guide interpretation and downstream use
  3. Integration is seamless with other Chiral applications

Next Steps

Resources and References

Key Publications

  1. Ille, A. M. et al. (2025). Human protein interactome structure prediction at scale with Boltz-2. bioRxiv. https://doi.org/10.1101/2025.07.03.663068
  2. Sehnal, D. et al. (2021). Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Research, 49(W1), W431-W437.
  3. GitHub repositories:

Structural Databases

This tutorial showcases how AI-powered structure prediction accelerates research on life-saving therapeutics, from COVID-19 vaccines to cancer immunotherapy.