3D Structure Prediction
This tutorial demonstrates how to use Boltz-2 for state-of-the-art biomolecular structure prediction, featuring examples from Nobel Prize-winning therapeutic breakthroughs that have saved millions of lives worldwide.
Tutorial Overview
Learning Objectives
By completing this tutorial, you will:
- Master modern AI-based structure prediction workflows
- Understand confidence metrics and quality assessment
- Analyze structural predictions with integrated visualization
- Connect structure prediction to downstream applications
Tutorial Examples
COVID-19 Spike RBD
Viral receptor binding domain structure prediction
FMC63 CAR-T Antibody
Therapeutic antibody structure prediction
Prerequisites
- Basic understanding of protein structure
- Protein sequence in FASTA format
- Access to Chiral Potter platform
Stage 1: COVID-19 Spike RBD Structure Prediction
Setting Up the Prediction
Create New Project and Workflow
- Project name: "COVID-19-Spike-RBD-Prediction"
- Workflow name: "Structure prediction"
Prepare Sequence Input Upload the Spike RBD sequence (UniProt P0DTC2, region 331-524):
>A|protein
NITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFT
NVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLF
RKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLH
APATVConfigure Boltz-2 Parameters
- Output Format: PDB (standard protein structure format)
- Recycling Steps: 5 (enhanced accuracy through iterative refinement)
- Sampling Steps: 200 (diffusion model denoising iterations)
- Diffusion Samples: 10 (explore conformational space diversity)
- Step Scale: 1.638 (diffusion step size optimization)
- Devices: 1 (GPU allocation)
- Accelerator: GPU (hardware acceleration)
- MSA Server: Enabled (evolutionary information integration)
Understanding Boltz-2 Configuration
Input Options
Sequence Input Formats
- FASTA: Simple sequence input (recommended for single proteins)
- YAML: Advanced format for complexes and constraints
- MSA: Include evolutionary information directly
MSA (Multiple Sequence Alignment)
- Automatically generated from sequence databases
- Provides evolutionary context
- Improves prediction accuracy significantly
Quality Settings
Confidence Levels (Boltz-2 scale: 0-1)
- pLDDT > 0.9: Very high confidence, near-atomic-resolution
- pLDDT 0.7-0.9: Confident regions, reliable structure
- pLDDT 0.5-0.7: Low confidence, interpret carefully
- pLDDT < 0.5: Very low confidence, likely disordered
Boltz-2 Overall Confidence Score
The Boltz-2 confidence score is calculated as: 0.8 × overall pLDDT + 0.2 × interface pTM
Recycle Steps
- More steps = better accuracy but longer time
- 3-5 steps typical for most proteins
- 7+ steps for challenging cases
Diffusion Sampling
- Controls conformational exploration
- Higher values explore more states
- Useful for flexible proteins
Advanced Options
Advanced Parameters
- Step Scale: Diffusion step size (default: 1.638)
- Devices: Number of GPUs to use
- Accelerator: Hardware acceleration (GPU/CPU)
- Checkpoint: Model checkpoint selection
Performance Optimization
- Batch Size: Process multiple predictions
- Memory Management: Automatic GPU memory handling
- Parallel Processing: Multi-GPU support
Results Analysis
Quality Assessment
- Review confidence scores (pLDDT values)
- Identify high/low confidence regions
- Compare with known structures if available
Structural Visualization
- Interactive 3D visualization with embedded Mol* viewer
- Export options for further analysis
Stage 2: FMC63 CAR-T Antibody Prediction
For complex multi-domain proteins like FMC63 CAR-T:
Input Sequence Upload the FMC63 CAR-T construct sequence containing multiple functional domains.
>A|protein
MLLLVTSLLLCELPHPAFLLIPDIQMTQTTSSLSASLGDRVTISCRASQDISKYLNWYQQKPD
GTVKLLIYHTSRLHSGVPSRFSGSGSGTDYSLTISNLEQEDIATYFCQQGNTLPYTFGGGTKL
EITGSTSGSGKPGSGEGSTKGEVKLQESGPGLVAPSQSLSVTCTVSGVSLPDYGVSWIRQPPR
KGLEWLGVIWGSETTYYNSALKSRLTIIKDNSKSQVFLKMNSLQTDDTAIYYCAKHYYYGGSY
AMDYWGQGTSVTVSSAAAIEVMYPPPYLDNEKSNGTIIHVKGKHLCPSPLFPGPSKPFWVLVV
VGGVLACYSLLVTVAFIIFWVRSKRSRLLHSDYMNMTPRRPGPTRKHYQPYAPPRDFAAYRSR
VKFSRSADAPAYQQGQNQLYNELNLGRREEYDVLDKRRGRDPEMGGKPRRKNPQEGLYNELQK
DKMAEAYSEIGMKGERRRGKGHDGLYQGLSTATKDTYDALHMQALPPRAdjusted Parameters
- Output Format: PDB (standard protein structure format)
- Recycling Steps: 7 (for complex multi-domain structure)
- Sampling Steps: 200 (diffusion model denoising iterations)
- Diffusion Samples: 15 (explore domain orientations)
- Step Scale: 1.638 (diffusion step size optimization)
- Devices: 1 (GPU allocation)
- Accelerator: GPU (hardware acceleration)
- MSA Server: Enabled (evolutionary information integration)
Expected Results
- Structured domains (scFv) show high confidence
- Flexible linkers show lower confidence
- Overall structure suitable for downstream analysis
Structure Reporting and Visualization
Generated structures are automatically processed for analysis:
-
Automated Reporting
- Confidence score analysis and visualization
- Quality metrics and validation
-
Mol Integration*
- Interactive 3D structure viewer
- Export options for publication graphics
Stage 4: Comparative Studies and Benchmarking
Method Comparison
Understanding Boltz-2's performance vs other methods:
vs AlphaFold2
- Speed: Boltz-2 is significantly faster
- Accuracy: Comparable for most proteins
- Complexes: Better handling of multi-protein systems
- Flexibility: Enhanced conformational sampling
vs Experimental Methods
- Resolution: AI predictions ~2-4Å effective resolution
- Coverage: Can predict any sequence
- Speed: Minutes vs months/years
- Cost: Dramatically reduced
Conclusion
This tutorial demonstrates the transformative power of AI-based structure prediction for biomedical research. By predicting structures of Nobel Prize-winning therapeutic targets, we've shown how modern computational methods can accelerate drug discovery and deepen our understanding of biological mechanisms.
Key Takeaways
- AI prediction is mature for most protein structures
- Confidence metrics guide interpretation and downstream use
- Integration is seamless with other Chiral applications
Next Steps
- Explore molecular docking: Use predicted structures in DiffDock or AutoDock Vina
- Analyze complexes: Use structures in protein-protein docking workflows
- Screen compounds: Apply structures in virtual screening workflows
Resources and References
Key Publications
- Ille, A. M. et al. (2025). Human protein interactome structure prediction at scale with Boltz-2. bioRxiv. https://doi.org/10.1101/2025.07.03.663068
- Sehnal, D. et al. (2021). Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Research, 49(W1), W431-W437.
- GitHub repositories:
Structural Databases
- PDB: https://www.rcsb.org/ (experimental structures)
- UniProt: https://www.uniprot.org/ (sequence data)
This tutorial showcases how AI-powered structure prediction accelerates research on life-saving therapeutics, from COVID-19 vaccines to cancer immunotherapy.