3D Structure Prediction

This tutorial demonstrates how to use Boltz-2 for state-of-the-art biomolecular structure prediction, featuring examples from Nobel Prize-winning therapeutic breakthroughs that have saved millions of lives worldwide.

Tutorial Overview

Learning Objectives

By completing this tutorial, you will:

Master modern AI-based structure prediction workflows
Understand confidence metrics and quality assessment
Analyze structural predictions with integrated visualization
Connect structure prediction to downstream applications

Tutorial Examples

COVID-19 Spike RBD

Viral receptor binding domain structure prediction

FMC63 CAR-T Antibody

Therapeutic antibody structure prediction

Prerequisites

Basic understanding of protein structure
Protein sequence in FASTA format
Access to Chiral Potter platform

Stage 1: COVID-19 Spike RBD Structure Prediction

Setting Up the Prediction

Create New Project and Workflow

Project name: "COVID-19-Spike-RBD-Prediction"
Workflow name: "Structure prediction"

Prepare Sequence Input Upload the Spike RBD sequence (UniProt P0DTC2, region 331-524):

>A|protein
NITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFT
NVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLF
RKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLH
APATV

Configure Boltz-2 Parameters

Output Format: PDB (standard protein structure format)
Recycling Steps: 5 (enhanced accuracy through iterative refinement)
Sampling Steps: 200 (diffusion model denoising iterations)
Diffusion Samples: 10 (explore conformational space diversity)
Step Scale: 1.638 (diffusion step size optimization)
Devices: 1 (GPU allocation)
Accelerator: GPU (hardware acceleration)
MSA Server: Enabled (evolutionary information integration)

Understanding Boltz-2 Configuration

Input Options

Sequence Input Formats

FASTA: Simple sequence input (recommended for single proteins)
YAML: Advanced format for complexes and constraints
MSA: Include evolutionary information directly

MSA (Multiple Sequence Alignment)

Automatically generated from sequence databases
Provides evolutionary context
Improves prediction accuracy significantly

Quality Settings

Confidence Levels (Boltz-2 scale: 0-1)

pLDDT > 0.9: Very high confidence, near-atomic-resolution
pLDDT 0.7-0.9: Confident regions, reliable structure
pLDDT 0.5-0.7: Low confidence, interpret carefully
pLDDT < 0.5: Very low confidence, likely disordered

Boltz-2 Overall Confidence Score

The Boltz-2 confidence score is calculated as: 0.8 × overall pLDDT + 0.2 × interface pTM

Recycle Steps

More steps = better accuracy but longer time
3-5 steps typical for most proteins
7+ steps for challenging cases

Diffusion Sampling

Controls conformational exploration
Higher values explore more states
Useful for flexible proteins

Advanced Options

Advanced Parameters

Step Scale: Diffusion step size (default: 1.638)
Devices: Number of GPUs to use
Accelerator: Hardware acceleration (GPU/CPU)
Checkpoint: Model checkpoint selection

Performance Optimization

Batch Size: Process multiple predictions
Memory Management: Automatic GPU memory handling
Parallel Processing: Multi-GPU support

Results Analysis

Quality Assessment

Review confidence scores (pLDDT values)
Identify high/low confidence regions
Compare with known structures if available

Structural Visualization

Interactive 3D visualization with embedded Mol* viewer
Export options for further analysis

Stage 2: FMC63 CAR-T Antibody Prediction

For complex multi-domain proteins like FMC63 CAR-T:

Input Sequence Upload the FMC63 CAR-T construct sequence containing multiple functional domains.

>A|protein
MLLLVTSLLLCELPHPAFLLIPDIQMTQTTSSLSASLGDRVTISCRASQDISKYLNWYQQKPD
GTVKLLIYHTSRLHSGVPSRFSGSGSGTDYSLTISNLEQEDIATYFCQQGNTLPYTFGGGTKL
EITGSTSGSGKPGSGEGSTKGEVKLQESGPGLVAPSQSLSVTCTVSGVSLPDYGVSWIRQPPR
KGLEWLGVIWGSETTYYNSALKSRLTIIKDNSKSQVFLKMNSLQTDDTAIYYCAKHYYYGGSY
AMDYWGQGTSVTVSSAAAIEVMYPPPYLDNEKSNGTIIHVKGKHLCPSPLFPGPSKPFWVLVV
VGGVLACYSLLVTVAFIIFWVRSKRSRLLHSDYMNMTPRRPGPTRKHYQPYAPPRDFAAYRSR
VKFSRSADAPAYQQGQNQLYNELNLGRREEYDVLDKRRGRDPEMGGKPRRKNPQEGLYNELQK
DKMAEAYSEIGMKGERRRGKGHDGLYQGLSTATKDTYDALHMQALPPR

Adjusted Parameters

Output Format: PDB (standard protein structure format)
Recycling Steps: 7 (for complex multi-domain structure)
Sampling Steps: 200 (diffusion model denoising iterations)
Diffusion Samples: 15 (explore domain orientations)
Step Scale: 1.638 (diffusion step size optimization)
Devices: 1 (GPU allocation)
Accelerator: GPU (hardware acceleration)
MSA Server: Enabled (evolutionary information integration)

Expected Results

Structured domains (scFv) show high confidence
Flexible linkers show lower confidence
Overall structure suitable for downstream analysis

Structure Reporting and Visualization

Generated structures are automatically processed for analysis:

Automated Reporting
- Confidence score analysis and visualization
- Quality metrics and validation
Mol Integration*
- Interactive 3D structure viewer
- Export options for publication graphics

Stage 4: Comparative Studies and Benchmarking

Method Comparison

Understanding Boltz-2's performance vs other methods:

vs AlphaFold2

Speed: Boltz-2 is significantly faster
Accuracy: Comparable for most proteins
Complexes: Better handling of multi-protein systems
Flexibility: Enhanced conformational sampling

vs Experimental Methods

Resolution: AI predictions ~2-4Å effective resolution
Coverage: Can predict any sequence
Speed: Minutes vs months/years
Cost: Dramatically reduced

Conclusion

This tutorial demonstrates the transformative power of AI-based structure prediction for biomedical research. By predicting structures of Nobel Prize-winning therapeutic targets, we've shown how modern computational methods can accelerate drug discovery and deepen our understanding of biological mechanisms.

Key Takeaways

AI prediction is mature for most protein structures
Confidence metrics guide interpretation and downstream use
Integration is seamless with other Chiral applications

Next Steps

Explore molecular docking: Use predicted structures in DiffDock or AutoDock Vina
Analyze complexes: Use structures in protein-protein docking workflows
Screen compounds: Apply structures in virtual screening workflows

Resources and References

Key Publications

Ille, A. M. et al. (2025). Human protein interactome structure prediction at scale with Boltz-2. bioRxiv. https://doi.org/10.1101/2025.07.03.663068
Sehnal, D. et al. (2021). Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Research, 49(W1), W431-W437.
GitHub repositories:
- Boltz: https://github.com/jwohlwend/boltz
- Mol*: https://github.com/molstar/molstar

Structural Databases

PDB: https://www.rcsb.org/ (experimental structures)
UniProt: https://www.uniprot.org/ (sequence data)

This tutorial showcases how AI-powered structure prediction accelerates research on life-saving therapeutics, from COVID-19 vaccines to cancer immunotherapy.