Foundation Models for Drug Discovery

MolForge AI Foundry

The first unified 3-pillar foundation model stack built for big-pharma discovery programs. Fine-tune MoLFormer-XL on your HTS data. Rank binding poses with Uni-Mol. Orchestrate everything with ChemCrow + Llama-3. On-prem, VPC-isolated, or MolForge managed.

1.1B+ Molecules in pre-train
12 Target classes covered
< 2 weeks Custom model delivery
SOC 2 · HIPAA Enterprise controls
The 3-Pillar Architecture

Best-in-class, open-source, production-ready.

Each pillar is state-of-the-art on its task. Together they cover every modality of drug discovery — text (SMILES), 3D geometry, and natural-language reasoning.

PILLAR 01

2D & ADMET Engine

Predict absorption, distribution, metabolism, excretion, toxicity — and every physical-chemistry property that matters.

IBM MoLFormer-XL
112M params · pre-trained on 1.1B molecules
  • All 32 ADMET endpoints (hERG, Caco-2, MDCK, CYP450 × 6, BBB, Ames, DILI…)
  • Fine-tunes on your proprietary HTS / lab-assay data
  • Quantum-property prediction (HOMO/LUMO, dipole, polarizability)
  • Sub-100ms inference at batch=64 on a single A100
Fine-tune MoLFormer →
PILLAR 02

Spatial & 3D Engine

Understand how a molecule folds, docks, and actually fits the target pocket in three dimensions.

Uni-Mol + EGNN
48M params · SE(3)-equivariant · conformer-aware
  • Pose ranking with binding-affinity head (ΔG in kcal/mol)
  • Conformer sampling with diffusion guidance
  • Protein-ligand interaction fingerprints (PLIFs)
  • Outperforms AutoDock Vina by 2.1× on DUD-E decoys
Fine-tune Uni-Mol →
PILLAR 03

Generative Oracle

Natural-language drug-design agent. Orchestrates Pillars 1 & 2 via tool-use. Reinforcement-learned from medicinal-chemistry feedback.

ChemCrow + Llama-3 70B
70B params · 18 tools · self-refining loop
  • Prompt: "Design a selective JAK2 inhibitor with hERG > 10µM and no PAINS"
  • Iteratively calls MoLFormer + Uni-Mol, refines until all constraints pass
  • Generates SAR tables, synthesis routes, patent-landscape warnings
  • Human-in-the-loop mode with Slack / Teams approvals
Customize ChemCrow →
How They Compose

One prompt. Three models. A drug candidate.

ChemCrow generates
LLM proposes 256 candidates from natural-language brief
MoLFormer filters
ADMET gate removes toxicity flags, Lipinski violations
Uni-Mol ranks
3D docking against target pocket, PLIF scoring, top-10 returned
Try It Live

ADMET prediction in the browser

Runs a small MoLFormer head on a real A100 in us-central1. First 5 inferences free.

Ready. Click Run to send to the Foundry API.
Pharma-Grade Pricing

Priced like Schrödinger. Built like DeepMind.

Annual commitments. GPU time + managed MLOps + regulatory SLAs included. All plans ship with audit logs, SOC 2, HIPAA BAA, and SSO.

Starter
Discovery Seat
$250K / seat / yr
billed annually · min. 2 seats
  • Unlimited inference on all 3 pillars
  • 100k molecule / month ADMET batch
  • 10k docking poses / month (Uni-Mol)
  • 1k ChemCrow agent sessions / month
  • 3 fine-tuning runs on your data (≤ 50k mols)
  • Shared multi-tenant GPU pool
  • Business-hours support (5 day SLA)
Request Quote
Enterprise
Platform Partnership
$18M+ / yr
5-year minimum · scoped to portfolio
  • Unlimited programs, targets, seats
  • Custom foundation model on your data (≥ 10M mols)
  • 32× H100 reserved, dedicated region
  • Joint publications + co-developed IP
  • Milestone + royalty co-development option
  • Named MLOps team + quarterly onsite reviews
  • On-prem air-gapped deploy available
  • GxP / 21 CFR Part 11 compliant deploy
Executive Briefing
Overage & On-Demand Compute (à la carte)
$180
per 1,000 mols
MoLFormer ADMET batch
$420
per 1,000 poses
Uni-Mol docking & scoring
$28
per agent session
ChemCrow reasoning loop
$4,900
per fine-tune (≤ 10k)
MoLFormer / Uni-Mol tune
$89,000
per run
Foundation pre-train (≤ 1M mols)
Quote
≥ 10M mols
Full foundation model build

Ready to turn your HTS library into a drug discovery engine?

Most pharma deploys go from first data upload to fine-tuned model in 9 days.

FAQ

Common questions from pharma CIO / Head of Discovery

Who owns the fine-tuned model weights?
You do. Every fine-tune you run on your own data produces a model where you retain 100% IP — we can't see it, reuse it, or sell it to another tenant. Contract guarantees include IP-indemnification up to contract value.
Can we run this on-prem / air-gapped?
Yes, on Enterprise. We ship a hardened OCI bundle that deploys into your GPU cluster (NVIDIA DGX, AWS Outposts, GCP Bare Metal, or customer data center). License is node-locked. Model weights never leave your environment.
How does this compare to Schrödinger / Atomwise / Insilico?
Schrödinger focuses on physics-based methods (FEP+). Atomwise is docking-only. Insilico is a closed black-box. MolForge Foundry is the only stack that ships open foundation models you can fine-tune and own — with the same ADMET/docking/generative coverage, at 40% of typical pharma contract TCO.
What GPUs are used? What's the SLA on fine-tuning?
Default tier runs A100 80GB; Enterprise gets H100s. Fine-tuning SLA: 72h for < 50k molecules, 10d for < 1M. Pre-training (foundation) quoted case-by-case; typical 6-week delivery for 10M mol corpora.
Do you sign BAAs / GxP validation docs?
BAAs: yes, from Program License up. GxP / 21 CFR Part 11 validation packet: included on Enterprise; available as an add-on ($180k one-time) on Program License.
Can we bring our own LLM for the agent pillar?
Yes. ChemCrow is model-agnostic. You can swap Llama-3 for Claude, GPT-4o, MolForge AI, or a private fine-tuned base. On Enterprise we help you set up the routing + benchmark suite.