Foundation Models for Drug Discovery

MolForge AI Foundry

The first unified 3-pillar foundation model stack built for big-pharma discovery programs. Fine-tune MoLFormer-XL on your HTS data. Rank binding poses with Uni-Mol. Orchestrate everything with ChemCrow + Llama-3. On-prem, VPC-isolated, or MolForge managed.

Train Your Model See Pharma Pricing

1.1B+ Molecules in pre-train

12 Target classes covered

< 2 weeks Custom model delivery

SOC 2 · HIPAA Enterprise controls

The 3-Pillar Architecture

Best-in-class, open-source, production-ready.

Each pillar is state-of-the-art on its task. Together they cover every modality of drug discovery — text (SMILES), 3D geometry, and natural-language reasoning.

PILLAR 01

2D & ADMET Engine

Predict absorption, distribution, metabolism, excretion, toxicity — and every physical-chemistry property that matters.

IBM MoLFormer-XL

112M params · pre-trained on 1.1B molecules

All 32 ADMET endpoints (hERG, Caco-2, MDCK, CYP450 × 6, BBB, Ames, DILI…)
Fine-tunes on your proprietary HTS / lab-assay data
Quantum-property prediction (HOMO/LUMO, dipole, polarizability)
Sub-100ms inference at batch=64 on a single A100

Fine-tune MoLFormer →

PILLAR 02

Spatial & 3D Engine

Understand how a molecule folds, docks, and actually fits the target pocket in three dimensions.

Uni-Mol + EGNN

48M params · SE(3)-equivariant · conformer-aware

Pose ranking with binding-affinity head (ΔG in kcal/mol)
Conformer sampling with diffusion guidance
Protein-ligand interaction fingerprints (PLIFs)
Outperforms AutoDock Vina by 2.1× on DUD-E decoys

Fine-tune Uni-Mol →

PILLAR 03

Generative Oracle

Natural-language drug-design agent. Orchestrates Pillars 1 & 2 via tool-use. Reinforcement-learned from medicinal-chemistry feedback.

ChemCrow + Llama-3 70B

70B params · 18 tools · self-refining loop

Prompt: "Design a selective JAK2 inhibitor with hERG > 10µM and no PAINS"
Iteratively calls MoLFormer + Uni-Mol, refines until all constraints pass
Generates SAR tables, synthesis routes, patent-landscape warnings
Human-in-the-loop mode with Slack / Teams approvals

Customize ChemCrow →

How They Compose

One prompt. Three models. A drug candidate.

ChemCrow generates

LLM proposes 256 candidates from natural-language brief

MoLFormer filters

ADMET gate removes toxicity flags, Lipinski violations

Uni-Mol ranks

3D docking against target pocket, PLIF scoring, top-10 returned

Try It Live

ADMET prediction in the browser

Runs a small MoLFormer head on a real A100 in us-central1. First 5 inferences free.

Ready. Click Run to send to the Foundry API.

Pharma-Grade Pricing

Priced like Schrödinger. Built like DeepMind.

Annual commitments. GPU time + managed MLOps + regulatory SLAs included. All plans ship with audit logs, SOC 2, HIPAA BAA, and SSO.

Starter

Discovery Seat

$250K / seat / yr

billed annually · min. 2 seats

Unlimited inference on all 3 pillars
100k molecule / month ADMET batch
10k docking poses / month (Uni-Mol)
1k ChemCrow agent sessions / month
3 fine-tuning runs on your data (≤ 50k mols)
Shared multi-tenant GPU pool
Business-hours support (5 day SLA)

Request Quote

Ready to turn your HTS library into a drug discovery engine?

Most pharma deploys go from first data upload to fine-tuned model in 9 days.

Start Training Book a Demo

FAQ

Common questions from pharma CIO / Head of Discovery

Who owns the fine-tuned model weights?

You do. Every fine-tune you run on your own data produces a model where you retain 100% IP — we can't see it, reuse it, or sell it to another tenant. Contract guarantees include IP-indemnification up to contract value.

Can we run this on-prem / air-gapped?

Yes, on Enterprise. We ship a hardened OCI bundle that deploys into your GPU cluster (NVIDIA DGX, AWS Outposts, GCP Bare Metal, or customer data center). License is node-locked. Model weights never leave your environment.

How does this compare to Schrödinger / Atomwise / Insilico?

Schrödinger focuses on physics-based methods (FEP+). Atomwise is docking-only. Insilico is a closed black-box. MolForge Foundry is the only stack that ships open foundation models you can fine-tune and own — with the same ADMET/docking/generative coverage, at 40% of typical pharma contract TCO.

What GPUs are used? What's the SLA on fine-tuning?

Default tier runs A100 80GB; Enterprise gets H100s. Fine-tuning SLA: 72h for < 50k molecules, 10d for < 1M. Pre-training (foundation) quoted case-by-case; typical 6-week delivery for 10M mol corpora.

Do you sign BAAs / GxP validation docs?

BAAs: yes, from Program License up. GxP / 21 CFR Part 11 validation packet: included on Enterprise; available as an add-on ($180k one-time) on Program License.

Can we bring our own LLM for the agent pillar?

Yes. ChemCrow is model-agnostic. You can swap Llama-3 for Claude, GPT-4o, MolForge AI, or a private fine-tuned base. On Enterprise we help you set up the routing + benchmark suite.