Abstract
We present BIOPTIC B1, an ultra-high-throughput ligand-based virtual screening system that evaluates multi-billion libraries in minutes. Retrospectively, B1 performs on par with ML SOTA; prospectively, it discovers multiple novel ligands for LRRK2 (incl. G2019S), with best Kd = 110 nM. The results demonstrate fast hit identification and scaffold hopping across ultra-large chemical space.
Highlights
- Scale: 40B Enamine REAL Space compounds
- Cycle time: 134 predicted leads synthesized in 11 weeks (93% success)
- Results: 14 binders confirmed (KINOMEscan); best Kd = 110 nM (sub-µM)
- Expansion: 10 / 47 analogs hit (21% hit rate)
- Novelty: ≤ 0.4 ECFP4 Tanimoto vs any BindingDB active
- Throughput & cost: CPU-only retrieval over 40B in 2:15 per query; est. screen ~$5
Methods (one paragraph)
BIOPTIC B1 is a SMILES-based transformer (RoBERTa-style) pre-trained on ~160M molecules (PubChem + Enamine REAL) and fine-tuned on BindingDB to learn potency-aware embeddings. Each molecule is mapped to a 60-dim vector; we run SIMD-optimized cosine search over pre-indexed libraries (GPU indexing once; CPU search thereafter). The LRRK2 campaign used diverse known inhibitors as queries, prioritized CNS-like chemistry and novelty, synthesized candidates via Enamine, and assayed binding with KINOMEscan (dose-response Kd).
Parkinson’s case study: LRRK2 (incl. G2019S)
- Hit ID: 87 compounds tested → 4 with Kd ≤ 10 µM.
- Analog expansion: 47 compounds → 10 additional actives (21%).
- Top hits: three sub-µM binders; several show improved affinity on wild-type LRRK2.
- Outcome: rapid navigation to new chemical series ready for lead optimization.
Scientific rigor
- Competitive with Chemprop and other SOTA baselines across multiple targets (retrospective).
- Strict novelty and liability filters (REOS, PAINS; ≤0.4 Tanimoto to any BindingDB active).
- Full Supporting Information available for data, scripts, and protocols.
Links & availability
- Open-access article (JCIM, Special Issue): https://pubs.acs.org/doi/10.1021/acs.jcim.5c00743
- Supporting Information: linked on the journal page (datasets, scripts, TableS1, protocols).
- Screen your target: https://pipeline.bioptic.io/
Citation
BIOPTIC B1 Ultra-High-Throughput Virtual Screening System Discovers LRRK2 Ligands in Vast Chemical Space. Journal of Chemical Information and Modeling (2025), Special Issue “Chemical Compound Space Exploration by Multiscale High-Throughput Screening and Machine Learning”. CC-BY-NC-ND 4.0.
Authors & acknowledgments
V. Vinogradov, K. T. Nguyen, S. Steshin, I. Izmailov, A. Doronichev.
We acknowledge collaborators and contributors as listed in the paper.
