Bioptic Explores the Frontier of Medicinal Chemistry with LLMs at the ASAP Discovery x OpenADMET Challenge
Gemini-powered ADME property prediction demonstrates that LLMs can compete with traditional models in real-world drug discovery tasks.
On March 25, 2025, Bioptic participated in the ASAP Discovery x OpenADMET Antiviral Challenge, a competitive benchmarking event that brought together the world’s best teams to tackle one of the hardest problems in early-stage drug development: predicting ADME (Absorption, Distribution, Metabolism, Excretion) properties of small molecules.
Our submission was unique — and bold. Instead of relying on the usual suspects like graph neural networks or gradient boosting machines, we asked a large language model to do medicinal chemistry. And it worked.
LLMs in Medicinal Chemistry: A Bold Experiment
Our experiment, led by Bioptic co-founder Vlad Vinogradov, involved fine-tuning Gemini 1.5 Flash (Google Cloud) using a very simple setup. For each query molecule, the model was provided with:
• The SMILES string of the molecule
• A few close analogs with known experimental ADME values
And then asked — no prompts, no special tricks — to directly predict the floating-point value of the property (e.g., solubility or permeability).
This wasn’t even a regression training. Just standard Supervised Fine-Tuning (SFT) using cross-entropy loss on tokenized numerical values. And yet, it worked surprisingly well.
Results: Competitive Performance with Minimal Tuning
Despite not ensembling and using minimal external data (unlike many other teams), our LLM-only solution achieved:
• Top-3 prediction on MDR1-MDCKII permeability
• Shared Top-2 on KSOL solubility
These are Tier-1 ADME properties, essential for assessing drug-like behavior early in the pipeline. Competing directly with traditional, heavily tuned models — and performing at this level — is a major step forward.
🔗 View the public report https://lnkd.in/ebnZT67t
🔗 Explore the code https://github.com/Alicegaz/AgenticADMET
What This Means for Drug Discovery
The implications are huge. Imagine directly fitting LLMs with heterogeneous, per-assay ADME data and predicting readouts in a unified framework. No handcrafted features. No ensemble pipelines. Just one scalable foundation model for property prediction — turning AI model-building into a commodity and putting the focus where it belongs: on data quality and benchmarking.
While our model didn’t perform equally well across all five tasks — due to its minimalist setup — the experiment proves a key point: LLMs are ready to take on medicinal chemistry.
Looking Ahead
At Bioptic, we’re integrating this approach into our broader AI drug discovery platform. From hit identification to PK profiling, LLMs will help streamline decisions and make model development dramatically faster and more flexible.
This is just the beginning. Stay with us as we continue to apply generative AI to every corner of the drug discovery process.