AI scientists are becoming a new interface for scientific computing. These agents read papers, write code, generate hypotheses, call APIs, and inspect files. But science is not software engineering. No test suite turns green when a hypothesis is correct. Discovery stays iterative, uncertain, and grounded in the physical world.
That gap is what NVIDIA is targeting. NVIDIA published a hands-on walkthrough for its BioNeMo Agent Toolkit. The argument is direct. A general coding agent pointed at biology will not produce new medicines. In biomolecular research, an agent’s ceiling is set by the tools it can use reliably, correctly, and efficiently.
TL;DR
BioNeMo Agent Toolkit packages NVIDIA biomolecular models as documented, callable agent skills.
Skills span protein folding, docking, generative chemistry, genomics, and protein design.
NVIDIA reports task completion rising from 57.1% to 100% with skills.
Agents averaged 2x more passing assertions per 1,000 tokens.
Hosted NIM endpoints suit quick access; local NIM suits repeated iteration.
Interactive Explainer
What is BioNeMo Agent Toolkit
The BioNeMo Agent Toolkit is an open-source repository of ‘skills’ for AI agents. Each skill turns an NVIDIA biomolecular model into a tool an agent can call. The toolkit packages protein folding, molecular docking, generative chemistry, genomics analysis, protein design, and biomarker discovery.
NVIDIA frames the platform in two parts. The first is an accelerated tool layer. NVIDIA NIM (NVIDIA Inference Microservices) and BioNeMo open models deliver core capabilities as callable services. These are accelerated by libraries such as cuEquivariance for structure models and Parabricks for genomics. The second part is agent-ready interfaces. BioNeMo Skills package each capability so an agent can use it.
A skill documents the model’s purpose, required inputs, optional parameters, expected artifacts, and failure modes. Model Context Protocol (MCP) server wrappers expose open models not yet packaged as NIM. Together, this lets an agent discover, select, invoke, and interpret biomolecular models on its own.
The repository groups skills into nim-skills, open-models-skills, and library-skills. A workflows folder holds multi-step meta-skills. One example is generative_protein_binder_design, which chains RFdiffusion → ProteinMPNN → OpenFold3.
How a BioNeMo Skill Works
Every skill is a directory with a SKILL.md file. It holds YAML frontmatter plus instructions, optional references, and optional scripts. An agent reads it like documentation, then acts on it.
The prompt pattern stays the same across models. The NVIDIA’s post uses OpenFold3. The same shape applies to other NIMs for biology. These include Boltz-2, DiffDock, GenMol, ProteinMPNN, MSA Search, RFdiffusion, and Evo 2. You name the skill, the input, and the endpoint.
# Hosted NIM endpoint
Use the OpenFold3 BioNeMo Skill to fold MKTVRQERLKSIVR
with the NVIDIA API endpoint at https://build.nvidia.com/openfold3
# Local NIM deployment
Use the OpenFold3 BioNeMo Skill to fold MKTVRQERLKSIVR
with the local NIM endpoint at http://localhost:8000
Installation pulls skills through the open-source skills CLI:
# Browse and pick a skill interactively
npx skills add NVIDIA-BioNeMo/bionemo-agent-toolkit
# Or install one skill for a specific agent
npx skills add NVIDIA-BioNeMo/bionemo-agent-toolkit –skill boltz2-nim –agent claude-code
Deployment is a choice, not a default. Use hosted NIM endpoints for fast access without managing infrastructure. Move selected models local when you need lower warm latency, data locality, or repeated iteration.
Benchmark
NVIDIA measured whether skills actually improve an agent’s loop. All reported metrics came from Codex CLI running GPT-5.5 fast. The team compared the same agent with and without each skill.
Task completion was the first metric. Without skills, the agent completed 57.1% of required tasks on average. With access to NIM skills, completion reached 100%.
Efficiency was the second metric. NVIDIA counted passing assertions, the individual steps that compose a task. With skills, an agent produced 2x more passing assertions per 1,000 tokens. That gain held across all ten NIM skills tested.
Use Cases With Examples
Protein structure prediction: An agent folds a peptide sequence with Boltz-2 or OpenFold3. It returns a CIF file for downstream inspection.
Multiple sequence alignment: An agent generates an MSA with MMseqs2 through the MSA Search skill. The artifact is an A3M file.
Generative chemistry: An agent generates candidate molecules with GenMol. Outputs arrive as SDF or SMILES for filtering.
Protein binder design: The generative_protein_binder_design workflow chains three models. RFdiffusion builds a backbone, ProteinMPNN designs the sequence, and OpenFold3 validates the fold.
Each loop follows the same shape: The agent selects a model, prepares inputs, runs it, inspects outputs, and explains results with caveats.
How It Compares: Agent With vs Without Skills
DimensionGeneral agent (no skills)Agent + BioNeMo SkillsTask completion57.1% average100% averageToken efficiencyBaseline2x passing assertions per 1k tokensModel selectionGuesses tool, format, and inputsReads purpose, inputs, and artifactsDeploymentManual setup from sourceHosted or local NIM, documentedFailure handlingUnknown failure modesDocumented failure modes per skillWorkflowsIsolated single callsMulti-step meta-skills (binder design)
Getting Started
The prerequisites are minimal. You need an agent runtime such as Claude or Codex. You need an NVIDIA API key for hosted BioNeMo NIM endpoints. A GPU node is optional, for local NIM deployment.
Point the agent at the repository first. Let it enumerate the available capabilities before it acts. Then hand it a single skill to operate one model.
NVIDIA flags two cautions. The build.nvidia.com endpoints are for small-scale development and testing only. They are not production-grade inference. NVIDIA also stresses validation: check low-confidence structures and filter generated molecules before trusting them.
Check out the Repo and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us
The post NVIDIA BioNeMo Agent Toolkit Turns Biomolecular Models Into Callable Skills for AI Agents in Drug Discovery appeared first on MarkTechPost.