This submit is co-authored with the Biomni group from Stanford.
Biomedical researchers spend roughly 90% of their time manually processing large volumes of scattered info. That is evidenced by Genentech’s challenge of processing 38 million biomedical publications in PubMed, public repositories just like the Human Protein Atlas, and their inner repository of lots of of tens of millions of cells throughout lots of of ailments. There’s a speedy proliferation of specialised databases and analytical instruments throughout totally different modalities together with genomics, proteomics, and pathology. Researchers should keep present with the big panorama of instruments, leaving much less time for the hypothesis-driven work that drives breakthrough discoveries.
AI brokers powered by basis fashions provide a promising resolution by autonomously planning, executing, and adapting complicated analysis duties. Stanford researchers constructed Biomni that exemplifies this potential. Biomni is a general-purpose biomedical AI agent that integrates 150 specialised instruments, 105 software program packages, and 59 databases to execute refined analyses similar to gene prioritization, drug repurposing, and uncommon illness analysis.
Nevertheless, deploying such brokers in manufacturing requires sturdy infrastructure able to dealing with computationally intensive workflows and a number of concurrent customers whereas sustaining safety and efficiency requirements. Amazon Bedrock AgentCore is a set of complete providers to deploy and function extremely succesful brokers utilizing any framework or mannequin, with enterprise-grade safety and scalability.
On this submit, we present you tips on how to implement a analysis agent utilizing AgentCore with entry to over 30 specialised biomedical database instruments from Biomni, thereby accelerating scientific discovery whereas sustaining enterprise-grade safety and manufacturing scale. The code for this resolution is on the market within the open-source toolkit repository of starter brokers for all times sciences on Amazon Net Companies (AWS). The step by step instruction helps you deploy your individual instruments and infrastructure, together with AgentCore parts, and examples.
Prototype-to-production complexity hole
Transferring from an area biomedical analysis prototype to a manufacturing system accessible by a number of analysis groups requires addressing complicated infrastructure challenges.
Agent deployment with enterprise safety
Enterprise safety challenges embrace OAuth-based authentication, safe software sharing via scalable gateways, complete observability for analysis audit trails, and automated scaling to deal with concurrent analysis workloads. Many promising prototypes fail to achieve manufacturing due to the complexity of implementing these enterprise-grade necessities whereas sustaining the specialised area experience wanted for correct biomedical evaluation.
Session-aware analysis context administration
Biomedical analysis workflows typically span a number of conversations and require persistent reminiscence of earlier analyses, experimental parameters, and analysis preferences throughout prolonged analysis classes. Analysis brokers should keep contextual consciousness of ongoing tasks, bear in mind particular protein targets, experimental circumstances, and analytical preferences. All that have to be carried out whereas facilitating correct session isolation between totally different researchers and analysis tasks in a multi-tenant manufacturing setting.
Scalable software gateway
Implementing a reusable software gateway that may deal with concurrent requests from analysis agent, correct authentication, and constant efficiency turns into crucial at scale. The gateway should allow brokers to find and use instruments via safe endpoints, assist brokers discover the fitting instruments via contextual search capabilities, and handle each inbound authentication (verifying agent identification) and outbound authentication (connecting to exterior biomedical databases) in a unified service. With out this structure, analysis groups face authentication complexity and reliability points that forestall efficient scaling.
Resolution overview
We use Strands Agents, an open supply agent framework, to construct a analysis agent with native software implementation for PubMed biomedical literature search. We prolonged the agent’s capabilities by integrating Biomni database instruments, offering entry to over 30 specialised biomedical databases.
The general structure is proven within the following diagram.

The AgentCore Gateway service centralizes Biomni database instruments as safer, reusable endpoints with semantic search capabilities. AgentCore Reminiscence service maintains contextual consciousness throughout analysis classes utilizing specialised methods for analysis context. Safety is dealt with by AgentCore Id service, which manages authentication for each customers and gear entry management. Deployment is streamlined with the AgentCore Runtime service, offering scalable, managed deployment with session isolation. Lastly, the AgentCore Observability service allows complete monitoring and auditing of analysis workflows which can be crucial for scientific reproducibility.
Step 1 – Creating instruments such because the Biomni database instruments utilizing AgentCore Gateway
In real-world use circumstances, we have to join brokers to totally different knowledge sources. Every agent would possibly duplicate the identical instruments, resulting in intensive code, inconsistent habits, and upkeep nightmares. AgentCore Gateway service streamlines this course of by centralizing instruments into reusable, safe endpoints that brokers can entry. Mixed with the AgentCore Id service for authentication, AgentCore Gateway creates an enterprise-grade software sharing infrastructure. To present extra context to the agent with reusable instruments, we offered entry to over 30 specialised public database APIs via the Biomni instruments registered on the gateway. The gateway exposes Biomni’s database instruments via the Model Context Protocol (MCP), permitting the analysis agent to find and invoke these instruments alongside native instruments like PubMed. It handles authentication, charge limiting, and error dealing with, offering a seamless analysis expertise.
We use an AWS Lambda operate to host the Biomni integration code. The Lambda operate is robotically configured as an MCP goal within the AgentCore Gateway. The Lambda operate exposes its obtainable instruments via the API specification (api_spec.json).
# Gateway Goal Configuration
lambda_target_config = {
"mcp": {
"lambda": {
"lambdaArn": get_ssm_parameter("/app/researchapp/agentcore/lambda_arn"),
"toolSchema": {"inlinePayload": api_spec},
}
}
}
# Create the goal
create_target_response = gateway_client.create_gateway_target(
gatewayIdentifier=gateway_id,
identify="LambdaUsingSDK",
description="Lambda Goal utilizing SDK",
targetConfiguration=lambda_target_config,
credentialProviderConfigurations=[{
"credentialProviderType": "GATEWAY_IAM_ROLE"
}],
)
The total checklist of Biomni database instruments included on the gateway are listed within the following desk:
| Group | Device | Description |
| Protein and construction databases | UniProt | Question the UniProt REST API for complete protein sequence and useful info |
| AlphaFold | Question the AlphaFold Database API for AI-predicted protein construction predictions | |
| InterPro | Question the InterPro REST API for protein domains, households, and useful websites | |
| PDB (Protein Information Financial institution) | Question the RCSB PDB database for experimentally decided protein buildings | |
| STRING | Question the STRING protein interplay database for protein-protein interplay networks | |
| EMDB (Electron Microscopy Information Financial institution) | Question for 3D macromolecular buildings decided by electron microscopy | |
| Genomics and variants | ClinVar | Question NCBI's ClinVar database for clinically related genetic variants and their interpretations |
| dbSNP | Question the NCBI dbSNP database for single nucleotide polymorphisms and genetic variations | |
| gnomAD | Question gnomAD for population-scale genetic variant frequencies and annotations | |
| Ensembl | Question the Ensembl REST API for genome annotations, gene info, and comparative genomics | |
| UCSC Genome Browser | Question the UCSC Genome Browser API for genomic knowledge and annotations | |
| Expression and omics | GEO (Gene Expression Omnibus) | Question NCBI's GEO for RNA-seq, microarray, and different gene expression datasets |
| PRIDE | Question the PRIDE database for proteomics identifications and mass spectrometry knowledge | |
| Reactome | Question the Reactome database for organic pathways and molecular interactions | |
| Medical and drug knowledge | cBioPortal | Question the cBioPortal REST API for most cancers genomics knowledge and scientific info |
| ClinicalTrials.gov | Question ClinicalTrials.gov API for details about scientific research and trials | |
| OpenFDA | Question the OpenFDA API for FDA drug, system, and meals security knowledge | |
| GtoPdb (Information to PHARMACOLOGY) | Question the Information to PHARMACOLOGY database for drug targets and pharmacological knowledge | |
| Illness and phenotype | OpenTargets | Question the OpenTargets Platform API for disease-target associations and drug discovery knowledge |
| Monarch Initiative | Question the Monarch Initiative API for phenotype and illness info throughout species | |
| GWAS Catalog | Question the GWAS Catalog API for genome-wide affiliation examine outcomes | |
| RegulomeDB | Question the RegulomeDB database for regulatory variant annotations and useful predictions | |
| Specialised databases | JASPAR | Question the JASPAR REST API for transcription issue binding website profiles and motifs |
| WoRMS (World Register of Marine Species) | Question the WoRMS REST API for marine species taxonomic info | |
| Paleobiology Database (PBDB) | Question the PBDB API for fossil incidence and taxonomic knowledge | |
| MPD (Mouse Phenome Database) | Question the Mouse Phenome Database for mouse pressure phenotype knowledge | |
| Synapse | Question Synapse REST API for biomedical datasets and collaborative analysis knowledge |
The next are examples of how particular person instruments get triggered via the MCP from our take a look at suite:
Because the software assortment grows, the agent can use built-in semantic search capabilities to find and choose instruments based mostly on the duty context. This improves agent efficiency and decreasing improvement complexity at scale. For instance, the person asks, “inform me about HER2 variant rs1136201.” As a substitute of itemizing all 30 or extra instruments from the gateway again to the agent, semantic search returns ‘n’ most related instruments. For instance, Ensembl, Gwas catalog, ClinVar, and Dbsnp to the agent. The agent now makes use of a smaller subset of instruments as enter to the mannequin to return a extra environment friendly and quicker response.
The next graphic illustrates utilizing AgentCore Gateway for software search.

Now you can take a look at your deployed AgentCore gateway utilizing the next take a look at scripts and evaluate how semantic search narrows down the checklist of related instruments based mostly on the search question.
uv run assessments/test_gateway.py --prompt "What instruments can be found?"
uv run assessments/test_gateway.py --prompt "Discover details about human insulin protein" --use-search
Step 2- Strands analysis agent with an area software
The next code snippet exhibits mannequin initialization, implementing the PubMed native software that’s declared utilizing the Strands @software decorator. We’ve carried out the PubMed software in research_tools.py that calls PubMed APIs to allow biomedical literature search capabilities inside the agent's execution context.
class ResearchAgent:
def __init__(
self,
bearer_token: str,
memory_hook: MemoryHook = None,
session_manager: AgentCoreMemorySessionManager = None,
bedrock_model_id: str = "us.anthropic.claude-sonnet-4-20250514-v1.0",
#bedrock_model_id: str = "openai.gpt-oss-120b-1.0", # Various
system_prompt: str = None,
instruments: Checklist[callable] = None,
):
self.model_id = bedrock_model_id
# For Anthropic Sonnet 4 interleaved considering
self.mannequin = BedrockModel(
model_id=self.model_id,
additional_request_fields={
"anthropic_beta": ["interleaved-thinking-2025-05-14"],
"considering": {"sort": "enabled", "budget_tokens": 8000},
},
)
self.system_prompt = (
system_prompt
if system_prompt
else """
You're a **Complete Biomedical Analysis Agent** specialised in conducting
systematic literature opinions and multi-database analyses to reply complicated biomedical analysis
questions. Your major mission is to synthesize proof from each revealed literature
(PubMed) and real-time database queries to supply complete, evidence-based insights for
pharmaceutical analysis, drug discovery, and scientific decision-making.
Your core capabilities embrace literature evaluation and extracting knowledge from 30+ specialised
biomedical databases** via the Bioimm gateway, enabling complete knowledge evaluation. The
database software classes embrace genomics and genetics, protein construction and performance, pathways
and system biology, scientific and pharmacological knowledge, expression and omics knowledge and different
specialised databases.
"""
)
- As well as, we carried out citations that use a structured system immediate to implement numbered in-text citations [1], [2], [3] with standardized reference codecs for each tutorial literature and database queries, marking positive each knowledge supply is correctly attributed. This permits researchers to rapidly entry and reference the scientific literature that helps their biomedical analysis queries and findings.
"""
- ALWAYS use numbered in-text citations [1], [2], [3], and many others. when referencing any knowledge supply
- Present a numbered "References" part on the finish with full supply particulars
- For educational literature: format as "1. Creator et al. Title. Journal. 12 months. ID: [PMID/DOI], obtainable at: [URL]"
- For database sources: format as "1. Database Title (Device: tool_name), Question: [query_description], Retrieved: [current_date]"
- Use numbered in-text citations all through your response to assist all claims and knowledge factors
- Every software question and every literature supply have to be cited with its personal distinctive reference quantity
- When instruments return tutorial papers, cite them utilizing the tutorial format with full bibliographic particulars
- Construction: Format every reference on a separate line with correct numbering - NO bullet factors
- Current the References part as a clear numbered checklist, not a complicated paragraph
- Keep sequential numbering throughout all reference sorts in a single "References" part
"""
Now you can take a look at your agent regionally:
uv run assessments/test_agent_locally.py --prompt "Discover details about human insulin protein"
uv run assessments/test_agent_locally.py --prompt "Discover details about human insulin protein" --use-search
Step 3 - Add Persistent Reminiscence for contextual analysis help
The analysis agent implements the AgentCore Memory service with three methods: semantic for factual analysis context, user_preference for analysis methodologies, and abstract for session continuity. The AgentCore Reminiscence session supervisor is built-in with Strands session management and retrieves related context earlier than queries and save interactions after responses. This allows the agent to recollect analysis preferences, ongoing tasks, and area experience throughout classes with out guide context re-establishment.
# Take a look at reminiscence performance with analysis conversations
python assessments/test_memory.py load-conversation
python assessments/test_memory.py load-prompt "My most well-liked response format is detailed explanations"
Step 4 - Deploy with AgentCore Runtime
To deploy our agent, we use AgentCore Runtime to configure and launch the analysis agent as a managed service. The deployment course of configures the runtime with the agent's most important entrypoint (agent/main.py), assigns an IAM execution position for AWS service entry, and helps each OAuth and IAM authentication modes. After deployment, the runtime turns into a scalable, serverless agent that may be invoked utilizing API calls. The agent robotically handles session administration, reminiscence persistence, and gear orchestration whereas offering safe entry to the Biomni gateway and native analysis instruments.
agentcore configure --entrypoint agent/most important.py -er arn:aws:iam::<Account-Id>:position/<Function> --name researchapp<AgentName>
For extra details about deploying with AgentCore Runtime, see Get started with AgentCore Runtime within the Amazon Bedrock AgentCore Developer Information.
Brokers in motion
The next are three consultant analysis eventualities that showcase the agent's capabilities throughout totally different domains: drug mechanism evaluation, genetic variant investigation, and pathway exploration. For every question, the agent autonomously determines which mixture of instruments to make use of, formulates acceptable sub-queries, analyzes the returned knowledge, and synthesizes a complete analysis report with correct citations. The accompanying demo video exhibits the whole agent workflow, together with instruments choice, reasoning, and response era.
- Conduct a complete evaluation of trastuzumab (Herceptin) mechanism of motion and resistance mechanisms you’ll want:
- HER2 protein construction and binding websites
- Downstream signaling pathways affected
- Identified resistance mechanisms from scientific knowledge
- Present scientific trials investigating mixture therapies
- Biomarkers for therapy response predictionQuery related databases to supply a complete analysis report.
- Analyze the scientific significance of BRCA1 variants in breast most cancers threat and therapy response. Examine:
- Inhabitants frequencies of pathogenic BRCA1 variants
- Medical significance and pathogenicity classifications
- Related most cancers dangers and penetrance estimates
- Therapy implications (PARP inhibitors, platinum brokers)
- Present scientific trials for BRCA1-positive sufferers
Use a number of databases to supply complete proof
The next video is an illustration of a biomedical analysis agent:
Scalability and observability
One of the crucial crucial challenges in deploying refined AI brokers is ensuring they scale reliably whereas sustaining complete visibility into their operations. Biomedical analysis workflows are inherently unpredictable—a single genomic evaluation would possibly course of 1000's of recordsdata, whereas a literature overview may span tens of millions of publications. Conventional infrastructure struggles with these dynamic workloads, notably when dealing with delicate analysis knowledge that requires strict isolation between totally different analysis tasks.On this deployment, we use Amazon Bedrock AgentCore Observability to visualise every step within the agent workflow. You should use this service to examine an agent's execution path, audit intermediate outputs, and debug efficiency bottlenecks and failures. For biomedical analysis, this stage of transparency is not only useful—it is important for regulatory compliance and scientific reproducibility.
Classes, traces, and spans kind a three-tiered hierarchical relationship within the observability framework. A session comprises a number of traces, with every hint representing a discrete interplay inside the broader context of the session. Every hint comprises a number of spans that seize fine-grained operations. The next screenshot footwear the utilization of 1 agent: Variety of classes, token utilization, and error charge in manufacturing

The next screenshot exhibits the brokers in manufacturing and their utilization (variety of Classes, variety of invocations)

The built-in dashboards present efficiency bottlenecks and establish why sure interactions would possibly fail, enabling steady enchancment and decreasing the imply time to detect (MTTD) and imply time to restore (MTTR). For biomedical purposes the place failed analyses can delay crucial analysis timelines, this speedy difficulty decision functionality makes positive that analysis momentum is maintained.
Future route
Whereas this implementation focuses on solely a subset of instruments, the AgentCore Gateway structure is designed for extensibility. Analysis groups can seamlessly add new instruments with out requiring code adjustments through the use of the MCP protocol. Newly registered instruments are robotically discoverable by brokers permitting your analysis infrastructure to evolve alongside the quickly altering software units.
For computational evaluation that requires code execution, the AgentCore Code Interpreter service could be built-in into the analysis workflow. With AgentCore Code Interpreter the analysis agent can retrieve knowledge and execute Python-based evaluation utilizing domain-specific libraries like BioPython, scikit-learn, or customized genomics packages.
Future extensions may assist a number of analysis brokers to collaborate on complicated tasks, with specialised brokers for literature overview, experimental design, knowledge evaluation, and consequence interpretation working collectively via multi-agent collaboration. Organizations can even develop specialised analysis brokers tailor-made to particular therapeutic areas, illness domains, or analysis methodologies that share the identical enterprise infrastructure and gear gateway.
Trying forward with Biomni
“Biomni as we speak is already helpful for educational analysis and open exploration. However to allow actual discovery—like advancing drug improvement—we have to transfer past prototypes and make the system enterprise-ready. Embedding Biomni into the workflows of biotech and pharma is crucial to show analysis potential into tangible impression.
That’s why we're excited to combine the open-source setting with Amazon Bedrock AgentCore, bridging the hole from analysis to manufacturing. Trying forward, we’re additionally enthusiastic about extending these capabilities with the Biomni A1 agent structure and the Biomni-R0 mannequin, which can unlock much more refined biomedical reasoning and evaluation. On the similar time, Biomni will stay a thriving open-source setting, the place researchers and business groups alike can contribute instruments, share workflows, and push the frontier of biomedical AI along with AgentCore.”
Conclusion
This implementation demonstrates how organizations can use Amazon Bedrock AgentCore to remodel biomedical analysis prototypes into production-ready techniques. By integrating Biomni's complete assortment of over 150 specialised instruments via the AgentCore Gateway service, we illustrate how groups can create enterprise-grade software sharing infrastructure that scales throughout a number of analysis domains.The mix of Biomni's biomedical instruments with the enterprise infrastructure of Bedrock AgentCore organizations can construct analysis brokers that keep scientific rigor whereas assembly manufacturing necessities for safety, scalability, and observability. Biomni's numerous software assortment—spanning genomics, proteomics, and scientific databases—exemplifies how specialised analysis capabilities could be centralized and shared throughout analysis groups via a safe gateway structure.
To start constructing your individual biomedical analysis agent with Biomni instruments, discover the implementation by visiting our GitHub repository for the whole code and documentation. You possibly can observe the step-by-step implementation information to arrange your analysis agent with native instruments, gateway integration, and Bedorck AgentCore deployment. As your wants evolve, you possibly can prolong the system along with your group's proprietary databases and analytical instruments. We encourage you to hitch the rising setting of life sciences AI brokers and instruments by sharing your extensions and enhancements.
In regards to the authors
Hasan Poonawala is a Senior AI/ML Options Architect at AWS, working with Healthcare and Life Sciences prospects. Hasan helps design, deploy and scale Generative AI and Machine studying purposes on AWS. He has over 15 years of mixed work expertise in machine studying, software program improvement and knowledge science on the cloud. In his spare time, Hasan likes to discover nature and spend time with family and friends.
Pierre de Malliard is a Senior AI/ML Options Architect at Amazon Net Companies and helps prospects within the Healthcare and Life Sciences Business. He's at the moment based mostly in New York Metropolis.
Necibe Ahat is a Senior AI/ML Specialist Options Architect at AWS, working with Healthcare and Life Sciences prospects. Necibe helps prospects to advance their generative AI and machine studying journey. She has a background in laptop science with 15 years of business expertise serving to prospects ideate, design, construct and deploy options at scale. She is a passionate inclusion and variety advocate.
Kexin Huang is a final-year PhD scholar in Pc Science at Stanford College, suggested by Prof. Jure Leskovec. His analysis applies AI to allow interpretable and deployable biomedical discoveries, addressing core challenges in multi-modal modeling, uncertainty, and reasoning. His work has appeared in Nature Drugs, Nature Biotechnology, Nature Chemical Biology, Nature Biomedical Engineering and high ML venues (NeurIPS, ICML, ICLR), incomes six finest paper awards. His analysis has been highlighted by Forbes, WIRED, and MIT Know-how Evaluation, and he has contributed to AI analysis at Genentech, GSK, Pfizer, IQVIA, Flatiron Well being, Dana-Farber, and Rockefeller College.

