firstly of 2026, AWS has a number of associated but distinct elements that make up its agentic and LLM abstractions.
- Bedrock is the mannequin layer that allows entry to massive language fashions.
- Brokers for Bedrock is the managed utility layer. In different phrases, AWS runs the brokers for you based mostly in your necessities.
- Bedrock AgentCore is an infrastructure layer that allows AWS to run brokers you develop utilizing third-party frameworks similar to CrewAI and LangGraph.
Aside from these three providers, AWS additionally has Strands, an open supply Python library for constructing brokers exterior of the Bedrock service, which might then be deployed on different AWS providers similar to ECS and Lambda.
It may be complicated as a result of all three agentic-based providers have the time period “Bedrock” of their names, however on this article, I’ll give attention to the usual Bedrock service and present how and why you’ll use it. commonplace Bedrock service and present how and why you’ll use it.
As a service, Bedrock has solely been obtainable on AWS since early 2023. That ought to provide you with a clue as to why it was launched. Amazon might clearly see the rise of Massive Language Fashions and their affect on IT structure and the techniques growth course of. That’s AWS’s meat and potatoes, and so they had been eager that no person was going to eat their lunch.
And though AWS has developed a couple of LLMs of its personal, it realised that to remain aggressive, it might must make the very high fashions, similar to these from Anthropic, obtainable to customers. And that’s the place Bedrock steps in. As they stated in their very own blurb on their web site,
… Bedrock is a completely managed service that provides a alternative of high-performing basis fashions (FMs) from main AI corporations like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, together with a broad set of capabilities it’s essential construct generative AI functions, simplifying growth whereas sustaining privateness and safety.
How do I entry Bedrock?
Okay, in order that’s the speculation behind the why of Bedrock, however how can we get entry to it and truly use it? Not surprisingly, the very first thing you want is an AWS account. I’m going to imagine you have already got this, but when not, click on the next hyperlink to set one up.
https://aws.amazon.com/account
Usefully, after you register for a brand new AWS account, an excellent variety of the providers that you simply use will fall beneath the so-called “free tier” at AWS, which suggests your prices must be minimal for one yr following your account creation – assuming you don’t go loopy and begin firing up large compute servers and such like.
There are three fundamental methods to make use of AWS providers.
- Through the console. In the event you’re a newbie, it will most likely be your most popular route because it’s the simplest approach to get began
- Through an API. In the event you’re useful at coding, you may entry all of AWS’s providers through an API. For instance, for Python programmers, AWS offers the boto3 library. There are related libraries for different languages, similar to JavaScript, and many others.
- Through the command line interface (CLI). The CLI is an extra software you may obtain from AWS and permits you to work together with AWS providers straight out of your terminal.
Be aware that, to make use of the latter two strategies, you need to have login credentials arrange in your native system.
What can I do with Bedrock?
The brief reply is that you are able to do many of the issues you may with common chat fashions from OpenAI, Anthropic, Google, and so forth. Underlying Bedrock are quite a few basis fashions that you should utilize with it, similar to:-
- Kimi K2 Considering. A deep reasoning mannequin
- Claude Opus 4.5. To many individuals, that is the highest LLM obtainable up to now.
- GPT-OSS. OpenAI’s open supply LLM
And lots of, many others apart from. For a full record, try the next hyperlink.
https://aws.amazon.com/bedrock/model-choice
How do I take advantage of Bedrock?
To make use of Bedrock, we are going to use a mixture of the AWS CLI and the Python API supplied by the boto3 library. Be sure you have the next setup as a prerequisite
- An AWS account.
- The AWS CLI has been downloaded and put in in your system.
- An Id and Entry Administration (IAM) consumer is about up with applicable permissions and entry keys. You are able to do this through the AWS console.
- Configured your consumer credentials through the AWS CLI like this. Generally, three items of knowledge must be provided. All of which you’ll get from the earlier step above. You’ll be prompted to enter related data,
$ aws configure
AWS Entry Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Entry Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default area identify [None]: us-west-2
Default output format [None]:
Giving Bedrock entry to a mannequin
Again within the day (a couple of months in the past!), you had to make use of the AWS administration console to request entry to specific fashions from Bedrock, however now entry is robotically granted if you invoke a mannequin for the primary time.
Be aware that for Anthropic fashions, first-time customers might must submit use case particulars earlier than they will entry the mannequin. Additionally word that entry to high fashions from Anthropic and different suppliers willincur prices so please make sure you monitor your billing commonly and take away any mannequin entry you not want.
Nevertheless, we nonetheless must know the mannequin identify we wish to use. To get a listing of all Bedrock-compatible fashions, we will use the next AWS CLI command.
aws bedrock list-foundation-models
It will return a JSON consequence set itemizing numerous properties of every mannequin, like this.
{
"modelSummaries": [
{
"modelArn": "arn:aws:bedrock:us-east-2::foundation-model/nvidia.nemotron-nano-12b-v2",
"modelId": "nvidia.nemotron-nano-12b-v2",
"modelName": "NVIDIA Nemotron Nano 12B v2 VL BF16",
"providerName": "NVIDIA",
"inputModalities": [
"TEXT",
"IMAGE"
],
"outputModalities": [
"TEXT"
],
"responseStreamingSupported": true,
"customizationsSupported": [],
"inferenceTypesSupported": [
"ON_DEMAND"
],
"modelLifecycle": {
"standing": "ACTIVE"
}
},
{
"modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0",
...
...
...
Select the mannequin you want and word its modelID from the JSON output, as we’ll want this in our Python code later. An essential caveat to that is that you simply’ll typically see the next in a mannequin description,
...
...
"inferenceTypesSupported": [
"INFERENCE_PROFILE"
]
...
...
That is reserved for fashions that:
- Are massive or in excessive demand
- Require reserved or managed capability
- Want specific value and throughput controls
For these fashions, we will’t simply reference the modelID in our code. As an alternative, we have to reference an inference profile. An inference profile is a Bedrock useful resource that’s certain to a number of foundational LLMs and a area.
There are two methods to acquire an inference profile you should utilize. The primary is to create one your self. These are known as Software Profiles. The second manner is to make use of one in all AWS’s Supported Profiles. That is the better choice, because it’s pre-built for you and also you simply must get hold of the related Profile ID related to the inference profile to make use of in your code.
If you wish to take the route of making your Software Profile, try the suitable AWS documentation, however I’m going to make use of a supported profile in my instance code.
For a listing of Supported Profiles in AWS, try the hyperlink under:
For my first code instance, I wish to use Claude’s Sonnet 3.5 V2 mannequin, so I clicked the road above and noticed the next description.

I took word of the profile ID ( us.anthropic.claude-3–5-sonnet-20241022-v2:0 ) and one of many legitimate supply areas ( us-east-1 )
For my second two instance code snippets, I’ll use OpenAI’s open-source LLM for textual content output and AWS’s Titan Picture generator for pictures. Neither of those fashions requires an inference profile, so you may simply use the common modelID for them in your code.
NB: Whichever mannequin(s) you select, be sure your AWS area is about to the right worth for every.
Setting Up a Improvement Atmosphere
As we’ll be performing some coding, it’s greatest to isolate your setting so we don’t intrude with any of our different tasks. So let’s try this now. I’m utilizing Home windows and the UV package deal supervisor for this, however use whichever software you’re most comfy with. My code will run in a Jupyter pocket book.
uv init bedrock_demo --python 3.13
cd bedrock_demo
uv add boto3 jupyter
# To run the pocket book, kind this in
uv run jupyter pocket book
Utilizing Bedrock from Python
Let’s see Bedrock in motion with a couple of examples. The primary might be easy, and we’ll regularly enhance the complexity as we go.
Instance 1: A easy query and reply utilizing an inference profile
This instance makes use of the Claude Sonnet 3.5 V2 mannequin we talked about earlier. As talked about, to invoke this mannequin, we use a profile ID related to its inference profile.
import json
import boto3
brt = boto3.consumer("bedrock-runtime", region_name="us-east-1")
profile_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
physique = json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 200,
"temperature": 0.2,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is the capital of France?"}
]
}
]
})
resp = brt.invoke_model(
modelId=profile_id,
physique=physique,
settle for="utility/json",
contentType="utility/json"
)
knowledge = json.masses(resp["body"].learn())
# Claude responses come again as a "content material" array, not OpenAI "selections"
print(knowledge["content"][0]["text"])
#
# Output
#
The capital of France is Paris.
Be aware that invoking this mannequin (and others prefer it) creates an implied subscription between you and AWS’s market. That is not a recurring common cost. It solely prices you when the mannequin is definitely used, however its greatest to observe this to keep away from surprising payments. You must obtain an electronic mail outlining the subscription settlement, with a hyperlink to handle and/or cancel any present mannequin subscriptions which are arrange.
Instance 2: Create a picture
A easy picture creation utilizing AWS’s personal Titan mannequin. This mannequin is just not related to an inference profile, so we will simply reference it utilizing its modelID.
import json
import base64
import boto3
brt_img = boto3.consumer("bedrock-runtime", region_name="us-east-1")
model_id_img = "amazon.titan-image-generator-v2:0"
immediate = "A hippo using a motorcycle."
physique = json.dumps({
"taskType": "TEXT_IMAGE",
"textToImageParams": {
"textual content": immediate
},
"imageGenerationConfig": {
"numberOfImages": 1,
"top": 1024,
"width": 1024,
"cfgScale": 7.0,
"seed": 0
}
})
resp = brt_img.invoke_model(
modelId=model_id_img,
physique=physique,
settle for="utility/json",
contentType="utility/json"
)
knowledge = json.masses(resp["body"].learn())
# Titan returns base64-encoded pictures within the "pictures" array
img_b64 = knowledge["images"][0]
img_bytes = base64.b64decode(img_b64)
out_path = "titan_output.png"
with open(out_path, "wb") as f:
f.write(img_bytes)
print("Saved:", out_path)
On my system, the output picture regarded like this.

Instance 3: A technical help triage assistant utilizing OpenAI’s OSS mannequin
This can be a extra complicated and helpful instance. Right here, we arrange an assistant that can take issues reported to it by non-technical customers and output extra questions you may want the consumer to reply, in addition to the most certainly causes of the problem and what additional steps to take. Like our earlier instance, this mannequin is just not related to an inference profile.
import json
import re
import boto3
from pydantic import BaseModel, Subject
from typing import Checklist, Literal, Optionally available
# ----------------------------
# Bedrock setup
# ----------------------------
REGION = "us-east-2"
MODEL_ID = "openai.gpt-oss-120b-1:0"
brt = boto3.consumer("bedrock-runtime", region_name=REGION)
# ----------------------------
# Output schema
# ----------------------------
Severity = Literal["low", "medium", "high"]
Class = Literal["account", "billing", "device", "network", "software", "security", "other"]
class TriageResponse(BaseModel):
class: Class
severity: Severity
abstract: str = Subject(description="One-sentence restatement of the issue.")
likely_causes: Checklist[str] = Subject(description="Prime believable causes, concise.")
clarifying_questions: Checklist[str] = Subject(description="Ask solely what is required to proceed.")
safe_next_steps: Checklist[str] = Subject(description="Step-by-step actions secure for a non-technical consumer.")
stop_and_escalate_if: Checklist[str] = Subject(description="Clear crimson flags that require an expert/helpdesk.")
recommended_escalation_target: Optionally available[str] = Subject(
default=None,
description="If severity is excessive, who to contact (e.g., IT admin, financial institution, ISP)."
)
# ----------------------------
# Helpers
# ----------------------------
def invoke_chat(messages, max_tokens=800, temperature=0.2) -> dict:
physique = json.dumps({
"messages": messages,
"max_tokens": max_tokens,
"temperature": temperature
})
resp = brt.invoke_model(
modelId=MODEL_ID,
physique=physique,
settle for="utility/json",
contentType="utility/json"
)
return json.masses(resp["body"].learn())
def extract_content(knowledge: dict) -> str:
return knowledge["choices"][0]["message"]["content"]
def extract_json_object(textual content: str) -> dict:
"""
Extract the primary JSON object from mannequin output.
Handles frequent instances like blocks or further textual content.
"""
textual content = re.sub(r".*? ", "", textual content, flags=re.DOTALL).strip()
begin = textual content.discover("{")
if begin == -1:
increase ValueError("No JSON object discovered.")
depth = 0
for i in vary(begin, len(textual content)):
if textual content[i] == "{":
depth += 1
elif textual content[i] == "}":
depth -= 1
if depth == 0:
return json.masses(textual content[start:i+1])
increase ValueError("Unbalanced JSON braces; couldn't parse.")
# ----------------------------
# The helpful operate
# ----------------------------
def triage_issue(user_problem: str) -> TriageResponse:
messages = [
{
"role": "system",
"content": (
"You are a careful technical support triage assistant for non-technical users. "
"You must be conservative and safety-first. "
"Return ONLY valid JSON matching the given schema. No extra text."
)
},
{
"role": "user",
"content": f"""
User problem:
{user_problem}
Return JSON that matches this schema:
{TriageResponse.model_json_schema()}
""".strip()
}
]
uncooked = invoke_chat(messages)
textual content = extract_content(uncooked)
parsed = extract_json_object(textual content)
return TriageResponse.model_validate(parsed)
# ----------------------------
# Instance
# ----------------------------
if __name__ == "__main__":
drawback = "My laptop computer is linked to Wi-Fi however web sites will not load, and Zoom retains saying unstable connection."
consequence = triage_issue(drawback)
print(consequence.model_dump_json(indent=2))
Right here is the output.
"class": "community",
"severity": "medium",
"abstract": "Laptop computer reveals Wi‑Fi connection however can not load web sites and Zoom
studies an unstable connection.",
"likely_causes": [
"Router or modem malfunction",
"DNS resolution failure",
"Local Wi‑Fi interference or weak signal",
"IP address conflict on the network",
"Firewall or security software blocking traffic",
"ISP outage or throttling"
],
"clarifying_questions": [
"Are other devices on the same Wi‑Fi network able to access the internet?",
"Did the problem start after any recent changes (e.g., new software, OS update, VPN installation)?",
"Have you tried moving closer to the router or using a wired Ethernet connection?",
"Do you see any error codes or messages in the browser or Zoom besides "unstable connection"?"
],
"safe_next_steps": [
"Restart the router and modem by unplugging them for 30 seconds, then power them back on.",
"On the laptop, forget the Wi‑Fi network, then reconnect and re-enter the password.",
"Run the built‑in Windows network troubleshooter (Settings → Network & Internet → Status → Network troubleshooter).",
"Disable any VPN or proxy temporarily and test the connection again.",
"Open a command prompt and run `ipconfig /release` followed by `ipconfig /renew`.",
"Flush the DNS cache with `ipconfig /flushdns`.",
"Try accessing a simple website (e.g., http://example.com) and note if it loads.",
"If possible, connect the laptop to the router via Ethernet to see if the issue persists."
],
"stop_and_escalate_if": [
"The laptop still cannot reach any website after completing all steps.",
"Other devices on the same network also cannot access the internet.",
"You receive error messages indicating hardware failure (e.g., Wi‑Fi adapter not found).",
"The router repeatedly restarts or shows error lights.",
"Zoom continues to report a poor or unstable connection despite a working internet test."
],
"recommended_escalation_target": "IT admin"
}
Abstract
This text launched AWS Bedrock, AWS’s managed gateway to basis massive language fashions, explaining why it exists, the way it matches into the broader AWS AI stack, and use it in follow. We lined mannequin discovery, area and credential setup, and the important thing distinction between on-demand fashions and people who require inference profiles – a typical supply of confusion for builders.
By means of sensible Python examples, we demonstrated textual content and picture era utilizing each commonplace on-demand fashions and people who require an inference profile.
At its core, Bedrock displays AWS’s long-standing philosophy: summary infrastructure complexity with out eradicating management. Relatively than pushing a single “greatest” mannequin, Bedrock treats basis fashions as managed infrastructure elements – swappable, governable, and region-aware. This means a future the place Bedrock evolves much less as a chat interface and extra as a mannequin orchestration layer, tightly built-in with IAM, networking, value controls, and agent frameworks.
Over time, we would count on Bedrock to maneuver additional towards standardised inference contracts (subscriptions) and clearer separation between experimentation and manufacturing capability. And with their Agent and AgentCore providers, we’re already seeing deeper integration of agentic workflows with Bedrock, positioning fashions not as merchandise in themselves however as sturdy constructing blocks inside AWS techniques.
For the avoidance of doubt, other than being a someday consumer of their providers, I’ve no connection or affilliation with Amazon Internet Providers

