AWS Bedrock
ALL Bedrock models (Anthropic, Meta, Mistral, Amazon, etc.) are Supported
LiteLLM requires boto3
to be installed on your system for Bedrock requests
pip install boto3>=1.28.57
For Amazon Nova Models: Bump to v1.53.5+
LiteLLM uses boto3 to handle authentication. All these options are supported - https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#credentials.
Usage​
import os
from litellm import completion
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
LiteLLM Proxy Usage​
Here's how to call Bedrock with the LiteLLM Proxy Server
1. Setup config.yaml​
model_list:
- model_name: bedrock-claude-v1
litellm_params:
model: bedrock/anthropic.claude-instant-v1
aws_access_key_id: os.environ/CUSTOM_AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/CUSTOM_AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/CUSTOM_AWS_REGION_NAME
All possible auth params:
aws_access_key_id: Optional[str],
aws_secret_access_key: Optional[str],
aws_session_token: Optional[str],
aws_region_name: Optional[str],
aws_session_name: Optional[str],
aws_profile_name: Optional[str],
aws_role_name: Optional[str],
aws_web_identity_token: Optional[str],
2. Start the proxy​
litellm --config /path/to/config.yaml
3. Test it​
- Curl Request
- OpenAI v1.0.0+
- Langchain
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "bedrock-claude-v1",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}
'
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="bedrock-claude-v1", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])
print(response)
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000", # set openai_api_base to the LiteLLM Proxy
model = "bedrock-claude-v1",
temperature=0.1
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
Set temperature, top p, etc.​
- SDK
- PROXY
import os
from litellm import completion
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}],
temperature=0.7,
top_p=1
)
Set on yaml
model_list:
- model_name: bedrock-claude-v1
litellm_params:
model: bedrock/anthropic.claude-instant-v1
temperature: <your-temp>
top_p: <your-top-p>
Set on request
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="bedrock-claude-v1", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
temperature=0.7,
top_p=1
)
print(response)
Pass provider-specific params​
If you pass a non-openai param to litellm, we'll assume it's provider-specific and send it as a kwarg in the request body. See more
- SDK
- PROXY
import os
from litellm import completion
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}],
top_k=1 # 👈 PROVIDER-SPECIFIC PARAM
)
Set on yaml
model_list:
- model_name: bedrock-claude-v1
litellm_params:
model: bedrock/anthropic.claude-instant-v1
top_k: 1 # 👈 PROVIDER-SPECIFIC PARAM
Set on request
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="bedrock-claude-v1", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
temperature=0.7,
extra_body={
top_k=1 # 👈 PROVIDER-SPECIFIC PARAM
}
)
print(response)
Usage - Function Calling​
LiteLLM uses Bedrock's Converse API for making tool calls
from litellm import completion
# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=messages,
tools=tools,
tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
response.choices[0].message.tool_calls[0].function.arguments, str
)
Usage - Vision​
from litellm import completion
# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
def encode_image(image_path):
import base64
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
image_path = "../proxy/cached_logo.jpg"
# Getting the base64 string
base64_image = encode_image(image_path)
resp = litellm.completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whats in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64," + base64_image
},
},
],
}
],
)
print(f"\nResponse: {resp}")
Usage - Bedrock Guardrails​
Example of using Bedrock Guardrails with LiteLLM
- LiteLLM SDK
- Proxy on request
- Proxy on config.yaml
from litellm import completion
# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = completion(
model="anthropic.claude-v2",
messages=[
{
"content": "where do i buy coffee from? ",
"role": "user",
}
],
max_tokens=10,
guardrailConfig={
"guardrailIdentifier": "ff6ujrregl1q", # The identifier (ID) for the guardrail.
"guardrailVersion": "DRAFT", # The version of the guardrail.
"trace": "disabled", # The trace behavior for the guardrail. Can either be "disabled" or "enabled"
},
)
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="anthropic.claude-v2", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
temperature=0.7,
extra_body={
"guardrailConfig": {
"guardrailIdentifier": "ff6ujrregl1q", # The identifier (ID) for the guardrail.
"guardrailVersion": "DRAFT", # The version of the guardrail.
"trace": "disabled", # The trace behavior for the guardrail. Can either be "disabled" or "enabled"
},
}
)
print(response)
- Update config.yaml
model_list:
- model_name: bedrock-claude-v1
litellm_params:
model: bedrock/anthropic.claude-instant-v1
aws_access_key_id: os.environ/CUSTOM_AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/CUSTOM_AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/CUSTOM_AWS_REGION_NAME
guardrailConfig: {
"guardrailIdentifier": "ff6ujrregl1q", # The identifier (ID) for the guardrail.
"guardrailVersion": "DRAFT", # The version of the guardrail.
"trace": "disabled", # The trace behavior for the guardrail. Can either be "disabled" or "enabled"
}
- Start proxy
litellm --config /path/to/config.yaml
- Test it!
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="bedrock-claude-v1", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
temperature=0.7
)
print(response)
Usage - "Assistant Pre-fill"​
If you're using Anthropic's Claude with Bedrock, you can "put words in Claude's mouth" by including an assistant
role message as the last item in the messages
array.
[!IMPORTANT] The returned completion will not include your "pre-fill" text, since it is part of the prompt itself. Make sure to prefix Claude's completion with your pre-fill.
import os
from litellm import completion
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
messages = [
{"role": "user", "content": "How do you say 'Hello' in German? Return your answer as a JSON object, like this:\n\n{ \"Hello\": \"Hallo\" }"},
{"role": "assistant", "content": "{"},
]
response = completion(model="bedrock/anthropic.claude-v2", messages=messages)
Example prompt sent to Claude​
Human: How do you say 'Hello' in German? Return your answer as a JSON object, like this:
{ "Hello": "Hallo" }
Assistant: {
Usage - "System" messages​
If you're using Anthropic's Claude 2.1 with Bedrock, system
role messages are properly formatted for you.
import os
from litellm import completion
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
messages = [
{"role": "system", "content": "You are a snarky assistant."},
{"role": "user", "content": "How do I boil water?"},
]
response = completion(model="bedrock/anthropic.claude-v2:1", messages=messages)
Example prompt sent to Claude​
You are a snarky assistant.
Human: How do I boil water?
Assistant:
Usage - Streaming​
import os
from litellm import completion
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
stream=True
)
for chunk in response:
print(chunk)
Example Streaming Output Chunk​
{
"choices": [
{
"finish_reason": null,
"index": 0,
"delta": {
"content": "ase can appeal the case to a higher federal court. If a higher federal court rules in a way that conflicts with a ruling from a lower federal court or conflicts with a ruling from a higher state court, the parties involved in the case can appeal the case to the Supreme Court. In order to appeal a case to the Sup"
}
}
],
"created": null,
"model": "anthropic.claude-instant-v1",
"usage": {
"prompt_tokens": null,
"completion_tokens": null,
"total_tokens": null
}
}
Cross-region inferencing​
LiteLLM supports Bedrock cross-region inferencing across all supported bedrock models.
- SDK
- PROXY
from litellm import completion
import os
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
litellm.set_verbose = True # 👈 SEE RAW REQUEST
response = completion(
model="bedrock/us.anthropic.claude-3-haiku-20240307-v1:0",
messages=messages,
max_tokens=10,
temperature=0.1,
)
print("Final Response: {}".format(response))
1. Setup config.yaml​
model_list:
- model_name: bedrock-claude-haiku
litellm_params:
model: bedrock/us.anthropic.claude-3-haiku-20240307-v1:0
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/AWS_REGION_NAME
2. Start the proxy​
litellm --config /path/to/config.yaml
3. Test it​
- Curl Request
- OpenAI v1.0.0+
- Langchain
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "bedrock-claude-haiku",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}
'
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="bedrock-claude-haiku", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])
print(response)
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000", # set openai_api_base to the LiteLLM Proxy
model = "bedrock-claude-haiku",
temperature=0.1
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
Set 'converse' / 'invoke' route​
Supported from LiteLLM Version v1.53.5
LiteLLM defaults to the invoke
route. LiteLLM uses the converse
route for Bedrock models that support it.
To explicitly set the route, do bedrock/converse/<model>
or bedrock/invoke/<model>
.
E.g.
- SDK
- PROXY
from litellm import completion
completion(model="bedrock/converse/us.amazon.nova-pro-v1:0")
model_list:
- model_name: bedrock-model
litellm_params:
model: bedrock/converse/us.amazon.nova-pro-v1:0
Alternate user/assistant messages​
Use user_continue_message
to add a default user message, for cases (e.g. Autogen) where the client might not follow alternating user/assistant messages starting and ending with a user message.
model_list:
- model_name: "bedrock-claude"
litellm_params:
model: "bedrock/anthropic.claude-instant-v1"
user_continue_message: {"role": "user", "content": "Please continue"}
OR
just set litellm.modify_params=True
and LiteLLM will automatically handle this with a default user_continue_message.
model_list:
- model_name: "bedrock-claude"
litellm_params:
model: "bedrock/anthropic.claude-instant-v1"
litellm_settings:
modify_params: true
Test it!
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "bedrock-claude",
"messages": [{"role": "assistant", "content": "Hey, how's it going?"}]
}'
Usage - PDF / Document Understanding​
LiteLLM supports Document Understanding for Bedrock models - AWS Bedrock Docs.
url​
- SDK
- PROXY
from litellm.utils import supports_pdf_input, completion
# set aws credentials
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
# pdf url
image_url = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
# model
model = "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0"
image_content = [
{"type": "text", "text": "What's this file about?"},
{
"type": "image_url",
"image_url": image_url, # OR {"url": image_url}
},
]
if not supports_pdf_input(model, None):
print("Model does not support image input")
response = completion(
model=model,
messages=[{"role": "user", "content": image_content}],
)
assert response is not None
- Setup config.yaml
model_list:
- model_name: bedrock-model
litellm_params:
model: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/AWS_REGION_NAME
- Start the proxy
litellm --config /path/to/config.yaml
- Test it!
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "bedrock-model",
"messages": [
{"role": "user", "content": {"type": "text", "text": "What's this file about?"}},
{
"type": "image_url",
"image_url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
}
]
}'
base64​
- SDK
- PROXY
from litellm.utils import supports_pdf_input, completion
# set aws credentials
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
# pdf url
image_url = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
response = requests.get(url)
file_data = response.content
encoded_file = base64.b64encode(file_data).decode("utf-8")
base64_url = f"data:application/pdf;base64,{encoded_file}"
# model
model = "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0"
image_content = [
{"type": "text", "text": "What's this file about?"},
{
"type": "image_url",
"image_url": base64_url, # OR {"url": base64_url}
},
]
if not supports_pdf_input(model, None):
print("Model does not support image input")
response = completion(
model=model,
messages=[{"role": "user", "content": image_content}],
)
assert response is not None
- Setup config.yaml
model_list:
- model_name: bedrock-model
litellm_params:
model: bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/AWS_REGION_NAME
- Start the proxy
litellm --config /path/to/config.yaml
- Test it!
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "bedrock-model",
"messages": [
{"role": "user", "content": {"type": "text", "text": "What's this file about?"}},
{
"type": "image_url",
"image_url": "data:application/pdf;base64,{b64_encoded_file}",
}
]
}'
Boto3 - Authentication​
Passing credentials as parameters - Completion()​
Pass AWS credentials as parameters to litellm.completion
import os
from litellm import completion
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_access_key_id="",
aws_secret_access_key="",
aws_region_name="",
)
Passing extra headers + Custom API Endpoints​
This can be used to override existing headers (e.g. Authorization
) when calling custom api endpoints
- SDK
- PROXY
import os
import litellm
from litellm import completion
litellm.set_verbose = True # 👈 SEE RAW REQUEST
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_access_key_id="",
aws_secret_access_key="",
aws_region_name="",
aws_bedrock_runtime_endpoint="https://my-fake-endpoint.com",
extra_headers={"key": "value"}
)
- Setup config.yaml
model_list:
- model_name: bedrock-model
litellm_params:
model: bedrock/anthropic.claude-instant-v1
aws_access_key_id: "",
aws_secret_access_key: "",
aws_region_name: "",
aws_bedrock_runtime_endpoint: "https://my-fake-endpoint.com",
extra_headers: {"key": "value"}
- Start proxy
litellm --config /path/to/config.yaml --detailed_debug
- Test it!
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "bedrock-model",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
},
{
"role": "user",
"content": "how can I solve 8x + 7 = -23"
}
]
}'
SSO Login (AWS Profile)​
- Set
AWS_PROFILE
environment variable - Make bedrock completion call
import os
from litellm import completion
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
or pass aws_profile_name
:
import os
from litellm import completion
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_profile_name="dev-profile",
)
STS (Role-based Auth)​
- Set
aws_role_name
andaws_session_name
LiteLLM Parameter | Boto3 Parameter | Description | Boto3 Documentation |
---|---|---|---|
aws_access_key_id | aws_access_key_id | AWS access key associated with an IAM user or role | Credentials |
aws_secret_access_key | aws_secret_access_key | AWS secret key associated with the access key | Credentials |
aws_role_name | RoleArn | The Amazon Resource Name (ARN) of the role to assume | AssumeRole API |
aws_session_name | RoleSessionName | An identifier for the assumed role session | AssumeRole API |
Make the bedrock completion call
- SDK
- PROXY
from litellm import completion
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=messages,
max_tokens=10,
temperature=0.1,
aws_role_name=aws_role_name,
aws_session_name="my-test-session",
)
If you also need to dynamically set the aws user accessing the role, add the additional args in the completion()/embedding() function
from litellm import completion
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=messages,
max_tokens=10,
temperature=0.1,
aws_region_name=aws_region_name,
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
aws_role_name=aws_role_name,
aws_session_name="my-test-session",
)
model_list:
- model_name: bedrock/*
litellm_params:
model: bedrock/*
aws_role_name: arn:aws:iam::888602223428:role/iam_local_role # AWS RoleArn
aws_session_name: "bedrock-session" # AWS RoleSessionName
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID # [OPTIONAL - not required if using role]
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY # [OPTIONAL - not required if using role]
Passing an external BedrockRuntime.Client as a parameter - Completion()​
This is a deprecated flow. Boto3 is not async. And boto3.client does not let us make the http call through httpx. Pass in your aws params through the method above 👆. See Auth Code Add new auth flow
Experimental - 2024-Jun-23:
aws_access_key_id
, aws_secret_access_key
, and aws_session_token
will be extracted from boto3.client and be passed into the httpx client
Pass an external BedrockRuntime.Client object as a parameter to litellm.completion. Useful when using an AWS credentials profile, SSO session, assumed role session, or if environment variables are not available for auth.
Create a client from session credentials:
import boto3
from litellm import completion
bedrock = boto3.client(
service_name="bedrock-runtime",
region_name="us-east-1",
aws_access_key_id="",
aws_secret_access_key="",
aws_session_token="",
)
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_bedrock_client=bedrock,
)
Create a client from AWS profile in ~/.aws/config
:
import boto3
from litellm import completion
dev_session = boto3.Session(profile_name="dev-profile")
bedrock = dev_session.client(
service_name="bedrock-runtime",
region_name="us-east-1",
)
response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_bedrock_client=bedrock,
)
Provisioned throughput models​
To use provisioned throughput Bedrock models pass
model=bedrock/<base-model>
, examplemodel=bedrock/anthropic.claude-v2
. Setmodel
to any of the Supported AWS modelsmodel_id=provisioned-model-arn
Completion
import litellm
response = litellm.completion(
model="bedrock/anthropic.claude-instant-v1",
model_id="provisioned-model-arn",
messages=[{"content": "Hello, how are you?", "role": "user"}]
)
Embedding
import litellm
response = litellm.embedding(
model="bedrock/amazon.titan-embed-text-v1",
model_id="provisioned-model-arn",
input=["hi"],
)
Supported AWS Bedrock Models​
Here's an example of using a bedrock model with LiteLLM. For a complete list, refer to the model cost map
Model Name | Command |
---|---|
Anthropic Claude-V3.5 Sonnet | completion(model='bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0', messages=messages) |
Anthropic Claude-V3 sonnet | completion(model='bedrock/anthropic.claude-3-sonnet-20240229-v1:0', messages=messages) |
Anthropic Claude-V3 Haiku | completion(model='bedrock/anthropic.claude-3-haiku-20240307-v1:0', messages=messages) |
Anthropic Claude-V3 Opus | completion(model='bedrock/anthropic.claude-3-opus-20240229-v1:0', messages=messages) |
Anthropic Claude-V2.1 | completion(model='bedrock/anthropic.claude-v2:1', messages=messages) |
Anthropic Claude-V2 | completion(model='bedrock/anthropic.claude-v2', messages=messages) |
Anthropic Claude-Instant V1 | completion(model='bedrock/anthropic.claude-instant-v1', messages=messages) |
Meta llama3-1-405b | completion(model='bedrock/meta.llama3-1-405b-instruct-v1:0', messages=messages) |
Meta llama3-1-70b | completion(model='bedrock/meta.llama3-1-70b-instruct-v1:0', messages=messages) |
Meta llama3-1-8b | completion(model='bedrock/meta.llama3-1-8b-instruct-v1:0', messages=messages) |
Meta llama3-70b | completion(model='bedrock/meta.llama3-70b-instruct-v1:0', messages=messages) |
Meta llama3-8b | completion(model='bedrock/meta.llama3-8b-instruct-v1:0', messages=messages) |
Amazon Titan Lite | completion(model='bedrock/amazon.titan-text-lite-v1', messages=messages) |
Amazon Titan Express | completion(model='bedrock/amazon.titan-text-express-v1', messages=messages) |
Cohere Command | completion(model='bedrock/cohere.command-text-v14', messages=messages) |
AI21 J2-Mid | completion(model='bedrock/ai21.j2-mid-v1', messages=messages) |
AI21 J2-Ultra | completion(model='bedrock/ai21.j2-ultra-v1', messages=messages) |
AI21 Jamba-Instruct | completion(model='bedrock/ai21.jamba-instruct-v1:0', messages=messages) |
Meta Llama 2 Chat 13b | completion(model='bedrock/meta.llama2-13b-chat-v1', messages=messages) |
Meta Llama 2 Chat 70b | completion(model='bedrock/meta.llama2-70b-chat-v1', messages=messages) |
Mistral 7B Instruct | completion(model='bedrock/mistral.mistral-7b-instruct-v0:2', messages=messages) |
Mixtral 8x7B Instruct | completion(model='bedrock/mistral.mixtral-8x7b-instruct-v0:1', messages=messages) |
Bedrock Embedding​
API keys​
This can be set as env variables or passed as params to litellm.embedding()
import os
os.environ["AWS_ACCESS_KEY_ID"] = "" # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2
Usage​
from litellm import embedding
response = embedding(
model="bedrock/amazon.titan-embed-text-v1",
input=["good morning from litellm"],
)
print(response)
Supported AWS Bedrock Embedding Models​
Model Name | Usage | Supported Additional OpenAI params |
---|---|---|
Titan Embeddings V2 | embedding(model="bedrock/amazon.titan-embed-text-v2:0", input=input) | here |
Titan Embeddings - V1 | embedding(model="bedrock/amazon.titan-embed-text-v1", input=input) | here |
Titan Multimodal Embeddings | embedding(model="bedrock/amazon.titan-embed-image-v1", input=input) | here |
Cohere Embeddings - English | embedding(model="bedrock/cohere.embed-english-v3", input=input) | here |
Cohere Embeddings - Multilingual | embedding(model="bedrock/cohere.embed-multilingual-v3", input=input) | here |
Advanced - Drop Unsupported Params​
Advanced - Pass model/provider-specific Params​
Image Generation​
Use this for stable diffusion on bedrock
Usage​
import os
from litellm import image_generation
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = image_generation(
prompt="A cute baby sea otter",
model="bedrock/stability.stable-diffusion-xl-v0",
)
print(f"response: {response}")
Set optional params
import os
from litellm import image_generation
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = image_generation(
prompt="A cute baby sea otter",
model="bedrock/stability.stable-diffusion-xl-v0",
### OPENAI-COMPATIBLE ###
size="128x512", # width=128, height=512
### PROVIDER-SPECIFIC ### see `AmazonStabilityConfig` in bedrock.py for all params
seed=30
)
print(f"response: {response}")
Supported AWS Bedrock Image Generation Models​
Model Name | Function Call |
---|---|
Stable Diffusion 3 - v0 | embedding(model="bedrock/stability.stability.sd3-large-v1:0", prompt=prompt) |
Stable Diffusion - v0 | embedding(model="bedrock/stability.stable-diffusion-xl-v0", prompt=prompt) |
Stable Diffusion - v0 | embedding(model="bedrock/stability.stable-diffusion-xl-v1", prompt=prompt) |
Rerank API​
Use Bedrock's Rerank API in the Cohere /rerank
format.
Supported Cohere Rerank Params
model
- the foundation model ARNquery
- the query to rerank againstdocuments
- the list of documents to reranktop_n
- the number of results to return
- SDK
- PROXY
from litellm import rerank
import os
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""
response = rerank(
model="bedrock/arn:aws:bedrock:us-west-2::foundation-model/amazon.rerank-v1:0", # provide the model ARN - get this here https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock/client/list_foundation_models.html
query="hello",
documents=["hello", "world"],
top_n=2,
)
print(response)
- Setup config.yaml
model_list:
- model_name: bedrock-rerank
litellm_params:
model: bedrock/arn:aws:bedrock:us-west-2::foundation-model/amazon.rerank-v1:0
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/AWS_REGION_NAME
- Start proxy server
litellm --config config.yaml
# RUNNING on http://0.0.0.0:4000
- Test it!
curl http://0.0.0.0:4000/rerank \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"model": "bedrock-rerank",
"query": "What is the capital of the United States?",
"documents": [
"Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. is the capital of the United States.",
"Capital punishment has existed in the United States since before it was a country."
],
"top_n": 3
}'