Skip to main content

Control Model Access

Restrict models by Virtual Keyโ€‹

Set allowed models for a key using the models param

curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"]}'
info

This key can only make requests to models that are gpt-3.5-turbo or gpt-4

Verify this is set correctly by

curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello"}
]
}'

API Referenceโ€‹

Restrict models by team_idโ€‹

litellm-dev can only access azure-gpt-3.5

1. Create a team via /team/new

curl --location 'http://localhost:4000/team/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{
"team_alias": "litellm-dev",
"models": ["azure-gpt-3.5"]
}'

# returns {...,"team_id": "my-unique-id"}

2. Create a key for team

curl --location 'http://localhost:4000/key/generate' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data-raw '{"team_id": "my-unique-id"}'

3. Test it

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-qo992IjKOC2CHKZGRoJIGA' \
--data '{
"model": "BEDROCK_GROUP",
"messages": [
{
"role": "user",
"content": "hi"
}
]
}'
{"error":{"message":"Invalid model for team litellm-dev: BEDROCK_GROUP.  Valid models for team are: ['azure-gpt-3.5']\n\n\nTraceback (most recent call last):\n  File \"/Users/ishaanjaffer/Github/litellm/litellm/proxy/proxy_server.py\", line 2298, in chat_completion\n    _is_valid_team_configs(\n  File \"/Users/ishaanjaffer/Github/litellm/litellm/proxy/utils.py\", line 1296, in _is_valid_team_configs\n    raise Exception(\nException: Invalid model for team litellm-dev: BEDROCK_GROUP.  Valid models for team are: ['azure-gpt-3.5']\n\n","type":"None","param":"None","code":500}}%            

API Referenceโ€‹

Model Access Groupsโ€‹

Use model access groups to give users access to select models, and add new ones to it over time (e.g. mistral, llama-2, etc.)

Step 1. Assign model, access group in config.yaml

model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
model_info:
access_groups: ["beta-models"] # ๐Ÿ‘ˆ Model Access Group
- model_name: fireworks-llama-v3-70b-instruct
litellm_params:
model: fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct
api_key: "os.environ/FIREWORKS"
model_info:
access_groups: ["beta-models"] # ๐Ÿ‘ˆ Model Access Group

Create key with access group

curl --location 'http://localhost:4000/key/generate' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{"models": ["beta-models"], # ๐Ÿ‘ˆ Model Access Group
"max_budget": 0,}'

Test Key

curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-<key-from-previous-step>" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello"}
]
}'

โœจ Control Access on Wildcard Modelsโ€‹

Control access to all models with a specific prefix (e.g. openai/*).

Use this to also give users access to all models, except for a few that you don't want them to use (e.g. openai/o1-*).

info

Setting model access groups on wildcard models is an Enterprise feature.

See pricing here

Get a trial key here

  1. Setup config.yaml
model_list:
- model_name: openai/*
litellm_params:
model: openai/*
api_key: os.environ/OPENAI_API_KEY
model_info:
access_groups: ["default-models"]
- model_name: openai/o1-*
litellm_params:
model: openai/o1-*
api_key: os.environ/OPENAI_API_KEY
model_info:
access_groups: ["restricted-models"]
  1. Generate a key with access to default-models
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"models": ["default-models"],
}'
  1. Test the key
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-<key-from-previous-step>" \
-d '{
"model": "openai/gpt-4",
"messages": [
{"role": "user", "content": "Hello"}
]
}'

View Available Fallback Modelsโ€‹

Use the /v1/models endpoint to discover available fallback models for a given model. This helps you understand which backup models are available when your primary model is unavailable or restricted.

Extension Point

The include_metadata parameter serves as an extension point for exposing additional model metadata in the future. While currently focused on fallback models, this approach will be expanded to include other model metadata such as pricing information, capabilities, rate limits, and more.

Basic Usageโ€‹

Get all available models:

curl -X GET 'http://localhost:4000/v1/models' \
-H 'Authorization: Bearer <your-api-key>'

Get Fallback Models with Metadataโ€‹

Include metadata to see fallback model information:

curl -X GET 'http://localhost:4000/v1/models?include_metadata=true' \
-H 'Authorization: Bearer <your-api-key>'

Get Specific Fallback Typesโ€‹

You can specify the type of fallbacks you want to see:

curl -X GET 'http://localhost:4000/v1/models?include_metadata=true&fallback_type=general' \
-H 'Authorization: Bearer <your-api-key>'

General fallbacks are alternative models that can handle the same types of requests.

Example Responseโ€‹

When include_metadata=true is specified, the response includes fallback information:

{
"data": [
{
"id": "gpt-4",
"object": "model",
"created": 1677610602,
"owned_by": "openai",
"fallbacks": {
"general": ["gpt-3.5-turbo", "claude-3-sonnet"],
"context_window": ["gpt-4-turbo", "claude-3-opus"],
"content_policy": ["claude-3-haiku"]
}
}
]
}

Use Casesโ€‹

  • High Availability: Identify backup models to ensure service continuity
  • Cost Optimization: Find cheaper alternatives when primary models are expensive
  • Content Filtering: Discover models with different content policies
  • Context Length: Find models that can handle larger inputs
  • Load Balancing: Distribute requests across multiple compatible models

API Parametersโ€‹

ParameterTypeDescription
include_metadatabooleanInclude additional model metadata including fallbacks
fallback_typestringFilter fallbacks by type: general, context_window, or content_policy

Role Based Access Control (RBAC)โ€‹