Skip to main content

v1.74.7-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Deploy this versionโ€‹

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.7-stable.patch.1

Key Highlightsโ€‹

  • Vector Stores - Support for Vertex RAG Engine, PG Vector, OpenAI & Azure OpenAI Vector Stores.
  • Bulk Editing Users - Bulk editing users on the UI.
  • Health Check Improvements - Prevent unnecessary pod restarts during high traffic.
  • New LLM Providers - Added Moonshot AI and Vercel v0 provider support.

Vector Stores APIโ€‹

This release introduces support for using VertexAI RAG Engine, PG Vector, Bedrock Knowledge Bases, and OpenAI Vector Stores with LiteLLM.

This is ideal for use cases requiring external knowledge sources with LLMs.

This brings the following benefits for LiteLLM users:

Proxy Admin Benefits:

  • Fine-grained access control: determine which Keys and Teams can access specific Vector Stores
  • Complete usage tracking and monitoring across all vector store operations

Developer Benefits:

  • Simple, unified interface for querying vector stores and using them with LLM API requests
  • Consistent API experience across all supported vector store providers

Get started


Bulk Editing Usersโ€‹

v1.74.7-stable introduces Bulk Editing Users on the UI. This is useful for:

  • granting all existing users to a default team (useful for controlling access / tracking spend by team)
  • controlling personal model access for existing users

Read more


Health Check Serverโ€‹

This release brings reliability improvements that prevent unnecessary pod restarts during high traffic. Previously, when the main LiteLLM app was busy serving traffic, health endpoints would timeout even when pods were healthy.

Starting with this release, you can run health endpoints on an isolated process with a dedicated port. This ensures liveness and readiness probes remain responsive even when the main LiteLLM app is under heavy load.

Read More


New Models / Updated Modelsโ€‹

Pricing / Context Window Updatesโ€‹

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)
Azure AIazure_ai/grok-3131k$3.30$16.50
Azure AIazure_ai/global/grok-3131k$3.00$15.00
Azure AIazure_ai/global/grok-3-mini131k$0.25$1.27
Azure AIazure_ai/grok-3-mini131k$0.275$1.38
Azure AIazure_ai/jais-30b-chat8k$3200$9710
Groqgroq/moonshotai-kimi-k2-instruct131k$1.00$3.00
AI21jamba-large-1.7256k$2.00$8.00
AI21jamba-mini-1.7256k$0.20$0.40
Together.aitogether_ai/moonshotai/Kimi-K2-Instruct131k$1.00$3.00
v0v0/v0-1.0-md128k$3.00$15.00
v0v0/v0-1.5-md128k$3.00$15.00
v0v0/v0-1.5-lg512k$15.00$75.00
Moonshotmoonshot/moonshot-v1-8k8k$0.20$2.00
Moonshotmoonshot/moonshot-v1-32k32k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k131k$2.00$5.00
Moonshotmoonshot/moonshot-v1-auto131k$2.00$5.00
Moonshotmoonshot/kimi-k2-0711-preview131k$0.60$2.50
Moonshotmoonshot/moonshot-v1-32k-043032k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k-0430131k$2.00$5.00
Moonshotmoonshot/moonshot-v1-8k-04308k$0.20$2.00
Moonshotmoonshot/kimi-latest131k$2.00$5.00
Moonshotmoonshot/kimi-latest-8k8k$0.20$2.00
Moonshotmoonshot/kimi-latest-32k32k$1.00$3.00
Moonshotmoonshot/kimi-latest-128k131k$2.00$5.00
Moonshotmoonshot/kimi-thinking-preview131k$30.00$30.00
Moonshotmoonshot/moonshot-v1-8k-vision-preview8k$0.20$2.00
Moonshotmoonshot/moonshot-v1-32k-vision-preview32k$1.00$3.00
Moonshotmoonshot/moonshot-v1-128k-vision-preview131k$2.00$5.00

Featuresโ€‹

Bugsโ€‹


LLM API Endpointsโ€‹

Featuresโ€‹

Bugsโ€‹


MCP Gatewayโ€‹

Featuresโ€‹

Bugsโ€‹

  • Fix to update object permission on update/delete key/team - PR #12701
  • Include /mcp in list of available routes on proxy - PR #12612

Management Endpoints / UIโ€‹

Featuresโ€‹

  • Keys
    • Regenerate Key State Management improvements - PR #12729
  • Models
    • Wildcard model filter support - PR #12597
    • Fixes for handling team only models on UI - PR #12632
  • Usage Page
    • Fix Y-axis labels overlap on Spend per Tag chart - PR #12754
  • Teams
    • Allow setting custom key duration + show key creation stats - PR #12722
    • Enable team admins to update member roles - PR #12629
  • Users
  • Logs Page
    • Add end_user filter on UI Logs Page - PR #12663
  • MCP Servers
    • Copy MCP Server name functionality - PR #12760
  • Vector Stores
    • UI support for clicking into Vector Stores - PR #12741
    • Allow adding Vertex RAG Engine, OpenAI, Azure through UI - PR #12752
  • General
    • Add Copy-on-Click for all IDs (Key, Team, Organization, MCP Server) - PR #12615
  • SCIM
    • Add GET /ServiceProviderConfig endpoint - PR #12664

Bugsโ€‹

  • Teams
    • Ensure user id correctly added when creating new teams - PR #12719
    • Fixes for handling team-only models on UI - PR #12632

Logging / Guardrail Integrationsโ€‹

Featuresโ€‹

Bugsโ€‹


Performance / Loadbalancing / Reliability improvementsโ€‹

Featuresโ€‹

  • Health Checks
    • Separate health app for liveness probes - PR #12669
    • Health check app on separate port - PR #12718
  • Caching
  • Router
    • Handle ZeroDivisionError with zero completion tokens in lowest_latency strategy - PR #12734

Bugsโ€‹

  • Database
    • Use upsert for managed object table to avoid UniqueViolationError - PR #11795
    • Refactor to support use_prisma_migrate for helm hook - PR #12600
  • Cache
    • Fix: redis caching for embedding response models - PR #12750

Helm Chartโ€‹

  • DB Migration Hook: refactor to support use_prisma_migrate - for helm hook PR
  • Add envVars and extraEnvVars support to Helm migrations job - PR #12591

General Proxy Improvementsโ€‹

Featuresโ€‹

  • Control Plane + Data Plane Architecture
    • Control Plane + Data Plane support - PR #12601
  • Proxy CLI
    • Add "keys import" command to CLI - PR #12620
  • Swagger Documentation
    • Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses - PR #12618
  • Dependencies
    • Loosen rich version from ==13.7.1 to >=13.7.1 - PR #12704

Bugsโ€‹

  • Verbose log is enabled by default fix - PR #12596

  • Add support for disabling callbacks in request body - PR #12762

  • Handle circular references in spend tracking metadata JSON serialization - PR #12643


New Contributorsโ€‹

Full Changelogโ€‹