TensorZero 2025.11.5 is out! 📌 This release brings native support for Anthropic's Beta Structured Outputs, enhanced flexibility for runtime configuration with `extra_body` and `extra_headers`, and expanded tool and thought signature support across providers. — Changelog 🚨 Moving forward, explicit `tensorzero::params` will take precedence over conflicting native parameters when using the OpenAI-compatible inference endpoint. ⚠️ [Planned Deprecation] Rename `json_mode="implicit_tool"` to `json_mode="tool"`. ⚠️ [Planned Deprecation] Set `model_name` and optionally `provider_name` instead of `model_provider_name` in `extra_body` and `extra_headers` objects supplied at inference time. Alternatively, don't include a scope filter at all. 🆕 Support Anthropic's Beta Structured Outputs feature natively (`beta_structured_outputs`). `extra_headers` is no longer necessary. 🆕 Support `json_mode="tool"` in chat inferences that don't otherwise include tools. 🆕 Support `extra_body` and `extra_headers` supplied at inference time without scope filters. 🆕 Support `extra_body` and `extra_headers` supplied at inference time with `model_name` and optional `provider_name` scope filters. 🆕 Support thought signatures for the GCP Vertex model providers. 🆕 Support custom tools for the OpenAI model provider. 🆕 Add `description` fields to evaluation and evaluator configuration. & multiple under-the-hood and UI improvements https://lnkd.in/g5UCrR8f
About us
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
- Website
-
https://www.tensorzero.com/
External link for TensorZero
- Industry
- Technology, Information and Internet
- Company size
- 2-10 employees
- Type
- Privately Held
Employees at TensorZero
Updates
-
TensorZero 2025.11.4 is out! 📌 This release includes new features including adaptive stopping for evaluations, audio input support, and expanded Python SDK methods. — Changelog 🚨 [Breaking Change] Moving forward, `allowed_tools` must include dynamic tools (tools specified at inference time rather than in configuration). This matches the OpenAI API behavior. Previously, TensorZero assumed that dynamic tools were always allowed. ⚠️ [Planned Deprecation] Use `limit` instead of `page_size` with the programmatic observability methods. Previously, the methods mixed these two fields. ⚠️ [Planned Deprecation] Don't nest fields in `metadata` or `tool_params` when calling `PATCH /v1/datasets/{dataset_name}/datapoints` or `update_datapoints`. Moving forward, please place them in the root. ⚠️ [Completed Deprecation] Require `template_filesystem_access.base_path` when `template_filesystem_access.enabled` is true. ⚠️ [Completed Deprecation] Removed many deprecated experimental types and methods from the TensorZero Python SDK. 🆕 Add adaptive stopping for evaluations in the UI and Python SDK. 🆕 Support explicit `candidate_variants` and `fallback_variants` when using uniform sampling. 🆕 Support the `input_audio` content block in the OpenAI-compatible inference endpoint. 🆕 Support the `input_audio` content block in the OpenAI, Azure, GCP Vertex Gemini, Google AI Studio, and OpenRouter model providers. 🆕 Add optional `filename` field for input files. 🆕 Move closer to parity between the GCP Vertex Anthropic model provider and the Anthropic model provider. 🆕 Expose new observability and dataset management endpoints as methods in the TensorZero Python SDK. 🆕 Add optional `postgres.enabled` field to the configuration. 🆕 Handle missing usage information from model providers that don't report it. 🆕 Add experimental method for searching inferences programmatically (`search_query_experimental`). 🆕 Add a native OpenRouter embedding model provider. & multiple under-the-hood and UI improvements https://lnkd.in/gpep2wAp
-
TensorZero 2025.11.3 is out! 🔨 Enable TLS support for Postgres connections. 🔨 Fix handling of user-defined tags in batch inference. & multiple under-the-hood and UI improvements https://lnkd.in/gKNcq5fJ
-
TensorZero 2025.11.2 is out! 🚨 [Breaking Change] Moving forward, the gateway will attempt any `fallback_variants` in order rather than randomly sample them. 🔨 Fix a bug that prevented some model inferences from being rendered correctly in the UI. 🔨 Handle non-image base64 file inputs consistently in the OpenAI-compatible inference endpoint. 🔨 Handle `raw_response` correctly for batch inference with GCP Vertex AI Gemini. 🆕 Apply the `tensorzero::api_key_public_id` tag to inference and feedback when using auth. 🆕 Add updated HTTP endpoint for creating datapoints (`POST /v1/datasets/{dataset_name}/datapoints`). 🆕 Add `gateway.global_outbound_http_timeout_ms` configuration setting. & multiple under-the-hood and UI improvements (thanks omarraf!) https://lnkd.in/gngFHpba
-
TensorZero 2025.11.1 is out! 📌 This release improves compatibility with the OpenAI API spec, adds new endpoints for programmatic observability, and introduces rate limiting by API keys. — Changelog 🔨 Fix a regression that prevented batch inferences from being rendered in the UI. 🔨 Handle missing Postgres credentials gracefully in the UI. 🆕 Support rate limiting by API key (`api_key_public_id`). 🆕 Add native `service_tier` inference parameter (supported providers: Anthropic, Azure, Groq, OpenAI). `extra_body` is no longer necessary. 🆕 Add native `detail` parameter for input images (supported providers: Azure, OpenAI, xAI). `extra_body` is no longer necessary. 🆕 Add updated HTTP endpoint for querying inferences by ID (`POST /v1/inferences/get_inferences`). 🆕 Add updated HTTP endpoint for querying inferences with filters (`POST /v1/inferences/list_inferences`). & multiple under-the-hood and UI improvements https://lnkd.in/gn5GFhqq
-
TensorZero 2025.11.0 is out! 📌 This release introduces two new major features: adaptive experimentation (A/B testing) and TensorZero API keys. — Changelog ⚠️ Completed the planned deprecation of the configuration field `enable_template_filesystem_access` in favor of `template_filesystem_access.enabled`. 🔨 Handle the `global` region correctly for GCP Vertex Anthropic. 🔨 Fix `output` format for JSON functions in the new endpoint for updating datapoints (`PATCH /v1/{dataset_name}/datapoints`). The `output` field now matches the inference endpoint (an object with a `raw` field; `parsed` is ignored and recomputed internally). 🆕 Add automated experimentation feature (automated A/B testing). 🆕 Add authentication for the TensorZero Gateway (virtual API keys). 🆕 Add native inference parameters to enable reasoning for every supported model provider (`reasoning_effort` or `thinking_budget_tokens` depending on the provider). `extra_body` is no longer necessary. 🆕 Add native `verbosity` inference parameter. `extra_body` is no longer necessary. 🆕 Support token inputs in the embeddings endpoint. 🆕 Support input thought content blocks for GCP Vertex Anthropic. 🆕 Improve handling of JSON Schemas for GCP Vertex Gemini and Google AI Studio. & multiple under-the-hood and UI improvements https://lnkd.in/gz4tHhae
-
🗞️ [Blog Post] Bandits in your LLM Gateway: Improve LLM Applications Faster with Adaptive Experimentation (A/B Testing) • Experimentation (A/B testing) with production traffic is the most reliable way to identify the best prompts and models for your task, but traditional approaches have significant limitations: you must either fix the experiment length in advance (risking wasted data or inconclusive results) or repeatedly check for significance (inflating error rates through p-hacking). • TensorZero now provides adaptive experimentation directly in its open-source LLM gateway. This multi-armed bandit algorithm overcomes the p-hacking problem, running experiments precisely until there’s enough evidence to pick a winner while dynamically allocating LLM inference traffic for maximum efficiency. • Across a diverse set of realistic and challenging environments, adaptive experimentation reduced the average time to correctly identify the best LLM variants (prompts, models, etc.) by 37% compared to simple A/B testing. Read more: https://lnkd.in/gpWtYQ3D
-
Shuyang Li previously was a staff software engineer at Google focused on next-generation search infrastructure, LLM-based search, and many other specialized search products (local, travel, maps, etc.). Before that, he worked on ML/analytics products at Palantir and graduated summa cum laude from Notre Dame. Welcome to the team, Shuyang!
-
-
TensorZero 2025.10.9 is out! 📌 This release introduces a suite of dataset HTTP endpoints and UI editing for messages/content blocks, plus enhanced observability with new OpenTelemetry spans and dynamic OTLP attributes. It also includes a small breaking change to programmatic observability/dataset APIs and completes a Helm ingress deprecation. 🚨 Notice on 2025.10.8: We ran into a technical issue during the release process for 2025.10.8 that resulted in a broken build for the TensorZero Python SDK on PyPI. We've yanked that release and recommend upgrading to this version. — Changelog 🚨 [Breaking Change] This release includes small breaking changes to the programmatic observability/dataset APIs (e.g. `list_datapoints`, `experimental_list_inferences`) and the underlying data schema. Moving forward, TensorZero will store and return the new format for text (`{"type": "text", "text": "..."}`), template (`{"type": "template", "name": "...", "arguments": { ... }}`), and file (`{"type": "file", "file_type": "...", ...}`) content blocks. Note: These changes do not affect the inference APIs or the legacy data stored in ClickHouse. ⚠️ [Completed Deprecation] The TensorZero Helm chart will no longer support the legacy gateway ingress. The `createLegacyIngress` variable was removed. Moving forward, the only supported gateway ingress is `tensorzero-gateway`. 🔨 Fix an issue that prevented comments from being rendered in the workflow evaluation UI. 🆕 Add HTTP endpoint for querying datapoints by ID (`POST /v1/datasets/get_datapoints`). 🆕 Add HTTP endpoint for querying datapoints with filters (`POST /v1/datasets/{dataset_name}/list_datapoints`). 🆕 Add HTTP endpoint for creating datapoints from inferences (`POST /v1/datasets/{dataset_id}/from_inferences`). 🆕 Add HTTP endpoint for updating datapoints (`PATCH /v1/{dataset_name}/datapoints`). 🆕 Add HTTP endpoint for updating datapoint metadata (`PATCH /v1/datasets/{dataset_name}/datapoints/metadata`). 🆕 Add HTTP endpoint for deleting datapoints (`DELETE /v1/datasets/{dataset_id}/datapoints`). 🆕 Add HTTP endpoint for deleting datasets (`DELETE /v1/datasets/{dataset_id}`). 🆕 Enable users to create, update, and delete messages and content blocks in the dataset editor in the UI. 🆕 Emit OpenTelemetry spans for rate limiting queries. 🆕 Add support for deployment service accounts in the Helm chart (thanks jinnovation!). 🆕 Add support for dynamic extra attributes for OTLP spans (`TensorZero-OTLP-Traces-Extra-Attribute-*`). & multiple under-the-hood and UI improvements https://lnkd.in/gaARGhY4
-
TensorZero 2025.10.7 is out! 📌 This release introduces an API endpoint for batch datapoint updates and improves the handling of input files for multimodal inference. — Changelog 🚨 [Breaking Change] The default value for `fetch_and_encode_input_files_before_inference` is changing from `true` to `false`. As a result, the gateway will no longer fetch input files before inference, but instead will fetch them in parallel with inference (for observability). In rare cases, this may cause the gateway to receive different input files than those received by model providers. ⚠️ [Planned Deprecation] Migrate file content blocks from untagged enums to tagged enums. Moving forward, you should provide a field `file_type` with a value of `"url"`, `"base64"`, or `"object_storage"`. Untagged enums are still accepted for backwards compatibility but will be deprecated in `2026.2+`. ⚠️ [Planned Deprecation] Rename the TensorZero Python SDK type `InferenceFilterTreeNode` to `InferenceFilter` for consistency with related types. Both types will be available as aliases until `2026.2+`. 🔨 Send a user agent when fetching input files to avoid restrictions from websites that require it (e.g. Wikimedia). 🆕 Add a new endpoint for batch datapoint updates (`PATCH /v1/{dataset_name}/datapoints`). 🆕 Expose thought summaries in the TensorZero Python SDK. 🆕 Add additional semantic tags when exporting traces using the OpenInference format (thanks jinnovation!) & multiple under-the-hood and UI improvements https://lnkd.in/gD4dDUqC