Cluster Toolkit release notes

This page documents production updates to Cluster Toolkit. Check this page for announcements about new or updated features, bug fixes, known issues, and deprecated functionality.

You can see the latest product updates for all of Google Cloud on the Google Cloud page, browse and filter all release notes in the Google Cloud console, or programmatically access release notes in BigQuery.

November 19, 2025

v1.73.0
Feature

Cluster Toolkit version v1.73.0 is available. This release adds support for the GKE Inference Gateway and adds a new blueprint for A3 High machines that automates the process of building a custom image with a TCPx-patched kernel for enhanced network performance. This version also includes an initial blueprint for G4 machine types and parameterized the gIB NCCL RDMA plugin installer in the gke-a4x.yaml blueprint. For more information about this release, see the Release announcement on GitHub.

November 13, 2025

v1.72.0
Feature

Cluster Toolkit version v1.72.0 is available. This release adds support for Google Cloud Managed Lustre as an optional storage solution for the gke-tpu-v6-advanced blueprint. This release also adds four example blueprints to support the deployment of Sycomp storage. In addition, this release makes improvements to the gke-node-pool module and a4xhigh-slurm-blueprint.yaml blueprint. For more information about this release, see the Release announcement on GitHub.

November 03, 2025

v1.71.0
Feature

Cluster Toolkit version v1.71.0 is available. This release includes a fix for a munge mount on login failure due to slow controller Slurm v6 setup, and adds Managed Lustre support in the gke-a4x blueprint. For more information about this release, see the Release announcement on GitHub.

October 24, 2025

v1.70.0
Feature

Cluster Toolkit version v1.70.0 is available. This release adds automated TPU support and Cloud Storage FUSE mounts in the TPU v6 blueprint and refactors the H4D blueprint. This version also includes breaking changes, such as removing support for the maintenance_interval field for reservations created by Technical Account Managers (TAMs) and migrating Jobset from static manifests to a Helm chart. For a complete list of changes, see the Release announcement on GitHub.

October 21, 2025

v1.69.0
Feature

Cluster Toolkit version v1.69.0 is available. This release adds NUMA-aware scheduling in GKE clusters for G4 machines and adds a new module that provides mount scripts for WEKA filesystems. This version also includes PSA updates and adds a GKE sample for running the nvidia-bug-report shell script. For details, see the Release announcement on GitHub.

October 10, 2025

v1.68.0
Feature

Cluster Toolkit version v1.68.0 is available. This release lets you download the NVIDIA Collective Communications Library (NCCL) software packages libnccl2 and libnccl-dev for A3U and A4H machine types. For more information about this release, see the Release announcement on GitHub.

This release supports the generally available, open-source IBM Spectrum Symphony HostFactory connectors for Google Compute Engine and Google Kubernetes Engine, which you can deploy through Cluster Toolkit to extend your on-premises cluster or run entirely within Google Cloud. For information, see Run IBM Spectrum Symphony workloads.

September 19, 2025

v1.67.0
Feature

Cluster Toolkit version v1.67.0 is available. This release adds support for aarch64-based architecture. For more information about this release, see the Release announcement on GitHub.

September 15, 2025

v1.66.0
Feature

Cluster Toolkit version v1.66.0 is available. This release lets you use Cloud Storage FUSE for H4D machine types and sets the default cluster availability to zonal. For more information about this release, see the Release announcement on GitHub.

September 09, 2025

v1.65.0
Feature

Cluster Toolkit version v1.65.0 is available. This release expands support for Managed Lustre on A4X instances and provides an improved GPU network wait solution for A-family machine types. This version also deprecates Debian-based blueprints for A3 Mega GPUs. For a complete list of changes, see the Release announcement on GitHub.

September 01, 2025

v1.64.0
Feature

Cluster Toolkit version v1.64.0 is available. This release integrates GKE Managed Lustre, which provides high-performance, scalable storage for your GKE clusters. The storage for A3 Ultra machine types now uses basic SSD for improved performance. This version also improves support for alternative services for private service access. For details about these changes and other updates, see the Release announcement on GitHub.

August 26, 2025

v1.63.0
Feature

Cluster Toolkit version v1.63.0 is available. This release upgrades Slurm image versions to the 6-11 iteration, for example, slurm-gcp-6-11-hpc-rocky-linux-8. This version migrates Dynamic Workload Scheduler (DWS) Flex-start to regional managed instance groups (MIGs). This version also includes breaking changes, for example, updating the file storage for A3 Ultra machine types to basic HDD. For a complete list of new features, improvements, and bug fixes, see the Release announcement on GitHub.

August 14, 2025

v1.62.0
Feature

Cluster Toolkit version v1.62.0 is available. This release adds new blueprints for A4X instances and adds a community scheduler module for Slinky (Slurm on Kubernetes). For details, see the Release announcement on GitHub.

August 04, 2025

v1.61.0
Change

Cluster Toolkit version v1.61.0 is available. This release adds a namespace for GKE modules and optimizes Cloud Storage FUSE configurations for A3 Ultra and A4 blueprints. For details, see the Release announcement on GitHub.

July 22, 2025

v1.59.0
Change

Cluster Toolkit version v1.59.0 is available. This release adds a configurable number of IP addresses per NAT for A-family blueprints and fixes an issue with additional disks for login nodes. For details, see the Release announcement on GitHub.

July 15, 2025

v1.58.0
Feature

Cluster Toolkit version v1.58.0 is available. This release adds a new blueprint for deploying GKE clusters with H4D instances and deprecates blueprints for deploying Parallelstore. For details, see the Release announcement on GitHub.

June 30, 2025

v1.57.0
Feature

Cluster Toolkit version v1.57.0 is available. This release integrates Cluster Health Scripts (CHS) with GKE blueprints for A3 Mega, A3 Ultra, and A4 instances. For details, see the Release announcement on GitHub.

June 23, 2025

v1.56.0
Change

Cluster Toolkit version v1.56.0 is available. This release improves SlurmGCP Resume functionality and includes several bug fixes. For details, see the Release announcement on GitHub.

June 16, 2025

v1.55.0
Feature

Cluster Toolkit version v1.55.0 is available. This release adds a new blueprint for a high-throughput AlphaFold 3 execution environment and aligns Cloud Storage FUSE configurations with best practices. For details, see the Release announcement on GitHub.

June 10, 2025

v1.54.0
Change

Cluster Toolkit version v1.54.0 is available. This release adds Managed Lustre support for non-default ports in GKE and adds a network blocking script for A3 High instances. For details, see the Release announcement on GitHub.

June 05, 2025

v1.53.0
Change

Cluster Toolkit version v1.53.0 is available. This release standardizes the A3 Mega Slurm solution on Ubuntu and adds a new GKE blueprint for A4X instances. For details, see the Release announcement on GitHub.

May 22, 2025

v1.52.0
Breaking

Cluster Toolkit version v1.52.0 is available. This release improves CloudSQL support with database flags and query insights, and adds several bug fixes. For details, see the Release announcement on GitHub.

May 13, 2025

v1.51.0
Feature

Cluster Toolkit version v1.51.0 is available. This release adds GPU health-check epilogs for A3 High and A3 Mega Slurm blueprints and adds a new GKE TPU v6e example. For details, see the Release announcement on GitHub.

May 05, 2025

v1.50.0
Feature

Cluster Toolkit version v1.50.0 is available. This release adds new blueprints for Managed Lustre attached to VMs and Slurm clusters. For details, see the Release announcement on GitHub.

April 24, 2025

v1.49.0
Feature

Cluster Toolkit version v1.49.0 is available. This release adds TPU support to the GKE nodepool module and adds support for Managed Lustre. For details, see the Release announcement on GitHub.

April 01, 2025

v1.48.0
Feature

Cluster Toolkit version v1.48.0 is available. This release updates the GKE nodepool module to support multiple nodepools and adds automatic GPU health checks for Slurm. For details, see the Release announcement on GitHub.

February 27, 2025

v1.47.0
Feature

Cluster Toolkit version v1.47.0 is available. This release adds support for the A4 machine family in GKE and Slurm blueprints and adds Dynamic Workload Scheduler Flex support for GKE. For details, see the Release announcement on GitHub.

February 07, 2025

v1.46.0
Feature

Cluster Toolkit version v1.46.0 is available. This release officially supports Kueue as the workload scheduler for A3U and adds new blueprints for A3U, H4D VMs, and Slurm. For details, see the Release announcement on GitHub.

January 15, 2025

v1.45.0
Change

Cluster Toolkit version v1.45.0 is available. This release updates A3 Ultra GKE blueprints to newer versions of Kueue and Jobset, and adds support for cluster deletion protection. For details, see the Release announcement on GitHub.

For information about previous releases of Cluster Toolkit, see the Announcements page on GitHub.