Failed to install `sglang==0.4.6post1` due to dependencies and `sgl-kernel` issues for GH200 Server #13303

ziminz19 · 2025-11-15T00:04:13Z

ziminz19
Nov 15, 2025

Installation Issues with `sglang==0.4.6post1` on GH200 (ARM/NGC PyTorch)

I am attempting to install a modified version of sglang==0.4.6post1 from eric-ai-lab/Soft-Thinking on a server node equipped with 4x NVIDIA GH200 GPUs (with ARM architecture CPU).

For optimal performance, I am working within the recommended NVIDIA NGC PyTorch 25.06 container (nvcr.io/nvidia/pytorch:25.06-py3). This environment, with its pre-compiled CUDA/PyTorch stack, is causing several dependency and build failures when attempting to install the package from source.

Problems

1. Failure to Build `sgl-kernel==0.1.0` from Source

The required dependency sgl-kernel==0.1.0 lacks a pre-compiled wheel for the aarch64 (ARM) architecture. Attempting to build it from source using make build consistently fails. The build fails almost on every files, producing an a lot of error messages. The process eventually kills the compute node. I attempted to limit parallelism using MAX_JOBS=2, but the issue persists.

2. `flashinfer-python` Dependency Conflict

The dependency chain requires flashinfer-python, which, in turn, requires nvidia-cudnn-frontend>=1.13.0. The NGC container is pre-installed with an older version: nvidia-cudnn-frontend==1.12.0. I am unable to upgrade or reinstall a newer version of this package due to the constraints of the controlled NGC environment.

Questions

I would greatly appreciate any insights or assistance on the following points:

NGC/GH200 Installation Experience: Has anyone successfully installed sglang (or a similar high-performance kernel library) from source within a NVIDIA NGC PyTorch container on an ARM-based GH200 node? I'm specifically looking for environmental configuration tips or known workarounds for this setup.
sgl-kernel Version Compatibility: Since sgl-kernel>=0.3.12 appears to offer official aarch64 wheels, would it be possible to force the use of a newer version (e.g., sgl-kernel>=0.3.12) with the modified sglang==0.4.6post1? What modifications would be needed in the sglang source to support a newer sgl-kernel API?
flashinfer-python Conflict Resolution: Is there a known way to install flashinfer-python while bypassing or resolving the hard dependency check on nvidia-cudnn-frontend>=1.13.0? Alternatively, is there a compatible version of flashinfer-python that works with the pre-installed 1.12.0 version in the container?

Thank you for your help in advance!

Reproduction step

clone repo https://github.com/eric-ai-lab/Soft-Thinking.git
cd sglang_soft_thinking_pkg/sgl-kernel
make build

packages preinstalled in NGC Container

> pip list
Package                    Version
-------------------------- -----------------------------
absl-py                    2.3.0
aiohappyeyeballs           2.6.1
aiohttp                    3.12.7
aiosignal                  1.3.2
annotated-types            0.7.0
anyio                      4.9.0
apex                       0.1
argon2-cffi                25.1.0
argon2-cffi-bindings       21.2.0
arrow                      1.3.0
asciitree                  0.3.3
asttokens                  3.0.0
astunparse                 1.6.3
async-lru                  2.0.5
attrs                      25.3.0
audioread                  3.0.1
babel                      2.17.0
beautifulsoup4             4.13.4
black                      25.1.0
bleach                     6.2.0
blis                       0.7.11
cachetools                 6.0.0
catalogue                  2.0.10
certifi                    2025.4.26
cffi                       1.17.1
charset-normalizer         3.4.2
click                      8.2.1
cloudpathlib               0.21.1
cloudpickle                3.1.1
cmake                      3.31.6
comm                       0.2.2
confection                 0.1.5
contourpy                  1.3.2
cuda-bindings              12.9.0
cuda-python                12.9.0
cudf                       25.4.0
cudf-polars                25.4.0
cugraph                    25.4.0
cugraph-service-client     25.4.0
cugraph-service-server     25.4.0
cuml                       25.4.0
cupy-cuda12x               13.3.0
cuvs                       25.4.0
cycler                     0.12.1
cymem                      2.0.11
Cython                     3.1.1
dask                       2025.2.0
dask-cuda                  25.4.0
dask-cudf                  25.4.0
debugpy                    1.8.14
decorator                  5.2.1
defusedxml                 0.7.1
dill                       0.4.0
distributed                2025.2.0
distributed-ucxx           0.43.0
dm-tree                    0.1.9
einops                     0.8.1
execnet                    2.1.1
executing                  2.2.0
expecttest                 0.3.0
fasteners                  0.19
fastjsonschema             2.21.1
fastrlock                  0.8.3
filelock                   3.18.0
flash_attn                 2.7.4.post1
fonttools                  4.58.1
fqdn                       1.5.1
frozenlist                 1.6.0
fsspec                     2025.5.1
gast                       0.6.0
grpcio                     1.62.1
h11                        0.16.0
httpcore                   1.0.9
httpx                      0.28.1
hypothesis                 6.130.8
idna                       3.10
importlib_metadata         8.7.0
iniconfig                  2.1.0
ipykernel                  6.29.5
ipython                    9.3.0
ipython_pygments_lexers    1.1.1
isoduration                20.11.0
isort                      6.0.1
jedi                       0.19.2
Jinja2                     3.1.6
joblib                     1.5.1
json5                      0.12.0
jsonpointer                3.0.0
jsonschema                 4.24.0
jsonschema-specifications  2025.4.1
jupyter_client             8.6.3
jupyter_core               5.8.1
jupyter-events             0.12.0
jupyter-lsp                2.2.5
jupyter_server             2.16.0
jupyter_server_terminals   0.5.3
jupyterlab                 4.4.3
jupyterlab_code_formatter  3.0.2
jupyterlab_pygments        0.3.0
jupyterlab_server          2.27.3
jupyterlab_tensorboard_pro 4.0.0
jupytext                   1.17.2
kiwisolver                 1.4.8
kvikio                     25.4.0
langcodes                  3.5.0
language_data              1.3.0
lazy_loader                0.4
libcudf                    25.4.0
libcugraph                 25.4.0
libcuml                    25.4.0
libcuvs                    25.4.0
libkvikio                  25.4.0
libraft                    25.4.0
librmm                     25.4.0
librmm-cu12                25.4.0
librosa                    0.11.0
libucxx                    0.43.0
lightning-thunder          0.2.3.dev0
lightning-utilities        0.14.3
lintrunner                 0.12.7
llvmlite                   0.42.0
locket                     1.0.0
looseversion               1.3.0
marisa-trie                1.2.1
Markdown                   3.8
markdown-it-py             3.0.0
MarkupSafe                 3.0.2
matplotlib                 3.10.3
matplotlib-inline          0.1.7
mdit-py-plugins            0.4.2
mdurl                      0.1.2
mistune                    3.1.3
mock                       5.2.0
mpmath                     1.3.0
msgpack                    1.1.0
multidict                  6.4.4
murmurhash                 1.0.13
mypy_extensions            1.1.0
nbclient                   0.10.2
nbconvert                  7.16.6
nbformat                   5.10.4
nest-asyncio               1.6.0
networkx                   3.5
ninja                      1.11.1.4
notebook                   7.4.3
notebook_shim              0.2.4
numba                      0.59.1
numba-cuda                 0.4.0
numcodecs                  0.13.1
numpy                      1.26.4
nvdlfw_inspect             0.1.0
nvfuser                    0.2.27a0+9bf5aca
nvidia-cudnn-frontend      1.12.0
nvidia-dali-cuda120        1.50.0
nvidia-ml-py               12.575.51
nvidia-modelopt            0.29.0
nvidia-modelopt-core       0.29.0
nvidia-nvcomp-cu12         4.2.0.14
nvidia-nvimgcodec-cu12     0.5.0.13
nvidia-nvjpeg-cu12         12.4.0.16
nvidia-nvjpeg2k-cu12       0.8.1.40
nvidia-nvtiff-cu12         0.5.0.67
nvidia-resiliency-ext      0.4.0
nvtx                       0.2.11
nx-cugraph                 25.4.0
onnx                       1.17.0
opt_einsum                 3.4.0
optree                     0.16.0
overrides                  7.7.0
packaging                  23.2
pandas                     2.2.3
pandocfilters              1.5.1
parso                      0.8.4
partd                      1.4.2
pathspec                   0.12.1
pexpect                    4.9.0
pillow                     11.2.1
pip                        25.1.1
platformdirs               4.3.8
pluggy                     1.6.0
ply                        3.11
polars                     1.25.2
polygraphy                 0.49.20
pooch                      1.8.2
preshed                    3.0.10
prometheus_client          0.22.1
prompt_toolkit             3.0.51
propcache                  0.3.1
protobuf                   4.24.4
psutil                     7.0.0
ptyprocess                 0.7.0
PuLP                       3.2.1
pure_eval                  0.2.3
pyarrow                    19.0.1
pybind11                   2.13.6
pybind11_global            2.13.6
pycocotools                2.0+nv0.8.1
pycparser                  2.22
pydantic                   2.11.5
pydantic_core              2.33.2
Pygments                   2.19.1
pylibcudf                  25.4.0
pylibcugraph               25.4.0
pylibcugraphops            25.4.0
pylibraft                  25.4.0
pylibwholegraph            25.4.0
pynvjitlink                0.3.0
pynvml                     12.0.0
pyparsing                  3.2.3
pytest                     8.1.1
pytest-flakefinder         1.1.0
pytest-rerunfailures       15.1
pytest-shard               0.1.2
pytest-xdist               3.7.0
python-dateutil            2.9.0.post0
python-hostlist            2.2.1
python-json-logger         3.3.0
pytorch-triton             3.3.0+git96316ce52.nvinternal
pytz                       2023.4
PyYAML                     6.0.2
pyzmq                      26.4.0
raft-dask                  25.4.0
rapids-dask-dependency     25.4.0a0
rapids-logger              0.1.18
referencing                0.36.2
regex                      2024.11.6
requests                   2.32.3
rfc3339-validator          0.1.4
rfc3986-validator          0.1.1
rich                       14.0.0
rmm                        25.4.0
rpds-py                    0.25.1
safetensors                0.5.3
scikit-learn               1.6.1
scipy                      1.15.3
Send2Trash                 1.8.3
setuptools                 78.1.1
shellingham                1.5.4
six                        1.16.0
smart-open                 7.1.0
sniffio                    1.3.1
sortedcontainers           2.4.0
soundfile                  0.13.1
soupsieve                  2.7
soxr                       0.5.0.post1
spacy                      3.7.5
spacy-legacy               3.0.12
spacy-loggers              1.0.5
srsly                      2.5.1
stack-data                 0.6.3
sympy                      1.14.0
tabulate                   0.9.0
tblib                      3.1.0
tensorboard                2.16.2
tensorboard-data-server    0.7.2
tensorrt                   10.11.0.33
terminado                  0.18.1
thinc                      8.2.5
threadpoolctl              3.6.0
thriftpy2                  0.5.2
tinycss2                   1.4.0
toolz                      1.0.0
torch                      2.8.0a0+5228986c39.nv25.6
torch_tensorrt             2.8.0a0
torchao                    0.11.0+git
torchprofile               0.0.4
torchvision                0.22.0a0+95f10a4e
tornado                    6.5.1
tqdm                       4.67.1
traitlets                  5.14.3
transformer_engine         2.4.0+3cd6870
treelite                   4.4.1
typer                      0.16.0
types-dataclasses          0.6.6
types-python-dateutil      2.9.0.20250516
typing_extensions          4.14.0
typing-inspection          0.4.1
tzdata                     2025.2
ucx-py                     0.43.0
ucxx                       0.43.0
uri-template               1.3.0
urllib3                    2.0.7
wasabi                     1.1.3
wcwidth                    0.2.13
weasel                     0.4.1
webcolors                  24.11.1
webencodings               0.5.1
websocket-client           1.8.0
Werkzeug                   3.1.3
wheel                      0.45.1
wrapt                      1.17.2
xdoctest                   1.0.2
xgboost                    2.1.4
yarl                       1.20.0
zarr                       2.18.7
zict                       3.0.0
zipp                       3.22.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failed to install `sglang==0.4.6post1` due to dependencies and `sgl-kernel` issues for GH200 Server #13303

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Failed to install sglang==0.4.6post1 due to dependencies and sgl-kernel issues for GH200 Server #13303

Uh oh!

Uh oh!

ziminz19 Nov 15, 2025

Installation Issues with sglang==0.4.6post1 on GH200 (ARM/NGC PyTorch)

Problems

1. Failure to Build sgl-kernel==0.1.0 from Source

2. flashinfer-python Dependency Conflict

Questions

Reproduction step

packages preinstalled in NGC Container

Replies: 0 comments

Failed to install `sglang==0.4.6post1` due to dependencies and `sgl-kernel` issues for GH200 Server #13303

ziminz19
Nov 15, 2025

Installation Issues with `sglang==0.4.6post1` on GH200 (ARM/NGC PyTorch)

1. Failure to Build `sgl-kernel==0.1.0` from Source

2. `flashinfer-python` Dependency Conflict