How to free multiple gpu memory

The question is how do you free memory

https://github.com/triton-inference-server/onnxruntime_backend/issues/103


When the model is deployed to a single card, I can specify real-time release of gpu memory, but if the model is deployed to multiple cards, I don't know what the format looks like
```
parameters { key: "memory.enable_memory_arena_shrinkage" value: { string_value: "gpu:3" }  }

instance_group [
    {
        count: 1
        kind: KIND_GPU
        gpus: [ 3 ]
    }
]
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to free multiple gpu memory #7825

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to free multiple gpu memory #7825

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions