Memory occupation on GPU 

Hi all,
I am experiencing strange behaviors in terms of memory occupation on the GPU.
Situation is the following on a server with T4 card:
I have a custom model (called _net_) that occupies 2081 MB. I want to istantiate a segmentation model on the same gpu that should run after my model. I chose two segmentation model from your modelzoo:
- [segformer](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/segformer/segformer_mit-b1_512x512_160k_ade20k.py) (expected memory occupation 2.6GB)
- [convnext](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k.py) (expected memory occupation 4.23GB)

I istantiate both model with the following lines of code:
```
config_file= "path_to_config_file"
checkpoint_file = "path_to_ckpt"

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
```

Then i run inference, as the demo you provided, with:
`result = inference_segmentor(model, img)
`
Now, i have critical issue: memory on GPU. To evaluate the impact of segmentation model in terms of memory i setup a simple snippet that i monitor with nvidia-smi:

```
net = my_model()
config_file= "path_to_config_file"
checkpoint_file = "path_to_ckpt"
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')

x = numpy.zeros((720,1280,3)) #i have to work with this resolution
for i in range(1000):
    print(i, flush=True, end="\r")
    res=my_model.infer(x)
    result = inference_segmentor(model, x)
```
If i just istantiate my model, it occupies 2081 MB. If i run inference with my model, the occupation remain the same (this model is on TRT).
For the segformer:

- If i just istantiate the segmentation model it occupies 1483 Mb
- If i run only the segmentation model it occupies 3237 Mb (!!)
- If i istantiate both the models, they occupy 2133 Mb
- If i run both inference, they occupy 10543 Mb (!!!)

For the convnext:

- If i just istantiate the segmentation model it occupies 1665 Mb
- If i run only the segmentation model it occupies 4125 Mb (!!)
- If i istantiate both the models, they occupy 2315 Mb
- If i run both inference, they occupy 8907Mb (!!!)

I don't understand this growing pattern in memory occupation and most important i cannot reproduce the memory occupation you mentioned in the above model's config pages. 

Is there another way to istantiate and infer the models? 
Why there is so much difference in terms of memory between   just istantiate a segmentation model and infering with it?
Why the model running separately occupy less memory then togheter?

Thanks in advance

P.S.: i tried this stuff not just on T4, but also on V100, A30 and RTXA6000. And the memory occupation is NEVER the same among different cards. If you feel to relate this problem, i can add more information about this. At the moment i just want to fix this issue on T4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory occupation on GPU #1758

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Memory occupation on GPU #1758

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions