Skip to content

Memory occupation on GPU  #1758

Description

@lodm94

Hi all,
I am experiencing strange behaviors in terms of memory occupation on the GPU.
Situation is the following on a server with T4 card:
I have a custom model (called net) that occupies 2081 MB. I want to istantiate a segmentation model on the same gpu that should run after my model. I chose two segmentation model from your modelzoo:

  • segformer (expected memory occupation 2.6GB)
  • convnext (expected memory occupation 4.23GB)

I istantiate both model with the following lines of code:

config_file= "path_to_config_file"
checkpoint_file = "path_to_ckpt"

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')

Then i run inference, as the demo you provided, with:
result = inference_segmentor(model, img)
Now, i have critical issue: memory on GPU. To evaluate the impact of segmentation model in terms of memory i setup a simple snippet that i monitor with nvidia-smi:

net = my_model()
config_file= "path_to_config_file"
checkpoint_file = "path_to_ckpt"
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')

x = numpy.zeros((720,1280,3)) #i have to work with this resolution
for i in range(1000):
    print(i, flush=True, end="\r")
    res=my_model.infer(x)
    result = inference_segmentor(model, x)

If i just istantiate my model, it occupies 2081 MB. If i run inference with my model, the occupation remain the same (this model is on TRT).
For the segformer:

  • If i just istantiate the segmentation model it occupies 1483 Mb
  • If i run only the segmentation model it occupies 3237 Mb (!!)
  • If i istantiate both the models, they occupy 2133 Mb
  • If i run both inference, they occupy 10543 Mb (!!!)

For the convnext:

  • If i just istantiate the segmentation model it occupies 1665 Mb
  • If i run only the segmentation model it occupies 4125 Mb (!!)
  • If i istantiate both the models, they occupy 2315 Mb
  • If i run both inference, they occupy 8907Mb (!!!)

I don't understand this growing pattern in memory occupation and most important i cannot reproduce the memory occupation you mentioned in the above model's config pages.

Is there another way to istantiate and infer the models?
Why there is so much difference in terms of memory between just istantiate a segmentation model and infering with it?
Why the model running separately occupy less memory then togheter?

Thanks in advance

P.S.: i tried this stuff not just on T4, but also on V100, A30 and RTXA6000. And the memory occupation is NEVER the same among different cards. If you feel to relate this problem, i can add more information about this. At the moment i just want to fix this issue on T4.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions