Skip to content

vgpu 并发调度pod时,显存混乱 #60

Description

@singeleaf

执行下面的命令,同时调度2个pod,一个分配24576M显存,一个分配600M显存,pod起来后进入容器使用nvidia-smi查看,发现两者的显存是反的,给容器ubuntu-container-24576分配了600M显存,给容器ubuntu-container-600分配了24576显存

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod-1v-24576-1
spec:
  schedulerName: volcano
  containers:
    - name: ubuntu-container-24576
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          volcano.sh/vgpu-number: 1 # requesting 1 vGPUs
          volcano.sh/vgpu-memory: 24576
---
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod-1v-600-1
spec:
  schedulerName: volcano
  containers:
    - name: ubuntu-container-600
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          volcano.sh/vgpu-number: 1 # requesting 1 vGPUs
          volcano.sh/vgpu-memory: 600
EOF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions