NVIDIA MIG 설정하고 생성, 삭제할 수 있다.
Rocky-9.2
NVIDIA A100 80GB PCIe
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | 0 |
| N/A 43C P0 68W / 300W | 4MiB / 81920MiB | 24% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
# nvidia-smi -i 0 -mig 1
# nvidia-smi –gpu-reset
GPU 00000000:03:00.0 was successfully reset.
All done.
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | On |
| N/A 44C P0 73W / 300W | 0MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+--------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+================================+===========+=======================|
| No MIG devices found |
+---------------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
# nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.20gb 15 4/4 19.50 No 14 1 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 3g.40gb 9 2/2 39.25 No 42 2 0 |
| 3 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 4g.40gb 5 1/1 39.25 No 56 2 0 |
| 4 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 7g.80gb 0 1/1 78.75 No 98 5 0 |
| 7 1 1 |
+-----------------------------------------------------------------------------+
# nvidia-smi mig -lgipp
GPU 0 Profile ID 19 Placements: {0,1,2,3,4,5,6}:1
GPU 0 Profile ID 20 Placements: {0,1,2,3,4,5,6}:1
GPU 0 Profile ID 15 Placements: {0,2,4,6}:2
GPU 0 Profile ID 14 Placements: {0,2,4}:2
GPU 0 Profile ID 9 Placements: {0,4}:4
GPU 0 Profile ID 5 Placement : {0}:4
GPU 0 Profile ID 0 Placement : {0}:8
MIG 프로필 ID로 생성
# nvidia-smi mig -cgi 15
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
# nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 6/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.20gb 15 3/4 19.50 No 14 1 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 3g.40gb 9 1/2 39.25 No 42 2 0 |
| 3 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 4g.40gb 5 1/1 39.25 No 56 2 0 |
| 4 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 7g.80gb 0 0/1 78.75 No 98 5 0 |
| 7 1 1 |
+-----------------------------------------------------------------------------+
MIG 이름으로 생성
# nvidia-smi mig -cgi 1g.10gb,”MIG 1g.10gb”
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
# nvidia-smi mig -cci -gi 11
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
# nvidia-smi mig -cci -gi 12,6
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
# nvidia-smi mig -cgi 19 -C
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 7 0:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 7 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 7 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
# nvidia-smi mig -dci -ci 8 -gi 11
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 11
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 7 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
# nvidia-smi mig -igi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 7 0:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
# nvidia-smi mig -dgi -gi 11
Successfully destroyed GPU instance ID 11 from GPU 0
# nvidia-smi mig -igi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 7 0:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
# nvidia-smi mig -dcl
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 7
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 12
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 6
# nvidia-smi mig -dgi
Successfully destroyed GPU instance ID 7 from GPU 0
Successfully destroyed GPU instance ID 12 from GPU 0
Successfully destroyed GPU instance ID 6 from GPU 0
# nvidia-smi mig -lci
No GPU Instances found: Not Found
# nvidia-smi mig -lgi
No GPU Instances found: Not Found
# nvidia-smi -i 0 -mig 0
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | 0 |
| N/A 43C P0 68W / 300W | 4MiB / 81920MiB | 24% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
1. 개요 Rocky Linux는 엔터프라이즈 환경에서 사용되는 RHEL(Red Hat Enterprise Linux)과 완전히 호환되는 오픈소스 Linux…
https://youtu.be/XwG4jBWakzQ 1. 개요 Supermicro IPMIView는 Supermicro에서 제공하는 IPMI (Intelligent Platform Management Interface) 기반의 통합 관리…
1. 개요 이 문서는 두 개의 NIC (enp5s0f0, enp5s0f1)를 bonding(active-backup) 방식으로 구성하고, 해당 bond 장치를 브리지(br0) 와 연결하여 KVM 가상머신에서…
1. 개요 KVM에서 NVIDIA GPU를 Passthrough 설정하여 VM에 할당할 때 RmInitAdapter failed 오류를 자주 접하게…
1. 개요 Proxmox에서 pGPU(Physical GPU)와 vGPU(Virtual GPU)를 동일한 서버에서 동시에 사용하는 방법을 정리합니다. 2. 버전…
1. 개요 Proxmox에서 vGPU를 설정하는 방법을 정리합니다. 2. 버전 Proxmox 8.2 3. vGPU란? vGPU(Virtual GPU)는…
댓글 보기