1. 개요
NVIDIA MIG 설정하고 생성, 삭제할 수 있다.
2. 버전 및 사양
Rocky-9.2
NVIDIA A100 80GB PCIe
3. 참고 링크
3-1. [Rocky] NVIDIA_MIG(Multi-Instance_GPU)란? (1)
3-2. [Rocky] NVIDA 그래픽 드라이버 설치
4. MIG
4-1. MIG 활성화
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | 0 |
| N/A 43C P0 68W / 300W | 4MiB / 81920MiB | 24% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
# nvidia-smi -i 0 -mig 1
# nvidia-smi –gpu-reset
GPU 00000000:03:00.0 was successfully reset.
All done.
4-2. MIG 설정 확인
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | On |
| N/A 44C P0 73W / 300W | 0MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+--------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+================================+===========+=======================|
| No MIG devices found |
+---------------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
4-3. MIG 프로필 확인
# nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.20gb 15 4/4 19.50 No 14 1 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 3g.40gb 9 2/2 39.25 No 42 2 0 |
| 3 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 4g.40gb 5 1/1 39.25 No 56 2 0 |
| 4 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 7g.80gb 0 1/1 78.75 No 98 5 0 |
| 7 1 1 |
+-----------------------------------------------------------------------------+
# nvidia-smi mig -lgipp
GPU 0 Profile ID 19 Placements: {0,1,2,3,4,5,6}:1
GPU 0 Profile ID 20 Placements: {0,1,2,3,4,5,6}:1
GPU 0 Profile ID 15 Placements: {0,2,4,6}:2
GPU 0 Profile ID 14 Placements: {0,2,4}:2
GPU 0 Profile ID 9 Placements: {0,4}:4
GPU 0 Profile ID 5 Placement : {0}:4
GPU 0 Profile ID 0 Placement : {0}:8
5. MIG GI(GPU Instance) 생성
5-1. GI(GPU Instance) 생성 방법 1
MIG 프로필 ID로 생성
# nvidia-smi mig -cgi 15
5-2. GI(GPU Instance) 생성 확인
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
5-3. GI 생성 가능 개수 확인
# nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.10gb 19 6/7 9.50 No 14 0 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 0 MIG 1g.20gb 15 3/4 19.50 No 14 1 0 |
| 1 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 |
| 2 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 3g.40gb 9 1/2 39.25 No 42 2 0 |
| 3 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 4g.40gb 5 1/1 39.25 No 56 2 0 |
| 4 0 0 |
+-----------------------------------------------------------------------------+
| 0 MIG 7g.80gb 0 0/1 78.75 No 98 5 0 |
| 7 1 1 |
+-----------------------------------------------------------------------------+
5-4. GI(GPU Instance) 생성 방법 2,3
MIG 이름으로 생성
# nvidia-smi mig -cgi 1g.10gb,”MIG 1g.10gb”
5-5. GI(GPU Instance) 생성 확인
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
6. MIG CI(Compute Instance) 생성
6-1. GI(GPU Instance) 확인
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
6-2. CI(Compute Instance) 단일 생성
# nvidia-smi mig -cci -gi 11
6-3. CI(Compute Instance) 생성 확인
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
6-4. CI(Compute Instance) 동시 생성
# nvidia-smi mig -cci -gi 12,6
6-5. CI(Compute Instance) 생성 확인
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
7. MIG GI(GPU Instance), CI(Compute Instance) 생성
7-1. GI(GPU Instance), CI(Compute Instance) 동시 생성
# nvidia-smi mig -cgi 19 -C
7-2. GI(GPU Instance) 생성 확인
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 7 0:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
7-3. CI(Compute Instance) 생성 확인
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 7 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
8. MIG CI(Compute Instance) 삭제
8-1. CI(Compute Instance) 확인
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 7 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
8-2. CI(Compute Instance) 삭제
# nvidia-smi mig -dci -ci 8 -gi 11
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 11
8-3. CI(Compute Instance) 삭제 확인
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 7 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 12 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 0 6 MIG 1g.20gb 0 0 0:1 |
+--------------------------------------------------------------------+
9. MIG GI(GPU Instance) 삭제
9-1. GI(GPU Instance) 확인
# nvidia-smi mig -igi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 7 0:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
9-2. GI(GPU Instance) 삭제
# nvidia-smi mig -dgi -gi 11
Successfully destroyed GPU instance ID 11 from GPU 0
9-3. GI(GPU Instance) 삭제 확인
# nvidia-smi mig -igi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 0 MIG 1g.10gb 19 7 0:1 |
+-------------------------------------------------------+
| 0 MIG 1g.10gb 19 12 5:1 |
+-------------------------------------------------------+
| 0 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
10. GI(GPU Instance), CI(Compute Instance) 삭제
10-1. CI(Compute Instance) 전체 삭제
# nvidia-smi mig -dcl
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 7
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 12
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 6
10-2. GI(GPU Instance) 전체 삭제
# nvidia-smi mig -dgi
Successfully destroyed GPU instance ID 7 from GPU 0
Successfully destroyed GPU instance ID 12 from GPU 0
Successfully destroyed GPU instance ID 6 from GPU 0
10-3. CI(Compute Instance) 확인
# nvidia-smi mig -lci
No GPU Instances found: Not Found
10-4. GI(GPU Instance) 확인
# nvidia-smi mig -lgi
No GPU Instances found: Not Found
11. MIG
11-1. MIG 비활성화
# nvidia-smi -i 0 -mig 0
11-2. MIG 비활성화 확인
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | 0 |
| N/A 43C P0 68W / 300W | 4MiB / 81920MiB | 24% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
“[Rocky] NVIDIA MIG(Multi-Instance_GPU) 설정 및 생성, 삭제 (2)”에 대한 1개의 생각