- LINUX

[Rocky] NVIDIA MIG(Multi-Instance_GPU) 설정 및 생성, 삭제 (2)






1. 개요

NVIDIA MIG 설정하고 생성, 삭제할 수 있다.







2. 버전 및 사양

Rocky-9.2
NVIDIA A100 80GB PCIe







3. 참고 링크





3-1. [Rocky] NVIDIA_MIG(Multi-Instance_GPU)란? (1)

BLOG
YouTube




3-2. [Rocky] NVIDA 그래픽 드라이버 설치

BLOG
YouTube







4. MIG





4-1. MIG 활성화

# nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10              Driver Version: 535.86.10    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100 80GB PCIe          Off | 00000000:03:00.0 Off |                    0 |
| N/A   43C    P0              68W / 300W |      4MiB / 81920MiB |     24%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+


# nvidia-smi -i 0 -mig 1
# nvidia-smi –gpu-reset

GPU 00000000:03:00.0 was successfully reset.
All done.




4-2. MIG 설정 확인

# nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10              Driver Version: 535.86.10    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100 80GB PCIe          Off | 00000000:03:00.0 Off |                   On |
| N/A   44C    P0              73W / 300W |      0MiB / 81920MiB |     N/A      Default |
|                                         |                      |              Enabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| MIG devices:                                                                          |
+------------------+--------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                   Memory-Usage |        Vol|      Shared           |
|      ID  ID  Dev |                     BAR1-Usage | SM     Unc| CE ENC DEC OFA JPG    |
|                  |                                |        ECC|                       |
|==================+================================+===========+=======================|
|  No MIG devices found                                                                 |
+---------------------------------------------------------------------------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+




4-3. MIG 프로필 확인

# nvidia-smi mig -lgip

+-----------------------------------------------------------------------------+
| GPU instance profiles:                                                      |
| GPU   Name             ID    Instances   Memory     P2P    SM    DEC   ENC  |
|                              Free/Total   GiB              CE    JPEG  OFA  |
|=============================================================================|
|   0  MIG 1g.10gb       19     7/7        9.50       No     14     0     0   |
|                                                             1     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 1g.10gb+me    20     1/1        9.50       No     14     1     0   |
|                                                             1     1     1   |
+-----------------------------------------------------------------------------+
|   0  MIG 1g.20gb       15     4/4        19.50      No     14     1     0   |
|                                                             1     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 2g.20gb       14     3/3        19.50      No     28     1     0   |
|                                                             2     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 3g.40gb        9     2/2        39.25      No     42     2     0   |
|                                                             3     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 4g.40gb        5     1/1        39.25      No     56     2     0   |
|                                                             4     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 7g.80gb        0     1/1        78.75      No     98     5     0   |
|                                                             7     1     1   |
+-----------------------------------------------------------------------------+


# nvidia-smi mig -lgipp

GPU  0 Profile ID 19 Placements: {0,1,2,3,4,5,6}:1
GPU  0 Profile ID 20 Placements: {0,1,2,3,4,5,6}:1
GPU  0 Profile ID 15 Placements: {0,2,4,6}:2
GPU  0 Profile ID 14 Placements: {0,2,4}:2
GPU  0 Profile ID  9 Placements: {0,4}:4
GPU  0 Profile ID  5 Placement : {0}:4
GPU  0 Profile ID  0 Placement : {0}:8







5. MIG GI(GPU Instance) 생성





5-1. GI(GPU Instance) 생성 방법 1

MIG 프로필 ID로 생성


# nvidia-smi mig -cgi 15




5-2. GI(GPU Instance) 생성 확인

# nvidia-smi mig -lgi

+-------------------------------------------------------+
| GPU instances:                                        |
| GPU   Name             Profile  Instance   Placement  |
|                          ID       ID       Start:Size |
|=======================================================|
|   0  MIG 1g.20gb         15        6          6:2     |
+-------------------------------------------------------+




5-3. GI 생성 가능 개수 확인

# nvidia-smi mig -lgip

+-----------------------------------------------------------------------------+
| GPU instance profiles:                                                      |
| GPU   Name             ID    Instances   Memory     P2P    SM    DEC   ENC  |
|                              Free/Total   GiB              CE    JPEG  OFA  |
|=============================================================================|
|   0  MIG 1g.10gb       19     6/7        9.50       No     14     0     0   |
|                                                             1     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 1g.10gb+me    20     1/1        9.50       No     14     1     0   |
|                                                             1     1     1   |
+-----------------------------------------------------------------------------+
|   0  MIG 1g.20gb       15     3/4        19.50      No     14     1     0   |
|                                                             1     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 2g.20gb       14     3/3        19.50      No     28     1     0   |
|                                                             2     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 3g.40gb        9     1/2        39.25      No     42     2     0   |
|                                                             3     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 4g.40gb        5     1/1        39.25      No     56     2     0   |
|                                                             4     0     0   |
+-----------------------------------------------------------------------------+
|   0  MIG 7g.80gb        0     0/1        78.75      No     98     5     0   |
|                                                             7     1     1   |
+-----------------------------------------------------------------------------+




5-4. GI(GPU Instance) 생성 방법 2,3

MIG 이름으로 생성


# nvidia-smi mig -cgi 1g.10gb,”MIG 1g.10gb”




5-5. GI(GPU Instance) 생성 확인

# nvidia-smi mig -lgi

+-------------------------------------------------------+
| GPU instances:                                        |
| GPU   Name             Profile  Instance   Placement  |
|                          ID       ID       Start:Size |
|=======================================================|
|   0  MIG 1g.10gb         19       11          4:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       12          5:1     |
+-------------------------------------------------------+
|   0  MIG 1g.20gb         15        6          6:2     |
+-------------------------------------------------------+







6. MIG CI(Compute Instance) 생성





6-1. GI(GPU Instance) 확인

# nvidia-smi mig -lgi

+-------------------------------------------------------+
| GPU instances:                                        |
| GPU   Name             Profile  Instance   Placement  |
|                          ID       ID       Start:Size |
|=======================================================|
|   0  MIG 1g.10gb         19       11          4:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       12          5:1     |
+-------------------------------------------------------+
|   0  MIG 1g.20gb         15        6          6:2     |
+-------------------------------------------------------+




6-2. CI(Compute Instance) 단일 생성

# nvidia-smi mig -cci -gi 11




6-3. CI(Compute Instance) 생성 확인

# nvidia-smi mig -lci

+--------------------------------------------------------------------+
| Compute instances:                                                 |
| GPU     GPU       Name             Profile   Instance   Placement  |
|       Instance                       ID        ID       Start:Size |
|         ID                                                         |
|====================================================================|
|   0     11       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+




6-4. CI(Compute Instance) 동시 생성

# nvidia-smi mig -cci -gi 12,6




6-5. CI(Compute Instance) 생성 확인

# nvidia-smi mig -lci

+--------------------------------------------------------------------+
| Compute instances:                                                 |
| GPU     GPU       Name             Profile   Instance   Placement  |
|       Instance                       ID        ID       Start:Size |
|         ID                                                         |
|====================================================================|
|   0     11       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0     12       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0      6       MIG 1g.20gb          0         0          0:1     |
+--------------------------------------------------------------------+







7. MIG GI(GPU Instance), CI(Compute Instance) 생성




7-1. GI(GPU Instance), CI(Compute Instance) 동시 생성

# nvidia-smi mig -cgi 19 -C




7-2. GI(GPU Instance) 생성 확인

# nvidia-smi mig -lgi

+-------------------------------------------------------+
| GPU instances:                                        |
| GPU   Name             Profile  Instance   Placement  |
|                          ID       ID       Start:Size |
|=======================================================|
|   0  MIG 1g.10gb         19        7          0:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       11          4:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       12          5:1     |
+-------------------------------------------------------+
|   0  MIG 1g.20gb         15        6          6:2     |
+-------------------------------------------------------+




7-3. CI(Compute Instance) 생성 확인

# nvidia-smi mig -lci

+--------------------------------------------------------------------+
| Compute instances:                                                 |
| GPU     GPU       Name             Profile   Instance   Placement  |
|       Instance                       ID        ID       Start:Size |
|         ID                                                         |
|====================================================================|
|   0      7       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0     11       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0     12       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0      6       MIG 1g.20gb          0         0          0:1     |
+--------------------------------------------------------------------+







8. MIG CI(Compute Instance) 삭제





8-1. CI(Compute Instance) 확인

# nvidia-smi mig -lci

+--------------------------------------------------------------------+
| Compute instances:                                                 |
| GPU     GPU       Name             Profile   Instance   Placement  |
|       Instance                       ID        ID       Start:Size |
|         ID                                                         |
|====================================================================|
|   0      7       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0     11       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0     12       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0      6       MIG 1g.20gb          0         0          0:1     |
+--------------------------------------------------------------------+




8-2. CI(Compute Instance) 삭제

# nvidia-smi mig -dci -ci 8 -gi 11

Successfully destroyed compute instance ID 0 from GPU 0 instance ID 11




8-3. CI(Compute Instance) 삭제 확인

# nvidia-smi mig -lci

+--------------------------------------------------------------------+
| Compute instances:                                                 |
| GPU     GPU       Name             Profile   Instance   Placement  |
|       Instance                       ID        ID       Start:Size |
|         ID                                                         |
|====================================================================|
|   0      7       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0     12       MIG 1g.10gb          0         0          0:1     |
+--------------------------------------------------------------------+
|   0      6       MIG 1g.20gb          0         0          0:1     |
+--------------------------------------------------------------------+







9. MIG GI(GPU Instance) 삭제





9-1. GI(GPU Instance) 확인

# nvidia-smi mig -igi

+-------------------------------------------------------+
| GPU instances:                                        |
| GPU   Name             Profile  Instance   Placement  |
|                          ID       ID       Start:Size |
|=======================================================|
|   0  MIG 1g.10gb         19        7          0:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       11          4:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       12          5:1     |
+-------------------------------------------------------+
|   0  MIG 1g.20gb         15        6          6:2     |
+-------------------------------------------------------+




9-2. GI(GPU Instance) 삭제

# nvidia-smi mig -dgi -gi 11

Successfully destroyed GPU instance ID 11 from GPU 0




9-3. GI(GPU Instance) 삭제 확인

# nvidia-smi mig -igi

+-------------------------------------------------------+
| GPU instances:                                        |
| GPU   Name             Profile  Instance   Placement  |
|                          ID       ID       Start:Size |
|=======================================================|
|   0  MIG 1g.10gb         19        7          0:1     |
+-------------------------------------------------------+
|   0  MIG 1g.10gb         19       12          5:1     |
+-------------------------------------------------------+
|   0  MIG 1g.20gb         15        6          6:2     |
+-------------------------------------------------------+







10. GI(GPU Instance), CI(Compute Instance) 삭제





10-1. CI(Compute Instance) 전체 삭제

# nvidia-smi mig -dcl

Successfully destroyed compute instance ID 0 from GPU 0 instance ID 7
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 12
Successfully destroyed compute instance ID 0 from GPU 0 instance ID 6




10-2. GI(GPU Instance) 전체 삭제

# nvidia-smi mig -dgi

Successfully destroyed GPU instance ID 7 from GPU 0
Successfully destroyed GPU instance ID 12 from GPU 0
Successfully destroyed GPU instance ID 6 from GPU 0




10-3. CI(Compute Instance) 확인

# nvidia-smi mig -lci

No GPU Instances found: Not Found




10-4. GI(GPU Instance) 확인

# nvidia-smi mig -lgi

No GPU Instances found: Not Found







11. MIG





11-1. MIG 비활성화

# nvidia-smi -i 0 -mig 0





11-2. MIG 비활성화 확인

# nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10              Driver Version: 535.86.10    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100 80GB PCIe          Off | 00000000:03:00.0 Off |                    0 |
| N/A   43C    P0              68W / 300W |      4MiB / 81920MiB |     24%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+



seuheu

최근 게시물

[Linux] Rocky Linux 9.5 LVM 스냅샷 실습 ext4/XFS 스냅샷 생성·변경·병합(롤백) 가이드[Linux] Rocky Linux 9.5 LVM 스냅샷 실습 ext4/XFS 스냅샷 생성·변경·병합(롤백) 가이드

1. 개요 본 글은 LVM의 스냅샷(snapshot) 기능으로 복구 지점 생성 → 파일 변경 → 스냅샷 병합(rollback) 과정을 실습합니다. 동일한…

%일 전

[Linux] Rocky Linux 9.5 LVM PV/VG/LV 구축, 온라인 확장, ext4 오프라인 축소

https://youtu.be/ZcxB7akkDKs 1. 개요 두 개의 디스크(/dev/vdb, /dev/vdc)로 LVM을 구성하고, ext4·XFS 파일시스템 생성 → 마운트 → VG/LV 확장 → ext4 축소(오프라인)까지 전체…

%일 전

[Linux] Rocky Linux 9.5 Parted로 GPT 파티셔닝: ext4/XFS 포맷과 fstab 자동 마운트

https://youtu.be/XYBR1ZFrV9s 1. 개요 parted를 사용해 새 디스크에 GPT 라벨 생성 → 파티션 생성(ext4/XFS) → 포맷/마운트 →…

%일 전

[Linux] Rocky Linux 9.5 FDISK MBR 파티셔닝 : ext4/XFS 포맷과 fstab 자동 마운트

https://youtu.be/yYV8RQKCFzA 1. 개요 이 문서는 fdisk를 사용해 MBR(DOS) 디스크에 파티션을 생성하고, ext4/XFS 파일시스템을 포맷·마운트, /etc/fstab에 등록했다가, 안전하게 해제·삭제하는 전…

%일 전

[Linux] MBR vs GPT : 리눅스 파티션 방식 쉽게 비교

1. 개요 Linux에서 디스크 파티션 스타일은 MBR(Master Boot Record) 와 GPT(GUID Partition Table)에 대해 설명합니다. 2. MBR이란? 디스크…

%일 전

[WindowsServer] Windows Server 2025 설치

https://youtu.be/CNd1bJV4wGY 1. 개요 Windows Server를 새로 설치할 때의 설치 과정을 단계별로 정리하였습니다. Windows Server 설치…

%일 전