Proxmox에서 vGPU를 설정하는 방법을 정리합니다.
Proxmox 8.2
# echo "deb <http://download.proxmox.com/debian/pve> bookworm pve-no-subscription" >> /etc/apt/sources.list
# apt update
# apt -y install proxmox-kernel-6.5 proxmox-headers-6.5 build-essential dkms mdevctl zip
커널 목록 확인 후 특정 버전 고정:
# proxmox-boot-tool kernel list
# proxmox-boot-tool kernel pin 6.5.13-6-pve
# nano /etc/default/grub
다음 내용을 추가 및 수정합니다:
+ GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
변경 사항 적용:
# update-grub
# reboot
# echo -e "vfio\\nvfio_iommu_type1\\nvfio_pci\\nvfio_virqfd" >> /etc/modules
# echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
초기화 후 시스템 재부팅:
# update-initramfs -u -k all
# reboot
# dmesg | grep -e DMAR -e IOMMU
출력 예시:
[ 1.714214] DMAR: IOMMU enabled
NVIDIA GRID vGPU 드라이버를 다운로드 후 압축 해제:
# cd ~/NVIDIA
# unzip NVIDIA-GRID-Linux-KVM-550.54.16-550.54.15-551.78.zip
호스트 드라이버 설치:
# cd Host_Drivers
# sh NVIDIA-Linux-x86_64-550.54.16-vgpu-kvm.run
설치 확인:
# nvidia-smi
출력 예시:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.16 Driver Version: 550.54.16 CUDA Version: N/A |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA L40S Off | 00000000:0D:00.0 Off | 0 |
| N/A 36C P0 122W / 350W | 0MiB / 46068MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA L40S Off | 00000000:B5:00.0 Off | 0 |
| N/A 37C P0 124W / 350W | 0MiB / 46068MiB | 3% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
# nvidia-smi vgpu --supported
출력 예시:
GPU 00000000:0D:00.0
NVIDIA L40S-1B
NVIDIA L40S-2B
NVIDIA L40S-1Q
NVIDIA L40S-2Q
NVIDIA L40S-3Q
NVIDIA L40S-4Q
NVIDIA L40S-6Q
NVIDIA L40S-8Q
NVIDIA L40S-12Q
NVIDIA L40S-16Q
NVIDIA L40S-24Q
NVIDIA L40S-48Q
NVIDIA L40S-1A
NVIDIA L40S-2A
NVIDIA L40S-3A
NVIDIA L40S-4A
NVIDIA L40S-6A
NVIDIA L40S-8A
NVIDIA L40S-12A
NVIDIA L40S-16A
NVIDIA L40S-24A
NVIDIA L40S-48A
# /usr/lib/nvidia/sriov-manage -e b5:00.0
설정 적용 후 다시 GPU 목록 확인:
# lspci | grep NVIDIA
출력 예시:
0d:00.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:00.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:00.4 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:00.5 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:00.6 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:00.7 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.1 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.2 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.3 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.4 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.5 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.6 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:01.7 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.1 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.2 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.3 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.4 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.5 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.6 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:02.7 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.1 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.2 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.3 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.4 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.5 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.6 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:03.7 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:04.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:04.1 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:04.2 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
b5:04.3 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
# cat << EOF > /etc/systemd/system/nvidia-sriov.service
[Unit]
Description=Enable NVIDIA SR-IOV
After=network.target nvidia-vgpud.service nvidia-vgpu-mgr.service
Before=pve-guests.service
[Service]
Type=oneshot
ExecStart=/usr/lib/nvidia/sriov-manage -e ALL
ExecStartPre=/bin/sleep 5
[Install]
WantedBy=multi-user.target
EOF
서비스 적용 및 활성화:
# systemctl daemon-reload
# systemctl enable --now nvidia-sriov.service
설정 적용 후 재부팅:
# reboot
1. 개요 본 글은 LVM의 스냅샷(snapshot) 기능으로 복구 지점 생성 → 파일 변경 → 스냅샷 병합(rollback) 과정을 실습합니다. 동일한…
https://youtu.be/ZcxB7akkDKs 1. 개요 두 개의 디스크(/dev/vdb, /dev/vdc)로 LVM을 구성하고, ext4·XFS 파일시스템 생성 → 마운트 → VG/LV 확장 → ext4 축소(오프라인)까지 전체…
https://youtu.be/XYBR1ZFrV9s 1. 개요 parted를 사용해 새 디스크에 GPT 라벨 생성 → 파티션 생성(ext4/XFS) → 포맷/마운트 →…
https://youtu.be/yYV8RQKCFzA 1. 개요 이 문서는 fdisk를 사용해 MBR(DOS) 디스크에 파티션을 생성하고, ext4/XFS 파일시스템을 포맷·마운트, /etc/fstab에 등록했다가, 안전하게 해제·삭제하는 전…
1. 개요 Linux에서 디스크 파티션 스타일은 MBR(Master Boot Record) 와 GPT(GUID Partition Table)에 대해 설명합니다. 2. MBR이란? 디스크…
https://youtu.be/CNd1bJV4wGY 1. 개요 Windows Server를 새로 설치할 때의 설치 과정을 단계별로 정리하였습니다. Windows Server 설치…