生活紀錄: 2023

2023年11月10日星期五

Fine-tuning Whisper in a Google Colab

參考 https://research.google.com/colaboratory/local-runtimes.html

使得 colab 可以用 local 的 cpu 和 gpu

文件說可以使用 docker 或 jupyter

但只有 jupyter 成功

建立 huggingface 帳號，並且登入

開啟 https://huggingface.co/settings/tokens

按下 New token

選擇 Role(有 read 和 write)

按下 copy

在執行下列命令時，貼上 token

$ huggingface-cli login

_| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|

_| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|

_|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|

_| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|

_| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|

To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .

Token:

Add token as git credential? (Y/n) Y

Token is valid (permission: read).

$ huggingface-cli login

_| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|

_| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|

_|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|

_| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|

_| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|

A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.

Setting a new token will erase the existing one.

To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .

Token:

Add token as git credential? (Y/n) Y

Token is valid (permission: write).

2023年10月26日星期四

Ubuntu 之錄音與撥放, 使用 arecord aplay ffmpeg

列出錄音設備

$ arecord -l

**** List of CAPTURE Hardware Devices ****

card 0: PCH [HDA Intel PCH], device 0: ALCS1200A Analog [ALCS1200A Analog]

Subdevices: 1/1

Subdevice #0: subdevice #0

card 0: PCH [HDA Intel PCH], device 2: ALCS1200A Alt Analog [ALCS1200A Alt Analog]

Subdevices: 1/1

Subdevice #0: subdevice #0

指定 card:0, device 0, 使用16000採樣，錄音10秒

$ arecord -Dhw:0,0 -d 10 -f cd -r 16000 -c 2 -t wav test.wav

Recording WAVE 'test.wav' : Signed 16 bit Little Endian, Rate 16000 Hz, Stereo

Warning: rate is not accurate (requested = 16000Hz, got = 44100Hz)

please, try the plug plugin

$ arecord -D mono --device=hw:0,0 -d 10 -f cd -r 16000 -c 2 -t wav test.wav

Recording WAVE 'test.wav' : Signed 16 bit Little Endian, Rate 16000 Hz, Stereo

Warning: rate is not accurate (requested = 16000Hz, got = 44100Hz)

please, try the plug plugin

列出播放設備

$ aplay -l

**** List of PLAYBACK Hardware Devices ****

card 0: PCH [HDA Intel PCH], device 0: ALCS1200A Analog [ALCS1200A Analog]

Subdevices: 1/1

Subdevice #0: subdevice #0

card 0: PCH [HDA Intel PCH], device 1: ALCS1200A Digital [ALCS1200A Digital]

Subdevices: 1/1

Subdevice #0: subdevice #0

card 0: PCH [HDA Intel PCH], device 3: HDMI 0 [HDMI 0]

Subdevices: 1/1

Subdevice #0: subdevice #0

從錄音設備直接撥出

$ arecord -Dhw:0,0 -d 10 -f cd -r 16000 | aplay -Dhw:0,0 -r 16000

Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 16000 Hz, Stereo

Warning: rate is not accurate (requested = 16000Hz, got = 44100Hz)

please, try the plug plugin

Playing WAVE 'stdin' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo

安裝 ffmpeg

$ sudo add-apt-repository ppa:savoury1/ffmpeg4

$ sudo apt-cache policy ffmpeg

$ sudo apt-get install ffmpeg

$ ffmpeg -version

$ sudo add-apt-repository --remove ppa:savoury1/ffmpeg4

參考 https://ffmpeg.org/ffmpeg-devices.html

列出裝置

$ ffmpeg -devices

Devices:

D. = Demuxing supported

.E = Muxing supported

DE alsa ALSA audio output

E caca caca (color ASCII art) output device

DE fbdev Linux framebuffer

D iec61883 libiec61883 (new DV1394) A/V input device

D jack JACK Audio Connection Kit

D kmsgrab KMS screen capture

D lavfi Libavfilter virtual input device

$ cat /proc/asound/cards

0 [PCH ]: HDA-Intel - HDA Intel PCH

HDA Intel PCH at 0xa7230000 irq 148

1 [NVidia ]: HDA-Intel - HDA NVidia

HDA NVidia at 0xa5080000 irq 17

錄音

$ ffmpeg -f alsa -i hw:0 test.wav

2023年10月11日星期三

安裝 Ubuntu 20.04

$ sudo apt-get update

$ sudo apt-get upgrade

$ sudo apt-get install ssh

$ sudo vi /etc/fstab

#中間的空格要使用 tab

ip:/share_folder /mnt/mount_folder nfs defaults,bg 0 0

$ cd /mnt

$ sudo mkdir QNAP_A QNAP_B

$ sudo mount -a

$ mkdir -p ~/.config/autostart

$ cp /usr/share/applications/vino-server.desktop ~/.config/autostart/

$ gsettings set org.gnome.Vino prompt-enabled false

$ gsettings set org.gnome.Vino require-encryption false

$ gsettings set org.gnome.Vino authentication-methods "['vnc']"

$ gsettings set org.gnome.Vino vnc-password $(echo -n 'ChangeToYourPasswd'|base64)

$ sudo vi /etc/gdm3/custom.conf

WaylandEnable=false

AutomaticLoginEnable = true

AutomaticLogin = UserLoginName

$ vi vino.sh

DISP=`ps -u $(id -u) -o pid= | \

while read pid; do

cat /proc/$pid/environ 2>/dev/null | tr '\0' '\n' | grep '^DISPLAY=:'

done | grep -o ':[0-9]*' | sort -u`

echo $DISP

/usr/lib/vino/vino-server --display=$DISP

$ chmod +x vino.sh

依據使用最新版本的 driver

CUDA Toolkit and Corresponding Driver Versions

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

dGPU Setup for Ubuntu

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Quickstart.html

Ubuntu 20.04

GStreamer 1.16.3

NVIDIA driver 525.125.06

CUDA 12.1

TensorRT 8.5.3.1

$ sudo ubuntu-drivers devices

$ sudo apt-get install nvidia-driver-535

$ sudo reboot

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb

$ sudo dpkg -i cuda-keyring_1.1-1_all.deb

$ sudo apt-get update

$ sudo apt-get -y install cuda-12-2

$ sudo apt-get -y install cuda-12-1

安裝 cuDNN

參考 https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html

到 2.2. Downloading cuDNN for Linux(https://developer.nvidia.com/cudnn)

下載 Local Install for Ubuntu18.04 x86_64(Deb)

$ sudo apt-get install zlib1g

$ sudo dpkg -i cudnn-local-repo-ubuntu2004-8.9.5.29_1.0-1_amd64.deb

sudo cp /var/cudnn-local-repo-ubuntu2004-8.9.5.29/cudnn-local-98C06E99-keyring.gpg /usr/share/keyrings/

$ sudo apt-get update

$ apt list -a libcudnn8

$ sudo apt-get install libcudnn8=8.9.5.29-1+cuda12.2

$ sudo apt-get install libcudnn8-dev=8.9.5.29-1+cuda12.2

$ sudo apt-get install libcudnn8-samples=8.9.5.29-1+cuda12.2

$ update-alternatives --display libcudnn

$ cp -r /usr/src/cudnn_samples_v8/ .

$ cd cudnn_samples_v8/mnistCUDNN/

$ sudo apt-get install libfreeimage3 libfreeimage-dev

$ make clean && make

$ ./mnistCUDNN

...

Test passed!

安裝 TensorRT 8.6.1

https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-861/install-guide/index.html

$ sudo apt-get install python3-pip

$ sudo apt-get install python3.8.venv

$ python3 -m venv envs/tensorrt

$ source envs/tensorrt/bin/activate

$ pip3 install --upgrade pip

$ python3 -m pip install --extra-index-url https://pypi.nvidia.com tensorrt_libs

$ python3 -m pip install --extra-index-url https://pypi.nvidia.com tensorrt_bindings

$ python3 -m pip install --upgrade tensorrt

$ python3 -m pip install --upgrade tensorrt_lean

$ python3 -m pip install --upgrade tensorrt_dispatch

測試 TensorRT Python

$ python3

>>> import tensorrt

>>> print(tensorrt.__version__)

>>> assert tensorrt.Builder(tensorrt.Logger())

>>> import tensorrt_lean as trt

>>> print(trt.__version__)

>>> assert trt.Builder(trt.Logger())

>>> import tensorrt_dispatch as trt

>>> print(trt.__version__)

>>> assert trt.Builder(trt.Logger())

連結 https://developer.nvidia.com/tensorrt 按 GET STARTED

連結 https://developer.nvidia.com/tensorrt-getting-started 按 DOWNLOAD NOW

選擇 TensorRT 8

選擇 TensorRT 8.6 GA

TensorRT 8.6 GA for Ubuntu 20.04 and CUDA 12.0 and 12.1 DEB local repo Package

$ sudo dpkg -i nv-tensorrt-local-repo-ubuntu2004-8.6.1-cuda-12.0_1.0-1_amd64.deb

$ sudo cp /var/nv-tensorrt-local-repo-ubuntu2004-8.6.1-cuda-12.0/nv-tensorrt-local-9A1EDFBA-keyring.gpg /usr/share/keyrings/

$ sudo apt-get update

$ sudo apt-get install tensorrt

$ sudo apt-get install libnvinfer-lean8

$ sudo apt-get install libnvinfer-vc-plugin8

$ sudo apt-get install python3-libnvinfer-lean

$ sudo apt-get install python3-libnvinfer-dispatch

$ python3 -m pip install numpy

$ sudo apt-get install python3-libnvinfer-dev

$ python3 -m pip install protobuf

$ sudo apt-get install uff-converter-tf

$ python3 -m pip install numpy onnx

$ sudo apt-get install onnx-graphsurgeon

確認安裝

$ dpkg-query -W tensorrt

tensorrt 8.6.1.6-1+cuda12.0

安裝 DeepStream

$ sudo apt-get install libssl1.1

$ sudo apt-get install libgstreamer1.0-0

$ sudo apt-get install gstreamer1.0-tools

$ sudo apt-get install gstreamer1.0-plugins-good

$ sudo apt-get install gstreamer1.0-plugins-bad

$ sudo apt-get install gstreamer1.0-plugins-ugly

$ sudo apt-get install gstreamer1.0-libav

$ sudo apt-get install libgstreamer-plugins-base1.0-dev

$ sudo apt-get install libgstrtspserver-1.0-0

$ sudo apt-get install libjansson4

$ sudo apt-get install libyaml-cpp-dev

$ sudo apt-get install libjsoncpp-dev

$ sudo apt-get install protobuf-compiler

$ sudo apt-get install gcc

$ sudo apt-get install make

$ sudo apt-get install git

$ sudo apt-get install python3

$ git clone https://github.com/edenhill/librdkafka.git

$ cd librdkafka

$ git reset --hard 7101c2310341ab3f4675fc565f64f0967e135a6a

$ ./configure

$ make

$ sudo make install

$ sudo mkdir -p /opt/nvidia/deepstream/deepstream-6.3/lib

$ sudo cp /usr/local/lib/librdkafka* /opt/nvidia/deepstream/deepstream-6.3/lib

https://catalog.ngc.nvidia.com/orgs/nvidia/resources/deepstream

下載 deepstream-6.3_6.3.0-1_arm64.deb

$ wget --content-disposition 'https://api.ngc.nvidia.com/v2/resources/nvidia/deepstream/versions/6.3/files/deepstream-6.3_6.3.0-1_amd64.deb'

$ sudo apt-get install ./deepstream-6.3_6.3.0-1_amd64.deb

$ cd /opt/nvidia/deepstream/deepstream-6.3/samples/configs/deepstream-app

$ deepstream-app -c source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt

安裝 Docker

https://docs.docker.com/engine/install/ubuntu/

$ sudo apt-get update

$ sudo apt-get install ca-certificates curl gnupg

$ sudo install -m 0755 -d /etc/apt/keyrings

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

$ sudo chmod a+r /etc/apt/keyrings/docker.gpg

$ echo \

"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \

"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \

sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

$ sudo apt-get update

$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

$ sudo docker run --rm hello-world

安裝 NVIDIA Container Toolkit

參考 https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \

&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \

sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \

sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \

&& \

sudo apt-get update

$ sudo apt-get install -y nvidia-container-toolkit

$ sudo nvidia-ctk runtime configure --runtime=docker

$ sudo systemctl restart docker

$ sudo groupadd docker

$ sudo usermod -a -G docker $USER

$ docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

安裝 NGC CLI

參考 https://ngc.nvidia.com/setup/installers/cli

$ wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.30.1/files/ngccli_linux.zip -O ngccli_linux.zip && unzip ngccli_linux.zip

$ find ngc-cli/ -type f -exec md5sum {} + | LC_ALL=C sort | md5sum -c ngc-cli.md5

$ sha256sum ngccli_linux.zip

$ chmod u+x ngc-cli/ngc

$ echo "export PATH=\"\$PATH:$(pwd)/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile

$ ngc config set

# 直接 enter 即可

$ docker login nvcr.io

Username: $oauthtoken

Password: <Your API Key>

用 Docker 開發 DeepStream 6.3

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_docker_containers.html

$ sudo docker pull nvcr.io/nvidia/deepstream:6.3-gc-triton-devel

$ export DISPLAY=:0

$ xhost +

$ docker run -it --rm --net=host --gpus all -e DISPLAY=$DISPLAY --device /dev/snd -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/deepstream:6.3-gc-triton-devel

# cd samples/configs/deepstream-app

# deepstream-app -c source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt

# exit

$ sudo docker ps -a

$ sudo docker stop container_id

$ sudo docker rm container_id

$ sudo docker image list

$ sudo docker image rm image_id

2023年9月28日星期四

Ubuntu 安裝 exiv2, 開發 exif 相關程式

參考 https://github.com/Exiv2/exiv2/tree/main

參考 https://github.com/Exiv2/exiv2/tree/main#PlatformLinux

download cmake from https://cmake.org/download/

$ tar xvfz cmake-3.27.6.tar.gz

$ cd cmake-3.27.6

$ sudo apt-get install libssl-dev

$ ./bootstrap

$ make -j4

$ sudo make install

$ git clone https://github.com/Exiv2/exiv2.git

$ cd exiv2

$ sudo apt-get install --yes build-essential ccache clang cmake git google-mock libbrotli-dev libcurl4-openssl-dev libexpat1-dev libgtest-dev libinih-dev libssh-dev libxml2-utils libz-dev python3 zlib1g-dev

$ cmake -S . -B build -G "Unix Makefiles"

$ cmake --build build

$ ctest --test-dir build --verbose

$ sudo cmake --install build

$ g++ -o exifprint exifprint.cpp -lexiv2

2023年8月28日星期一

使用 sudo 不輸入密碼

增加可以 reboot 的 myuser

$ sudo deluser myuser

$ adduser myuser

$ sudo gpasswd -a myuser sudo

$ echo "myuser ALL = NOPASSWD: /usr/sbin/reboot" | sudo tee /etc/sudoers.d/60_myuser

$ sudo chmod 0440 /etc/sudoers.d/60_myuser

若不幸輸入錯字，會產生如下錯誤

>>> /etc/sudoers: syntax error near line 24 <<<

sudo: parse error in /etc/sudoers near line 24

sudo: no valid sudoers sources found, quitting

sudo: unable to initialize policy plugin

使用下列方法修復

$ pkexec visudo

What now? 會停在此處，別怕按下 Enter

Options are:

(e)dit sudoers file again

e(x)it without saving changes to sudoers file

(Q)uit and save changes to sudoers file (DANGER!)

使用 ssh 不用輸入密碼

hostA$ ssh-keygen

hostA$ ssh-copy-id "user@hostB -p 22"

hostA$ ssh user@hostB "command.sh arg1 arg2"

2023年7月18日星期二

gstreamer fpsdisplaysink videorate

$ export URI=rtsp://root:A1234567@192.168.112.202:554/live1s1.sdp

$ export GST_DEBUG=fpsdisplaysink:5

利用 videotestsrc 測試 fpsdisplaysink

$ gst-launch-1.0 videotestsrc ! 'video/x-raw,width=1280,height=720,framerate=60/1' ! videoconvert ! fpsdisplaysink text-overlay=true

fpsdisplaysink 不使用 text-overlay

$ gst-launch-1.0 rtspsrc location=$URI protocols=tcp+udp ! application/x-rtp, media=video ! decodebin ! nvvideoconvert ! nvegltransform ! fpsdisplaysink text-overlay=0 video-sink=nveglglessink

fpsdisplaysink 使用 text-overlay

$ gst-launch-1.0 rtspsrc location=$URI protocols=tcp+udp ! application/x-rtp, media=video ! decodebin ! nvvideoconvert ! fpsdisplaysink text-overlay=1 video-sink=autovideosink

利用 videorate 設定 framerate

$ gst-launch-1.0 rtspsrc location=$URI protocols=tcp+udp ! application/x-rtp, media=video ! decodebin ! nvvideoconvert ! videorate ! video/x-raw,framerate=60/1 ! nvvideoconvert ! fpsdisplaysink text-overlay=1 video-sink=autovideosink

加入 rtpjitterbuffer, 但好像沒用

$ gst-launch-1.0 rtspsrc location=$URI protocols=tcp+udp ! application/x-rtp, media=video ! rtpjitterbuffer latency=0 ! decodebin ! nvvideoconvert ! fpsdisplaysink text-overlay=1 video-sink=autovideosink

不顯示, 但從 log 中可查看出 fps

$ gst-launch-1.0 rtspsrc location=$URI protocols=tcp+udp ! application/x-rtp, media=video ! decodebin ! nvvideoconvert ! fpsdisplaysink text-overlay=0 video-sink=fakesink

輸出

:00:02.590692019 1665816 0xffff6001d700 DEBUG fpsdisplaysink fpsdisplaysink.c:372:display_current_fps:<fpsdisplaysink0> Updated max-fps to 1.102534

0:00:02.590778644 1665816 0xffff6001d700 DEBUG fpsdisplaysink fpsdisplaysink.c:376:display_current_fps:<fpsdisplaysink0> Updated min-fps to 1.102534

2023年7月14日星期五

ubuntu 多網卡之 default route

$ cd /etc/NetworkManager/system-connections/

編輯相對網卡的檔案, 在 [ipv4] 下, 加入 route

$ sudo vi 'Wired connection 1.nmconnection'

[ipv4]

route1=0.0.0.0/0,192.168.0.254,1

2023年7月12日星期三

install pytorch in ubuntu

參考 PyTorch 官方網站

參考安裝舊版本

參考安裝版本文件

$ python3 -m venv pytorch

$ source pytorch/bin/activate

$ pip3 install --upgrade --no-cache-dir pip

$ sudo update-alternatives --config cuda

$ pip3 install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

2023年7月5日星期三

DeepStream 之 nvinfer(primary mode) 執行 classifier, 使用在 ROI 上

試了很久, 無法利用 nvdspreprocess 的 ROI 放在 nvinfer 之前

參考 How to run classifier in primary mode only in specific ROI?

發現直接使用 nvvideoconvert 的 ROI 可以正常運作

參數設定 src-crop="left:top:width:height", 如

g_object_set(G_OBJECT(pre_proc), "src-crop", "50:0:320:240", NULL);

2023年7月3日星期一

YOLOv8 and TensorRT

參考 YOLOv8 GitHub 官網

參考 YOLOv8 文件

參考 DeepStream-Yolo

1. 下載 DeepStream-Yolo, Ultralytics YOLOv8

git clone https://github.com/marcoslucianops/DeepStream-Yolo.git

git clone https://github.com/ultralytics/ultralytics.git /mnt/Data/DeepStream/DeepStream-Yolo/ultralytics

2. 建立 deepstream_yolo docker container

docker_run.sh

xhost +

docker run --name='deepstream_yolo' --gpus all -it --net=host --privileged \

-v /tmp/.X11-unix:/tmp/.X11-unix \

-v /etc/localtime:/etc/localtime \

-v /mnt/Data/DeepStream/DeepStream-Yolo/DeepStream-Yolo:/home/DeepStream-Yolo \

-v /mnt/Data/DeepStream/DeepStream-Yolo/ultralytics:/home/ultralytics \

-v /mnt/Data/DeepStream/DeepStream-Yolo/read_me:/home/read_me \

-v /mnt/Data/DeepStream/DeepStream-Yolo/datasets:/home/datasets \

-v /mnt/CT1000SSD/ImageData/Light:/home/Light \

-e DISPLAY=$DISPLAY \

-w /home/read_me \

nvcr.io/nvidia/deepstream:6.2-devel

3. 在 Docker 內, 安裝 DeepStream-Yolo

apt-get install build-essential

/opt/nvidia/deepstream/deepstream/user_additional_install.sh

cd /home/DeepStream-Yolo

CUDA_VER=11.8 make -C nvdsinfer_custom_impl_Yolo

4. 在 Docker 內, 安裝 Ultralytics YOLOv8

#python3 -m pip install --upgrade pip

pip3 install --upgrade pip

pip3 install protobuf numpy

cd /home/ultralytics

#pip install -e .

pip3 install -r requirements.txt

python3 setup.py install

pip3 install onnx onnxsim onnxruntime

5. 在 Docker 內, 下載，轉換，測試 yolov8s.pt, yolov8s.pt 模型

cd /home/ultralytics

cp /home/DeepStream-Yolo/utils/export_yoloV8.py /home/ultralytics

wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt

wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt

python3 export_yoloV8.py -w yolov8s.pt --dynamic

python3 export_yoloV8.py -w yolov8n.pt --dynamic

cp yolov8s.onnx labels.txt /home/DeepStream-Yolo

cp yolov8n.onnx labels.txt /home/DeepStream-Yolo

6. 移除 deepstream_yolo container

$ docker container rm deepstream_yolo

7. 重新進入 Docker

docker_attach.sh

xhost +

docker start deepstream_yolo

docker attach deepstream_yolo

8. 轉換模型格式為 onnx

yolov8n.py

from ultralytics import YOLO

# Load a model

model = YOLO("yolov8n.yaml") # build a new model from scratch

model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)

# Use the model

model.train(data="coco128.yaml", epochs=3) # train the model

metrics = model.val() # evaluate model performance on the validation set

results = model("https://ultralytics.com/images/bus.jpg") # predict on an image

path = model.export(format="onnx") # export the model to ONNX format

執行 python3 yolov8n.py 出現下列錯誤

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).

修正方式

$ sudo systemctl stop docker

取得 container id

$ docker inspect deepstream_yolo | grep Id

"Id": "???????"

編輯 container 的 ShmSize

$ sudo vi /var/lib/docker/containers/your_container_id/hostconfig.json

"ShmSize":8589934592

$ sudo systemctl restart docker

$ ./docker_attach.sh

9. 在 DeepStream 中測試 onnx 模型

# cd /home/DeepStream-Yolo

# vi config_infer_primary_yoloV8.txt

onnx-file=yolov8s.onnx

onnx-file=yolov8n.onnx

# vi deepstream_app_config.txt

uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4

uri=rtsp://root:A1234567@192.168.0.107:554/live1s1.sdp

live-source=0

live-source=1

config-file=config_infer_primary.txt

config-file=config_infer_primary_yoloV8.txt

file-loop=0

file-loop=1

# deepstream-app -c deepstream_app_config.txt

10. 準備自己的圖形資料, PASCAL VOC(LabelImg 產生的 xml) 格式轉換成 txt

參考 Ultralytics 文件之 VOC Dataset 說明

參考範例程式

prepare_detect.py

import cv2

import os

import random

import re

import xml.etree.ElementTree as ET

import numpy as np

LIGHT_CLASSES_LIST = [

'forward_right',

'others',

'red',

'red_left',

'yellow',

]

def save_false_positives(img_org, iName, xName, tag, classIdx,

clip_x0, clip_y0, clip_x1, clip_y1):

img_new = img_org[clip_y0:clip_y1, clip_x0:clip_x1]

fPath, fName = os.path.split(iName)

fName, fExt = os.path.splitext(fName)

fName = fName + tag + fExt

rndPaths = ['train', 'val', 'test']

rndPath = random.choices(rndPaths, weights=(8,1,1))[0]

iName = os.path.join('/home/datasets/Light/images', rndPath, fName)

cv2.imwrite(iName, img_new)

def convert_box(size, box):

dw, dh = 1. / size[0], 1. / size[1]

x, y, w, h = (box[0] + box[1]) / 2.0 - 1, (box[2] + box[3]) / 2.0 - 1, box[1] - box[0], box[3] - box[2]

return x * dw, y * dh, w * dw, h * dh

def save_file(img_org, iName, xName, tag, classIdx,

p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y,

img_w, img_h, xmin, ymin, xmax, ymax,

clip_x0, clip_y0, clip_x1, clip_y1):

img_new = img_org[clip_y0:clip_y1, clip_x0:clip_x1]

fPath, fName = os.path.split(iName)

fName, fExt = os.path.splitext(fName)

fName = fName + tag + fExt

rndPaths = ['train', 'val', 'test']

rndPath = random.choices(rndPaths, weights=(8,1,1))[0]

iName = os.path.join('/home/datasets/Light/images', rndPath, fName)

cv2.imwrite(iName, img_new)

w = clip_x1 - clip_x0

h = clip_y1 - clip_y0

xmin = xmin - clip_x0

ymin = ymin - clip_y0

xmax = xmax - clip_x0

ymax = ymax - clip_y0

bb = convert_box((w, h), (xmin, xmax, ymin, ymax))

fPath, fName = os.path.split(xName)

fName, fExt = os.path.splitext(fName)

fName = fName + tag + '.txt'

tName = os.path.join('/home/datasets/Light/labels', rndPath, fName)

with open(tName, 'w') as f:

f.write(" ".join([str(a) for a in (classIdx, *bb)]) + '\n')

def gen_img_yolo(iName, xName):

tree = ET.parse(open(xName))

root = tree.getroot()

img_w = int(root.find('size').find('width').text)

img_h = int(root.find('size').find('height').text)

for idx, object in enumerate(root.findall('object')):

name = object.find('name').text

classIdx = LIGHT_CLASSES_LIST.index(name)

#print(classIdx, name)

bndbox = object.find('bndbox')

p0x = int(bndbox.find('p0x').text)

p0y = int(bndbox.find('p0y').text)

p1x = int(bndbox.find('p1x').text)

p1y = int(bndbox.find('p1y').text)

p2x = int(bndbox.find('p2x').text)

p2y = int(bndbox.find('p2y').text)

p3x = int(bndbox.find('p3x').text)

p3y = int(bndbox.find('p3y').text)

xmin = int(bndbox.find('xmin').text)

ymin = int(bndbox.find('ymin').text)

xmax = int(bndbox.find('xmax').text)

ymax = int(bndbox.find('ymax').text)

if xmin != p0x or xmin != p3x or ymin != p0y or ymin != p1y or \

xmax != p1x or xmax != p2x or ymax != p2y or ymax != p3y:

print('error:bndbox', xName)

exit()

if idx > 0:

print('error:object', xName)

exit()

img_org = cv2.imread(iName)

if img_org.shape[0] != img_h or img_org.shape[1] != img_w:

print(img_org.shape, (img_h, img_w))

exit()

img = np.copy(img_org)

clip_x0 = random.randrange(0, int(xmin*0.5))

clip_y0 = random.randrange(0, int(ymin*0.5))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.5), img_w+1)

clip_y1 = random.randrange(int(ymax + (img_h-ymax)*0.5), img_h+1)

save_file(img_org, iName, xName, '', classIdx,

p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y,

img_w, img_h, xmin, ymin, xmax, ymax,

clip_x0, clip_y0, clip_x1, clip_y1)

ratio = (xmax - xmin) / img_w

if ratio < 0.3:

clip_x0 = random.randrange(int(xmin*0.3), int(xmin*0.8))

clip_y0 = random.randrange(int(ymin*0.3), int(ymin*0.8))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.2), int(xmax + (img_w-xmax)*0.7))

clip_y1 = random.randrange(int(ymax + (img_h-ymax)*0.2), int(ymax + (img_h-ymax)*0.7))

save_file(img_org, iName, xName, '_a', classIdx,

p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y,

img_w, img_h, xmin, ymin, xmax, ymax,

clip_x0, clip_y0, clip_x1, clip_y1)

clip_x0 = random.randrange(int(xmin*0.5), int(xmin*0.9))

clip_y0 = random.randrange(int(ymin*0.5), int(ymin*0.9))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.1), int(xmax + (img_w-xmax)*0.5))

clip_y1 = random.randrange(int(ymax + (img_h-ymax)*0.1), int(ymax + (img_h-ymax)*0.5))

save_file(img_org, iName, xName, '_b', classIdx,

p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y,

img_w, img_h, xmin, ymin, xmax, ymax,

clip_x0, clip_y0, clip_x1, clip_y1)

if xmin > (img_w - xmax):

if ymin > (img_h - ymax):

clip_x0 = random.randrange(0, int(xmin*0.8))

clip_y0 = random.randrange(0, int(ymin*0.8))

clip_x1 = random.randrange(int(xmin), int(xmin+(xmax-xmin)*0.8))

clip_y1 = random.randrange(int(ymin), int(ymin+(ymax-ymin)*0.8))

root.remove(object)

save_false_positives(img_org, iName, xName, '_f0', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

else:

clip_x0 = random.randrange(0, int(xmin*0.8))

clip_y0 = random.randrange(int(ymin+(ymax-ymin)*0.2), int(ymax))

clip_x1 = random.randrange(int(xmin), int(xmin + (xmax-xmin)*0.8))

clip_y1 = random.randrange(int(ymax+(img_h-ymax)*0.2), img_h)

root.remove(object)

save_false_positives(img_org, iName, xName, '_f1', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

else:

if ymin > (img_h - ymax):

clip_x0 = random.randrange(int(xmin+(xmax-xmin)*0.2), int(xmax))

clip_y0 = random.randrange(0, int(ymin*0.8))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.2), img_w)

clip_y1 = random.randrange(int(ymin), int(ymin+(ymax-ymin)*0.8))

root.remove(object)

save_false_positives(img_org, iName, xName, '_f2', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

else:

clip_x0 = random.randrange(int(xmin+(xmax-xmin)*0.2), int(xmax))

clip_y0 = random.randrange(int(ymin+(ymax-ymin)*0.2), int(ymax))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.2), img_w)

clip_y1 = random.randrange(int(ymax+(img_h-ymax)*0.2), img_h)

root.remove(object)

save_false_positives(img_org, iName, xName, '_f3', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

elif ratio < 0.7:

clip_x0 = random.randrange(int(xmin*0.1), int(xmin*0.7))

clip_y0 = random.randrange(int(ymin*0.1), int(ymin*0.7))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.3), int(xmax + (img_w-xmax)*0.9))

clip_y1 = random.randrange(int(ymax + (img_h-ymax)*0.3), int(ymax + (img_h-ymax)*0.9))

save_file(img_org, iName, xName, '_c', classIdx,

p0x, p0y, p1x, p1y, p2x, p2y, p3x, p3y,

img_w, img_h, xmin, ymin, xmax, ymax,

clip_x0, clip_y0, clip_x1, clip_y1)

if xmin > (img_w - xmax):

if ymin > (img_h - ymax):

clip_x0 = random.randrange(0, int(xmin*0.8))

clip_y0 = random.randrange(0, int(ymin*0.8))

clip_x1 = random.randrange(int(xmin), int(xmin+(xmax-xmin)*0.8))

clip_y1 = random.randrange(int(ymin), int(ymin+(ymax-ymin)*0.8))

root.remove(object)

save_false_positives(img_org, iName, xName, '_f4', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

else:

clip_x0 = random.randrange(0, int(xmin*0.8))

clip_y0 = random.randrange(int(ymin+(ymax-ymin)*0.2), int(ymax))

clip_x1 = random.randrange(int(xmin), int(xmin + (xmax-xmin)*0.8))

clip_y1 = random.randrange(int(ymax+(img_h-ymax)*0.2), img_h)

root.remove(object)

save_false_positives(img_org, iName, xName, '_f5', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

else:

if ymin > (img_h - ymax):

clip_x0 = random.randrange(int(xmin+(xmax-xmin)*0.2), int(xmax))

clip_y0 = random.randrange(0, int(ymin*0.8))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.2), img_w)

clip_y1 = random.randrange(int(ymin), int(ymin+(ymax-ymin)*0.8))

root.remove(object)

save_false_positives(img_org, iName, xName, '_f6', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

else:

clip_x0 = random.randrange(int(xmin+(xmax-xmin)*0.2), int(xmax))

clip_y0 = random.randrange(int(ymin+(ymax-ymin)*0.2), int(ymax))

clip_x1 = random.randrange(int(xmax + (img_w-xmax)*0.2), img_w)

clip_y1 = random.randrange(int(ymax+(img_h-ymax)*0.2), img_h)

root.remove(object)

save_false_positives(img_org, iName, xName, '_f7', classIdx,

clip_x0, clip_y0, clip_x1, clip_y1)

elif ratio < 1.0:

pass

return

def recursive_folder(path):

files = os.listdir(path)

files.sort()

for file in files:

fullName = os.path.join(path, file)

if os.path.isfile(fullName):

fPath, fName = os.path.split(fullName)

fName, fExt = os.path.splitext(fName)

if fExt in ('.jpg'):

xPath = fPath + '.xml'

xName = fName + '.xml'

xFName = os.path.join(xPath, xName)

if os.path.isfile(xFName):

gen_img_yolo(fullName, xFName)

else:

print(xFName)

else:

recursive_folder(fullName)

def main():

recursive_folder('/home/Light')

if __name__ == '__main__':

main()

11. 訓練自己的模型

from ultralytics import YOLO

# Load a model

model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)

# Train the model

model.train(data='VOC.yaml', epochs=100, imgsz=640)

12. 查詢 onnx 模型的輸出輸入層

import onnx

model = onnx.load('yolov8n.onnx')

g_in = model.graph.input

g_out = model.graph.output

2023年6月21日星期三

如何在 python 中使用 tao 產生的 yolov4 模型

參考 Nvidia TAO Computer Vision Sample Workflows 產生 yolov4-tiny 模型

此模型與一般產生的模型不一樣

一般產生的模型參考 tensorrt_demos 即可在 python 中使用

但是照著 Nvidia TAO Computer Vision Sample Workflows 文件

只能將

tao yolo_v4_tiny export 出的模型(.etlt)用於 DeepStream

而由

tao converter 產生的 trt.engine 不能使用於 DeepStream 也不能在 python 中使用

參考 TAO Toolkit Triton Apps

有說明如何使用 tao 產生的模型在 Triton 伺服器上

在 yolov3_postprocessor.py 中發現 tao 產生的 yolo

已經將輸出的 NMS 處理過, 並將內容置於

BatchNMS(-1,1): 偵測出的數量

BatchNMS_1(-1,200,4): 座標

BatchNMS_2(-1,200): 信心

BatchNMS_3(-1,200): 類別

輸入的方式也有改變

cv2 讀出的圖不需 cvtColor, 也不用除以 255.0

只需將 BHWC 轉成 BCHW

img = img.transpose((2, 0, 1)).astype(np.float32)

tao 的執行是在 docker 中，所以很難除錯

發現下列命令，可以直接進入 docker 中，執行 python, 查詢版本環境等

docker run -it --rm --gpus all \

-v "/mnt/Data/tao/yolo_v4_tiny_1.4.1":"/workspace/tao-experiments" \

-v "/mnt/Data/TensorRT/tensorrt_demos":"/workspace/tensorrt_demos" \

-v "/mnt/CT1000SSD/ImageData/Light":"/workspace/Light" \

nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5 \

bash

將模型轉換成 TensorRT 除了使用

!tao converter -k $KEY \

-p Input,1x3x416x416,8x3x416x416,16x3x416x416 \

-e $USER_EXPERIMENT_DIR/export/trt.engine \

-t fp32 \

$USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt

外, 也可使用

!tao-deploy yolo_v4_tiny gen_trt_engine \

-m $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt \

-e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \

-k $KEY \

--data_type fp32 \

--batch_size 1 \

--engine_file $USER_EXPERIMENT_DIR/export/yolov4_tao_deplay.trt

但若是要在不同平台上轉換

參考 TAO Converter 下載安裝，並執行轉換

./tao-converter_v4.0.0_trt8.5.1.7 \

-k nvidia_tlt \

-p Input,1x3x416x416,2x3x416x416,4x3x416x416 \

-e yolo_v4_tiny_1.4.1/yolo_v4_tiny/export/yolov4_tao_converter_fp32.engine \

-t fp32 \

yolo_v4_tiny_1.4.1/yolo_v4_tiny/export/yolov4_cspdarknet_tiny_epoch_080.etlt

參考 tensorrt_demos 修改 utils/yolo_with_plugins.py, 改名成 triton_yolo_with_plugins.py 如下

"""yolo_with_plugins.py

Implementation of TrtYOLO class with the yolo_layer plugins.

"""

from __future__ import print_function

import ctypes

import numpy as np

import cv2

import tensorrt as trt

import pycuda.driver as cuda

try:

ctypes.cdll.LoadLibrary('./plugins/libyolo_layer.so')

except OSError as e:

raise SystemExit('ERROR: failed to load ./plugins/libyolo_layer.so. '

'Did you forget to do a "make" in the "./plugins/" '

'subdirectory?') from e

def _preprocess_yolo(img, input_shape, letter_box=False):

"""Preprocess an image before TRT YOLO inferencing.

# Args

img: int8 numpy array of shape (img_h, img_w, 3)

input_shape: a tuple of (H, W)

letter_box: boolean, specifies whether to keep aspect ratio and

create a "letterboxed" image for inference

# Returns

preprocessed img: float32 numpy array of shape (3, H, W)

"""

if letter_box:

img_h, img_w, _ = img.shape

new_h, new_w = input_shape[0], input_shape[1]

offset_h, offset_w = 0, 0

if (new_w / img_w) <= (new_h / img_h):

new_h = int(img_h * new_w / img_w)

offset_h = (input_shape[0] - new_h) // 2

else:

new_w = int(img_w * new_h / img_h)

offset_w = (input_shape[1] - new_w) // 2

resized = cv2.resize(img, (new_w, new_h))

img = np.full((input_shape[0], input_shape[1], 3), 127, dtype=np.uint8)

img[offset_h:(offset_h + new_h), offset_w:(offset_w + new_w), :] = resized

else:

img = cv2.resize(img, (input_shape[1], input_shape[0]))

#img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

img = img.transpose((2, 0, 1)).astype(np.float32)

#img /= 255.0

return img

class HostDeviceMem(object):

"""Simple helper data class that's a little nicer to use than a 2-tuple."""

def __init__(self, host_mem, device_mem):

self.host = host_mem

self.device = device_mem

def __str__(self):

return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

def __repr__(self):

return self.__str__()

def get_input_shape(engine):

"""Get input shape of the TensorRT YOLO engine."""

binding = engine[0]

assert engine.binding_is_input(binding)

binding_dims = engine.get_binding_shape(binding)

if len(binding_dims) == 4:

return tuple(binding_dims[2:])

elif len(binding_dims) == 3:

return tuple(binding_dims[1:])

else:

raise ValueError('bad dims of binding %s: %s' % (binding, str(binding_dims)))

def allocate_buffers(engine, context):

"""Allocates all host/device in/out buffers required for an engine."""

inputs = []

outputs = []

bindings = []

stream = cuda.Stream()

for binding in engine:

binding_dims = engine.get_binding_shape(binding)

binding_dtype = engine.get_tensor_dtype(binding)

binding_format = engine.get_tensor_format_desc(binding)

binding_loc = engine.get_tensor_location(binding)

binding_mode = engine.get_tensor_mode(binding)

binding_shape = engine.get_tensor_shape(binding)

binding_shape_inference = engine.is_shape_inference_io(binding)

print('binding_dims:{} {} {}'.format(binding, binding_dims, binding_dtype))

print(' {}'.format(binding_format))

print(' {} {} {} {}'.format(binding_loc, binding_mode, binding_shape, binding_shape_inference))

size = trt.volume(binding_dims)

if size < 0: size *= -1;

print(' size:{}'.format(size))

dtype = trt.nptype(engine.get_binding_dtype(binding))

# Allocate host and device buffers

host_mem = cuda.pagelocked_empty(size, dtype)

device_mem = cuda.mem_alloc(host_mem.nbytes)

# Append the device buffer to device bindings.

bindings.append(int(device_mem))

# Append to the appropriate list.

if engine.binding_is_input(binding):

#binding_pro_shape = engine.get_profile_shape(0, binding)

#print(' {}'.format(binding_pro_shape))

if binding_dims[0] == -1:

alloc_dims = np.copy(binding_dims)

alloc_dims[0] = 1

context.set_binding_shape(0, alloc_dims)

inputs.append(HostDeviceMem(host_mem, device_mem))

else:

outputs.append(HostDeviceMem(host_mem, device_mem))

return inputs, outputs, bindings, stream

def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):

"""do_inference (for TensorRT 6.x or lower)

This function is generalized for multiple inputs/outputs.

Inputs and outputs are expected to be lists of HostDeviceMem objects.

"""

# Transfer input data to the GPU.

[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

# Run inference.

context.execute_async(batch_size=batch_size,

bindings=bindings,

stream_handle=stream.handle)

# Transfer predictions back from the GPU.

[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]

# Synchronize the stream

stream.synchronize()

# Return only the host outputs.

return [out.host for out in outputs]

def do_inference_v2(context, bindings, inputs, outputs, stream):

"""do_inference_v2 (for TensorRT 7.0+)

This function is generalized for multiple inputs/outputs for full

dimension networks.

Inputs and outputs are expected to be lists of HostDeviceMem objects.

"""

# Transfer input data to the GPU.

[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

# Run inference.

context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)

# Transfer predictions back from the GPU.

[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]

# Synchronize the stream

stream.synchronize()

# Return only the host outputs.

return [out.host for out in outputs]

class TrtYOLO(object):

"""TrtYOLO class encapsulates things needed to run TRT YOLO."""

def _load_engine(self):

TRTbin = 'yolo/%s.trt' % self.model

TRTbin = self.model

with open(TRTbin, 'rb') as f, trt.Runtime(self.trt_logger) as runtime:

return runtime.deserialize_cuda_engine(f.read())

def __init__(self, model, category_num=80, letter_box=False, cuda_ctx=None):

"""Initialize TensorRT plugins, engine and conetxt."""

self.model = model

self.category_num = category_num

self.letter_box = letter_box

self.cuda_ctx = cuda_ctx

if self.cuda_ctx:

self.cuda_ctx.push()

self.inference_fn = do_inference if trt.__version__[0] < '7' \

else do_inference_v2

self.trt_logger = trt.Logger(trt.Logger.INFO)

# add for errors

# IPluginCreator not found in Plugin Registry

# getPluginCreator could not find plugin: BatchedNMSDynamic_TRT version: 1

# Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed

trt.init_libnvinfer_plugins(self.trt_logger, namespace="")

self.engine = self._load_engine()

self.input_shape = get_input_shape(self.engine)

try:

self.context = self.engine.create_execution_context()

self.inputs, self.outputs, self.bindings, self.stream = \

allocate_buffers(self.engine, self.context)

except Exception as e:

raise RuntimeError('fail to allocate CUDA resources') from e

finally:

if self.cuda_ctx:

self.cuda_ctx.pop()

def __del__(self):

"""Free CUDA memories."""

del self.outputs

del self.inputs

del self.stream

def detect(self, img, letter_box=None):

"""Detect objects in the input image."""

letter_box = self.letter_box if letter_box is None else letter_box

img_h, img_w, _ = img.shape

img_resized = _preprocess_yolo(img, self.input_shape, letter_box)

#print(img_resized.shape, img_resized.dtype)

# Set host input to the image. The do_inference() function

# will copy the input to the GPU before executing.

self.inputs[0].host = np.ascontiguousarray(img_resized)

if self.cuda_ctx:

self.cuda_ctx.push()

trt_outputs = self.inference_fn(

context=self.context,

bindings=self.bindings,

inputs=self.inputs,

outputs=self.outputs,

stream=self.stream)

if self.cuda_ctx:

self.cuda_ctx.pop()

y_pred = [i.reshape(1, -1,)[:1] for i in trt_outputs]

keep_k, boxes, scores, cls_id = y_pred

#print(keep_k.shape)

#print(boxes.shape)

keep_k[0,0] = 1

locs = np.empty((0,4), dtype=np.uint)

cids = np.empty((0,1), dtype=np.uint)

confs = np.empty((0,1), dtype=np.float32)

for idx, k in enumerate(keep_k.reshape(-1)):

mul = np.array([img_w,img_h,img_w,img_h])

loc = boxes[idx].reshape(-1, 4)[:k] * mul

loc = loc.astype(np.uint)

cid = cls_id[idx].reshape(-1, 1)[:k]

cid = cid.astype(np.uint)

conf = scores[idx].reshape(-1, 1)[:k]

locs = np.concatenate((locs, loc), axis=0)

cids = np.concatenate((cids, cid), axis=0)

confs = np.concatenate((confs, conf), axis=0)

#print(locs.shape, cids.shape, confs.shape)

#print(locs, cids, confs)

return locs, confs, cids

下列程式使用上列的程式

import cv2

import numpy as np

import tensorrt as trt

import pycuda.autoinit # This is needed for initializing CUDA driver

import pycuda.driver as cuda

from utils.triton_yolo_with_plugins import TrtYOLO

#MODEL_PATH = '/workspace/tao-experiments/yolo_v4_tiny/export/yolov4_tao_convert.engine'

MODEL_PATH = '/workspace/tao-experiments/yolo_v4_tiny/export/yolov4_tao_deplay.trt'

#MODEL_PATH = '/workspace/tao-experiments/yolo_v4_tiny/export/trt.engine'

def main():

trt_yolo = TrtYOLO(MODEL_PATH, 5, True)

img_org = cv2.imread('bb.jpg')

img = np.copy(img_org)

print(img.shape, img.dtype)

boxes, confs, clss = trt_yolo.detect(img, False)

print(boxes.shape, confs.shape, clss.shape)

print(boxes, confs, clss)

for box, conf, clss in zip(boxes, confs, clss):

x_min, y_min, x_max, y_max = box[0], box[1], box[2], box[3]

cv2.rectangle(img, (x_min, y_min), (x_max, y_max), (255, 255, 255), 2)

print(box, conf, clss)

cv2.imwrite('aa.jpg', img)

print('aaa')

if __name__ == '__main__':

main()

除錯說明

訊息: Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed

解決: TensorRt 的版本不一致，安裝不同版本，或利用 docker

訊息: IPluginCreator not found in Plugin Registry

訊息: getPluginCreator could not find plugin: BatchedNMSDynamic_TRT version: 1

解決: 需安裝 TensorRT OSS

在 load_engine() 之前加上

trt.init_libnvinfer_plugins(self.trt_logger, namespace="")

2023年6月6日星期二

在 Jetson 上設定 VNC server

參考 https://developer.nvidia.com/embedded/learn/tutorials/vnc-setup

或 L4T-README/README-vnc.txt

$ cd /usr/lib/systemd/user/graphical-session.target.wants

$ sudo ln -s ../vino-server.service ./.

$ gsettings set org.gnome.Vino prompt-enabled false

$ gsettings set org.gnome.Vino require-encryption false

$ gsettings set org.gnome.Vino authentication-methods "['vnc']"

$ gsettings set org.gnome.Vino vnc-password $(echo -n 'YourPassword'|base64)

$ vi vino.sh

DISP=`ps -u $(id -u) -o pid= | \

while read pid; do

cat /proc/$pid/environ 2>/dev/null | tr '\0' '\n' | grep '^DISPLAY=:'

done | grep -o ':[0-9]*' | sort -u`

echo $DISP

/usr/lib/vino/vino-server --display=$DISP

重新開機

訂閱：文章 (Atom)

網頁

2023年11月10日 星期五

2023年10月26日 星期四

2023年10月11日 星期三

2023年9月28日 星期四

2023年8月28日 星期一

2023年7月18日 星期二

2023年7月14日 星期五

2023年7月12日 星期三

2023年7月5日 星期三

2023年7月3日 星期一

2023年6月21日 星期三

2023年6月6日 星期二

2023年11月10日星期五

2023年10月26日星期四

2023年10月11日星期三

2023年9月28日星期四

2023年8月28日星期一

2023年7月18日星期二

2023年7月14日星期五

2023年7月12日星期三

2023年7月5日星期三

2023年7月3日星期一

2023年6月21日星期三

2023年6月6日星期二