網頁

顯示具有 python 標籤的文章。 顯示所有文章
顯示具有 python 標籤的文章。 顯示所有文章

2025年4月15日 星期二

球型攝影機 onvif

參考 https://github.com/FalkTannhaeuser/python-onvif-zeep
參考 https://www.onvif.org/onvif/ver20/util/operationIndex.html

$ python -m venv --system-site-packages /mnt/Data/envs/onvif
$ source /mnt/Data/envs/onvif/bin/activate
$ pip install --upgrade onvif_zeep
$ git clone https://github.com/FalkTannhaeuser/python-onvif-zeep.git

$ onvif-cli devicemgmt GetHostname --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80

查詢 ProfileToken
$ onvif-cli media GetProfiles --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80 | grep -o "'token': '[^']*'" | awk -F': ' 'END {print $2}'
'MediaProfile00002'
$ onvif-cli ptz GotoPreset "{'ProfileToken':'MediaProfile00002', 'PresetToken':'9'}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80
$ onvif-cli ptz GetPresets "{'ProfileToken':'MediaProfile00002'}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80
$ onvif-cli ptz AbsoluteMove "{'ProfileToken':'MediaProfile00002', 'Position':{'PanTilt':{'x': -0.05, 'y': 0.6}, 'Zoom':0.5}}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80

相對位置移動, 0:不移動 正:上,右,放大 負:下,左,縮小
左上
$ onvif-cli ptz RelativeMove "{'ProfileToken':'MediaProfile00002', 'Translation':{'PanTilt':{'x': 0.105, 'y': 0.22}, 'Zoom':0.3}}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80
右下
$ onvif-cli ptz RelativeMove "{'ProfileToken':'MediaProfile00002', 'Translation':{'PanTilt':{'x': -0.115, 'y': -0.201}, 'Zoom':0.3}}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80
左上
$ onvif-cli ptz RelativeMove "{'ProfileToken':'MediaProfile00002', 'Translation':{'PanTilt':{'x': 0.105, 'y': 0.21}, 'Zoom':0.15}}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80
右下
$ onvif-cli ptz RelativeMove "{'ProfileToken':'MediaProfile00002', 'Translation':{'PanTilt':{'x': -0.105, 'y': -0.21}, 'Zoom':0.15}}" --user 'admin' --password 'sh22463458' --host '192.168.113.203' --port 80

2024年10月16日 星期三

Jetson Container 使用紀錄,二

因為這次更新到 Jetpack 6.0, 所以測試得很順利

參考 https://jetsonhacks.com/2023/08/07/speech-ai-on-nvidia-jetson-tutorial/
註冊 NGC
生成 API 密鑰
$ ngc config set
$ ngc registry resource download-version nvidia/riva/riva_quickstart_arm64:2.17.0
$ cd riva_quickstart
$ vi config.sh
service_enabled_nlp=false
service_enabled_nmt=false
asr_language_code=("en-US")
#asr_language_code=("zh-CN")
tts_language_code=("en-US")
#tts_language_code=("zh-CN")
$ sudo bash riva_init.sh
$ sudo bash riva_start.sh
安裝 https://github.com/nvidia-riva/python-clients.git

### 文字介面
$ jetson-containers run --env HUGGINGFACE_TOKEN=hf_abc123def \
  $(./autotag nano_llm) \
  python3 -m nano_llm.chat --api=mlc \
    --model meta-llama/Llama-2-7b-chat-hf \
    --quantization q4f16_ft

### llamaspeak
$ jetson-containers run --env HUGGINGFACE_TOKEN=hf_xyz123abc456 \
  $(autotag nano_llm) \
  python3 -m nano_llm.agents.web_chat --api=mlc \
    --model meta-llama/Meta-Llama-3-8B-Instruct \
    --asr=riva --language-code=en-US --tts=piper \
    --voice=en_US-libritts-high --web-port 8050

$ jetson-containers run --env HUGGINGFACE_TOKEN=hf_xyz123abc456 \
  $(autotag nano_llm) \
  python3 -m nano_llm.agents.web_chat --api=mlc \
    --model meta-llama/Meta-Llama-3-8B-Instruct \
    --asr=riva --language-code=zh-CN --tts=piper \
    --voice=zh_CN-huayan-x_low --web-port 8050

參考 https://ithelp.ithome.com.tw/articles/10347557
輸入的中文識別:修改/opt/NanoLLM/nano_llm/plugins/speech/riva_asr.py,將裡面的language_code=從en-US改成zh-CN
輸出的中文語音:修改/opt/NanoLLM/nano_llm/plugins/speech/piper_tts.py,將裡面的en_US-libritts-high模型(有兩處)改成zh_CN-huayan-medium模型。這裡可以選擇的中文模型,可以在/data/models/piper/voices.json裡找到,請用關鍵字zh-CN搜索,會找到zh_CN-huayan-medium與zh_CN-huayan-x_low兩組可用的中文語言

還須設定 ws-port(49000) 的 nat port 轉換
can you describe the image?

##Multimodality
$ jetson-containers run $(autotag nano_llm) \
  python3 -m nano_llm.agents.web_chat --api=mlc \
    --model Efficient-Large-Model/VILA-7b \
    --asr=riva --tts=piper

### Agent Studio
# python3 -m nano_llm.studio
https://IP_ADDRESS:8050

### text-generation-webui
$ jetson-containers run $(autotag text-generation-webui)
http://<IP_ADDRESS>:7860

中文模型
ckip-joint_bloom-3b-zh
參考  https://www.atyun.com/models/info/ckip-joint/bloom-3b-zh.html
taide_Llama3-TAIDE-LX-8B-Chat-Alpha1-4bit
參考  https://taide.tw/index
yentinglin_Llama-3-Taiwan-8B-Instruct
參考  https://medium.com/@simon3458/project-tame-llama-3-taiwan-1b249b88ab67

### ollama (Open WebUI)
$ jetson-containers run --name ollama $(autotag ollama)
# /bin/ollama run llama3
$ docker run -it --rm --network=host --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main
http://JETSON_IP:8080

### Stable Diffusion
$ jetson-containers run $(autotag stable-diffusion-webui)
http://<IP_ADDRESS>:7860
txt2img: a person sitting at an office desk working
img2img: 先放一張圖
1. 右上 Generate 下方有 Interrogate CLIP 和 Interrogate DeepBooru 兩按鈕,可讀取圖片資訊
2. 可修改讀取出的資訊,按 Generate, 重新產生圖
3. 選 Resize and fill, 並配合不同的 Denising strength, 按 Generate, 重新產生圖
4. 選 Just resize, 並配合不同的 Denising strength, 按 Generate, 重新產生圖
5. Inpaint: 在 prompt 上填入 sunglasses, 將圖片上的眼睛位置塗黑, 按 Generate, 重新產生圖

### Stable Diffusion XL
$ CONTAINERS_DIR=`pwd`
$ MODEL_DIR=$CONTAINERS_DIR/data/models/stable-diffusion/models/Stable-diffusion/
$ sudo chown -R $USER $MODEL_DIR
$ wget -P $MODEL_DIR https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors
$ wget -P $MODEL_DIR https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/resolve/main/sd_xl_refiner_1.0.safetensors
左上角 Stable Diffusion checkpoint 選擇 sd_xl_base_1.0.safetensors
Generation tab 點選 Refiner 選擇 sd_xl_refiner_1.0.safetensors
改 width/height 到 1024
txt2img: photograph of a friendly robot alongside a person climbing a mountain 

### Jetson Copilot
git clone https://github.com/NVIDIA-AI-IOT/jetson-copilot/
cd jetson-copilot
./setup_environment.sh
./launch_jetson_copilot.sh
http://JETSON_IP:8501


$ virtualenv --system-site-packages tensorrt

$ docker ps -a
$ docker kill wqerqwer

2024年9月26日 星期四

模型優化之指派問題(linear sum assignment problem)

參考 匈牙利算法
python 使用函數
scipy.optimize.linear_sum_assignment
lap.lapjv


MMTracking 學習紀錄

參考 https://mmtracking.readthedocs.io/en/latest/index.html
參考 https://github.com/open-mmlab/mmtracking

$ git clone https://github.com/open-mmlab/mmtracking.git

$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:21.05-py3
$ docker run --gpus all -it --name MMTracking nvcr.io/nvidia/pytorch:21.05-py3
$ docker start MMTracking
$ docker attach MMTracking
# <ctrl+p><ctrl+q>
$ docker attach MMTracking
$ docker stop MMTracking
$ docker rm MMTracking

$ echo $DISPLAY
$ export DISPLAY=:0
$ xhost +
$ docker run --gpus all -it --name MMTracking --shm-size=8G \
  --net=host -e DISPLAY=$DISPLAY -e XAUTHORITY=/tmp/xauth \
  -e QT_X11_NO_MITSHM=1 \
  -v /tmp/.X11-unix/:/tmp/.X11-unix \
  -v ~/.Xauthority:/tmp/xauth \
  -v /etc/localtime:/etc/localtime \
  -v /mnt/Data/MMTracking/mmtracking:/workspace/mmtracking \
  nvcr.io/nvidia/pytorch:21.05-py3

# pip install git+https://github.com/votchallenge/toolkit.git
# nvidia-smi
CUDA Version: 12.2
# pip list
opencv-python                    4.10.0.84
torch                            1.9.0a0+2ecb2c7

# pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu122/torch1.9.0/index.html
#### 出現下列錯誤
####    cv.gapi.wip.GStreamerPipeline = cv.gapi_wip_gst_GStreamerPipeline
####AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)
#### 解決
# pip install opencv-python==4.5.1.48

# pip install mmdet==2.28.2
# pip install mmengine==0.10.4
# pip list
mmcv-full                        1.7.2
mmdet                            2.28.2
mmengine                         0.10.4
pillow                           10.4.0
# cd /workspace/mmtracking/
# pip install -r requirements/build.txt
# pip install -v -e .
# pip install git+https://github.com/JonathonLuiten/TrackEval.git
# pip install git+https://github.com/lvis-dataset/lvis-api.git
# pip install git+https://github.com/TAO-Dataset/tao.git

# python demo/demo_mot_vis.py configs/mot/deepsort/deepsort_faster-rcnn_fpn_4e_mot17-private-half.py \
  --input demo/demo.mp4 --output mot.mp4 --show
#### 出現下列錯誤
####    from .cv2 import *
####ImportError: libGL.so.1: cannot open shared object file: No such file or directory
#### 解決
# apt-get update
# apt-get install libgl1
#### 出現下列錯誤
####    NumpyArray = npt.NDArray[Any]
####AttributeError: module 'numpy.typing' has no attribute 'NDArray'
#### 解決
# pip install Pillow==9.5.0
#### 出現下列錯誤
#### Could not load the Qt platform plugin "xcb" in "/opt/conda/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found.
####This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
#### 查詢錯誤所在
# export QT_DEBUG_PLUGINS=1
####Cannot load library /opt/conda/lib/python3.8/site-packages/cv2/qt/plugins/platforms/libqxcb.so: (libSM.so.6: cannot open shared object file: No such file or directory)
#### 解決
# apt-get install -y libsm6 libxext6 libxrender-dev
# pip install opencv-contrib-python==4.5.1.48

# wget https://download.openmmlab.com/mmtracking/mot/ocsort/mot_dataset/ocsort_yolox_x_crowdhuman_mot17-private-half_20220813_101618-fe150582.pth
# python demo/demo_mot_vis.py configs/mot/ocsort/ocsort_yolox_x_crowdhuman_mot17-private-half.py \
  --checkpoint ocsort_yolox_x_crowdhuman_mot17-private-half_20220813_101618-fe150582.pth \
  --input demo/demo.mp4 --output mot.mp4 --show

# wget https://download.openmmlab.com/mmtracking/vid/selsa/selsa_faster_rcnn_r101_dc5_1x_imagenetvid/selsa_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172724-aa961bcc.pth
# python demo/demo_vid.py configs/vid/selsa/selsa_faster_rcnn_r101_dc5_1x_imagenetvid.py \
  --checkpoint selsa_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172724-aa961bcc.pth \
  --input demo/demo.mp4 --show

# python demo/demo_sot.py configs/sot/siamese_rpn/siamese_rpn_r50_20e_lasot.py \
  --input demo/demo.mp4 --output sot.mp4 --show
#### 用滑鼠框出要追蹤的物件,按空白鍵

MMOCR 學習紀錄

參考 https://mmocr.readthedocs.io/en/dev-1.x/
參考 https://github.com/open-mmlab/mmocr

$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:21.05-py3
$ docker run --gpus all -it --name MMOCR nvcr.io/nvidia/pytorch:21.05-py3
$ docker start MMOCR
$ docker attach MMOCR
# <ctrl+p><ctrl+q>
$ docker attach MMOCR
$ docker stop MMOCR
$ docker rm MMOCR
$ docker run --gpus all -it --name MMOCR --shm-size=8G \
  -v /mnt/Data/MMOCR/mmocr:/workspace/mmocr \
  -v /mnt/QNAP_A/ImageData/ICDAR:/mmocr/data \
  nvcr.io/nvidia/pytorch:21.05-py3

# pip install -U openmim
#### 出現下列錯誤
####ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
####We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
#### 不用擔心

確認 https://mmocr.readthedocs.io/en/dev-1.x/get_started/install.html 底部的版本資訊
安裝正確版本套件
# mim list
# mim install mmengine==
# mim install mmengine
# mim install mmcv==2.0.1
#### 出現下列錯誤
####    cv.gapi.wip.GStreamerPipeline = cv.gapi_wip_gst_GStreamerPipeline
####AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)
#### 解決
#### 因為目前 opencv-python 版本為 4.10.0.84, 降版本
# pip install opencv-python==4.5.1.48
# mim install mmcv==2.0.1
# mim install mmdet==3.1.0
# cd /workspace/mmocr/
# pip install -v -e .
# pip install opencv-python-headless==4.5.1.48
# pip install -r requirements/albu.txt
# pip install -r requirements.txt
# python tools/infer.py demo/images --det DBNet --rec CRNN --print-result \
  --save_pred --save_vis --out-dir='results/' --batch-size=2

2024年8月28日 星期三

SwinTransformer 學習紀錄

參考 https://github.com/microsoft/Swin-Transformer?tab=readme-ov-file
參考 https://github.com/SwinTransformer/Swin-Transformer-Object-Detection
參考 https://mmdetection.readthedocs.io/en/latest/get_started.html
參考 https://github.com/open-mmlab/mmdetection/blob/master/docs/en/get_started.md
參考 https://github.com/open-mmlab/mmcv
參考 https://github.com/open-mmlab/mim?tab=readme-ov-file

Swin-Transformer 是原始版本,主要針對圖片分類
Swin-Transformer-Object-Detection 主要針對物件偵測(基於 mmdetection)
mmdetection 主要針對物件偵測,其中不只是 Swin-Transformer, 包含各式各樣先進的演算法
mmdetection 可在 mmdet/__init__.py 可以查詢到 mmcv 所需要的版本
Swin-Transformer-Object-Detection 要求使用 mmcv(1.4.0)
但 mmdetection 最後對應到 mmcv(1.4.0) 的版本為(v2.18.1)
所以以下放棄使用 Swin-Transformer-Object-Detection, 直接使用 mmdetection 的最新版本

$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:21.05-py3
$ docker run --gpus all -it --name SwinTransformer nvcr.io/nvidia/pytorch:21.05-py3
$ docker start SwinTransformer
$ docker attach SwinTransformer
# <ctrl+p><ctrl+q>
$ docker attach SwinTransformer
$ docker stop SwinTransformer
$ docker rm SwinTransformer
$ git clone https://github.com/microsoft/Swin-Transformer.git
$ git clone https://github.com/SwinTransformer/Swin-Transformer-Object-Detection.git
$ git clone https://github.com/open-mmlab/mmdetection.git
$ docker run --gpus all -it --name SwinTransformer --shm-size=8G \
  -v /mnt/Data/SwinTransformer/Swin-Transformer:/workspace/Swin-Transformer \
  -v /mnt/Data/SwinTransformer/mmdetection:/workspace/mmdetection \
  -v /mnt/Data/SwinTransformer/Swin-Transformer-Object-Detection:/workspace/Swin-Transformer-Object-Detection \
  -v /mnt/QNAP_A/ImageData/ImageNet:/workspace/ImageNet \
  nvcr.io/nvidia/pytorch:21.05-py3

# pip install timm==0.4.12
# pip install opencv-python==4.4.0.46 termcolor==1.1.0 yacs==0.1.8 pyyaml scipy

# cd /workspace/Swin-Transformer/kernels/window_process/
# python setup.py install

# cd /workspace/Swin-Transformer/
# wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
# python -m torch.distributed.launch --nproc_per_node 1 --master_port 12345 main.py \
  --eval --cfg configs/swin/swin_tiny_patch4_window7_224.yaml \
  --resume swin_tiny_patch4_window7_224.pth --data-path /workspace/ImageNet \
  --batch_size=64
#### 出現下列錯誤
####  File "/opt/conda/lib/python3.8/site-packages/PIL/_typing.py", line 10, in <module>
####    NumpyArray = npt.NDArray[Any]
####AttributeError: module 'numpy.typing' has no attribute 'NDArray'
#### 解決
# pip install Pillow==9.5.0
#### 出現下列錯誤
####RuntimeError: Found 0 files in subfolders of: /workspace/ImageNet/val
####Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp
#### 解決
$ cd /mnt/QNAP_A/ImageData/ImageNet/
$ mv val val_a ;mkdir val; mv val_a val
#### 出現下列錯誤
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
#### 解決 docker run 時須加上 --shm-size=8G

# cd /workspace/mmdetection
# pip install -U openmim
#### 出現下列錯誤
####ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
####We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
#### 不用擔心

# mim list
# mim install mmengine
# mim install "mmcv>=2.0.0"
# pip install -v -e .
    # "-v" means verbose, or more output
    # "-e" means installing a project in editable mode,
    # thus any local modifications made to the code will take effect without reinstallation.
# mim download mmdet --config yolov3_mobilenetv2_8xb24-320-300e_coco --dest .
# mim list
# ls
# python demo/image_demo.py demo/demo.jpg yolov3_mobilenetv2_8xb24-320-300e_coco.py \
  --weights yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth
#### 出現下列錯誤
####    import cv2
####  File "/opt/conda/lib/python3.8/site-packages/cv2/__init__.py", line 5, in <module>
####    from .cv2 import *
####ImportError: libGL.so.1: cannot open shared object file: No such file or directory
#### 出現下列錯誤
####    cv.gapi.wip.GStreamerPipeline = cv.gapi_wip_gst_GStreamerPipeline
####AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)
#### 解決
# pip install opencv-python-headless==4.4.0.46
#### 出現下列錯誤
####    assert (mmcv_version >= digit_version(mmcv_minimum_version)
####AssertionError: MMCV==2.2.0 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.2.0.
#### 解決
# mim install "mmcv==2.0.0rc4"

# mim download mmdet --config mask-rcnn_swin-t-p4-w7_fpn_1x_coco --dest .
# python demo/image_demo.py demo/demo.jpg mask-rcnn_swin-t-p4-w7_fpn_1x_coco.py \
  --weights mask_rcnn_swin-t-p4-w7_fpn_1x_coco_20210902_120937-9d6b7cfa.pth

Jetson Container 使用紀錄

mic-733ao@ubuntu:~/Data/AgentStudio/jetson-containers$ ./run.sh --workdir=/opt/text-generation-webui $(./autotag text-generation-webui) python3 server.py --model-dir=/data/models/text-generation-webui --listen --verbose --trust-remote-code

mic-733ao@ubuntu:~/Data/AgentStudio/jetson-containers$ ./run.sh --workdir=/opt/text-generation-webui $(./autotag text-generation-webui) /bin/bash -c 'python3 download-model.py --output=/data/models/text-generation-webui/THUDM_cogvlm2-llama3-chinese-chat-19B-int4 THUDM/cogvlm2-llama3-chinese-chat-19B-int4'

產生授權錯誤
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/taide/TAIDE-LX-7B/resolve/main/README_en.md
到 https://huggingface.co/ 網站 右上角使用者/Settings/Access Tokens
建立 Fine-grained token, 打勾所有權限
$ export HUGGINGFACE_TOKEN=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

chrome://flags/#unsafely-treat-insecure-origin-as-secure

2023年11月10日 星期五

Fine-tuning Whisper in a Google Colab

參考 https://research.google.com/colaboratory/local-runtimes.html
使得 colab 可以用 local 的 cpu 和 gpu
文件說可以使用 docker 或 jupyter
但只有 jupyter 成功

建立 huggingface 帳號,並且登入
開啟 https://huggingface.co/settings/tokens
按下 New token
選擇 Role(有 read 和 write)
按下 copy
在執行下列命令時,貼上 token

$ huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|
    
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token: 
Add token as git credential? (Y/n) Y
Token is valid (permission: read).
$ huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|
    
    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token: 
Add token as git credential? (Y/n) Y
Token is valid (permission: write).

2023年7月12日 星期三

install pytorch in ubuntu


$ python3 -m venv pytorch
$ source pytorch/bin/activate
$ pip3 install --upgrade --no-cache-dir pip
$ sudo update-alternatives --config cuda
$ pip3 install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

2023年6月21日 星期三

如何在 python 中使用 tao 產生的 yolov4 模型

參考 Nvidia TAO Computer Vision Sample Workflows 產生 yolov4-tiny 模型
此 模型 與 一般產生 的 模型 不一樣
一般產生的模型 參考 tensorrt_demos 即可在 python 中使用

只能將 
tao yolo_v4_tiny export 出的模型(.etlt)用於 DeepStream
而由 
tao  converter 產生的 trt.engine 不能使用於 DeepStream 也不能在 python 中使用

有說明如何使用 tao 產生的模型在 Triton 伺服器上
在 yolov3_postprocessor.py 中發現 tao 產生的 yolo
已經將輸出的 NMS 處理過, 並將內容置於 
BatchNMS(-1,1): 偵測出的數量
BatchNMS_1(-1,200,4): 座標
BatchNMS_2(-1,200): 信心
BatchNMS_3(-1,200): 類別
輸入的方式也有改變
cv2 讀出的圖 不需 cvtColor, 也不用除以 255.0
只需將 BHWC 轉成 BCHW
img = img.transpose((2, 0, 1)).astype(np.float32)

tao 的執行是在 docker 中,所以很難除錯
發現下列命令,可以直接進入 docker 中,執行 python, 查詢版本環境等
docker run -it --rm --gpus all \
  -v "/mnt/Data/tao/yolo_v4_tiny_1.4.1":"/workspace/tao-experiments" \
  -v "/mnt/Data/TensorRT/tensorrt_demos":"/workspace/tensorrt_demos" \
  -v "/mnt/CT1000SSD/ImageData/Light":"/workspace/Light" \
  nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5 \
  bash

將模型轉換成 TensorRT 除了使用
!tao converter -k $KEY \
                   -p Input,1x3x416x416,8x3x416x416,16x3x416x416 \
                   -e $USER_EXPERIMENT_DIR/export/trt.engine \
                   -t fp32 \
                   $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt
外, 也可使用
!tao-deploy yolo_v4_tiny gen_trt_engine \
  -m $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt \
  -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
  -k $KEY \
  --data_type fp32 \
  --batch_size 1 \
  --engine_file $USER_EXPERIMENT_DIR/export/yolov4_tao_deplay.trt
但若是要在不同平台上轉換
參考 TAO Converter 下載安裝,並執行轉換
./tao-converter_v4.0.0_trt8.5.1.7 \
  -k nvidia_tlt \
  -p Input,1x3x416x416,2x3x416x416,4x3x416x416 \
  -e yolo_v4_tiny_1.4.1/yolo_v4_tiny/export/yolov4_tao_converter_fp32.engine \
  -t fp32 \
  yolo_v4_tiny_1.4.1/yolo_v4_tiny/export/yolov4_cspdarknet_tiny_epoch_080.etlt

參考 tensorrt_demos 修改 utils/yolo_with_plugins.py, 改名成 triton_yolo_with_plugins.py 如下
"""yolo_with_plugins.py
Implementation of TrtYOLO class with the yolo_layer plugins.
"""
from __future__ import print_function
import ctypes
import numpy as np
import cv2
import tensorrt as trt
import pycuda.driver as cuda

try:
    ctypes.cdll.LoadLibrary('./plugins/libyolo_layer.so')
except OSError as e:
    raise SystemExit('ERROR: failed to load ./plugins/libyolo_layer.so.  '
                     'Did you forget to do a "make" in the "./plugins/" '
                     'subdirectory?') from e

def _preprocess_yolo(img, input_shape, letter_box=False):
    """Preprocess an image before TRT YOLO inferencing.
    # Args
        img: int8 numpy array of shape (img_h, img_w, 3)
        input_shape: a tuple of (H, W)
        letter_box: boolean, specifies whether to keep aspect ratio and
                    create a "letterboxed" image for inference
    # Returns
        preprocessed img: float32 numpy array of shape (3, H, W)
    """
    if letter_box:
        img_h, img_w, _ = img.shape
        new_h, new_w = input_shape[0], input_shape[1]
        offset_h, offset_w = 0, 0
        if (new_w / img_w) <= (new_h / img_h):
            new_h = int(img_h * new_w / img_w)
            offset_h = (input_shape[0] - new_h) // 2
        else:
            new_w = int(img_w * new_h / img_h)
            offset_w = (input_shape[1] - new_w) // 2
        resized = cv2.resize(img, (new_w, new_h))
        img = np.full((input_shape[0], input_shape[1], 3), 127, dtype=np.uint8)
        img[offset_h:(offset_h + new_h), offset_w:(offset_w + new_w), :] = resized
    else:
        img = cv2.resize(img, (input_shape[1], input_shape[0]))

    #img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = img.transpose((2, 0, 1)).astype(np.float32)
    #img /= 255.0
    return img

class HostDeviceMem(object):
    """Simple helper data class that's a little nicer to use than a 2-tuple."""
    def __init__(self, host_mem, device_mem):
        self.host = host_mem
        self.device = device_mem

    def __str__(self):
        return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

    def __repr__(self):
        return self.__str__()

def get_input_shape(engine):
    """Get input shape of the TensorRT YOLO engine."""
    binding = engine[0]
    assert engine.binding_is_input(binding)
    binding_dims = engine.get_binding_shape(binding)
    if len(binding_dims) == 4:
        return tuple(binding_dims[2:])
    elif len(binding_dims) == 3:
        return tuple(binding_dims[1:])
    else:
        raise ValueError('bad dims of binding %s: %s' % (binding, str(binding_dims)))

def allocate_buffers(engine, context):
    """Allocates all host/device in/out buffers required for an engine."""
    inputs = []
    outputs = []
    bindings = []
    stream = cuda.Stream()
    for binding in engine:
        binding_dims = engine.get_binding_shape(binding)
        binding_dtype = engine.get_tensor_dtype(binding)
        binding_format = engine.get_tensor_format_desc(binding)
        binding_loc = engine.get_tensor_location(binding)
        binding_mode = engine.get_tensor_mode(binding)
        binding_shape = engine.get_tensor_shape(binding)
        binding_shape_inference = engine.is_shape_inference_io(binding)
        print('binding_dims:{} {} {}'.format(binding, binding_dims, binding_dtype))
        print('  {}'.format(binding_format))
        print('  {} {} {} {}'.format(binding_loc, binding_mode, binding_shape, binding_shape_inference))
        size = trt.volume(binding_dims)
        if size < 0: size *= -1;
        print('  size:{}'.format(size))
        dtype = trt.nptype(engine.get_binding_dtype(binding))
        # Allocate host and device buffers
        host_mem = cuda.pagelocked_empty(size, dtype)
        device_mem = cuda.mem_alloc(host_mem.nbytes)
        # Append the device buffer to device bindings.
        bindings.append(int(device_mem))
        # Append to the appropriate list.
        if engine.binding_is_input(binding):
            #binding_pro_shape = engine.get_profile_shape(0, binding)
            #print('  {}'.format(binding_pro_shape))
            if binding_dims[0] == -1:
                alloc_dims = np.copy(binding_dims)
                alloc_dims[0] = 1
                context.set_binding_shape(0, alloc_dims)
            inputs.append(HostDeviceMem(host_mem, device_mem))
        else:
            outputs.append(HostDeviceMem(host_mem, device_mem))
    return inputs, outputs, bindings, stream

def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
    """do_inference (for TensorRT 6.x or lower)
    This function is generalized for multiple inputs/outputs.
    Inputs and outputs are expected to be lists of HostDeviceMem objects.
    """
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
    # Run inference.
    context.execute_async(batch_size=batch_size,
                          bindings=bindings,
                          stream_handle=stream.handle)
    # Transfer predictions back from the GPU.
    [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
    # Synchronize the stream
    stream.synchronize()
    # Return only the host outputs.
    return [out.host for out in outputs]

def do_inference_v2(context, bindings, inputs, outputs, stream):
    """do_inference_v2 (for TensorRT 7.0+)
    This function is generalized for multiple inputs/outputs for full
    dimension networks.
    Inputs and outputs are expected to be lists of HostDeviceMem objects.
    """
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
    # Run inference.
    context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
    # Transfer predictions back from the GPU.
    [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
    # Synchronize the stream
    stream.synchronize()
    # Return only the host outputs.
    return [out.host for out in outputs]

class TrtYOLO(object):
    """TrtYOLO class encapsulates things needed to run TRT YOLO."""
    def _load_engine(self):
        TRTbin = 'yolo/%s.trt' % self.model
        TRTbin = self.model
        with open(TRTbin, 'rb') as f, trt.Runtime(self.trt_logger) as runtime:
            return runtime.deserialize_cuda_engine(f.read())

    def __init__(self, model, category_num=80, letter_box=False, cuda_ctx=None):
        """Initialize TensorRT plugins, engine and conetxt."""
        self.model = model
        self.category_num = category_num
        self.letter_box = letter_box
        self.cuda_ctx = cuda_ctx
        if self.cuda_ctx:
            self.cuda_ctx.push()

        self.inference_fn = do_inference if trt.__version__[0] < '7' \
                                         else do_inference_v2
        self.trt_logger = trt.Logger(trt.Logger.INFO)
        # add for errors
        # IPluginCreator not found in Plugin Registry
        # getPluginCreator could not find plugin: BatchedNMSDynamic_TRT version: 1
        # Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed
        trt.init_libnvinfer_plugins(self.trt_logger, namespace="")
        self.engine = self._load_engine()

        self.input_shape = get_input_shape(self.engine)

        try:
            self.context = self.engine.create_execution_context()
            self.inputs, self.outputs, self.bindings, self.stream = \
                allocate_buffers(self.engine, self.context)
        except Exception as e:
            raise RuntimeError('fail to allocate CUDA resources') from e
        finally:
            if self.cuda_ctx:
                self.cuda_ctx.pop()

    def __del__(self):
        """Free CUDA memories."""
        del self.outputs
        del self.inputs
        del self.stream

    def detect(self, img, letter_box=None):
        """Detect objects in the input image."""
        letter_box = self.letter_box if letter_box is None else letter_box
        img_h, img_w, _ = img.shape
        img_resized = _preprocess_yolo(img, self.input_shape, letter_box)
        #print(img_resized.shape, img_resized.dtype)

        # Set host input to the image. The do_inference() function
        # will copy the input to the GPU before executing.
        self.inputs[0].host = np.ascontiguousarray(img_resized)
        if self.cuda_ctx:
            self.cuda_ctx.push()
        trt_outputs = self.inference_fn(
            context=self.context,
            bindings=self.bindings,
            inputs=self.inputs,
            outputs=self.outputs,
            stream=self.stream)
        if self.cuda_ctx:
            self.cuda_ctx.pop()

        y_pred = [i.reshape(1, -1,)[:1] for i in trt_outputs]
        keep_k, boxes, scores, cls_id = y_pred
        #print(keep_k.shape)
        #print(boxes.shape)
        keep_k[0,0] = 1
        locs = np.empty((0,4), dtype=np.uint)
        cids = np.empty((0,1), dtype=np.uint)
        confs = np.empty((0,1), dtype=np.float32)
        for idx, k in enumerate(keep_k.reshape(-1)):
            mul = np.array([img_w,img_h,img_w,img_h])
            loc = boxes[idx].reshape(-1, 4)[:k] * mul
            loc = loc.astype(np.uint)
            cid = cls_id[idx].reshape(-1, 1)[:k]
            cid = cid.astype(np.uint)
            conf = scores[idx].reshape(-1, 1)[:k]
            locs = np.concatenate((locs, loc), axis=0)
            cids = np.concatenate((cids, cid), axis=0)
            confs = np.concatenate((confs, conf), axis=0)
        #print(locs.shape, cids.shape, confs.shape)
        #print(locs, cids, confs)
        return locs, confs, cids

下列程式使用上列的程式
import cv2
import numpy as np
import tensorrt as trt
import pycuda.autoinit # This is needed for initializing CUDA driver
import pycuda.driver as cuda
from utils.triton_yolo_with_plugins import TrtYOLO

#MODEL_PATH = '/workspace/tao-experiments/yolo_v4_tiny/export/yolov4_tao_convert.engine'
MODEL_PATH = '/workspace/tao-experiments/yolo_v4_tiny/export/yolov4_tao_deplay.trt'
#MODEL_PATH = '/workspace/tao-experiments/yolo_v4_tiny/export/trt.engine'
        
def main():
    trt_yolo = TrtYOLO(MODEL_PATH, 5, True)
    img_org = cv2.imread('bb.jpg')
    img = np.copy(img_org)
    print(img.shape, img.dtype)
    boxes, confs, clss = trt_yolo.detect(img, False)
    print(boxes.shape, confs.shape, clss.shape)
    print(boxes, confs, clss)
    for box, conf, clss in zip(boxes, confs, clss):
        x_min, y_min, x_max, y_max = box[0], box[1], box[2], box[3]
        cv2.rectangle(img, (x_min, y_min), (x_max, y_max), (255, 255, 255), 2)
        print(box, conf, clss)
    cv2.imwrite('aa.jpg', img)
    print('aaa')

if __name__ == '__main__':
    main()

除錯說明
訊息: Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed
解決: TensorRt 的版本不一致,安裝不同版本,或利用 docker
訊息: IPluginCreator not found in Plugin Registry
訊息: getPluginCreator could not find plugin: BatchedNMSDynamic_TRT version: 1
解決: 需安裝 TensorRT OSS
在 load_engine() 之前加上
trt.init_libnvinfer_plugins(self.trt_logger, namespace="")

2023年3月7日 星期二

install opencv4.6.0 with CUDA for Jetson

參考 https://forums.developer.nvidia.com/t/best-way-to-install-opencv-with-cuda-on-jetpack-5-xavier-nx-opencv-for-tegra/222777
下載 install_opencv4.6.0_jetson.sh 備份如下
#!/bin/bash
#
# Copyright (c) 2022, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA Corporation and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA Corporation is strictly prohibited.
#

version="4.6.0"
folder="workspace"

for (( ; ; ))
do
    echo "Do you want to remove the default OpenCV (yes/no)?"
    read rm_old

    if [ "$rm_old" = "yes" ]; then
        echo "** Remove other OpenCV first"
        sudo apt -y purge *libopencv*
break
    elif [ "$rm_old" = "no" ]; then
break
    fi
done


echo "------------------------------------"
echo "** Install requirement (1/4)"
echo "------------------------------------"
sudo apt-get update
sudo apt-get install -y build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
sudo apt-get install -y python3.8-dev python-dev python-numpy python3-numpy
sudo apt-get install -y libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libdc1394-22-dev
sudo apt-get install -y libv4l-dev v4l-utils qv4l2 v4l2ucp
sudo apt-get install -y curl


echo "------------------------------------"
echo "** Download opencv "${version}" (2/4)"
echo "------------------------------------"
mkdir $folder
cd ${folder}
curl -L https://github.com/opencv/opencv/archive/${version}.zip -o opencv-${version}.zip
curl -L https://github.com/opencv/opencv_contrib/archive/${version}.zip -o opencv_contrib-${version}.zip
unzip opencv-${version}.zip
unzip opencv_contrib-${version}.zip
rm opencv-${version}.zip opencv_contrib-${version}.zip
cd opencv-${version}/


echo "------------------------------------"
echo "** Build opencv "${version}" (3/4)"
echo "------------------------------------"
mkdir release
cd release/
cmake -D WITH_CUDA=ON -D WITH_CUDNN=ON -D CUDA_ARCH_BIN="7.2,8.7" -D CUDA_ARCH_PTX="" -D OPENCV_GENERATE_PKGCONFIG=ON -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-${version}/modules -D WITH_GSTREAMER=ON -D WITH_LIBV4L=ON -D BUILD_opencv_python3=ON -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD_EXAMPLES=OFF -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local ..
make -j$(nproc)


echo "------------------------------------"
echo "** Install opencv "${version}" (4/4)"
echo "------------------------------------"
sudo make install
echo 'export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export PYTHONPATH=/usr/local/lib/python3.8/site-packages/:$PYTHONPATH' >> ~/.bashrc
source ~/.bashrc


echo "** Install opencv "${version}" successfully"
echo "** Bye :)"

將 cv2 加到 python3 中
$ export PYTHONPATH=$PYTHONPATH:/usr/local/lib/python3.8/site-packages
$ python3
>>> import cv2
>>> cnt = cv2.cuda.getCudaEnabledDeviceCount()
>>> cnt
1 表示有 CUDA

2022年10月25日 星期二

tensorflow predict memory leak

記憶體越吃越多,直到系統當機
$ top
可看到 VIRT RES 越來越大
$ jtop
看到 Mem 也隨時間越來越大

查詢目前程式占用的記憶體
import psutil
psutil.Process().memory_info().rss / (1024*1024*1024),
psutil.Process().memory_info().vms / (1024*1024*1024),

查詢目前程式碼使用記憶體狀況
from memory_profiler import profile
@profile(precision=4,stream=open('memory_profiler.log','w+'))
def function()
@profile # 直接在 stdout 輸出
def function()
但是看不出所以然

網路上常說因為 numpy 到 tensor 轉換的原因
state = tf.convert_to_tensor(state)
model.predict(state)
state = tf.convert_to_tensor(state)
model.fit(states)
但是沒有用

垃圾收集
import gc
gc.collect()
但是也沒有用

最後一招,有用
import tensorflow as tf
tf.keras.backend.clear_session()

2022年10月20日 星期四

python 出現 cannot allocate memory in static TLS block

其實會出現這個問題是 gym tensorflow 衝突 原因
解決這個問題,就部會出現下列問題

OSError: /usr/lib/xxxx/libxxxx.so.0: cannot allocate memory in static TLS block

解決方法為
$ export LD_PRELOAD=/usr/lib/xxxx/libxxxx.so.0:$LD_PRELOAD

tensorflow 在 Xavier 出現 cannot allocate memory in static TLS block 錯誤

其實會出現這個問題是 gym tensorflow 衝突 原因
解決這個問題,就部會出現下列問題

Traceback (most recent call last):
  File "/home/UserName/envs/tf2.10.0/lib/python3.8/site-packages/tensorflow/python/pywrap_tensorflow.py", line 62, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /home/UserName/envs/tf2.10.0/lib/python3.8/site-packages/tensorflow/python/../../tensorflow_cpu_aws.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block

$ vi .bashrc
export LD_PRELOAD=/home/UserName/envs/tf2.10.0/lib/python3.8/site-packages/tensorflow/python/../../tensorflow_cpu_aws.libs/libgomp-d22c30c5.so.1.0.0

2022年9月15日 星期四

OpenAI Gym

OpenAI Gym 目前進到 v0.21.0, 改使用 python 3.7, 導致很多舊程式無法使用
$ pip install gym==v0.20.0
$ pip install pygame

出現下列錯誤
AttributeError: module 'ale_py.gym' has no attribute 'ALGymEnv'
$ pip install ale-py==0.7

出現找不到遊戲 rom
$ pip install autorom
$ AutoROM

在 Xavier 上
$ pip install gym==v0.19.0
$ pip install gym[atari]

2022年1月14日 星期五

deepstream with python

參考 DeepStream Python Apps
$ git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps.git
$ cd deepstream_python_apps/bindings/
$ git submodule update --init
$ cd ../3rdparty/gst-python/
$ ./autogen.sh
$ make
$ make  install
$ cd  ../../bindings/
$ mkdir build
$ cd build
$ cmake ..
$ make
$ cd ..
$ mkdir export_pyds
$ cp build/pyds*.whl export_pyds
$ pip3 install export_pyds/pyds-1.1.0-py3-none-linux_x86_64.whl



2021年12月13日 星期一

python onvif

https://github.com/quatanium/python-onvif
只適用 python2
https://github.com/FalkTannhaeuser/python-onvif-zeep
才能在 python3 使用


2021年6月7日 星期一

python 之 pip install --upgrade pip 出錯

Exception:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 290, in run
    with self._build_session(options) as session:
  File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 69, in _build_session
    if options.cache_dir else None
  File "/usr/lib/python3.6/posixpath.py", line 80, in join
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not int

pip 更新出錯
$ pip install --upgrade --no-cache-dir pip
$ python3 -m pip install --upgrade --no-cache-dir pip -i https://pypi.python.org/simple

2021年3月11日 星期四

pyinstaller

參考 使用說明

(tensorflow) D:\PlateOcr> pyinstaller --add-binary d:\your_path\opencv_ffmpeg340_64.dll;opencv_ffmpeg340_64.dll --hidden-import opencv-python --hidden-import cv2 --hidden-import another --paths D:\your_path_to_opencv_lib:D:\your_path_to_cv2.xxx.pyd -F PlateOcrEval.py

最終測試成功命令
(tensorflow) D:\PlateOcr> pyinstaller --hidden-import cv2 --paths D:\your_path_to_opencv_lib:D:\your_path_to_cv2.xxx.pyd -F PlateOcrEval.py

2021年1月29日 星期五

install pycuda

sudo pip3 install --global-option=build_ext --global-option="-I/usr/local/cuda-10.2/targets/aarch64-linux/include/" --global-option="-L/usr/local/cuda-10.2/targets/aarch64-linux/lib/" pycuda

https://docs.donkeycar.com/guide/robot_sbc/tensorrt_jetson_nano/