網頁

2021年6月7日 星期一

Ubuntu 18.04 重灌

下載 iso 檔,利用 rufus 寫入 usb
搜尋工具/Disks 確認硬碟位置
備份資料含檔案屬性
$ sudo cp -a /path/source/. /path/dest

gitea 備份和還原
/var/lib/gitea
$ sudo cp /etc/gitea/app.ini .
$ sudo -u git gitea dump -c /etc/gitea/app.ini
$ unzip gitea-dump-xxxx.zip
$ cd gitea-dump-xxxx
$ mv data/conf/app.ini /etc/gitea/conf/app.ini
$ mv data/* /var/lib/gitea/data/
$ mv log/* /var/lib/gitea/log/
$ mv repos/* /var/lib/gitea/repositories/
$ chown -R gitea:gitea /etc/gitea/conf/app.ini /var/lib/gitea
$ mysql --default-character-set=utf8mb4 -u$USER -p$PASS $DATABASE <gitea-db.sql
$ service gitea restart

mariadb backup and restore
/var/lib/mysql
$ sudo mysql
> SHOW DATABASES;
$ sudo mysqldump --all-databases > all.sql
$ sudo mysql --one-database db_name < all.sql

$ sudo cp -a /opt/tomcat /backup/opt/tomcat
$ sudo cp -a /opt/nvidia /backup/opt/nvidia
$ sudo cp -a /etc/nginx /backup/etc/nginx
$ sudo cp -a /etc/systemd /backup/etc/systemd
$ sudo cp -a /etc/udev/rules.d /backup/udev

$ sudo vi /etc/default/grub
$ sudo update-grub

nvidia driver 更新
$ sudo apt-get install gcc
$ sudo apt-get install make
下載 https://www.nvidia.com/download/driverResults.aspx/168347/en-us 驅動
$ chmod 755 NVIDIA-Linux-x86_64-460.32.03.run
$ sudo ./ NVIDIA-Linux-x86_64-460.32.03.run
但會說 nvidia-drm 已經啟動
所以要卸載 舊驅動
$ sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" \
 "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*"
$ sudo apt-get purge 'nvidia*'
$ sudo apt-get autoremove
$ sudo reboot
$ sudo ./ NVIDIA-Linux-x86_64-460.32.03.run
使用 --no-opengl-files 參數,以免使用DeepStream時,只開啟一下黑幕,出現下列錯誤
cuGraphicsGLRegisterBuffer failed with error(304) gst_eglglessink_cuda_init texture = 1
$ sudo ./ NVIDIA-Linux-x86_64-460.32.03.run --no-opengl-files --dkms --no-drm
因為 ubuntu 還是用了一個驅動,所以不能直接更新驅動
但可以設定停用此驅動,小心回答問題,在執行一遍
Would you like ton register the kernel module sources with DKMS? 回答 Yes
Install NVIDIA’s 32-bit compatibility libraries? 回答 No
$ nvidia-smi

安裝 CUDA, 不要使用 deb(network), 好像在版本安裝上會出錯
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
$ sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
$ sudo apt-key add /var/cuda-repo-ubuntu1804-11-1-local/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get -y install cuda
$ sudo apt-get -y install cuda-11-1
指定版本很重要,不然會裝最新的
$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
重新開機即可
但顯示出來的 Driver 和 CUDA Version 都會改變

TensorRT 安裝
https://developer.nvidia.com/nvidia-tensorrt-download
由此進入選擇所需版本,並選擇 deb 版
由此選擇開啟文件
進入 TensorRT Installation Guide
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-723/install-guide/index.html
跳至 4.1. Debian Installation
跟著步驟安裝

設定環境變數
PATH
LD_LIBRARY_PATH
pycuda 只能搭配 python 3.7
$ sudo apt-get install python3.7
$ sudo apt-get install python3.7-dev
$ python3.7 -m pip install 'pycuda>=2019.1.1'

Install Nvidia Docker
$ curl https://get.docker.com | sh && sudo systemctl --now enable docker
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo systemctl restart docker
$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
$ sudo groupadd docker
$ sudo usermod -a -G docker $USER
$ sudo reboot

Install TensorRT 7.2 OSS
參考 https://github.com/NVIDIA/TensorRT/tree/master
Install TensorRT 7.2 OSS
$ git clone -b master https://github.com/nvidia/TensorRT TensorRT_OSS-7.2.3.4
$ cd TensorRT_OSS-7.2.3.4/
$ git submodule update --init --recursive
$ cd /pathto/TensorRT-7.2.2.3/
$ export TRT_LIBPATH=`pwd`
$ cd /pathto/TensorRT_OSS-7.2.3.4/
$ ./docker/build.sh --file docker/ubuntu-18.04.Dockerfile --tag tensorrt-ubuntu-1804 --cuda 11.1
$ ./docker/launch.sh --tag tensorrt-ubuntu-1804 --gpus all
trtuser@c2936e108d43:/workspace$ cd $TRT_OSSPATH
trtuser@c2936e108d43:/workspace/TensorRT$ mkdir -p build && cd build
trtuser@c2936e108d43:/workspace/TensorRT/build$ cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out
trtuser@c2936e108d43:/workspace/TensorRT/build$ make -j$(nproc)
trtuser@c2936e108d43:/workspace/TensorRT/build$ exit
$ cd ..
$ mkdir backup
$ mv TensorRT-7.2.2.3/targets/x86_64-linux-gnu/lib/libnvinfer_plugin.so.7.2.2 backup
$ cp TensorRT_OSS-7.2.3.4/build/out/libnvinfer_plugin.so.7.2.3 TensorRT-7.2.2.3/targets/x86_64-linux-gnu/lib/

install xpra
https://github.com/Xpra-org/xpra/blob/master/docs/Build/Debian.md
$ git clone https://github.com/Xpra-org/xpra.git
$ sudo ./setup.py install
Exception: ERROR: cannot find a valid pkg-config entry for nvjpeg-11.4 using PKG_CONFIG_PATH=(empty)
$ vi setup.py
/if nvjpeg_ENABLED:
在底下不遠處有兩處 for v in ("11.4", "11.3"...):
皆改成 for v in ("11.1"): 即可
另外 ld: connot find -lcuda
$ sudo ln -s /usr/local/cuda-11.1/lib64/stubs/libcuda.so /usr/lib

pip 更新出錯
$ pip install --upgrade --no-cache-dir pip
$ python3 -m pip install --upgrade --no-cache-dir pip -i https://pypi.python.org/simple

沒有留言:

張貼留言