$ docker pull nvcr.io/nvidia/tritonserver:22.04-py3
$ git clone https://github.com/triton-inference-server/server.git server.22.04
$ cd server.22.04/docs/examples/
$ ./fetch_models.sh
開啟 Triton Server
$ docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 \
-v/your_path_to/server.22.04/docs/examples/model_repository:/models \
nvcr.io/nvidia/tritonserver:22.04-py3 tritonserver \
--model-repository=/models
測試 Triton Server
$ curl -v localhost:8000/v2/health/ready
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /v2/health/ready HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
<
* Connection #0 to host localhost left intact
使用 Triton Client
可以參考 Github: Triton Client Libraries and Examples 直接使用 Docker
$ docker pull nvcr.io/nvidia/tritonserver:22.04-py3-sdk
$ docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:22.04-py3-sdk
/workspace# /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '/workspace/images/mug.jpg':
15.349570 (504) = COFFEE MUG
13.227468 (968) = CUP
10.424897 (505) = COFFEEPOT
/workspace# exit
沒有留言:
張貼留言