參考文件 https://docs.nvidia.com/nim/riva/asr/latest/overview.html
$ export NGC_API_KEY="NTVwdDZqbTdrNnBva285Y3EzbmQxOGNodjY6NWM0NjYzYjYtMzczMy00MjVkLTg1NWQtZTE2MDNmZTAxNDBj"
$ echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
parakeet-0-6b-ctc-en-us 缺 arm64 版本
parakeet-1-1b-ctc-en-us ok
parakeet-tdt-0.6b-v2 缺 arm64 版本
parakeet-1-1b-rnnt-multilingual 失敗
parakeet-ctc-0.6b-zh-cn 缺 arm64 版本
parakeet-ctc-0.6b-zh-tw 缺 arm64 版本
$ export CONTAINER_ID=parakeet-ctc-0.6b-zh-tw
$ export NIM_TAGS_SELECTOR="mode=str,vad=silero,diarizer=sortformer"
$ docker run -it --rm --name=$CONTAINER_ID \
--gpus '"device=0"' \
--shm-size=8GB \
-e NGC_API_KEY \
-e NIM_HTTP_API_PORT=9001 \
-e NIM_GRPC_API_PORT=50052 \
-p 9001:9001 \
-p 50052:50052 \
-e NIM_TAGS_SELECTOR \
nvcr.io/nim/nvidia/$CONTAINER_ID:latest
# Create the cache directory on the host machine:
$ export LOCAL_NIM_CACHE=$(pwd)/cache/nim
$ mkdir -p $LOCAL_NIM_CACHE
$ chmod 777 $LOCAL_NIM_CACHE
# Set the appropriate values
$ export CONTAINER_ID=parakeet-1-1b-ctc-en-us
$ export NIM_TAGS_SELECTOR="name=parakeet-1-1b-ctc-en-us,mode=all,vad=silero,diarizer=sortformer,model_type=prebuilt"
# Run the container with the cache directory mounted in the appropriate location:
$ docker run -it --rm --name=$CONTAINER_ID \
--gpus '"device=0"' \
--shm-size=8GB \
-e NGC_API_KEY \
-e NIM_TAGS_SELECTOR \
-e NIM_HTTP_API_PORT=9001 \
-e NIM_GRPC_API_PORT=50052 \
-p 9001:9001 \
-p 50052:50052 \
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
nvcr.io/nim/nvidia/$CONTAINER_ID:latest
$ curl -X 'GET' 'http://localhost:9001/v1/health/ready'
{"status":"ready"}
$ uv init riva-client
$ cd riva-client/
$ rm .python-version
$ uv venv --python 3.10
$ uv add nvidia-riva-client
$ git clone https://github.com/nvidia-riva/python-clients.git
$ sudo apt-get install python3-pip
$ pip install -U nvidia-riva-client
$ git clone https://github.com/nvidia-riva/python-clients.git
$ docker cp $CONTAINER_ID:/opt/riva/wav/zh-TW_sample.wav .
$ python3 python-clients/scripts/asr/transcribe_file.py \
--server 0.0.0.0:50052 \
--list-models
Available ASR models
{'en-US': [{'model': ['parakeet-1.1b-en-US-asr-streaming-silero-vad-sortformer']}]}
$ python3 python-clients/scripts/asr/transcribe_file_offline.py \
--server 0.0.0.0:50052 \
--list-models
Available ASR models
{}
$ curl -s http://0.0.0.0:9001/v1/audio/transcriptions -F language=zh-TW \
-F file="@zh-TW_sample.wav"
$ docker stop $CONTAINER_ID
$ docker rm $CONTAINER_ID
https://docs.nvidia.com/nim/riva/asr/latest/support-matrix.html#parakeet-0-6b-ctc-taiwanese-mandarin-english
Available modes include streaming low latency (str),
streaming high throughput (str-thr), and offline (ofl).
Setting the mode to (all) deploys all inference modes where applicable.
The profiles with silero and sortformer use Silero VAD to detect start and end of utterance
and Sortformer SD for speaker diarization.
CONTAINER_ID=parakeet-ctc-0.6b-zh-tw
NIM_TAGS_SELECTOR
mode=ofl,vad=default,diarizer=disabled
mode=str,vad=default,diarizer=disabled
mode=str-thr,vad=default,diarizer=disabled
mode=all,vad=default,diarizer=disabled
mode=ofl,vad=silero,diarizer=sortformer
mode=str,vad=silero,diarizer=sortformer
mode=str-thr,vad=silero,diarizer=sortformer
mode=all,vad=silero,diarizer=sortformer
沒有留言:
張貼留言