In this tutorial we are going to use different versions of pre-built models of NVIDIA NeMo’s toolkit. If you still didn’t download and install NeMo, you can go back to my previous blog-post “Getting Started with NVIDIA NeMo ASR”, and go over the installation step by step.
Pre-built Models
There are two main methods to use NeMo’s pre-built models: locally or from the cloud. You can find the all the pre-trained models here or by printing list_available_models()
.
print(nemo_asr.models.EncDecCTCModel.list_available_models())
Cloud
NVIDIA GPU Cloud (NGC) allows to download NVIDIA’s pre-trained models directly by using the command from_pretrained()
:
quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-Zh")
Then just add the location of your wav file (up to 20 seconds):
transcriptions = quartznet.transcribe(paths2audio_files=['/home/user/wavfile.wav'])
Locally
The other option is to download and save it to your folder and by using the restore_from()
command.
Let’s for example download a pre-trained English checkpoint and network configuration file, Multidataset-QuartzNet15x5 from NVIDIA NGC.
First, you need to choose from the list the language that you want to use its pre-trained model from here. For example lets choose the Spanish model
Enter the link and in the next window copy the Wget Model to the terminal in order to download the zip file.
wget — content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/nemo/stt_es_quartznet15x5/versions/1.0.0rc1/zip -O stt_es_quartznet15x5_1.0.0rc1.zip
Save it wherever you want and unzip the file:
unzip multidataset_quartznet15x5_2.zip
Now you have a file with .nemo
ending, so just add the location of the pre-trained model:
quartznet = nemo_asr.models.EncDecCTCModel.restore_from('/home/user/pretrained/stt_es_quartznet15x5.nemo')
Then add the wav file location and transcribe as beforehand:
transcriptions = quartznet.transcribe(paths2audio_files=['/home/user/wavfile.wav'])
End Notes
In this tutorial we used different versions of QuartzNet and Jasper in different languages and we learned two ways of use. Soon I will publish another blog-post with more information about training and fine-tuning, so keep following;)