NeMo Toolkit in Different Languages

2 min readApr 21, 2021

In this tutorial we are going to use different versions of pre-built models of NVIDIA NeMo’s toolkit. If you still didn’t download and install NeMo, you can go back to my previous blog-post “Getting Started with NVIDIA NeMo ASR”, and go over the installation step by step.

Pre-built Models

There are two main methods to use NeMo’s pre-built models: locally or from the cloud. You can find the all the pre-trained models here or by printing list_available_models() .

print(nemo_asr.models.EncDecCTCModel.list_available_models())

Cloud

NVIDIA GPU Cloud (NGC) allows to download NVIDIA’s pre-trained models directly by using the command from_pretrained() :

quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-Zh")

Then just add the location of your wav file (up to 20 seconds):

transcriptions = quartznet.transcribe(paths2audio_files=['/home/user/wavfile.wav'])

Locally

The other option is to download and save it to your folder and by using the restore_from() command.

Let’s for example download a pre-trained English checkpoint and network configuration file, Multidataset-QuartzNet15x5 from NVIDIA NGC.

First, you need to choose from the list the language that you want to use its pre-trained model from here. For example lets choose the Spanish model

Enter the link and in the next window copy the Wget Model to the terminal in order to download the zip file.

wget — content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/nemo/stt_es_quartznet15x5/versions/1.0.0rc1/zip -O stt_es_quartznet15x5_1.0.0rc1.zip

Save it wherever you want and unzip the file:

unzip multidataset_quartznet15x5_2.zip

Now you have a file with .nemo ending, so just add the location of the pre-trained model:

quartznet = nemo_asr.models.EncDecCTCModel.restore_from('/home/user/pretrained/stt_es_quartznet15x5.nemo')

Then add the wav file location and transcribe as beforehand:

transcriptions = quartznet.transcribe(paths2audio_files=['/home/user/wavfile.wav'])

End Notes

In this tutorial we used different versions of QuartzNet and Jasper in different languages and we learned two ways of use. Soon I will publish another blog-post with more information about training and fine-tuning, so keep following;)

Reference

Checkpoints - NVIDIA NeMo 1.0.0rc2 documentation

The ASR collection has checkpoints of several models trained on various datasets for a variety of tasks. These…

docs.nvidia.com