Getting Started with NVIDIA NeMo ASR
NVIDIA NeMo — Quick Start Guide
This guide will focus on the basics steps for start working with NeMo’s toolkit.
What is NeMo?
NVIDIA’s framework for Automatic Speech Recognition (ASR) is called NeMo and it contains a collections of pre-built acoustic models for automatically transcribe spoken language.
Except the acoustic models. NVIDIA also offers pre-built models for Natural Language Processing (NLP) and Text-to-Speech (TTS) but in this tutorial I’m going to write just about ASR.
First of all you need to be sure that you have python 3.6+ before you start. Then, start with few libraries installations and downloading the “main” branch (not the “master”) from GitHub. For more details read .
apt-get update && apt-get install -y libsndfile1 ffmpeg
pip install Cython
python -m pip install git+https://github.com/NVIDIA/NeMo.git@main
Then install NeMo’s toolkit by pip in your virtual environment:
pip install nemo-asr
python -m pip install -r requirements.txt
python -m pip install -r requirements_asr.txt
The next step will be to Install PyTorch and be sure that you download 1.7.1 version and not the newest one.
Inference with NeMo
We are done and good to go to the final step;) Let’s try NeMo!
The next line will download pre-trained QuartzNet15x5 model from NVIDIA GPU Cloud (NGC) and instantiate it for you:
quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-En")
Download a wav file in English that is up to 20 seconds and add the file location:
transcription = quartznet.transcribe(paths2audio_files=['/home/user/wavfile.wav'])
For those who have problems with
editdistance package and are getting this error:
ModuleNotFoundError: No module named ‘editdistance’.
The solution depends on your operation system you use:
For Windows, you need to download Visual Studio is 2019 and Visual C++ Build Tools .
For Ubuntu, the solution also depends on your python version. For Python 3.8 use:
sudo apt-get install python3.8-dev
If you use another version so change it in accordance.
git clone https://github.com/NVIDIA/apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
We went over the basic steps for getting a working environment for inference with NeMo toolkit. If you want to read more about NeMo’s toolkit you can check out one of the tutorials  that NVIDIA suggests. If you wish to try other ASR models in more languages, you can continue to my next blog-post.
NeMo is a toolkit for creating Conversational AI applications. NeMo toolkit makes it possible for researchers to easily…
Select your preferences and run the install command. Stable represents the most currently tested and supported version…
Download Visual Studio 2019 for Windows & Mac
Full-featured integrated development environment (IDE) for Android, iOS, Windows, web, and cloud Powerful IDE, free for…
How To Install CUDA 10.1 on Ubuntu 19.04
Ubuntu 19.04 will be released soon so I decided to see if CUDA 10.1 could be installed on it. Yes, it can and it seems…