Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
-
Updated
Dec 5, 2023 - Python
Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.
Automatic transcriber made with the Nvidia NeMo AI toolkit. Used to transcribe speech to text in real-time from any source. Requires CUDA capable GPU to run on the local machine, if setup using virtual audio cables can transcribe the audio that is being played in real-time without any other requirements.
The simplest & most comprehensible tutorial on speaker identification with NVIDIA's `Nemo`.
Module for russian speech recognition using NVIDIA Nemo.
FastAPI-based Hindi ASR app using NVIDIA NeMo + ONNX, with Docker support for easy deployment.
Implementation of a Kazakh Speech-to-Text Model using the NVIDIA NeMo toolkit for efficient transcription of spoken Kazakh speech into text.
Diarizer system that takes as input .wav files and it transcribe the audio saying who spoke and when. This has been done using the NVIDIA Nemo Framework and pre-trained models.
Audio profanity detector desktop app developed with PyQt5 using NVidia-Nemo tech.
Autonomous Data Agent that cleans, analyzes, and models datasets using Python, RAPIDS, PyTorch, TensorFlow, XGBoost, LightGBM, CatBoost, SHAP/LIME, NeMo, and Streamlit, delivering GPU-accelerated, explainable insights.
Add a description, image, and links to the nvidia-nemo topic page so that developers can more easily learn about it.
To associate your repository with the nvidia-nemo topic, visit your repo's landing page and select "manage topics."