A family of diffusion models for text-to-audio generation.
-
Updated
Jul 3, 2024 - Python
A family of diffusion models for text-to-audio generation.
A webui for different audio related Neural Networks
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
OpenMusic: SOTA Text-to-music (TTM) Generation
Mustango: Toward Controllable Text-to-Music Generation
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.
Pytorch implementation of SoundCTM
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
text-to-audio-latent-diffusion
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24
A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.
Python program to convert text to speech.
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying.
Text To Audio (Voice, Music) -Support Chat-GPT
Generative AI version of the GeoGuesser game.
Add a description, image, and links to the text-to-audio topic page so that developers can more easily learn about it.
To associate your repository with the text-to-audio topic, visit your repo's landing page and select "manage topics."