Build software better, together

Enemyx-net / VibeVoice-ComfyUI

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.

text-to-speech tts voice-cloning ai-voice voice-generation ai-audio t2s ai-tts ai-voice-clone ai-voice-clonining voice-generator comfyui-nodes comfyui-custom-node comfyui-custom-nodes-text-to-speech vibevoice vibevoice-microsoft

Updated Oct 2, 2025
Python

diodiogod / TTS-Audio-Suite

Star

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio 2 and Microsoft VibeVoice with unlimited text length, SRT timing, Character support, and many audio tools

Updated Dec 30, 2025
Python

RhythrosaLabs / soundstorm

Star

Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.

midi chatbot sound sound-processing gpt algorithmic-music algorithmic-composition sounds audio-processing random-music audio-tools sound-design text-to-audio audio-toolbox ai-audio gpt-4 chatgpt chat-gpt ai-audio-generation

Updated May 4, 2024
Python

Ali-Shariati-Najafabadi / Real-Time-Deepfake-Pipeline

Star

Real-Time Deepfake Pipeline

audio real-time video ai skype realtime faceswap gan webcam zoom microsoft-teams audio-processing deepfake deepfake-detection ai-audio real-time-deepfake

Updated Jun 5, 2025
Python

soumya997 / Music-Generation-Using-Deep-Learning

Star

Music Generation Using Deep Learning🎶🎵

nlp machine-learning deep-learning tensorflow2 musicgeneration ai-audio

Updated Jun 26, 2021
Jupyter Notebook

Yuan-ManX / ai-voice-agents

Star

AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧

ai deep-learning ai-agents ai-voice meachine-learning ai-audio ai-agents-framework

Updated Aug 30, 2024

gabrielsenadev / audioinsight

Star

AudioInsight is a web application that processes audio, generates transcriptions, and allows users to ask questions about the related audio.

full-stack webdev whisper audio-processing audio-to-text ai-audio cloudflare-ai

Updated May 26, 2024
TypeScript

ALucek / companion-guide-challenge

Star

An approach to Andrej Karpathy's LLM challenge, as outlined here: https://twitter.com/karpathy/status/1760740503614836917

audio blogs video-to-text ai-audio ai-video

Updated Mar 13, 2024
Jupyter Notebook

Dineshkumar-Ponnusamy / maya-voice-ai

Star

Maya Voice AI is an open-source project that demonstrates the Maya1 model, capable of generating realistic voice audio from text input with rich emotional and descriptive control. This repository provides a demo for text-to-speech synthesis using advanced language models and the SNAC codec, focusing on high-quality audio at 24kHz.

python open-source text-to-speech deep-learning speech-synthesis maya voice-ai emotional-voice-conversion ai-audio audio-generation snac-codec

Updated Nov 10, 2025
Python

DynamicDevices / meta-dynamicdevices

Star

Professional Yocto BSP Layer for Dynamic Devices Edge Computing Platforms - AI Audio Processing, E-Ink Displays, Power Management, Wireless Connectivity, i.MX8MM/i.MX93 Support

Updated Dec 27, 2025
Shell

Yuan-ManX / SoundHub

Star

AI Audio Framework 🎵

deep-learning ai-framework ai-audio ai-audio-generation

Updated Apr 28, 2024
Python

zelosleone / Audiobook-Generator

Sponsor

Star

A GPU-accelerated Python application that converts PDF and TXT documents into high-quality MP4 audio files using WhisperSpeech technology.

python machine-learning text-to-speech cuda pdf-converter pytorch audiobook speech-synthesis gpu-acceleration text-processing ai-audio

Updated Jun 2, 2025
Python

awzucker / capstone

Star

A project attempting to generate and extract features from music to make comparisons with popular artists, and examine where and with what demographics those artists are popular in order to craft a DIY marketing solution for aspiring artists.

music feature-extraction recommender-system feature-engineering diy-solutions new-media ai-audio

Updated Jan 28, 2021
Jupyter Notebook

ninuxi / acoustic-space-analyzer-ai-pro

Star

Acoustic Space Analyzer AI Pro is a professional acoustic analysis tool that leverages artificial intelligence to generate optimized DSP processing chains for any acoustic environment. This innovative application combines real-time spectral analysis, 3D spatial scanning, and AI-powered audio processing to deliver precise acoustic corrections.

signal-processing audio-analysis audio-processing frequency-analysis web-audio-api real-time-analysis room-correction acoustic-analysis ai-audio sound-engineering professional-audio dsp-chain ai-dsp

Updated Sep 8, 2025
JavaScript

hamzaelmarjani / sympho

Star

Open source AI speech generation solution

android java api rust ios text-to-speech ai api-server tts springboot flutter nestjs actix-web nestjs-backend ai-audio audio-generation shadcn-ui elevenlabs-api rust-actix-web

Updated Sep 13, 2025
Dart

lord-lethris / ComfyUI-lethris-dia2

Star

ComfyUI custom nodes for the Dia2 TTS model — generate speech, timestamps, and captions directly inside ComfyUI.

text-to-speech cuda pytorch captions ass tts subtitles srt vtt asr ai-voice ai-audio audio-generation dia2 comfyui comfyui-nodes audio-generation-ai comfyui-extension

Updated Dec 12, 2025
Python

PaulMarisOUMary / 2022-Project-Artificial-Intelligence-Group-D

Star

kaggle google-colab ai-audio prediction-audio

Updated Jun 26, 2022
Jupyter Notebook

akramnasreddine / ComfyUI-lethris-dia2

Star

🎤 Generate TTS audio and captions easily within ComfyUI, supporting multiple speakers and various caption formats for flexible content creation.

text-to-speech cuda pytorch captions ass tts subtitles srt vtt asr ai-voice ai-audio audio-generation dia2 comfyui comfyui-nodes audio-generation-ai comfyui-extension

Updated Dec 31, 2025
Python

UDA-IIT-Mandi / UDA-Gradient-Reversal-Layer

Star

This repository implements Unsupervised Domain Adaptation using Gradient Reversal Layer with PaSST feature extractors for cross-device acoustic scene classification on DCASE TAU 2020 dataset.

ai grl audio-classification ai-audio

Updated Jun 30, 2025
Jupyter Notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-audio

Here are 19 public repositories matching this topic...

Enemyx-net / VibeVoice-ComfyUI

diodiogod / TTS-Audio-Suite

RhythrosaLabs / soundstorm

Ali-Shariati-Najafabadi / Real-Time-Deepfake-Pipeline

soumya997 / Music-Generation-Using-Deep-Learning

Yuan-ManX / ai-voice-agents

gabrielsenadev / audioinsight

ALucek / companion-guide-challenge

Dineshkumar-Ponnusamy / maya-voice-ai

DynamicDevices / meta-dynamicdevices

Yuan-ManX / SoundHub

zelosleone / Audiobook-Generator

awzucker / capstone

ninuxi / acoustic-space-analyzer-ai-pro

hamzaelmarjani / sympho

lord-lethris / ComfyUI-lethris-dia2

PaulMarisOUMary / 2022-Project-Artificial-Intelligence-Group-D

akramnasreddine / ComfyUI-lethris-dia2

UDA-IIT-Mandi / UDA-Gradient-Reversal-Layer

Improve this page

Add this topic to your repo