This project is inspired by silero-api-server and utilizes XTTSv2. This server was created for SillyTavern but you can use it for your needs Feel free to make PRs or use the code for your own needs ...
Abstract: Speech-to-Text (STT) and Text-to-Speech (TTS) recognition technologies have witnessed significant advancements in recent years, transforming various industries and applications. STT allows ...
Kokoro Web is powered by hexgrad/Kokoro-82M, an open-weight 82 million parameter Text-to-Speech model available on Hugging Face. Despite its lightweight architecture, it delivers comparable quality to ...