~ Things are starting to take shape!
These blogs can be found in the SugarLabs website. The content is the same in both places.~
Project: Speak Activity
Mentors: Chihurumnaya Ibiam, Kshitij Shah
Assisting Mentors: Walter Bender, Devin Ulibarri
Reporting Period: 2025-06-22 - 2025-06-29
Note: I was on leave this week till the 26th due to my final exams. But I still managed to do a bunch of cool stuff after that.
Kokoro meets Speak - A new chapter
One of the three major parts of my proposal was to integrate a more modern, natural-sounding TTS model into Speak.
The current implementation of the code is a rather hacky way of integrating Kokoro. I say this because the audio pipeline currently looks like this:
Text → Kokoro → Outputs a temporary WAV file → Read by GStreamer → Audio output can be heard
This is not ideal for obvious reasons. We don't want Kokoro to save an audio file every time and then read from it again. This is slow because Kokoro has to process the entire text, convert it to a WAV, and then GStreamer has to read and output it. For smaller text inputs it's still fine, but it’s not optimal.
Video demo:
Note that the recording has a slight echo, but that's the recordings issue, it sounds perfectly fine inside of speak.
pulls a model hosted on 🤗 → sets up all local dependencies → quantizes the model → exports it as a GGUF → and uses a plugin script (model dependent) to run it in chat mode.
# Model Config
MODEL_REPO="hfusername/modelname"
GGUF_OUT="output_model_name.gguf"
GGUF_QUANT="output_model_name-q4.gguf"
N_CTX=2048
BUILD_DIR="build"
SAVED_DIR_NAME_HF="output_dir_name"
# Another thing to note is the URL to the plugin inference script:
RAW_URL="https://raw.githubusercontent.com/mebinthattil/template_llama_chat_python/main/chatapp.py"
This script tries to be OS agnostic, and attempts to detect which OS you're on to run commands accordingly. It’s not fully comprehensive yet, but it works well on macOS, as that’s the only platform I’ve tested it on.
Thank you to my mentors, the Sugar Labs community, and fellow GSoC contributors for their ongoing support.
Powered by Not An SSG 😎