Week 05 Progress Report

Project: Speak Activity
Mentors: Chihurumnaya Ibiam, Kshitij Shah
Assisting Mentors: Walter Bender, Devin Ulibarri
Reporting Period: 2025-06-29 - 2025-07-06

Goals for This Week

This Week's Progress

1. Hey Kokoro, you sound different today

This week, I tested out different voices of Kokoro in two different ways:

  1. I tested them inside Speak, within Sugar, and it worked. It still uses the hacky way of creating a temporary WAV file and then playing it via GStreamer, but it works. Streaming will be introduced soon.

    Under-the-hood changes:

    Text → Kokoro → handle phonemes via G2P engine → Misaki (primary G2P) → fallback → Espeak-ng

  2. I deployed a web app that lets you generate and mix audio. You can try it out here.

    UI of web app

from openai import OpenAI

client = OpenAI(
    base_url="http://my_kokoro_backend:8880/v1", api_key="not-needed"
)

with client.audio.speech.with_streaming_response.create(
    model="kokoro",
    voice="af_sky+af_bella",  # single or multiple voicepack combo
    input="Hello world!"
) as response:
    response.stream_to_file("output.mp3")

Understanding and playing with Kokoro:

Links:

2. New brains for Speak

But...

Next Week's Roadmap

Acknowledgments

Thank you to my mentors, the Sugar Labs community, and fellow GSoC contributors for their ongoing support.




Powered Not An SSG 😎