Week 03 Progress Report by Mebin J Thattil

~ Damn it's been 3 weeks!
Also I'd recommend that you view these GSoC blogs on the SugarLabs website itself, as the markdown is written in the format their SSG uses. With that being said the content is the same in both places.~



Project: Speak Activity
Mentors: Chihurumnaya Ibiam, Kshitij Shah
Assisting Mentors: Walter Bender, Devin Ulibarri
Reporting Period: 2025-06-14 - 2025-06-21


Goals for This Week

This Week’s Achievements

Note: I'm officially on leave for this week and the next week, but I have however been taking calls and attending meetings, and did some light work in the background.

1. Re-formatted the dataset to avoid generating chain of responses
- Before the dataset had a records of conversations between a student and a teacher. Each record would have around 5-10 back-and-forth questions and interactions between the student and teacher.
- Since we were training on this dataset format, the model would also try to replicate this format - ie. it would start generating a chain of question-answer back and forths between the student and teacher. This is obviously something that we don't want.
- I initially kept it this way to teach the model better conversational flow, but this approach does more harm than help.
- So I have broken up the conversations and re-structured the conversations.
- I will now fine-tune it again on a subset of the dataset and deploy just to test it (this is yet to be done)



Key Learnings

Next Week’s Roadmap

Acknowledgments

Thank you to my mentors, the Sugar Labs community, and fellow GSoC contributors for ongoing support.





Powered Not An SSG 😎