~ Week 02 progress report goes live!
Also I'd recommend that you view these GSoC blogs on the SugarLabs website itself, as the markdown is written in the format their SSG uses. With that being said the content is the same in both places.~
Project: Speak Activity
Mentors: Chihurumnaya Ibiam, Kshitij Shah
Assisting Mentors: Walter Bender, Devin Ulibarri
Reporting Period: 2025-06-08 - 2025-06-14
Provisioned GPUs on AWS SageMaker to fine-tune Llama3-1B foundation model.
Dataset & Cleaning
ml.g5.2xlarge
instance.Hyperparameters:
```
| Name | Value |
|----------------------------------|----------------------------------------------------|
| add_input_output_demarcation_key | True |
| chat_dataset | True |
| chat_template | Llama3.1 |
| enable_fsdp | False |
| epoch | 5 |
| instruction_tuned | False |
| int8_quantization | True |
| learning_rate | 0.0001 |
| lora_alpha | 8 |
| lora_dropout | 0.08 |
| lora_r | 2 |
| max_input_length | -1 |
| max_train_samples | -1 |
| max_val_samples | -1 |
| per_device_eval_batch_size | 1 |
| per_device_train_batch_size | 4 |
| preprocessing_num_workers | None |
| sagemaker_container_log_level | 20 |
| sagemaker_job_name | jumpstart-dft-meta-textgeneration-l-20250607-200133|
| sagemaker_program | transfer_learning.py |
| sagemaker_region | ap-south-1 |
| sagemaker_submit_directory | /opt/ml/input/data/code/sourcedir.tar.gz |
| seed | 10 |
| target_modules | q_proj,v_proj |
| train_data_split_seed | 0 |
| validation_split_ratio | 0.2 |
```
s3://sagemaker-ap-south-1-021891580293/jumpstart-run2/output/model/
Testing the model
Evaluation
Thank you to my mentors, the Sugar Labs community, and fellow GSoC contributors for ongoing support.
Powered Not An SSG 😎