Preserving Macedonian Culinary Heritage: Fine-Tuning a Large Language Model for Recipe Generation in a Low-Resource Language
Journal
2025 IEEE International Conference on Big Data (BigData)
Date Issued
2025-12-08
Author(s)
Peshevski, Dimitar
Ss. Cyril and Methodius University
Sasanski, Darko
Saints Cyril and Methodius University of Skopje, Ss. Cyril and Methodius University
DOI
10.1109/bigdata66926.2025.11401620
Abstract
We introduce the first fine-tuned large language model for recipe instruction generation in Macedonian. Building on VezilkaLLM-Instruct, a 4-billion parameter model, we fine-tune it using a curated dataset of 36,000 recipes with detailed cooking instructions. Our key contributions include: (1) the development of a domain-adapted language model for a low-resource language; (2) the demonstration that relatively small LLMs can be effectively adapted to specialized culinary tasks; and (3) the proposal of a dual evaluation framework that combines semantic similarity and verb overlap analyses to assess both content generalization and procedural accuracy. Fine-tuning results in a mean cosine similarity of 0.90 and significantly increases the overlap of domain-specific cooking verbs, indicating improved generation quality. These results highlight the potential of targeted fine-tuning approaches for domain-specific applications in underrepresented languages and provide a foundation for further research in computational culinary heritage.
