Out-of-tree kernel patches are essential for adapting the Linux kernel to new hardware or enabling specific functionalities. Maintaining and updating these patches across different kernel versions demands significant effort from experienced engineers. Large language models (LLMs) have shown remarkable progress across various domains, suggesting their potential for automating out-of-tree kernel patch migration. However, our findings reveal that LLMs, while promising, struggle with incomplete code context understanding and inaccurate migration point identification. In this work, we propose MigGPT, a framework that employs a novel code fingerprint structure to retain code snippet information and incorporates three meticulously designed modules to improve the migration accuracy and efficiency of out-of-tree kernel patches. Furthermore, we establish a robust benchmark using real-world out-of-tree kernel patch projects to evaluate LLM capabilities. Evaluations show that MigGPT significantly outperforms the direct application of vanilla LLMs, achieving an average completion rate of 74.07 for migration tasks.
Through analyzing LLM behavior and results, we identify key challenges hindering their success in out-of-tree kernel patch migration:
MigGPT works in two stages: identifying target code snippets in the new version and migrating the out-of-tree patch. MigGPT consists of three core modules: the Retrieval Augmentation Module (addressing Challenges 1 and 3), the Retrieval Alignment Module (addressing Challenge 2), and the Migration Enhancement Module (addressing Challenge 4). Each module uses a Code Fingerprint (CFP) structure to enhance LLM performance and migration accuracy.
Figure 5. Overview of the MigGPT system architecture.
The Code Fingerprint (CFP) is a lightweight sequential data structure for analyzing code snippets, offering three key advantages:
Figure 6. Compared to AST, CFP extracts key code structures, and its linear representation enables clearer localization of code modification points.
@article{dang2025miggpt,
title={MigGPT: Harnessing Large Language Models for Automated Migration of Out-of-Tree Linux Kernel Patches Across Versions},
author={Dang, Pucheng and Huang, Di and Li, Dong and Chen, Kang and Wen, Yuanbo and Guo, Qi and Hu, Xing},
journal={arXiv preprint arXiv:2504.09474},
year={2025}
}