Hi, I am Jian. Currently, I work at Monash University as a PhD candidate, under the supervision of Prof. Aldeida Aleti, Prof. Chunyang Chen, and Prof. Hongyu Zhang.

I was working as a PhD candidate (research assistant) at University of Zurich, supervised by Prof. Harald C. Gall. Before that, I obtained my master’s degree in machine learning, at KTH Royal Institute of Technology, supervised by Prof. Martin Monperrus. Further, I completed the bachelor’s study in computer science (the elite class), at Shandong University, supervised by Prof. Jun Ma.

My research interests are on the intersection between software engineering and machine learning. The focus is adapting the idea of program repair to language models, namely LM Repair. If you are seeking any forms of academic communication, welcome to contact me via email.

🔥 News

2025.03: ⭐ Proudly announce our 3rd work on LM repair: Optimization for Repair!
2024.10: 🚁 I started a visiting trip to NLP Lab @ Tsinghua University …
2024.06: 🎉 Happy to have our 2nd work on LM semantics: Transition on Semantics accepted by ACL!
2024.01: ⭐ Proudly announce our 1st work on LM semantics: Vocabulary-Defined Semantics!
2023.09: ⭐ Proudly announce our 2nd work on LM repair: Token-Aware Repair!
2023.07: 🎉 Happy to have our paper on top-down code generation accepted by FSE!
2023.03: ⭐ Proudly announce our 1st work on LM repair: Neuron-Targeted Repair!
2022.09: 🚁 The LLM age is coming, I moved from AI4SE (University of Zurich) to SE4AI (Monash University) …

💻 Featured Work

Semantic-based Optimization for Repairing LLMs: Case Study on Code Generation
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

STAR is a novel semantic-based optimization approach for LM repair that efficiently locates and patches buggy neurons using statistical insights and analytical formulas, outperforming prior methods in effectiveness, efficiency, and minimizing side effects.

ACL'25 @ Vienna

Semantic-Aware Layer-Freezing for Computation-Efficient Fine-Tuning of LMs
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

Our semantic-based layer freezing approach improves the efficiency of language model finetuning by determining where to finetune, outperforming existing methods through a detailed semantic analysis of the model’s inference process.

Vocabulary-Defined Semantics: Latent Space Clustering for Beyond-Context Learning
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

We propose “vocabulary-defined semantics” to reformulate in-context learning as a clustering problem, aligning semantic properties of language models with downstream data, outperforming state-of-the-art methods in effectiveness, efficiency and robustness.

Neuron Patching: Semantic-based Neuron-level LM Repair for Code Generation
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

MINT is an efficient and reliable technique for repairing large language models in software engineering. It can successfully solve model failures by patching merely 1 or 2 neurons, outperforming state-of-the-art methods in coding tasks.

FSE'23 @ San Francisco

Towards Top-Down Automated Development in Limited Scopes: A Neuro-Symbolic Framework from Expressibles to Executables
Jian Gu, Harald C. Gall

Deep code generation integrates neural models into software engineering for generating code but requires enhancements for project-level tasks, suggesting a taxonomy on code data and introducing a semantic pyramid framework to improve software development processes.

📝 Publications

Software Engineering for Deep Learning (SE4AI)

ArXiv SeMe: Training-Free Language Model Merging via Semantic Alignment
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
ArXiv Semantic-based Optimization for Repairing LLMs: Case Study on Code Generation
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
ACL'25 Semantic-Aware Layer-Freezing for Computation-Efficient Fine-Tuning of LMs
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
ArXiv Vocabulary-Defined Semantics: Latent Space Clustering for Beyond-Context Learning
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
ArXiv Focus-aware Neurons: Robust LM Repair leveraging Selective Attention
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
ArXiv Neuron Patching: Semantic-based Neuron-level LM Repair for Code Generation
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

Deep Learning for Software Engineering (AI4SE)

ArXiv Semantic-based Memory Augmentation for Continual Code Understanding
Jian Gu, Harald C. Gall
FSE'23 Towards Top-Down Deep Code Generation in Limited Scopes
Jian Gu, Harald C. Gall
SANER'22 Assemble Foundation Models for Automatic Code Summarization
Jian Gu, Pasquale Salza, Harald C. Gall
ICSME'21 Multimodal Representation for Neural Code Search
Jian Gu, Zimin Chen, Martin Monperrus
TSE 2022 On the Effectiveness of Transfer Learning for Code Search
Pasquale Salza, Christoph Schwizer, Jian Gu, Harald C. Gall
TSE 2021 Automated Classification of Overfitting Patches with Statically Extracted Code Features
He Ye, Jian Gu, Matias Martinez, Thomas Durieux, Martin Monperrus

“Machine intelligence is the last invention that humanity will ever need to make. Machines will then be better at inventing than we are, and they’ll be doing so on digital timescales.” – Nick Bostrom

🌵 Jian Gu

🔥 News

💻 Featured Work

📝 Publications

Software Engineering for Deep Learning (SE4AI)

Deep Learning for Software Engineering (AI4SE)