Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts

Apr 23, 2026·
Kartik Sharma
Yiqiao Jin
Yiqiao Jin
,
Vineeth Rakesh
,
Yingtong Dou
,
Menghai Pan
,
Mahashweta Das
,
Srijan Kumar
· 1 min read
Abstract
Aligning frozen large language models without modifying their weights is a key challenge for safe and adaptive deployment. We introduce Sysformer, a system that learns adaptive system prompts to safeguard frozen LLMs across diverse risk scenarios. Sysformer treats the system prompt as a learnable, query-conditioned intervention, enabling fine-grained safety control without parameter updates and improving robustness across multiple LLM families. ICLR 2026 acceptance rate: 28%.
Type
Publication
International Conference on Learning Representations (ICLR) 2026

Abstract

Aligning frozen large language models without modifying their weights is a key challenge for safe and adaptive deployment. We introduce Sysformer, a system that learns adaptive system prompts to safeguard frozen LLMs across diverse risk scenarios. Sysformer treats the system prompt as a learnable, query-conditioned intervention, enabling fine-grained safety control without parameter updates and improving robustness across multiple LLM families.

Yiqiao Jin
Authors
Ph.D. Candidate in Computer Science
My research focuses on adaptive and efficient AI systems, with emphasis on LLM agents, agent memory, self-distillation, multimodal LLMs, and structured multi-agent intelligence.