Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts

Apr 23, 2026·

Kartik Sharma

Yiqiao Jin

Vineeth Rakesh

Yingtong Dou

Menghai Pan

Mahashweta Das

Srijan Kumar

· 1 min read

PDF Cite

Abstract

Aligning frozen large language models without modifying their weights is a key challenge for safe and adaptive deployment. We introduce Sysformer, a system that learns adaptive system prompts to safeguard frozen LLMs across diverse risk scenarios. Sysformer treats the system prompt as a learnable, query-conditioned intervention, enabling fine-grained safety control without parameter updates and improving robustness across multiple LLM families. ICLR 2026 acceptance rate: 28%.

Type

Conference paper

Publication

International Conference on Learning Representations (ICLR) 2026

Abstract

Links

Last updated on Apr 23, 2026

LLM Safety System Prompts Large Language Models Alignment

Authors

Yiqiao Jin

Ph.D. Candidate in Computer Science

My research focuses on adaptive and efficient AI systems, with emphasis on LLM agents, agent memory, self-distillation, multimodal LLMs, and structured multi-agent intelligence.

← Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models Apr 23, 2026

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent Feb 4, 2026 →