Publications

A complete and continuously updated list is also available on Google Scholar and Semantic Scholar. For the canonical record, see my CV.

* denotes equal contribution.

(2026). SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression. ACL'26.
(2026). Reasoning Is Not All You Need: Examining LLMs for Multi-Turn Mental Health Conversations. ACL'26.
(2026). MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A. ACL'26 Industry.
(2026). MASCOT: Towards Multi-Agent Socio-Collaborative Companion Systems. ACL'26 TrustNLP Workshop.
(2026). MedHalu: Hallucinations in Responses to Healthcare Queries by Large Language Models. ICWSM'26.
(2026). UniSD: Towards a Unified Self-Distillation Framework for Large Language Models. Preprint.
(2026). TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization. Preprint.
(2026). Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts. ICLR'26.
(2026). Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models. ICLR'26.
(2026). AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent. Preprint.
(2026). Consistency Should Be the Priority for Unified Multimodal Models. Preprint.
(2026). Efficient Knowledge Probing of Large Language Models by Adapting Pre-trained Embeddings. Preprint.
(2025). SlideAgent: Hierarchical Agentic Framework for Multi-Page Visual Document Understanding. ACL'26.
(2025). Protein Large Language Models: A Comprehensive Survey. EMNLP'25.
(2025). Empowering Interdisciplinary Insights with Dynamic Graph Embedding Trajectories. KDD'25 TGL Workshop.
(2025). Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas. AI and Ethics.
(2025). A Survey on Efficient LLM Training: From Data-centric Perspectives. ACL'25.
(2025). ProteinGPT: Multimodal LLM for Protein Property Prediction and Structure Understanding. ICML'25 FM4LS Workshop.
(2025). Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems. Preprint.
(2025). CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries. CVPR'25 VLMs4All Workshop.
(2025). ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction. WebConf'25 MM4SG.
(2025). UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Models. AAAI'25 DAI Workshop.
(2024). RNA-GPT: Multimodal Generative System for RNA Sequence Understanding. NeurIPS'24 MLSB Workshop.
(2024). PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners. EMNLP'24.
(2024). AgentReview: Exploring Peer Review Dynamics with LLM Agents. EMNLP'24.
(2024). Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation. CIKM'24.
(2024). CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents. ICML'24.
(2024). Prototypical Reward Network for Data-Efficient RLHF. ACL'24.
(2024). MM-SOC: Benchmarking Multimodal Large Language Models in Social Media Platforms. ACL'24.
(2024). Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries. WWW'24.
(2023). Semi-Offline Reinforcement Learning for Optimized Text Generation. ICML'23.
(2023). Code Recommendation for Open Source Project Developers. WWW'23.
(2023). Prototypical Fine-Tuning: Towards Robust Performance under Varying Data Sizes. AAAI'23.
(2023). Predicting Information Pathways Across Online Communities. KDD'23.
(2022). Reinforcement Subgraph Reasoning for Fake News Detection. KDD'22.
(2022). Towards Fine-Grained Reasoning for Fake News Detection. AAAI'22.