Projects | Yiqiao Jin CS PhD @ Georgia Tech

Research Projects

A selection of research projects spanning LLM agents and agent memory, self-distillation and efficient adaptation, multimodal LLMs, and structured multi-agent intelligence.

UniSD

Self-Distillation

UniSD

A unified self-distillation framework for large language models that consolidates fragmented self-distillation directions into a single, modular formulation across data, representation, and decoding levels. Under review at NeurIPS 2026.

May 30, 2026

Sysformer

Safeguards frozen large language models by learning adaptive, query-conditioned system prompts. Enables fine-grained safety control without modifying model weights. Accepted at ICLR 2026.

Apr 23, 2026

SARA

Retrieval-Augmented Generation

SARA

Selective and Adaptive Retrieval-augmented Generation with Context Compression. A unified RAG framework that combines fine-grained natural-language spans with compact semantic compression vectors under strict context budgets. Accepted at ACL 2026 main conference.

Mar 15, 2026

Multimodal LLMs

SlideAgent

Hierarchical agentic framework for multi-page visual document understanding. Decomposes reasoning into global, page, and element levels to handle slide decks, financial reports, and infographics. Accepted at ACL 2026 main conference.

Mar 15, 2026

AgentArk

AgentArk

Distilling multi-agent intelligence into a single LLM agent. Decomposes multi-agent trajectories into role-conditioned skills and trains a single agent to reproduce the collaborative behavior of the original ensemble. Under review at NeurIPS 2026.

Feb 4, 2026

Multimodal LLMs

ScreenLLM

A multimodal LLM for Graphical User Interface understanding and action prediction. Introduces a stateful screen schema that summarizes dynamic UI sessions as time-aware text and a key-frame extractor for significant UI transitions. WebConf 2025 MM4SG Workshop.

Apr 30, 2025

SciEvo

A 2-million, 30-year cross-disciplinary dataset for temporal scientometric analysis. Reveals disparities in epistemic cultures, citation practices, and knowledge production modes across fields. Best Paper Award at Good-Data @ AAAI 2025 Workshop.

Mar 3, 2025

AgentReview

First LLM-based peer review simulation framework that disentangles latent factors driving reviewer decisions. Reveals a 37.1% variation in paper decisions due to reviewer biases. EMNLP 2024 Oral.

Nov 12, 2024

Multimodal LLMs

MM-Soc

A benchmark for evaluating multimodal LLMs on social media platforms, covering misinformation, sentiment, hate speech, and humor across image-text content. ACL 2024 (Findings).

May 15, 2024

CompeteAI

Studies competition dynamics among LLM-based agents in a simulated virtual town with restaurant and customer agents. Reveals emergent behaviors and strategic patterns aligned with market and sociological theories. ICML 2024 Oral.

May 1, 2024

Large Language Models

XLingEval

Cross-lingual evaluation framework that exposes substantial multilingual gaps in LLM healthcare responses. Featured by Scientific American, The World, and Georgia Tech News. WebConf 2024 Oral.

Apr 1, 2024

INPAC

Continuous-Time Dynamic Graph framework for predicting information pathways across online communities. Models the cross-platform diffusion of YouTube videos through Reddit using multi-modal signals. KDD 2023 Oral.

Aug 1, 2023

Recommender Systems

CODER

Graph-based code recommendation framework for open source developers. Leverages heterogeneous OSS contribution networks (repositories, users, issues, pull requests) to deliver multi-modal recommendations. WebConf 2023 Oral.

Apr 30, 2023

Misinformation Detection

FinerFact

A fine-grained reasoning framework for fake news detection that mirrors the human information-processing model, using a prior-aware bi-channel kernel graph network over evidence types. AAAI 2022 Oral.

Feb 1, 2022