Paper-Conference

SciEvo: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal Scientometric Analysis
SciEvo: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal Scientometric Analysis

SciEvo is a comprehensive dataset containing 2 million+ papers spanning 30 years (1995-2024) for temporal scientometric analysis.

Mar 3, 2025

AgentReview: Exploring Peer Review Dynamics with LLM Agents
AgentReview: Exploring Peer Review Dynamics with LLM Agents

Peer review is fundamental to the integrity and advancement of scientific publication. Traditional methods of peer review analyses often rely on exploration and statistics of existing peer review data...

Nov 12, 2024

Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation
Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation

We address fairness issues in graph anomaly detection, providing benchmark datasets and comprehensive evaluation frameworks for fair anomaly detection on graphs....

Jul 4, 2024

CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents

This work studies the competition dynamics among LLM-based agents, revealing emergent behaviors and strategic patterns in multi-agent systems....

Apr 30, 2024

Prototypical Reward Network for Data-Efficient RLHF
Prototypical Reward Network for Data-Efficient RLHF

We propose a prototypical reward network that enables data-efficient reinforcement learning from human feedback (RLHF) for large language models....

Jan 1, 2024

MM-SOC: Benchmarking Multimodal Large Language Models in Social Media Platforms
MM-SOC: Benchmarking Multimodal Large Language Models in Social Media Platforms

Social media platforms are hubs for multimodal information exchange, encompassing text, images, and videos, making it challenging for machines to comprehensively understand the information. Multimodal...

Jan 1, 2024

Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries
Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries

We present a framework and benchmark to evaluate LLMs' multilingual capabilities in healthcare queries, revealing significant performance gaps across languages and providing insights for improving hea...

Jan 1, 2024

Semi-Offline Reinforcement Learning for Optimized Text Generation
Semi-Offline Reinforcement Learning for Optimized Text Generation

We propose a semi-offline reinforcement learning approach for optimizing text generation in language models, balancing exploration and exploitation effectively....

May 1, 2023

Prototypical Fine-Tuning: Towards Robust Performance under Varying Data Sizes
Prototypical Fine-Tuning: Towards Robust Performance under Varying Data Sizes

We propose prototypical fine-tuning, a novel framework for fine-tuning pretrained language models that maintains robust performance across varying data sizes.

Jan 1, 2023

Predicting Information Pathways Across Online Communities
Predicting Information Pathways Across Online Communities

We develop methods to predict how information spreads across different online communities, revealing patterns in cross-platform information diffusion....

Jan 1, 2023