Publications

ECCV 2026

O-VAD: Industrial Video Anomaly Detection through Object-Centric State Tracking and Reasoning

Mei Yuan, Qi Long, Qifeng Wu, Zhenyang Li, Yizhou Zhao, Lei Wang, Yang Liu, Min Xu

Accepted by ECCV 2026

TL;DR: A VLM-based reasoning framework that elevates video anomaly detection from pattern matching to cognitive-level understanding. By simulating human spatial perception and representing scene dynamics via object-centric state tracking, our approach achieves state-of-the-art performance on industrial benchmarks, pioneering explainable anomaly detection for robotic laboratories.

ongoing

Time-STaR: Self-Taught Reasoners Augmented with Tools for Reliable Time Series Analysis

ongoing

📄 Paper 💻 Code

TL;DR: A reasoning-centric framework that repurposes LLMs for time series forecasting. By curating the Time-STaR-CoTT dataset and implementing GRPO-style reinforcement learning, we enable models to identify causal relationships, detect regime changes, and generate interpretable forecasts—achieving state-of-the-art results across weather, traffic, and finance domains.

Guiding Grasp and Growth: Multi-Modal Detection and Feedback on Accented Mispronunciation

Mei Yuan, Boting Li

Brief version of master's thesis

📄 Paper 🎯 Demo

TL;DR: An interactive text-vision-audio pronunciation coaching system combining LLM-powered assessment, Neural TTS exemplars, and viseme animations. Validated with 82 students showing 90%+ satisfaction, the system was adopted as an intelligent teaching assistant in a graduate-level English course at Peking University.

EMNLP 2023

DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models

Shengguang Wu, Mei Yuan, Qi Su

Findings of EMNLP 2023

📄 Paper 🌐 Project

TL;DR: A non-autoregressive DiffusionLM-based storytelling model that generates coherent narratives around visual sequences. Trained with weighted conditions on global vision-language history, DiffuVST achieves superior performance with 10× faster inference than autoregressive models.

ACM MM 2022

A Person Re-identification Approach Focusing on the Occlusion Problem and Ranking Optimization

Wenkai Zheng, Mei Yuan

ACM Multimedia 2022 (MMSports Workshop)

📄 Paper 🏆 2nd Place