Opening Remarks
Overview of VQualA 2025 goals and structure
Title: Evaluate and Improve AIGC via Modeling Human Feedback and Behavior
Abstract
In this talk, I will present our recent work on evaluation and post-training of AIGC. In particular, I will talk about how to build an rich human feedback (auto-rater) model to predict raters’ rich feedback for generated images, which can serve as an interpretable AIGC evaluation and reward model. Moreover, I will show how to improve image generation models via fine-tuning with our auto-rater model predictions, e.g. achieving region-aware fine-tuning for T2I models to fix problematic regions (CVPR 2025 paper), or fine-tuning with multi rewards. Finally, we will also discuss a rich human behavior model across various kinds of visual content, and give some example about how to use it to improve visual content for better user experience.
Paper Session I — Image and Video Quality Assessment
Each: 8 min talk + 2 min Q&A
- AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images
- AIGVQA: A Unified Framework for Multi-Dimensional Quality Assessment of AI-Generated Video
- Understanding Perceptual Quality in CCTV Images: A Benchmark Dataset and Entropy-based Insights
- Perceptual Classifiers: Detecting Generative Images using Perceptual Features
- Hybrid Vision Transformer and Convolutional Neural Network for Super-Resolution Image Quality Assessment
- DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment
- Better Supervised Fine-tuning for VQA: Integer-Only Loss
Paper Session II — Face and Multimodal Quality Assessment
Each: 8 min talk + 2 min Q&A
- MSPT: A Lightweight Face Image Quality Assessment Method with Multi-stage Progressive Training
- Efficient Face Image Quality Assessment via Self-training and Knowledge Distillation
- RankCORE: Self-Supervised Ranking-Aware Correlation Optimized Regression for Face Image Quality Assessment
- A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss
- MMF-QE: Advanced Multi-Modal Fusion for Quality Assessment and Engagement Prediction in User-Generated Short Videos
- Engagement Prediction of Short Videos with Large Multimodal Models
- Exploring MLLM in Fine-Grained Visual Quality Comparison with Quality Token
- QualiVision: Multi-Modal Video Quality Assessment with Quality-Aware Fusion and Discriminative Learning Strategies
VQualA 2025 — Overview of Challenges and Closing Remarks
Overview of challenge tracks and winning solutions
- ISRGC-Q: Image Super-Resolution Generated Content Quality Assessment Challenge
- FIQA: Face Image Quality Assessment Challenge
- EVQA-SnapUGC: Engagement Prediction for Short Videos Challenge
- Visual Quality Comparison for Large Multimodal Models Challenge
- DIQA: Document Image Enhancement Quality Assessment Challenge
- GenAI-Bench AIGC Video Quality Assessment Challenge
Closing Remarks