Lin Chen   陈林

Researcher

ByteDance Seed

Email: chlin@mail.ustc.edu.cn
Google Scholar: Link
Github: https://github.com/xiaoachen98/
HuggingFace: https://huggingface.co/Lin-Chen

Lin Chen

Biography

I am a researcher at ByteDance Seed, working on foundation multi-modal models. Before that, I got my Ph.D. degree from University of Science and Technology of China (USTC) (Sep. 2020 - June. 2026), advised by Prof. Feng Zhao. I am also leading the vision-language model group at USTC-BIVLab.

✨ NOTE: Our Lab [Link] is looking forward to having elegant students or researchers join us. Positions for Master's, Ph.D., and post-doc are opening. If you are interested in our research and want to join us, just contact me!

News

Experience

Selected Publications

* denotes equal contribution.

Preprint Papers

Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity
Bytedance, Seed, 2026
[PDF] [Project]
Seed1.8 Model Card: Towards Generalized Real-World Agency
Bytedance, Seed, 2025
[PDF] [Github]
Seed1.5-VL Technical Report
Bytedance, Seed, 2025
[PDF] [Code]
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Pan Zhang, Xiaoyi Dong, et al., 2024
[PDF] [Code]

Published Papers

♠ (Co-) First author papers
Are We on the Right Way for Evaluating Large Vision-Language Models?
Lin Chen*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao
NeurIPS, 2024 — Top 10 Most Influential NeurIPS 2024 Papers
[PDF] [Project] [Code]
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Lin Chen*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin
ECCV, 2024 — Top 5 Most Influential ECCV 2024 Papers
[PDF] [Project] [Demo] [Code]
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Lin Chen*, Xilin Wei*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Bin Lin, Zhenyu Tang, Li Yuan, Yu Qiao, Dahua Lin, Feng Zhao, Jiaqi Wang
NeurIPS, 2024
[PDF] [Project] [Code]
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
Zhixiang Wei*, Lin Chen*, Yi Jin*, Xiaoxiao Ma, Tianle Liu, Pengyang Ling, Ben Wang, Huaian Chen, Jinjin Zheng
CVPR, 2024
[PDF] [Project] [Code]
FreeDrag: Point Tracking is Not What You Need for Interactive Point-based Image Editing
Pengyang Ling*, Lin Chen*, Pan Zhang, Huaian Chen, Yi Jin
CVPR, 2024
[PDF] [Project] [Demo] [Code]
Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement
Zhixiang Wei*, Lin Chen*, Tao Tu, Huaian Chen, Pengyang Ling, Yi Jin
ICCV, 2023
[PDF] [Code]
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation
Lin Chen*, Zhixiang Wei*, Xin Jin*, Huaian Chen, Miao Zheng, Kai Chen, Yi Jin
NeurIPS, 2022 — Spotlight
[PDF] [Code]
Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation
Lin Chen*, Huaian Chen*, Zhixiang Wei, Xin Jin, Xiao Tan, Yi Jin, Enhong Chen
CVPR, 2022
[PDF] [Code]
♠ Co-author papers
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
ICML, 2026
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
ICML, 2026
Unbiased Principles, Robust Rewards
ICML, 2026
Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models
ACL, 2026
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision
ACL, 2026
CompBench: Benchmarking Complex Instruction-guided Image Editing
CVPR, 2026
Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models
ICLR, 2026
V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction Models
ICLR, 2026
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
NeurIPS, 2025
CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios
EMNLP Main, 2025
Enhancing Large Vision-Language Models with Ultra-Detailed Image Caption Generation
EMNLP Main, 2025
VFM-Adapter: Adapting Visual Foundation Models for Dense Prediction with Dynamic Hybrid Operation Mapping
AAAI, 2025
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
NeurIPS, 2024