Lin Chen 陈林ResearcherByteDance Seed
Email:
chlin@mail.ustc.edu.cn
|
I am a researcher at ByteDance Seed, working on foundation multi-modal models. Before that, I got my Ph.D. degree from University of Science and Technology of China (USTC) (Sep. 2020 - June. 2026), advised by Prof. Feng Zhao. I am also leading the vision-language model group at USTC-BIVLab.
✨ NOTE: Our Lab [Link] is looking forward to having elegant students or researchers join us. Positions for Master's, Ph.D., and post-doc are opening. If you are interested in our research and want to join us, just contact me!
* denotes equal contribution.
| Seed2.0 Model Card: Towards Intelligence Frontier for Real-World
Complexity
Bytedance, Seed, 2026 [PDF] [Project] |
| Seed1.8 Model Card: Towards Generalized Real-World Agency
Bytedance, Seed, 2025 [PDF] [Github] |
| Seed1.5-VL Technical Report
Bytedance, Seed, 2025 [PDF] [Code] |
| InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting
Long-Contextual Input and Output
Pan Zhang, Xiaoyi Dong, et al., 2024 [PDF] [Code] |
| ♠ (Co-) First author papers |
| Are We on the Right Way for Evaluating Large Vision-Language Models?
Lin Chen*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao NeurIPS, 2024 — [PDF] [Project] [Code] |
| ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Lin Chen*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Conghui He, Jiaqi Wang, Feng Zhao, Dahua Lin ECCV, 2024 — [PDF] [Project] [Demo] [Code] |
| ShareGPT4Video: Improving Video Understanding and Generation with Better
Captions
Lin Chen*, Xilin Wei*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Bin Lin, Zhenyu Tang, Li Yuan, Yu Qiao, Dahua Lin, Feng Zhao, Jiaqi Wang NeurIPS, 2024 [PDF] [Project] [Code] |
| Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for
Domain Generalized Semantic Segmentation
Zhixiang Wei*, Lin Chen*, Yi Jin*, Xiaoxiao Ma, Tianle Liu, Pengyang Ling, Ben Wang, Huaian Chen, Jinjin Zheng CVPR, 2024 [PDF] [Project] [Code] |
| FreeDrag: Point Tracking is Not What You Need for Interactive Point-based
Image Editing
Pengyang Ling*, Lin Chen*, Pan Zhang, Huaian Chen, Yi Jin CVPR, 2024 [PDF] [Project] [Demo] [Code] |
| Disentangle then Parse: Night-time Semantic Segmentation with Illumination
Disentanglement
Zhixiang Wei*, Lin Chen*, Tao Tu, Huaian Chen, Pengyang Ling, Yi Jin ICCV, 2023 [PDF] [Code] |
| Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation
Lin Chen*, Zhixiang Wei*, Xin Jin*, Huaian Chen, Miao Zheng, Kai Chen, Yi Jin NeurIPS, 2022 — [PDF] [Code] |
| Reusing the Task-specific Classifier as a Discriminator: Discriminator-free
Adversarial Domain Adaptation
Lin Chen*, Huaian Chen*, Zhixiang Wei, Xin Jin, Xiao Tan, Yi Jin, Enhong Chen CVPR, 2022 [PDF] [Code] |
| ♠ Co-author papers |
| Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
ICML, 2026 |
| VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
ICML, 2026 |
| Unbiased Principles, Robust Rewards
ICML, 2026 |
| Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models
ACL, 2026 |
| UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision
ACL, 2026 |
| CompBench: Benchmarking Complex Instruction-guided Image Editing
CVPR, 2026 |
| Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models
ICLR, 2026 |
| V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction Models
ICLR, 2026 |
| VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
NeurIPS, 2025 |
| CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios
EMNLP Main, 2025 |
| Enhancing Large Vision-Language Models with Ultra-Detailed Image Caption Generation
EMNLP Main, 2025 |
| VFM-Adapter: Adapting Visual Foundation Models for Dense Prediction with Dynamic Hybrid Operation Mapping
AAAI, 2025 |
| Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
NeurIPS, 2024 |