Bowen Cheng 程博文

Bowen is a Member of Technical Staff at OpenAI working on multimodal foundation models and agents.

Before that, a Senior Research Scientist in the Autopilot team at Tesla to build the Full Self-Driving (FSD) system. Bowen was Bowen received his Ph.D. in Electrical and Computer Engineering (ECE) at the University of Illinois Urbana-Champaign (UIUC). His Ph.D. advisors are Prof. Alexander Schwing and Prof. Thomas Huang (2017-2020). Bowen is doing research in computer vision and machine learning. Before commencing his graduate studies, he received his B.S. in ECE at UIUC in 2017.

Bowen has interned at FAIR NYC (Facebook AI Research, New York City), FAIR MPK (Facebook AI Research, Menlo Park), Google Research (Los Angeles), Microsoft Research (Redmond), and Microsoft Research Asia (Beijing, China).

Research Interests

As a researcher in computer vision and machine learning, I am interested in multimodal embodied agents. In particular, I would like to build an AI assistant (e.g., self-driving car, chatbot, etc.) that (1) takes human commands in arbitrary forms; (2) executes and generates outputs in desired forms, either with its internal knowledge or by using “tools”; and (3) learns from commonsense knowledge and human feedback.

Please refer to my Google scholar for a full list of my publications.

News

[2025.04] o3 and o4-mini are out! I initiated the research on Thinking with images , the model can now add images to CoT.
[2024.05] GPT-4o is out! Check our blog post.
[2024.03] New journey @ OpenAI!
[2023.12] FSD v12 is out! It is the first end-to-end self-driving model from Tesla, I’m so proud to contribute to this project.

Projects

Mask2Former codebase based on Detectron2 [code]
MaskFormer codebase based on Detectron2 [code]
Pointly-supervised instance segmentation implementation in Detectron2 [code]
Panoptic-DeepLab implementation in Detectron2 [code]
DeepLab implementation in Detectron2 [code]
Panoptic-DeepLab implementation in PyToch from scratch [code]
HigherHRNet implementation in PyTorch [code]