Bowen Cheng 程博文

Bowen is a Member of Technical Staff at OpenAI working on multimodal foundation models and agents.

Before that, a Senior Research Scientist in the Autopilot team at Tesla to build the Full Self-Driving (FSD) system. Bowen was Bowen received his Ph.D. in Electrical and Computer Engineering (ECE) at the University of Illinois Urbana-Champaign (UIUC). His Ph.D. advisors are Prof. Alexander Schwing and Prof. Thomas Huang (2017-2020). Bowen is doing research in computer vision and machine learning. Before commencing his graduate studies, he received his B.S. in ECE at UIUC in 2017.

Bowen has interned at FAIR NYC (Facebook AI Research, New York City), FAIR MPK (Facebook AI Research, Menlo Park), Google Research (Los Angeles), Microsoft Research (Redmond), and Microsoft Research Asia (Beijing, China).

Research Interests

As a researcher in computer vision and machine learning, I am interested in multimodal embodied agents. In particular, I would like to build an AI assistant (e.g., self-driving car, chatbot, etc.) that (1) takes human commands in arbitrary forms; (2) executes and generates outputs in desired forms, either with its internal knowledge or by using “tools”; and (3) learns from commonsense knowledge and human feedback.

Please refer to my Google scholar for a full list of my publications.

News

  • [2024.03] New journey @ OpenAI!
  • [2023.12] FSD v12 is out! It is the first end-to-end self-driving model from Tesla, I’m so proud to contribute to this project.
  • [2023.08] One paper accepted in ICCV 2023!
  • [2022.08] Day 1 @ Tesla!
  • [2022.06] Defended my Ph.D. thesis :)
  • [2022.03] Two papers accepted in CVPR 2022!
  • [2021.12] Checkout our Mask2Former which outperforms specialized architectures on panoptic, instance and semantic segmentation with a single universal architecture for the first time. New SOTAs: 57.8 PQ on COCO panoptic segmentation, 50.1 AP on COCO instance segmentation and 57.7 mIoU on ADE20K semantic segmentation!
  • [2021.10] I received a NeurIPS 2021 Outstanding Reviewer Award.
  • [2021.09] MaskFormer accepted to NeurIPS 2021 as spotlight!
  • [2021.07] Checkout our MaskFormer which seamlessly unifies semantic- and instance-level segmentation tasks by treating semantic segmentation as a mask classification problem. Our MaskFormer acheives new SOTA on both semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO).

Projects

  • Mask2Former codebase based on Detectron2 [code]
  • MaskFormer codebase based on Detectron2 [code]
  • Pointly-supervised instance segmentation implementation in Detectron2 [code]
  • Panoptic-DeepLab implementation in Detectron2 [code]
  • DeepLab implementation in Detectron2 [code]
  • Panoptic-DeepLab implementation in PyToch from scratch [code]
  • HigherHRNet implementation in PyTorch [code]