Mosam Dabhi

PhD student at the Robotics Institute, Carnegie Mellon University (CMU) with Simon Lucey and Laszlo Jeni. Before that, I worked with Nathan Michael at CMU on making drones operate intelligently.

Purpose: Solving spatial intelligence merging existing data streams on internet with spatial signals (3D) to lead the base for intelligence that has physical reasoning.

CV Scholar Github Twitter LinkedIn

June 2024: Back at Apple (5th summer) as research scientist intern - making next-gen stuff on  Vision Pro.
Feb. 2024: 3D-LFM is accepted at CVPR, 2024. Looking forward to presenting this at Seattle.
June 2023: Apple (4th summer) as research scientist intern - making next-gen stuff  Vision Pro.
Oct 2022: Won the NeurIPS 2022 scholar award in New Orleans, Louisiana!
Oct 2022: MBW has been accepted to NeurIPS 2022.
Aug 2022: MV-NRSfM from 3DV 2021 was featured on this post, titled “New AI tech to bring human-like understanding of our 3D world.”
May 2022: Joining Apple for 3rd consecutive summer -  Vision Pro.
Oct 2021: MV-NRSfM has been accepted to 3DV 2021. We have also released the code.
May 2021: Rejoining Apple (2nd consective summer) -  Vision Pro.
May 2021: I have defended my Masters thesis on Multi-view NRSfM: Affordable Setup for High-Fidelity 3D Reconstruction.
May 2020: Joining Apple as a Scientist Intern -  Vision Pro.

RAT4D: Rig and Animate any object without Templates in 4D

Mosam Dabhi, Simon Lucey, László A. Jeni

Coming soon...

Create 3D/4D rigs and models of any deformable object in the world without templates from a single video captured on a smartphone - a tool for mapping the spatial world.

Project Page Video Code

3D-LFM: Lifting Foundation Model

Mosam Dabhi, László A. Jeni, Simon Lucey

CVPR, 2024

A universal 2D-3D lifting model, that processes diverse objects without category-specific knowledge. It uses transformers' permutation equivariance and geometric consistency to handle camera rotations, standardizing shape representation in a canonical frame.

Project Page arXiv Video Code

MBW: Multi-view Bootstrapping in the Wild

Mosam Dabhi, Chaoyang Wang, Tim Clifford, László A. Jeni, Ian R. Fasel, Simon Lucey

NeurIPS, 2022

By enforcing temporal along with spatial consistencies via neural priors, MBW carries out Out-of-Distribution (OOD) detection for auto-labeling at scale in a low-shot learning fashion.

Project Page PDF arXiv Video Code Dataset

High Fidelity 3D Reconstructions with Limited Physical Views

Mosam Dabhi, Chaoyang Wang, Kunal Saluja, László A. Jeni, Ian R. Fasel, Simon Lucey

3DV, 2021

Enforcing multi-view equivariance with modern deep 3D lifting enables generation of high-fidelity 3D reconstructions using just 2-3 cameras, compared to >100 cameras.

Project Page PDF arXiv Video Feature Post Code

Real-Time Information-Theoretic Exploration with Gaussian Mixture Model Maps

Wennie Tabib, Kshitij Goel, John Yao, Mosam Dabhi, Curtis Boirum, Nathan Michael

RSS, 2019

Representing environment using Gaussian Mixture Models (GMMs) over voxel grids enables map transfer from Mars to Earth in 21 seconds compared to 1260 seconds.

PDF Video

Fast and agile vision‑based flight with teleoperation and collision avoidance on a multirotor

Alex Spitzer, Xuning Yang, John Yao, Aditya N. Dhawale, Kshitij Goel, Mosam Dabhi, Matt Collins, Curtis Boirum, Nathan Michael

ISER, 2018

Aggressive autonomous flight and collision-free teleoperation in unstructured, GPS-denied environments at speeds exceeding 12 m/s^2.

PDF arXiv Video

Aggressive Flight Performance using Robust Experience-driven Predictive Control Strategies: Experimentation and Analysis

Mosam Dabhi, Alex Spitzer, Nathan Michael

Robotics Institute, CMU (Technical Report), 2018

By storing crucial control policies, you can re-use them at a later stage without spending valuable compute resources on a resource-constrained Micro Air Vehicle (MAV).