Mosam Dabhi

PhD student at the Robotics Institute, Carnegie Mellon University (CMU) with Simon Lucey and Laszlo Jeni. Before that, I worked with Nathan Michael at CMU on making drones operate intelligently.

Purpose: Solving spatial intelligence merging existing data streams on internet with spatial signals (3D) to lead the base for intelligence that has physical reasoning.


News

  • June 2024: Back at Apple (5th summer) as research scientist intern - making next-gen stuff on  Vision Pro.
  • Feb. 2024: 3D-LFM is accepted at CVPR, 2024. Looking forward to presenting this at Seattle.
  • June 2023: Apple (4th summer) as research scientist intern - making next-gen stuff  Vision Pro.
  • Oct 2022: Won the NeurIPS 2022 scholar award in New Orleans, Louisiana!
  • Oct 2022: MBW has been accepted to NeurIPS 2022.
  • Aug 2022: MV-NRSfM from 3DV 2021 was featured on this post, titled “New AI tech to bring human-like understanding of our 3D world.”
  • May 2022: Joining Apple for 3rd consecutive summer -  Vision Pro.
  • Oct 2021: MV-NRSfM has been accepted to 3DV 2021. We have also released the code.
  • May 2021: Rejoining Apple (2nd consective summer) -  Vision Pro.
  • May 2021: I have defended my Masters thesis on Multi-view NRSfM: Affordable Setup for High-Fidelity 3D Reconstruction.
  • May 2020: Joining Apple as a Scientist Intern -  Vision Pro.

Publications

RAT4D: Rig and Animate any object without Templates in 4D

RAT4D: Rig and Animate any object without Templates in 4D

Coming soon...

Create 3D/4D rigs and models of any deformable object in the world without templates from a single video captured on a smartphone - a tool for mapping the spatial world.

3D-LFM: Lifting Foundation Model

3D-LFM: Lifting Foundation Model

CVPR, 2024

A universal 2D-3D lifting model, that processes diverse objects without category-specific knowledge. It uses transformers' permutation equivariance and geometric consistency to handle camera rotations, standardizing shape representation in a canonical frame.

MBW: Multi-view Bootstrapping in the Wild

MBW: Multi-view Bootstrapping in the Wild

NeurIPS, 2022

By enforcing temporal along with spatial consistencies via neural priors, MBW carries out Out-of-Distribution (OOD) detection for auto-labeling at scale in a low-shot learning fashion.

High Fidelity 3D Reconstructions with Limited Physical Views

High Fidelity 3D Reconstructions with Limited Physical Views

3DV, 2021

Enforcing multi-view equivariance with modern deep 3D lifting enables generation of high-fidelity 3D reconstructions using just 2-3 cameras, compared to >100 cameras.

Real-Time Information-Theoretic Exploration with Gaussian Mixture Model Maps

Real-Time Information-Theoretic Exploration with Gaussian Mixture Model Maps

RSS, 2019

Representing environment using Gaussian Mixture Models (GMMs) over voxel grids enables map transfer from Mars to Earth in 21 seconds compared to 1260 seconds.

Fast and agile vision‑based flight with teleoperation and collision avoidance on a multirotor

Fast and agile vision‑based flight with teleoperation and collision avoidance on a multirotor

ISER, 2018

Aggressive autonomous flight and collision-free teleoperation in unstructured, GPS-denied environments at speeds exceeding 12 m/s^2.

Aggressive Flight Performance using Robust Experience-driven Predictive Control Strategies: Experimentation and Analysis

Aggressive Flight Performance using Robust Experience-driven Predictive Control Strategies: Experimentation and Analysis

Robotics Institute, CMU (Technical Report), 2018

By storing crucial control policies, you can re-use them at a later stage without spending valuable compute resources on a resource-constrained Micro Air Vehicle (MAV).