[R] Treating Depth Sensor Failures as Learning Signal: Masked Depth Modeling outperforms industry-grade RGB-D cameras
Been reading through “Masked Depth Modeling for Spatial Perception” from Ant Group and the core idea clicked for me. RGB-D cameras fail on reflective and transparent surfaces, and most methods just discard these missing values as noise. This paper does the opposite: sensor failures happen exactly where geometry is hardest (specular reflections, glass, textureless walls), so why not use them as natural masks for self-supervised learning? The setup takes full RGB as context, masks depth tokens where the […]