Physically Embodied Gaussian Splatting: A Visually Learnt and Physically Grounded 3D Representation for Robotics

1QUT Centre for Robotics 2University of Adelaide

Our Gaussian-Particle representation constantly corrects itself to match the real world.

Abstract

For robots to robustly understand and interact with the physical world, it is highly beneficial to have a comprehensive representation - modelling geometry, physics, and visual observations - that informs perception, planning, and control algorithms. We propose a novel dual "Gaussian-Particle" representation that models the physical world while (i) enabling predictive simulation of future states and (ii) allowing online correction from visual observations in a dynamic world. Our representation comprises particles that capture the geometrical aspect of objects in the world and can be used alongside a particle-based physics system to anticipate physically plausible future states. Attached to these particles are 3D Gaussians that render images from any viewpoint through a splatting process thus capturing the visual state. By comparing the predicted and observed images, our approach generates "visual forces" that correct the particle positions while respecting known physical constraints. By integrating predictive physical modeling with continuous visually-derived corrections, our unified representation reasons about the present and future while synchronizing with reality. We validate our approach on 2D and 3D tracking tasks as well as photometric reconstruction quality. Our system runs in realtime at 30Hz using only 3 cameras.

Realtime Operation

Our system calculating visual forces in realtime and being disrupted by the user.

Deformables

The particles of a rope in a simulated scene.

The particles of a rope in a real scene.

Pushover

The TBlock is pushed over by the robot and the gravity prior is able to keep the block within the corrective ability of the visual forces.

Pickup

A mug is slightly pushed by the robot before being picked up. Both the push and the pickup are captured by the representation.

Rope

The Gaussian-Particle representation synchronizing with a rope as it is deformed.

Multiple Objects

Multiple objects being tracked by our system.

Complex Object

The particles and Gaussians can be built for complex shapes.

Simulated Scenes

Related Links

ParticleNeRF introduces the idea of using particles as both the primitive for rendering NeRFs and the primitive on which a physics system acts.

Dynamic 3D Gaussians uses Gaussian Splatting with auxiliary structural losses to track elements in the scene.

Citation

@inproceedings{
        abou-chakra2024physically,
        title={Physically Embodied Gaussian Splatting: A Realtime Correctable World Model for Robotics},
        author={Jad Abou-Chakra and Krishan Rana and Feras Dayoub and Niko Suenderhauf},
        booktitle={8th Annual Conference on Robot Learning},
        year={2024},
        url={https://openreview.net/forum?id=AEq0onGrN2}
        }