Academic
Academic
Home
Posts
Publications
Contact
Light
Dark
Automatic
paper-conference
Leveraging VLM-Based Pipelines to Annotate 3D Objects
We improve pretrained captioning and classification of 3D objects via visually grounded aggregation of VLM responses.
Rishabh Kabra
,
Loic Matthey
,
Alexander Lerchner
,
Niloy J. Mitra
Cite
Project
SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition
A video scene model which separates the time-invariant, object-level contents of the scene from global time-varying elements such as viewpoint.
Rishabh Kabra
,
Daniel Zoran
,
Goker Erdogan
,
Loic Matthey
,
Antonia Creswell
,
Matt Botvinick
,
Alexander Lerchner
,
Chris Burgess
Cite
Project
Slides
Video
Cite
×