Research
I am interested in Computer Vision, Robotics and Deep Learning in general.
My current research aims to improve Vision Language Models (VLMs) pre-training and how
to infuse spatial understanding into these models.
I have also a strong track of records and experience in the topics of local feature detection
and description based on deep learning, and their applications to visual localization and
sparse 3D reconstruction.
Your browser does not support the video tag.
CoPE-VideoLM: Leveraging Codec Primitives For Efficient Video Language Modeling
Sayan Deb Sarkar ,
Rémi Pautrat ,
Ondrej Miksik ,
Marc Pollefeys ,
Iro Armeni ,
Mahdi Rad *,
Mihai Dusmanu *
ArXiv , 2026
project page /
arXiv
CoPE-VideoLM leverages codec primitives to efficiently process videos for large video language models.
LightGlueStick: a Fast and Robust Glue for Joint Point-Line Matching
Aidyn Ubingazhibov ,
Rémi Pautrat ,
Iago Suárez ,
Shaohui Liu ,
Marc Pollefeys ,
Viktor Larsson ,
International Conference on Computer Vision (ICCV ) workshop , 2025
code /
arXiv
A fast joint point-line matcher with graph neural networks.
Relative Pose Estimation through Affine Corrections of Monocular Depth Priors
Yifan Yu ,
Shaohui Liu ,
Rémi Pautrat ,
Marc Pollefeys ,
Viktor Larsson ,
Conference on Computer Vision and Pattern Recognition (CVPR ) , 2025
(Highlight )
code /
arXiv
New solvers for relative pose estimation leveraging monocular depth priors and affine corrections.
Robust Incremental Structure-From-Motion With Hybrid Features
Shaohui Liu *,
Yidan Gao *,
Tianyi Zhang *,
Rémi Pautrat ,
Johannes Schönberger ,
Viktor Larsson ,
Marc Pollefeys ,
European Conference on Computer Vision (ECCV ) , 2024
code /
arXiv
An open-sourced system that runs Structure-from-Motion (SfM) with points, lines, and vanishing points features.
Your browser does not support the video tag.
3D Neural Edge Reconstruction
Lei Li ,
Songyou Peng ,
Zehao Yu ,
Shaohui Liu ,
Rémi Pautrat ,
Xiaochuan Yin ,
Marc Pollefeys ,
Conference on Computer Vision and Pattern Recognition (CVPR ) , 2024
project page /
video /
code /
arXiv
A learned approach to estimate an edge density and extract 3D edges from a scene.
Handbook on Leveraging Lines for Two-View Relative Pose Estimation
Petr Hruby ,
Shaohui Liu ,
Rémi Pautrat ,
Marc Pollefeys ,
Dániel Béla Baráth ,
International Conference on 3D Vision (3DV ) , 2024 (Spotlight )
arXiv
A complete classification of all solvers for relative pose estimation based on point and line correspondences.
Your browser does not support the video tag.
GlueStick: Robust Image Matching by Sticking Points and Lines Together
Rémi Pautrat *,
Iago Suárez *,
Yifan Yu ,
Marc Pollefeys ,
Viktor Larsson ,
International Conference on Computer Vision (ICCV ) , 2023
project page /
code /
arXiv
A joint point-line matcher with graph neural networks.
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction
Rémi Pautrat ,
Shaohui Liu ,
Petr Hruby ,
Marc Pollefeys ,
Dániel Béla Baráth ,
International Conference on Computer Vision (ICCV ) , 2023
project page /
code /
arXiv
Solvers to extract the 3 orthogonal vanishing points of an uncalibrated image (i.e. unknown focal length), given a prior on the gravity direction.
3D Line Mapping Revisited
Shaohui Liu ,
Yifan Yu ,
Rémi Pautrat ,
Marc Pollefeys ,
Viktor Larsson ,
Computer Vision and Pattern Recognition (CVPR ) , 2023
(Highlight )
project page /
code /
arXiv
An open-sourced system that robustly and efficiently constructs 3D line maps from multi-view images.
Your browser does not support the video tag.
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients
Rémi Pautrat ,
Dániel Béla Baráth ,
Viktor Larsson ,
Martin R. Oswald ,
Marc Pollefeys ,
Computer Vision and Pattern Recognition (CVPR ) , 2023
project page /
video /
code /
arXiv
A generic line detector that combines the robustness of deep learning with the accuracy of handcrafted detectors.
Your browser does not support the video tag.
SOLD²: Self-supervised Occlusion-aware Line Description and Detection
Rémi Pautrat* ,
Juan-Ting Lin* ,
Viktor Larsson ,
Martin R. Oswald ,
Marc Pollefeys ,
Computer Vision and Pattern Recognition (CVPR ) , 2021
(Oral )
project page /
video /
code /
arXiv
A deep line detector and descriptor able to match line segments partially occluded.
Your browser does not support the video tag.
Online Invariance Selection for Local Feature Descriptors
Rémi Pautrat ,
Viktor Larsson ,
Martin R. Oswald ,
Marc Pollefeys ,
European Conference on Computer Vision (ECCV ) , 2020
(Oral )
project page /
teaser /
oral /
code /
arXiv
A learned feature descriptor able to adapt its invariance to illumination and rotation at matching time.
Your browser does not support the video tag.
Object Finding in Cluttered Scenes Using Interactive Perception
Tonci Novkovic* ,
Rémi Pautrat* ,
Fadri Furrer ,
Michel Breyer ,
Roland Siegwart ,
Juan Nieto ,
International Conference on Robotics and Automation (ICRA ) , 2020
project page /
teaser /
oral /
arXiv
We leverage reinforcement learning and computer vision to perform interactive perception:
a robot manipulator has to find a hidden target object in a scene by interacting with
its environment.
Your browser does not support the video tag.
Bayesian Optimization with Automatic Prior Selection for Data-Efficient Direct Policy Search
Rémi Pautrat ,
Konstantinos Chatzilygeroudis ,
Jean-Baptiste Mouret ,
International Conference on Robotics and Automation (ICRA ) , 2018
video /
arXiv
We propose a new acquisition function for Bayesian Optimization that combines the
likelihood of prior information with the expected improvement. We apply it to the
task of damage recovery in robotics and automatic adaptation to new environments.