Computer Vision / Machine Learning Intern - Ground-to-Aerial Registration

Job description

Maps rule the world. They inform and underpin some of the most crucial decisions that are made in society—but the world is moving so fast that maps everywhere get outdated quicker than any single player can update them. That’s where Mapillary comes in. We allow anyone, anywhere to update maps at scale, using nothing but cameras. We apply computer vision to street-level imagery to generate map data at scale so that people and organizations everywhere can build better maps.


We are looking for an intern to join our computer vision team at Mapillary. You will be working with a team of engineers and researchers with extensive experience in computer vision and deep learning.


The goal of the project is to develop methods for registering Mapillary’s street-level images with aerial images. 

Mapillary’s street-level images are tagged with their GPS positions. These positions are the only source of global positions available. The precision of the GPS varies between 1 and 30 meters depending on the device used and the capture conditions. Mapillary currently uses Structure from Motion to register multiple images of the same area together effectively fusing information from their GPS positions and reducing the location uncertainty. To improve the global positioning further, Aerial images can be used as an additional source of global positioning provided that one can register the ground images to them. Registering ground and aerial images is challenging because of the large change of perspective e.g. while we see building facades from the ground, we only see the roofs from the sky.


In this project, we will explore techniques based on deep learning and semantics to register ground and aerial images. These can include end-to-end methods that provide the registration given the images, and also methods exploiting the geometry and semantics of the scene to find correspondences between the two views.


This is a challenging, unsolved problem and the result of the project can additionally lead to a publication in a high-impact computer vision conference.  


  • PhD student or MSc in computer science, computer vision or related fields, and have been working on deep learning or 3D reconstruction. 

  • Self-motivated and ready to lead project development  

  • Strong in python and C/C++, and experienced in PyTorch 

  • Previous publications in any of CVPR, ICCV, ECCV, NeurIPS, ICML is a plus. 

The internship period is 3 months with the possibility to extend if needed. When you apply, please include your CV and a short description about of previous works e.g. links to your Github account or papers. 


We're a remote first company—you can apply to this position from anywhere in the world. Become part of our mission for helping others to visualize and understand the world!