Seamless High-Resolution Terrain Reconstruction: A Prior-Based Vision Transformer Approach

Osher Refaeli^1,* Tal Svoray¹ Ariel Nahlieli¹

¹Ben-Gurion University of the Negev

*Corresponding author: osher@bgu.ac.il

Abstract

High-resolution elevation data is essential for hydrological modeling, hazard assessment, and environmental monitoring; however, globally consistent, fine-scale Digital Elevation Models (DEMs) remain unavailable. Very high-resolution single-view imagery enables the extraction of topographic information at the pixel level, allowing the reconstruction of fine terrain details over large spatial extents. In this paper, we present single-view-based DEM reconstruction shown to support practical analysis in GIS environments across multiple sub-national jurisdictions. Specifically, we produce high-resolution DEMs for large-scale basins, representing a substantial improvement over the 30 m resolution of globally available Shuttle Radar Topography Mission (SRTM) data. The DEMs are generated using a prior-based monocular depth foundation (MDE) model, extended in this work to the remote sensing height domain for high-resolution, globally consistent elevation reconstruction. We fine-tune the model by integrating low-resolution SRTM data as a global prior with high-resolution RGB imagery from the National Agriculture Imagery Program (NAIP), producing DEMs with near LiDAR-level accuracy. Our method achieves a 100x resolution enhancement (from 30 m to 30 cm), exceeding existing super-resolution approaches by an order of magnitude. Across two diverse landscapes, the model generalizes robustly, resolving fine-scale terrain features with a mean absolute error of less than 5 m relative to LiDAR and improving upon SRTM by up to 18 %. Hydrological analyses at both catchment and hillslope scales confirm the method's utility for hazard assessment and environmental monitoring, demonstrating improved streamflow representation and catchment delineation. Finally, we demonstrate the scalability of the framework by applying it across large geographic regions.

Key Features

100 × Resolution Enhancement

We enhance the spatial resolution of predicted DEMs by a factor of 100, from 30 m to 30 cm, surpassing previous attempts by an order of magnitude.

Global Prompting

We leverage freely available SRTM DEMs as absolute-height prompts, ensuring a globally consistent elevation context.

Seamless Terrain Products

We blend patch-wise Vision Transformer predictions into seamless mosaics that are ready for slope, aspect, and flow-routing analyses.

Resource-Efficient

Processing ≈ 150 km² h^-1 on a single GPU; and achieving up to an 18% improvement in vertical accuracy compared with the original SRTM dataset.

Visual Preview

Urban

RGB

Elevation

Aspect

Hillshade

Slope

Vegetated

RGB

Elevation

Aspect

Hillshade

Slope

Bare

RGB

Elevation

Aspect

Hillshade

Slope

Acknowledgment

We thank the Ministry of Agriculture Chief Scientist (grant 16-17-0005, 2022) and the Negev Scholarship of the Kreitman School, Ben-Gurion University of the Negev, for supporting Osher Rafaeli’s PhD studies.

Cite Us


@misc{rafaeli2025prompt2demhighresolutiondemsurban,
      title={Prompt2DEM: High-Resolution DEMs for Urban and Open Environments from Global Prompts Using a Monocular Foundation Model}, 
      author={Osher Rafaeli and Tal Svoray and Ariel Nahlieli},
      year={2025},
      eprint={2507.09681},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.09681}, 
}

This page was built using the Academic Project Page Template, which was adopted from the Nerfies project page.