3D Reconstruction Techniques for Construction Sites

Problem Statement

Recent advances in construction machinery have introduced semi-autonomous control capabilities supported by high-precision localization sensors and control systems, as well as data-link functionalities for transmitting and receiving design plans and operational records via network servers. These machines, often referred to as ICT construction equipment, are increasingly equipped with RGB cameras for surrounding monitoring. Such cameras are mainly utilized for applications like object detection and operator-assist warnings. However, for achieving safer operation and full automation, robust 3D scene understanding of construction environments is an essential foundation technology.

On the other hand, conventional 3D sensors such as LiDAR face challenges in cost and durability, which are critical factors in construction machinery. As a result, their widespread adoption has been limited.

 

Approach

We focus on two technological trends that have rapidly progressed in recent years.
The first is 3D reconstruction from RGB images using deep learning. Beyond monocular depth estimation (MDE), which infers 3D geometry from a single image, multi-view reconstruction provides more robust and consistent results by leveraging multiple scene images. Moreover, the fusion of RGB imagery and LiDAR point clouds has become an active area of research, aiming to combine the strengths of both modalities.

The second trend is Digital Transformation (DX) in construction. In this context, drone-based sensing using LiDAR and cameras enables the creation of digital twins of construction sites, serving as a foundation for project optimization.

Our approach develops technologies that allow construction machinery equipped with RGB cameras to utilize external data—such as drone-captured RGB images and point clouds—to achieve a dynamic and comprehensive understanding of the construction environment.

 

Ongoing Work

3D reconstruction based solely on RGB images inherently produces up-to-scale estimates due to the lack of absolute depth reference. To address this, we propose a model (referred to as XVADER) that incorporates metric depth priors obtained through pre-trained Metric Monocular Depth Estimation (Metric MDE) into multi-view reconstruction, enabling metric-scale 3D estimation.
Currently, we are working to further enhance reconstruction accuracy through fusion with LiDAR point cloud data provided by drones. This integration aims to achieve both high precision and robustness suitable for practical deployment in construction machinery.

Figure 1 – Overview of ICT construction equipment. Nakagawa, T. (2017). STATE-OF-THE-ART CONSTRUCTION SITES REALIZED WITH ICT CONSTRUCTION MACHINES.

Figure 2 –  Overview of XVADER. 3D reconstruction from RGB images.

 

Relevant Publications

 

Students

Shun Sasaki