News
Youtube Channel CG
Low Feature 3D Reconstruction Benchmark
Introduction
This data set was designed to evaluate the accuracy and robustness of online 3D reconstruction frameworks with data providing different levels of sensor noise, low depth features scenes, and ground truth for ego motion and geometry. We provide low feature depth data acquired with two Kinect sensors, i.e., the Structured Light based Kinect™ (Kinect-SL) and the Time-of-Flight based Kinect™ (Kinect-ToF)).
We provide the following scenes, partially acquired in a registered manner using a turn-table, partially acquired in free-hand mode:
LEGO™-PAMI-TT:
-
A scene constructed with LEGO™ pieces comprising the letter sequence "IEEE PAMI". We use a rigid setup, where the camera is fixed at about 80 cm height above the target scene. The scene is mounted on a turn-table controlled with a precise stepper motor, which yields ground truth camera poses. The ground plate is 80×80 cm wide.
The turn-table is making a complete 360° revolution leading to 1601 acquired frames, where the last frame corresponds to the initial turn-table position.
We use two different scales of the "IEEE PAMI" scene: LEGO™-PAMI-TT× 1 with a single block width and height and a second version LEGO™-PAMI-TT× 2 uniformly scaled by a factor of 2.
LEGO™-PAMI-Free:
- The two scenes as LEGO™-PAMI-TT acquired with a free-hand uncontrolled camera motion at a distance of about 50-100 cm distance from the target and acquiring some 1100 frames. In this setup, only geometric ground truth is available.
Stone-Wall:
-
This data set presented by Zhou and Koltun has approximate dimensions of 5.8×2.8×0.7 m. It has some 2700 input frames acquired with Asus Xtion Pro Live range camera and it includes a prominent loop closure.
The data set and the reconstruction based on their global optimization approach is provided here.
Brick-Wall:
-
This scene comprises of a nearly planar wall with very thin depth features (< 4 mm) only present at the wall's brick interstices. The wall was acquired using a hand guided range camera at a distance of about 50-100 cm, yielding some 800 depth frames. The scenery covers approximately an area of 1.80×1.70 m.
Neither groundtruth camera poses nor geometry are available.
For each of our sequence comprises:
- A list of depth image files with associated camera intrinsic parameters.
- Camera position, extracted from each turn-table scene.
-
Reconstruction (original) geometry including all surface attributes, i.e., position, radius, normal, curvature (first principal curvature direction and curvature values) reconstructed with our algorithm for all scenes. Furthermore, for the LEGO™ scenes we provide
- groundtruth geometry built with the LEGO™ designer software, and
- reconstructed (registered) point cloud geometry (only position and normal) registered with its corresponding groundtruth geometry.
Data Format
All scenes include a reconstructed model in .ply file format (if available the groundtruth model, only on LEGO™ sequences), a depth video sequence in 16-bit mono-layer PNG files (unit is millimeters), a computed camera trajectory (and camera path groundtruth if Turn-Table sequences).
Each depth video sequences were taken from a near range Kinect Structure-Light (PC version) at VGA resolution (640x480) and a Kinect v2 prototype version at custom resolution (524x412). Both acquisition were done at full speed (30Hz) except for the Turn-Table sequences where precise stop motion were operated (for perfect groundtruth ego motion generation and avoidance of motion artifacts in the data).
-
Camera intrinsic parameters are provided as intrinsic.txt file for each sequence folder in the following form:
fx: 583.45663288636297 // focal-length in pixels (x-axis) fy: 583.45663288636297 // focal-length in pixels (y-axis) cx: 318.58229917101085 // principal point in pixels (x-axis) cy: 251.55208565453225 // principal point in pixels (y-axis) depthScaling: 1.000 // not used for this data set depthFactor: 1.000 // not used for this data set d0: -0.073768415722531969 // distortion coefficients k_1 (OpenCV format) d1: 0.16408477247405578 // distortion coefficients k_2 (OpenCV format) d2: 0.0 // distortion coefficients p_1 (OpenCV format) d3: 0.0 // distortion coefficients p_2 (OpenCV format) d4: 0.0 // distortion coefficients k_3 (OpenCV format)
-
Camera trajectories are given in cameraPoses.txt file for each sequence folder in the following form:
0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0.999 -0.003 0.001 0 0.003 0.999 0.000 0 -0.001 -0.000 0.999 0 -0.001 -0.000 0.000 1 2 0.999 -0.005 0.002 0 0.005 0.999 0.000 0 -0.002 -0.000 0.999 0 -0.001 -0.000 0.000 1 3 0.999 -0.010 0.003 0 0.010 0.999 0.000 0 -0.003 -0.000 0.999 0 -0.002 -0.000 0.000 1 4 0.999 -0.016 0.004 0 0.016 0.999 0.000 0 -0.004 -0.000 0.999 0 -0.003 -0.000 0.000 1 5 0.999 -0.021 0.005 0 0.021 0.999 0.000 0 -0.005 -0.000 0.999 0 -0.004 -0.000 0.000 1 6 0.999 -0.024 0.005 0 0.024 0.999 0.000 0 -0.005 -0.000 0.999 0 -0.004 -0.000 0.000 1 7 0.999 -0.027 0.005 0 0.027 0.999 0.000 0 -0.005 -0.000 0.999 0 -0.005 -0.000 0.000 1 8 0.999 -0.031 0.006 0 0.031 0.999 0.000 0 -0.006 -0.000 0.999 0 -0.006 -0.000 0.000 1 9 0.999 -0.035 0.007 0 0.035 0.999 0.000 0 -0.007 -0.000 0.999 0 -0.007 -0.000 0.000 1
The first value indicates the frame index. The 16 other values are representing the transformation matrix T4x4 (column-major) to pass from camera coordinate to world coordinate.
- The ego motion Groundtruth is not explicity given in a file. Since this groundtruth is only valid for the specific Turn-Table sequences, the camera path should describe a perfect circle. A matlab code file will also be provided in order to compare the current cameraPoses.txt file to the groundtruth. Practically, a plane is first fitted to the camera center path using RANSAC algorithm. Later on a circle is fitted and a groundtruth ego motion is generated using for each frame the step motor angle.
Download
Video
Publication