Welcome to the Old 4D Light Field Benchmark.

A complete description of the datasets and the acquisition process is available in the VMV 2013 paper:
“Datasets and Benchmarks for Densely Sampled 4D Light Fields”


[download all dataset]


Blender scenes

Conversion between depth and disparity. To compare disparity results to the ground truth depth, the latter has to first be converted to disparity. Given a depth Z, the disparity or slope of the epipolar lines d in pixels per grid unit is

where B is the baseline or distance between two cameras, f the focal length in pixel and Δx the shift between two neighboring images relative to an arbitrary rectification plane (in case of light fields generated with Blender, this is the scene origin). The parameters in the upper equation are given by the following attributes in the main HDF file:

Attribute Description
B dH distance between to cameras
f f focal length
Δx shift shift between neighboring images


Conversion between Blender depth units and disparity. The above HDF5 camera attributes in the main file for conversion from Blender depth units to disparity are calculated from Blender parameters via

where Z0 is the distance between the blender camera and the scene origin in [BE], fov is the field of view in units radian and b the distance between two cameras in [BE]. Since all light fields are rendered or captured on a regular equidistant grid, it is sufficient to use only the horizontal distance between two cameras to define the baseline.


Buddha 2







Gantry Acquisitions

For the real-world light fields, a Nikon D800 digital camera is mounted on a stepper-motor driven gantry manufactured by Physical Instruments. A picture of the setup can be seen in figure 4. Accuracy and repositioning error of the gantry is well in the micrometer range. The capturing time for a complete light field depends on the number of images, about 15 seconds are required per image. As a consequence, this acquisition method is limited to static scenes. The internal camera matrix must be estimated beforehand by capturing images of a calibration pattern and invoking the camera calibration algorithms of the OpenCV library, see next section for details. Experiments have shown that the positioning accuracy of the gantry actually surpasses the pattern based external calibration as long as the differences between the sensor and movement planes are kept minimal.







Ground Truth

Ground truth for the real world scenes was generated using standard pose estimation techniques. First, we acquired 3D polygon meshes for an object in the scene using a Breuckmann SmartscanHE structured light scanner. The meshes contain between 2.5 and 8 Million faces with a stated accuracy of down to 50 micron. The object-to-camera pose was estimated by hand-picking 2D-to-3D feature points from the light field center view and the 3D mesh, and then calculating the external camera matrix using an iterative Levenberg-Marquardt approach from the OpenCV library [Bra00]. This method is used for both the internal and external calibration.

[Bra00] Bradski G.: The OpenCV Library. Dr. Dobb’s Journal of Software Tools (2000).