Light Field Cameras for 3D Capture and Reconstruction Part 1 of 2

Plenoptic cameras, or light field cameras, use an array of individual lenses (a microlens) to capture 4D light field about a scene. This lens arrangement means that multiple light rays can be associated to each sensor pixel and synthetic cameras (created via software) can then process that information.

Phew, thats a mouthful, right? Its actually easier to visualize

Image from Raytrix GmbH Presentation delivered at NVIDIA GTC 2012

This light field information can be used to help solve various computer vision challenges for example, allowing images to be refocused after they are taken, to substantially improve low light performance with an acceptable signal to noise ratio or even to create a 3D depth map of a scene. Of course the plenoptic approach is not restricted to single images, plenoptic video cameras (with a corresponding increase in data captured) have been developed as well.

The underlying algorithms and concepts behind a plenoptic camera have been around for quite some time. A great technical backgrounder on this technology can be found in Dr. Ren Ngs 2005 Stanford publication titled Light Field Photography with a Hand-Held Plenoptic Camera. He reviews the (then) current state of the art before proposing his solution targeted at synthetic image formation. Dr. Ng ultimately went on to commercialize his research by founding Lytro, which I discuss later. Another useful backgrounder is the technical presentation prepared by Raytrix (profiled below) and delivered at the NVIDIA GPU Technology Conference 2012.

In late 2010 at the Nvidia GPU Conference, Adobe demonstrated a plenoptic camera system (hardware and software) they had been working on while dated, it is a useful video to watch as it explains both the hardware and software technologies involved with light field imaging as well as the computing horsepower required. Finally, another interesting source of information and recent news on developments in the light field technology space can be found at the Light Field Forum.

Light field cameras have only become truly practical because of advances in lens and sensor manufacturing techniques coupled with the massive computational horsepower unlocked by GPU compute based solutions. To me, light field cameras represent a very interesting step in the evolution of digital imaging which until now has really been focused on improving what had been a typical analog workflow.

Light Field Cameras and 3D Reconstructions

Much of the recent marketing around the potential of plenoptic synthetic cameras focuses on the ability of a consumer to interact and share images in an entirely different fashion (i.e. changing the focal point of a captured scene). While that is certainly interesting in its own right, I am personally much more excited about the potential of extracting depth map information from light field cameras, and then using that depth map to create 3D surface reconstructions.

Pelican Imaging (profiled below) recently published a paper at SIGGRAPH Asia 2013 detailing exactly that — the creation of a depth map, which was then surfaced manipulated, using their own plenoptic hardware and software solution called the PiCam. This paper is published in full at the Pelican Imaging site, see especially pages 10-12.

There is a lot of on-going research in this space, some use traditional stereo imaging methods acting upon the data generated from the plenoptic lens array but others use entirely different technical approaches for depth map extraction. A very interesting recent paper presented at ICCV 2013 in early December 2013 titled Depth from Combining Defocus and Correspondence Using Light Field Cameras and authored by researchers from the University of California, Berkley and Adobe proposes a novel method for extracting depth data from light field cameras by combining two methods of depth estimation. The authors of this paper have made available their sample code and representative examples and note in the Introduction:

The images in this paper were captured from a single passive shot of the $400 consumer Lytro camera in different scenarios, such as high ISO, outdoors and indoors. Most other methods for depth acquisition are not as versatile or too expensive and difcult for ordinary users; even the Kinect is an active sensor that does not work outdoors. Thus, we believe our paper takes a step towards democratizing creation of depth maps and 3D content for a range of real-world scenes.

In Part 2 (which can be found here) I will take a look at the technology providers.