Making Three-dimensional Imaging a Reality

August 28, 2010

Making Three-dimensional Imaging a Reality using Mass-Market Digital Cameras, Camcorders and Phones (A short movie of a 3D head model built by Visualsize is available at http://www.visualsize.com/3ddemo/videos/amr/amr.htm.)

Imaging technologies have advanced significantly over the past 20 years or so. Film cameras are now largely replaced by digital ones, and a significant percentage of cell phones sold these days are equipped with a camera. However, these cameras record color pictures and videos, not the depth (distance) profile of the 3D scene. While specialized range sensors – using active laser and infrared projection – can obtain such 3D distance profiles, such sensors are significantly more expensive, have much lower spatial resolutions, and can be bulky and unwieldy. With over one billion camera phones and another billion digital cameras sold world-wide each year, can clever algorithms make 3D imaging a reality on these mass-market digital cameras? This is the question Professor Yuan-Fang Wang’s research trying to answer.

The fundamental principles of three-dimensional computer vision are well understood. To see in three dimensions with a regular digital camera, the key is to take more than one picture – in fact, a lot of them. Furthermore, there are, in fact, a multitude of possible formulations of the 3D imaging problem. On one extreme, the input images can be stitched together to build a panorama without explicitly inferring the 3D depth. On the other extreme, the 3D depth and texture can be densely recovered at each and every pixel location. Certainly, middle-of-the-road solutions, such as inferring only the camera motion but not the object structure (useful for browsing large collections of photographs based on their spatial relationship), or inferring discrete, sparse 3D structures (useful for dimension measurement) are also possible.

Professor Wang’s research, funded by the U.S. Army, has focused on 3D behavior modeling – especially to model the deformation behaviors of soft tissues and organs in computer-assisted colonoscopy. The research results were widely published and showed the potential of making cameras see in 3D, but commercial development to make 3D imaging a viable technology was needed. The desire to see the fruits of the basic research directly benefiting the society led Professor Wang to found Visualsize Inc., which developed the IP needed to turn the basic research into robust commercial products.

Professor Wang has taken the sponsored research in multiple, exciting directions to develop a suite of 3D technologies from building panorama images, to spatially-aware image browsing (similar to Microsoft’s Photosynth), to 3D metrology and dimension measurement, and to 3D model building (similar to Autodesk’s Project Photofly). All of 3D technologies (1) work with consumer-market digital cameras, camcorders, and camera phones of any make/model, (2) use no special equipment (e.g., lens and tripod), active projection, calibration, or markers, (3) require no user training (just point and shoot), (4) are fully automated and end-to-end (from photographs to end results) requiring no manual intervention, (5) are software-based solutions that do not involve modifying the existing device hardware, (6) does not use proprietary, third-party codes, and (7) run on back-end commodity PCs (Windows or Linux) without the need of special hardware (GPU, DSP, etc.) acceleration.

The 3D software is superior in the sense that a due-diligent Web search has unearthed no other company in the world (sans Autodesk and Microsoft) that offers such a comprehensive suite of 3D technologies. Furthermore, in unbiased comparison studies, Visualsize’s technologies compare very favorably against those from Microsoft and Autodesk:

(A) Spatially-aware image browsing: using Visualsize’s PhotoNav3D against Microsoft’s Photosynth, and the results are summarized in http://www.visualsize.com/photonav3d/summary.html

(B) 3D model building: using Visualsize’s PhotoModel3D against

  1. Bundler – which is the core of the Photo Tourism line of research at Microsoft (Dr. Richard Szeliski) and University of Washington (Drs. Noah Snavely and Steven Seitz), arguably the best-known R&D project in 3D modeling. The results are summarized in http://www.visualsize.com/3Ddemo/true3d/comparison/ .
  2. Bundler and PMVS2 combo – PMVS2 was developed by Drs. Yasu Furukawa and Jean Ponce at University of Illinois. The results are summarized in http://www.visualsize.com/3Ddemo/true3d/comparison/.
  3. Project Photofly from Autodesk – Project Photofly is based on the technologies of RealViz, acquired by Autodesk in 2008. RealViz’s technologies were acquired from INRIA’s RobotVis Research Group, headed by Dr. Olivier Faugeras. The results are summarized in http://www.visualsize.com/3Ddemo/comparison/.

There are myriad, real-world applications of such 3D technologies. The 3D modeling technology can be used in civilian applications such as Internet showroom and Web sales, social networks, and entertainment (movies and games), and in military arena in realistic event simulation, surveillance, situation study, threat assessment, and campaign planning. The spatially-aware photo browsing and panorama building technologies can facilitate virtual tourism and museums, insurance claim processing, and crime scene analysis. 3D metrology allows remote 3D dimension measurements in construction (roofing, floor, etc.), home improvement (remodeling), urban development, and city planning.

Professor Wang is grateful for the funding support of the U.S. Army and the opportunity to engage in this entrepreneur endeavor accorded by the Computer Science Department and the University of California.