Mon. Jul 22nd, 2024

Innovative technology enables cars to see around corners

Natasha Kumar By Natasha Kumar Jul7,2024

Innovative technology gives allow cars to see around corners

Researchers use shadows to model 3D scenes, including objects blocked from view. This technique could lead to safer autonomous vehicles, more efficient AR/VR headsets or faster warehouse robots.

Imagine driving through a tunnel in an autonomous vehicle, but unbeknownst to you, an accident has stopped the traffic ahead. You usually have to rely on the car in front of you to know when to start braking. But what if your vehicle could see the car ahead and brake even earlier?

Researchers at the Massachusetts Institute of Technology and Meta have developed a computer vision technique that will one day allow an autonomous vehicle to do just that. They introduced a method that creates physically accurate 3D models of an entire scene, including areas blocked from view, using images from a single camera. Their technique uses shadows to determine what is in closed parts of the scene.

Innovative technology enables cars to peer around the corner

Plato-NeRF — is a computer vision system that combines lidar measurements with machine learning to reconstruct a 3D scene, including hidden objects, from just one camera using shadows. Here, the system accurately simulates a rabbit in a chair, even if that rabbit is blocked from view. Credit: Provided by researchers, edited by MIT News

They call their approach PlatoNeRF, based on Plato's allegory of the cave, a passage from the "Republic" Greek philosopher, in which prisoners locked in a cave see the reality of the outside world based on the shadows cast by the cave walls.

By combining lidar (light detection and ranging) with machine learning, PlatoNeRF can generate more accurate reconstructions of 3D geometry than some existing AI methods. In addition, PlatoNeRF does a better job of smoothly reconstructing scenes where shadows are difficult to see, such as scenes with high lighting or dark backgrounds.

Improving AR/VR and Robotics with PlatoNeRF

In addition to improving the safety of autonomous vehicles, PlatoNeRF can make AR/VR headsets more efficient by allowing the user to model the geometry of a room without having to walk around taking measurements. It can also help warehouse robots find items faster in a cluttered environment.

"Our key idea was to take these two things that were done in different disciplines before and bring them together — multibounce lidar and machine learning. It turns out that when you combine these two worlds, you find many new opportunities to explore and get the best of both worlds, – says Zofi Klinghoffer, an MIT graduate student in media arts and sciences, research assistant in the Camera Culture Group at the MIT Media Lab, and lead author of the PlatoNeRF paper.

Klinghoffer wrote the paper with his advisor Ramesh Raskar, an associate professor Chair of Media Arts and Sciences and Chair of the Camera Culture Group at MIT; senior author Rakesh Ranjan, director of AI research at Meta Reality Labs; as well as Siddharth Somasundaram, research fellow at Camera Culture Group, and Xiaoyu Xiang, Yuchen Fan, and Christian Richardt of Meta. The research will be presented at the conference on computer vision and pattern recognition.

Advanced 3D reconstruction using lidar and machine learning

Reconstruction of a complete 3D scene from a single camera view is a challenging problem. Some machine learning approaches use generative artificial intelligence models that try to guess what's in closed areas, but these models can hallucinate objects that aren't actually there. Other approaches try to determine the shapes of hidden objects using shadows in a color image, but these methods can have problems when the shadows are difficult to see.

For PlatoNeRF, MIT researchers built these approaches using a new sensing model, which is called a single-photon lidar. Lidars display a 3D scene by emitting pulses of light and measuring the time it takes for that light to bounce back to the sensor. Because single-photon lidars can detect individual photons, they provide higher resolution data.

The researchers use a single-photon lidar to illuminate a target point on the scene. Part of the light is reflected from this point and returns directly to the sensor. However, most of the light is scattered and reflected by other objects before returning to the sensor. PlatoNeRF relies on these second light reflections.

By calculating how long it takes light to bounce twice and then return to the lidar sensor, PlatoNeRF captures additional information about the scene, including depth. The second reflection of light also contains information about shadows.

The system tracks secondary rays of light — those reflected from the target point to other points in the scene — to determine which points lie in shadow (due to lack of light). Based on the location of these shadows, PlatoNeRF can infer the geometry of hidden objects.

The lidar sequentially illuminates 16 points, capturing multiple images that are used to reconstruct the entire 3D scene.

"Every time we light a point in the scene, we create new shadows. Because we have all these different light sources, we have a lot of light rays shooting around, so we carve out an area that is occluded and lies outside the visible eye», — says Klinghoffer.

Combining multibounce lidar and neural emission fields

The key to PlatoNeRF is the combination of a multibounce lidar with a special type of machine learning model known as a Neural Radiation Field (NeRF). NeRF encodes scene geometry into neural network weights, giving the model a powerful ability to interpolate or estimate new scene views. This interpolation capability also results in highly accurate scene reconstructions when combined with multi-reflection lidar, says Klinghoffer.

“The biggest challenge was figuring out how to combine these two things.” We really needed to think about the physics of how light is transmitted with multiple bounce lidar and how to model that with machine learning,— he says.

They compared PlatoNeRF with two conventional alternative methods: one that uses only lidar and the other – only NeRF with color image.

They found that their method was able to outperform both methods, especially when the lidar sensor had a lower resolution. This will make their approach more practical for deployment in the real world, where lower-resolution sensors are common in commercial devices.

"About 15 years ago, our group invented the first camera that "sees" at angles, which works by using multiple light reflections or “light echoes”. These methods used special lasers and sensors, and used three reflections of light. Since then, lidar technology has become more popular, leading to our research into cameras that see through fog. This new work uses only two light reflections, which means that the signal-to-noise ratio is very high and the quality of the 3D reconstruction is impressive.», — says Raskar.

In the future, the researchers want to try tracking more than two bounces of light to see how that might improve scene reconstruction. Additionally, they are interested in applying deeper learning methods and combining PlatoNeRF with color image measurements to extract texture information.

«While shadow images captured by a camera have long been studied as a means of 3D reconstruction, this work returns to the problem in the lidar context, showing significant improvements in the accuracy of reconstructed hidden geometry. The work shows how smart algorithms can create extraordinary capabilities when combined with ordinary sensors, including lidar systems that many of us now carry in our pockets,— says David Lindell, associate professor of computer science at the University of Toronto. who did not participate in this work.

Natasha Kumar

By Natasha Kumar

Natasha Kumar has been a reporter on the news desk since 2018. Before that she wrote about young adolescence and family dynamics for Styles and was the legal affairs correspondent for the Metro desk. Before joining The Times Hub, Natasha Kumar worked as a staff writer at the Village Voice and a freelancer for Newsday, The Wall Street Journal, GQ and Mirabella. To get in touch, contact me through my natasha@thetimeshub.in 1-800-268-7116

Related Post