Just as we shift from burning fossil fuels to electrification to power our vehicles, we are also moving steadily towards a greater number of autonomous functions.
The journey began with advanced driver-assistance systems, and while the destination is yet to be decided (will we really have fully autonomous, driverless cars on public roads?) the technology is locked in to become part of our lives.
But even though there has been a lot of fanfare about vehicles already on the road being able to almost take over the majority of driving – Tesla along with Waymo are rarely out of the headlines in this field – there remain significant challenges to reaching the highest levels of automated driving.
Perhaps one of the biggest issues to overcome is developing autonomous driving systems that can predict what is going to happen in certain scenarios and adapt. Something humans are incredibly good at.
Who gives way?
No doubt many people have debated how a self-driving vehicle would cope if it arrived at a multi-exit junction at the same time as other vehicles – who gives way to whom?
For humans it isn’t really an issue. When you’re behind the wheel of a car you can make eye contact with other drivers, use gestures or accrued knowledge to predict how things should proceed. An autonomous vehicle could find it more difficult.
Silicon firm Nvidia and others are developing technologies that hopefully should be able to overcome these kinds of dilemmas.
Its PredictionNet system uses a deep neural network (DNN) that can be used to predict future behaviour and vehicle trajectories. Neural networks can learn to use context and environment to improve prediction, and Nvidia’s DNN uses a rasterised top-down view of the world provided by onboard perception systems and computes predictions from past observations. This form of machine learning is key to autonomous vehicles being able to reach their full potential.
Nvidia’s approach uses recurrent neural networks (RNN) that can leverage past insights to make future predictions. The system uses temporal sequences of images to produce prediction results. RNNs can also use contextual information, such as how an object appears to be moving relative to its static surroundings, when predicting its future motion.
Neda Cvijetic, autonomous vehicle software developer at Nvidia, said: “For PredictionNet, we adopt an RNN-based architecture that uses two-dimensional convolutions. This structure is highly scaleable for arbitrary input sizes, including the number of road users and prediction horizons.”
“As is typically the case with any RNN, different time steps are fed into the DNN sequentially. Each time step is represented by a top-down view image that shows the vehicle surroundings, including dynamic obstacles observed via live perception, and fixed landmarks provided by a map.
“This top-down view image is processed by a set of 2D convolutions before being passed to the RNN. In the current implementation, PredictionNet is able to confidently predict one to five seconds into the future, depending on the complexity of the scene, for example, highway versus urban.”
Being able to predict the future is incredibly important as autonomous vehicles take to the road because, initially at least, they will have to mix with human-controlled vehicles too. Being able to predict how often irrational people drive will be challenging, but necessary.
Want the best engineering stories delivered straight to your inbox? The Professional Engineering newsletter gives you vital updates on the most cutting-edge engineering and exciting new job opportunities. To sign up, click here.
Content published by Professional Engineering does not necessarily represent the views of the Institution of Mechanical Engineers.