Video caption: this talk on spatial AI is instructive in several ways
This many years into the new millennium, we’re still waiting for the self-driving car. Why is that, exactly?
Keith Kirkpatrick of the Communications of the ACM magazine cites repeated announcements from Elon Musk of autonomous vehicles that never really arrived on the American streets in any great capacity.
“Elon Musk has predicted his company Tesla would deliver fully autonomous vehicles by the end of 2021, but he made similar predictions in 2020, 2019, and 2017,” Kirkpatrick writes. “Each prediction has fallen flat, largely due to real-world safety concerns, particularly related to how self-driving cars perform in adverse conditions or situations.”
That last part in bold holds the key to why we still have to drive our cars by ourselves!
You could call it the ‘crossing guard problem’ – and it has to do with the small ways that people use intuition to avoid obstacles in the real world.
Here’s how it works: in addition to using all of our spatial senses to figure out what’s around us, we can also read dynamic social cues from other people who are driving other vehicles, or biking, or walking around us. And we rely on those other inputs to keep us safe.
By contrast, AI isn't quite there yet. It can generalize some principles about road safety, but easily gets confused when looking at any complex landscape.
You can see that in the beginning of this presentation by Ayush Tewari at IIA – he's showing the limits of spatial AI in figuring out what's around a vehicle, and how to drive safely.
And as we know, it only takes one mistake to create a tragedy.
So we’re playing close to the vest, and very cautiously, when it comes to self-driving vehicles. And that makes sense.
If you follow the rest of this video, outlining the current state of AI, Tewari talks about three different kinds of advancements – graphics, computer vision and robotics.
These have different applications, and different underpinning goals and objectives.
First, take a look at how we move from 2D to 3D, analyzing the ways that an eye or camera intakes images, and using convolutional neural networks and other tools to create vibrant 3D models.
(Tewari goes over this, too: you can take a look at his example of two-dimensional fire hydrant images generating a three-dimensional result.)
In a way, it's a lot like those early virtual tours, where people took a series of camera photographs from different angles, and strung them together to create three-dimensional motion.
But that was what you might call “purely deterministic” rendering, like a primitive flip book, making video out of frames. The new stuff is really of a different caliber, and you start to understand that when you see the AI guessing about what a fire hydrant or other object would look like from the back!
Here, we’re dealing with the imaginative side of AI, and it's pretty impressive. Actually, that's an understatement – if you look at too much of the stuff, you might find it hard to sleep at night. AI is outpacing us in leaps and bounds when it comes to dealing with predictions and forecasting, because it has such an unbounded ability to take in information in large amounts and sift through it to get insights.
Going back to the IIA presentation at hand, when it comes to robotics, Tewari is talking about the sort of thing we heard in Russ Tedrake's talk on ‘robots doing the dishes.’ That's a good one, if you haven't checked that one out yet.
Anyway, we’re seeing that the spatial AI is creating more dexterous robots that can do more in the real world. Where this really ties together is in the making of sentient AIs inhabiting robotic bodies. When you put the physical capabilities together with the mental ones, you are, again, starting to see technologies that can imitate humanity in bold and sometimes disturbing ways.
That's one view into the limits of spatial AI, and on the other hand, how it's moving forward as well. How long do you think it will be before we get the self-driving vehicle, and by that time, what else do you think computers will be able to do?