re: The Battle of the Sensors | LIDAR vs Camera Imaging in the race for self-driving supremacy Having worked with both cameras and LIDAR, I think the challenge is going to be in compute power.
Just like any problem can be solved with money, similarly, any computational challenge can be solved by throwing enough compute at it. Sadly, it takes more compute power than is currently available on cars.
For current Level 4 self driving cars (specifically Waymo tech stack) there is the LIDAR, the RADAR and the Cameras. Each one cover's up the weakness of the other for ex: LIDAR is crappy in rain, Cameras have bad depth perception and can be fooled like by images, Radar is less accurate but has the maximum range. All three systems together generate huge quantities of high quality data in real time. Waymo also uses HD maps which are used by the vehicle to localize itself in the surrounding. Its up to the onboard computers (CPU + GPU) to clean up this data, fuse it, process it and then act on it. Doing all this in real time is possible. Industries already do it in closed, controlled environments.
Performing all these operations in real time, in unpredictable surroundings, with limited compute is really tough. That's one of the biggest reasons Waymo is so slow with its roll out. They first generate HD maps, then train their software stack on a per city basis, test, test, test some more, release in beta and then release to public.
This approach takes time, all these sensors are extremely expensive and Waymo seems to be following the 'technology will catch up' methodology. They are hoping sensors will become cheap in the future. They are hoping compute will become cheap in the future. A Waymo car from San Francisco cannot operate directly in Arizona. It needs a software update with Arizona data/models. The cars are geo-fenced. They are not profitable right now.
Talking about vision only systems (Tesla, Comma.AI) these only use cameras (in Tesla's case, cheap cameras). With enough processing, vision can be used during night time and in bad weather. But training the AI model for driving just using vision needs a lot of data. It becomes extremely compute intensive. That's why Tesla needs it super computer. Their onboard Hardware 3.0 computer while quite capable, is no where near as capable to handle every scenario that the super computer based AI model has been trained on. As a result, Tesla focusses only on California (and now Texas) streets. Their FSD systems performs way better in these areas than in other cities/localities in terms of interventions and disengagements. No matter how much Elon sings praises of his 'generalized AI model' which can handle any and every scenario, it can't. Not with the current hardware. A small note about AI/ML models: every company (Waymo, Tesla, Zoox etc) has realized AI is the way to make the self driving car a reality. For this discussion, I am using Tesla's software as an example as they have the most amount of training data (6 billion+ miles).
Tesla teaches its models what is good behavior and what is bad behavior. If they train their models on all the data they have available, the models will have trillions of parameters, will be terabytes in size and will be completely unmanageable. It would be next to impossible to run these models on the car's computer and even sending it OTA will be a nightmare
That's why Tesla is extremely selective about their training data. Sometimes they miss a few scenarios and people see regressions/regressive behavior.
In summary, barring cost, software has proven that both approaches can work. The math behind both approaches checks out. The race currently is to prove who can get there faster and profitably. Without significant improvement in compute capacity and cost reduction in sensors, neither of the approaches are going to work.
Last edited by KarthikK : 17th July 2024 at 23:01.
Reason: Minor spacing edits
|