adimas - Fotolia

Machine learning development for autonomous vehicles

AI experts from the autonomous vehicle technology industry discuss automation, spotting flaws in and speeding up machine learning systems and accounting for changes in real time.

With AI at the core of autonomous vehicle software technology, machine learning developers are focused on accelerating the rate of model building and innovation.

For many machine learning experts, the data set is the first step in accelerating development when a new output needs to be added to the perception stack, for example, the computing resources that enable the AI hardware and software of autonomous vehicle control systems to "see."

For autonomous and self-driving vehicles, a key challenge for the design team is when it needs the vehicle to start detecting traffic cones, or, say, amber lights for the first time.

Starting with the data

The first step in attacking that problem is building a continuous learning framework, said Sammy Omari, vice president of engineering and head of autonomy at Motional, an autonomous vehicle maker that is a joint venture between Hyundai Motor Group and Aptiv.

Omari, with other AI experts, spoke on a panel at the Scale TransformX  AI conference on Oct. 6.

The framework starts with labeling or detecting crash and other accident scenarios and then transferring those scenarios into new training sets. Once that is done, developers need an effective training framework to train the new models. He said the final step is to understand how the new output will affect the overall end-to-end performance of the autonomous vehicle system.

Screenshot of experts talking about ML development
During a panel discussion at Scale Transform X conference, machine learning experts discuss ways they accelerate ML development on their different teams.

In the traffic light situation, teams may start with what Omari called "naïve" data. This data set about six to 12 months of driving data and includes potential traffic light scenarios in which there could have been an amber light present.

The developers can run that data on an offline system like the cloud. The next step is to send the results to human annotators to determine where there are still inconsistencies in the data set.

"Each of these stage in the pipeline, we are in the process of optimizing both throughput as well as the actual quality of each of these components," Omari said.

Balancing automation and targeted improvements

Because each stage of the machine learning (ML) development process often involves multiple teams focusing on different parts of the workflow, the challenge is balancing automation of the entire process while making sure to focus on targeted improvements.

Building an effective mining system could solve this problem, Omari said. At Motional, this means creating a scenario search and mining framework. This framework enables developers to compute an extremely large set of attributes after every autonomous vehicle training mission they drive.

Accelerating ML production is also critical, said Yanbing Li, senior vice president of engineering at Aurora, a vendor of autonomous vehicle control systems.

Li said her team takes the friction out of building core ML technology by applying automation, which makes launching experiments "a truly push-button experience so that your ML developers focus on making small changes around their ML code."

"But they get this automated experience of running experimentations and getting results," she continued.

By making the validation process smoother and the AI infrastructure invisible and behind-the-scenes, Li said her team at Aurora can reduce complexities and enable ML developers to focus on validating models.

When we look forward to how sensors will behave in the near future, it's not going to be fixed.
Gonen BarkanGroup manager, General Motors

Being able to change data in real time is another way of using automation while making targeted changes to the system, said Gonen Barkan, group manager at General Motors for radar for autonomous vehicles.

"When we look forward to how sensors will behave in the near future, it's not going to be fixed," Barkan said. "For radars today you can control the way they operate on the fly."

He added that when a ML team is not flexible in changing data, they end up losing a lot of capability.

"Having a very flexible ML pipeline, to digest, train, adapt the noise modeling, adapt the way you treat the data, is extremely critical to being able to utilize the sensor effectively," he said.

The result of changing the data set

But changing the data may end up setting ML engineers back and could disrupt the experiment.

One way to avoid this is by building a simulation system, Omari said. This enables ML teams to automate the evaluations of each data set change at scale and enable teams to get the same signal or a highly similar signal to what a human vehicle driver would have received.

"I think that's one of the biggest challenges for us in the industry as a whole," Omari said.

At Aurora, machine learning teams concentrate on smoothly managing one way they deal with the constantly changing data is by automating and keeping the ML side of their experimentation smooth. Li said her team focuses on managing the CI/CD cycle, a series of steps that is performed to deliver a new version of software pipeline correctly.

"Every day we're trying to really increase the amount of time we spent for the trucks to be in an autonomous stage because that gives us the maximum feedback," Li said.

Avoiding regressions

ML teams must also make sure that they build a validation model that doesn't improve in one scenario and regress in others.

According to Li, multiple modalities of testing can help solve this problem.

"It is extremely important that we have a framework that allows us to test one thing at a time," she said.

Siva Gurumurthy, senior vice president of engineering at vehicle fleet management AI vendor KeepTruckin, said the vendor's platform combats regression by creating different versions of each model and data.

KeepTruckin then inputs the different versions into an automated engine that shows when the model performed well and when it did not.

Another key for machine learning teams is figuring out which results are false positives.

Gurumurthy noted that each ML team approaches testing and data differently.

"Everybody is trying to figure out what works best for their environment and there isn't a traditional set of practices like, 'Hey, here's how you do your model development, here's how you do your code reviews, model reviews,'" Gurumurthy continued. "Everybody is throwing lots of ideas and seeing what sticks to these ML engineers."

Dig Deeper on Machine learning platforms