Getty Images/iStockphoto

Collins Aero reducing flight delays with Databricks platform

The data lakehouse vendor's tools form the foundation of analytics products designed to help airlines predict and prevent maintenance that results in delays and cancellations.

Collins Aerospace is trying to do something about the frequency of flight cancellations and delays, and it's using the Databricks lakehouse platform to do it.

Delays and cancellations are the bane of any traveler's existence. They ruin vacations, cause meetings to be missed, and usually lead to frustration and fatigue.

Few can do anything about it. For most, cancellations and delays just mean trying to make the best of a bad situation.

Collins Aerospace, however, can do something about it.

Based in Charlotte, N.C., the company is one of the United States' largest suppliers of aerospace and defense products, including many of the parts that are used to build commercial aircrafts. It doesn't make every part of a given plane, but from the radar system at the nose to the stabilizers on the rear wings and tail, it makes many of them.

In addition, Collins Aerospace now develops technology products -- including airport systems, data and analytics products, and flight support services -- at its Connected Aviation Solutions unit aimed at driving digital transformation in the aviation industry.

As both a builder of airline parts and developer of data-driven technology tools, it is uniquely positioned to provide airlines with information that can lead to preventative maintenance.

In 2017, Collins Aerospace sought to develop a software system to help airlines predict maintenance problems before they arise and take preventative measures.

Over the following five years, it honed that platform -- with its 2019 deployment of the Databricks lakehouse platform playing a key role -- that now helps customers reduce delays by nearly one-third.

The problem

Flight delays and cancellations are on the rise.

According to the Bureau of Transportation Statistics, more than 1.1 million flights were delayed at least 15 minutes in 2022, up from just over 800,000 in 2021. Meanwhile, cancellations have risen substantially since the 2020 start of the COVID-19 pandemic, with Reuters reporting that the number of flight cancellations through just the first six months of 2022 exceeded the total number of cancellations for all of 2019.

The primary culprit for both delays and cancellations is the airlines, the BTS reports.

Before the pandemic, about half of all delays were due to the National Aviation System. Now, however, more than half are due to the airlines themselves. Maintenance problems especially are causing more delays and cancellations than ever, according to Sanket Amin, a senior manager at Collins Aerospace leading the company's data science and analytics efforts.

Collins Aerospace is trying to change that with its Ascentia platform, a service developed on Databricks designed to enable customers to take proactive measures to prevent flight delays and cancellations.

"We're looking at solutions to help airlines deal with the problem of delays and cancellations," Amin said in June during a session at Databricks' Data + AI Summit, the vendor's user conference in San Francisco. "While we work on a variety of solutions to help airlines in this way, Ascentia helps avoid unscheduled maintenance events."

Amin himself has suffered the frustration of ruined plans resulting from repeated flight delays.

For his 40th birthday, he planned a trip with his wife to drive a convertible down Route 1 in Florida from Miami to Key West.

"It was a bucket-list dream trip of mine," he said.

On the morning of the trip, however, Amin got a text alert that the early morning flight he and his wife planned to take to Miami was delayed an hour. Then came another alert that the flight was further delayed. He called the airline at that point to get more information, but the airline had none. He and his wife left for the airport only to get yet another alert as they drove that the flight was being pushed back even later.

With the flight now scheduled to depart in the afternoon and the 4-hour drive from Miami to Key West planned for that same day, the whole reason for taking the plane was disappearing.

Ultimately, the flight was delayed again and again, pushing their arrival to the evening, when it was too late to take the drive.

"The trip was no longer worth it," Amin said. "So we canceled our flight and went back home."

He later learned that the flight did leave, taking off 12 hours later than its initially scheduled departure. The reason for the delay?

"In this particular case, it was due to an aircraft maintenance issue," Amin said.

But delays and cancellations don't only affect travelers. They cost airlines more than $8 million per month in losses, according to Amin. With regulations emerging in the European Union and U.S. that propose to fine airlines for delays and cancellations on a per passenger basis, those losses could increase significantly.

An airplane in midflight.
Collins Aerospace, a manufacturer of airline parts and developer of data products, is attempting to reduce flight cancellations and delays with predictive analytics tools engineered using the Databricks lakehouse platform.

Solution and results

Ascentia is a tool designed to monitor and predict the health of aircraft parts to help airlines avoid delays and cancellations due to maintenance issues.

"It's about becoming proactive rather than reactive with aircraft maintenance -- forward looking rather than rearview looking," Amin said.

Unlike some similar tools built by specific airlines or developers of specific components used to build airplanes, Ascentia is airline agnostic and not limited, for example, to monitoring the health of just the engine or another single part of a plane, he continued.

"Many of the analytics we've deployed come from combining [the work] of an engineering expert for a component with a data scientist," Amin said.

Despite Ascentia's promise, it has not yet gained the industry-wide traction needed to materially affect the massive number of delays and cancellations. But it is having an effect when deployed.

Those that use the platform can predict problems and reduce potential delays by about 30% and decrease unplanned maintenance by about 20%, according to Amin.

The idea for Ascentia began in 2017, before the current iteration of Collins Aerospace was formed in 2018 with United Technologies' acquisition of Rockwell Collins and subsequent merger with UTC Aerospace Systems.

The first members of the team tasked with developing tools to predict and prevent airline maintenance used their standard company-issued laptops with Matlab as the underlying technology.

That meant the compute power used to perform data management and analysis was limited to what the laptops could provide, which amounted to the ability to process data from two aircraft in 30 minutes, according to Amin.

"Trying to analyze two months' worth of data for a handful of aircraft was doable but not scalable," he said. "We quickly realized we needed a cloud environment to store, process and consume large volumes of data. We needed a solution that was going to make it easy to prototype analytics and test it against a large dataset."

It needed a platform that could handle 50 gigabytes of data per day and terabytes of data overall coming in real time from IoT sensors, weather reports, aircraft position tracking, maintenance logs and more.

We're looking at solutions to help airlines deal with the problem of delays and cancellations.
Sanket AminSenior manager of data science and analytics, Collins Aerospace

In 2019, Collins Aerospace deployed Ascentia on the Databricks lakehouse platform.

That enabled data scientists to code in Python and use data science notebooks to analyze large amounts of data. Data integration, meanwhile, was connected to Delta Lake tables to enable efficient batch processing.

Quickly, the result of the transition to Databricks' lakehouse platform was more efficient and cost-effective use of a tool within Ascentia named Automated New Signature Evaluation Reporting (ANSER).

Before Databricks, ANSER could analyze 168,000 data points per flight at a cost of nearly $14,000 over more than four days. After deploying on Databricks, through automation and other process improvements, the cost to analyze the same 168,000 data points per flight dropped to less than $3,000, and the time to do so was reduced to about 22 hours.

"The majority of our improvements came with a few lines of code to optimize the way we structured our data in Delta Lake," Amin said. "After implementing those improvements, we achieved a nearly 80% reduction in cost while speeding up the processing on the same dataset."

By 2022, Collins Aerospace had developed a data operation that begins with data ingestion into the Databricks lakehouse platform and Microsoft's Azure Data Factory and continues through refinement and curation in Delta Lake -- an open source tool initially developed by Databricks before being made open source. The process ultimately ends with consumption in Ascentia after being moved to Databricks lakehouses and SQL databases.

"Databricks is leveraged all across the architecture, from ingestion to data processing to data serving," Amin said. "We went from having zero analytics and a handful of data sets to an application that today is serving hundreds of aircraft with over 50 analytics, all within a period of five years."

Future plans

While Ascentia now helps users predict and prevent maintenance problems that enable them to significantly reduce their delays and cancellations, Collins Aerospace has plans to improve the platform.

One of the company's goals for 2023 is to achieve multi-cloud interoperability, according to Amin.

Collins Aerospace has data distributed across numerous cloud environments, with most in AWS and Azure. It has now set up Databricks' lakehouse platform in both and is in the process of setting up Delta sharing to enable a connection between AWS and Azure.

The desired result will be richer datasets to drive analysis.

In addition, Amin and his team are targeting better data governance. Collins Aerospace has employees throughout the world and collects data from customers and governments in numerous countries with different regulations.

Still another focus this year is to improve machine learning operations. Currently, Collins Aerospace's MLOps are manually executed.

By yearend, Amin said he hopes to at least automate the training, testing and deployment of models. Beyond 2023, the goal is to also automate the monitoring and updating of machine learning models.

"Ultimately, translating data into value is our mission," Amin said. "We believe that will lead to a more sustainable, more efficient, more reliable and more enjoyable aerospace experience."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Next Steps

Databricks lakehouse a key tool for champion Texas Rangers

Dig Deeper on Business intelligence technology