olly - Fotolia

Data scientist shortage leaves organizations uncertain

Despite their desire to use data science in their decision-making process, some organizations can't find qualified data scientists to develop and run their data science initiatives.

Organizations across varied industries see the benefits of data science, but a data scientist shortage is stopping them before they can get started using data to their advantage.

They see what Netflix has been able to accomplish using data science, building a programming empire based on machine learning algorithms, and how Amazon and Google are employing data science to drive engagement.

And organizations see how competitors in their own industries are using data science to gain advantages, using machine learning and augmented intelligence to identify and profile potential customers, develop recommendations, predict supply chains, identify fraud and develop predictive models.

But when they attempt to hire their own data scientists, they can't find them.

Nearly a decade ago, the Harvard Business Review called data scientist the sexiest job of the 21st century, and in 2011, according to dataversity.net, job postings for data scientists increased 15,000% over the previous year.

Now, job services such as Indeed.com, LinkedIn and Glassdoor are inundated with listings for data scientists. Indeed.com currently has more than 7,000 listings for data scientists, while Glassdoor and LinkedIn each have more than 10,000, and the average salary for data scientists is now over $100,000, according to the U.S. Bureau of Labor Statistics.

We know there's a shortage of data scientists because they've been difficult to recruit. Also, the salaries have been astronomical. Those are good indicators.
Donald FarmerPrincipal, TreeHive Strategy

Consulting firm QuantHub, meanwhile, compiled data from the three job services -- along with Burtch Works, CIO, Computer Weekly, Harnham, McKinsey and Women in Data Science -- and found that there was a data scientist shortage of 250,000 in 2020 based on the number of job postings for data scientists/analysts and the number of searches for those terms by job seekers.

"We know there's a shortage of data scientists because they've been difficult to recruit," said Donald Farmer, principal of TreeHive Strategy. "Also, the salaries have been astronomical. Those are good indicators."

Similarly, Rebecca Kelly, technical evangelist at streaming analytics vendor KX Systems, said that sheer volume of job listings for data scientists indicates an increase in demand without a corresponding increase in supply.

"In the last week alone there are something like 50 that have been added -- it's very much increasing," she said. "If you just think about the kinds of questions people are asking internally in companies these days, they're the sorts of questions you need data scientists to answer."

But the situation is changing.

And within a handful of years, industry insiders said the dearth of data scientists might disappear. A combination of education and technology designed to make data science more accessible have the potential to enable organizations to fulfill their ambition to turn data into insights.

Right now, however, those organizations are in a state of uncertainty and waiting.

Comparing data scientists and data analysts.
How data scientists differ from data analysts.

The problem

With Netflix, Amazon and other tech giants reaping massive gains using data science, organizations want their own success.

They want their own team of data scientists to develop algorithms that lead to recommendations, troubleshoot potential problems, predict the future and ultimately lead to the data-driven decisions that power increased profits.

And they don't want to be using personal experience and gut instinct to make significant decisions when their competitors are applying augmented intelligence and machine learning to their key decisions.

But many enterprises are stuck. They want to take advantage of data science, but they can't find qualified data scientists to develop the algorithms and build predictive models.

Supply is lagging behind demand.

"From a company perspective, the problem is they just aren't able to be as agile as those that do have [data scientists]," Kelly said. "The great thing about data science is being able to identify issues that need to be resolved and ways that you can add value, generate additional revenue, so the companies that aren't doing that are very concerned."

That concern referenced by Kelly, however, may be the biggest problem at this point.

Despite the desire to hire data scientists and use serious data science to transform their decision-making process, many organizations aren't ready, according to industry experts. They want the idea of data science, but they don't have the focused approach needed to make hiring data scientists meaningful.

"I think this is starting from a standpoint of, 'I want to be like Netflix,'" said Joe DosSantos, chief data officer at Qlik. "Every CEO is being told they need to be more predictive, they need to be more like Netflix, and it stirs anxiety."

Anxiety, however, isn't a good enough reason to hire a data scientist, or a team of them, DosSantos continued.

In fact, he said, while there is certainly a data scientist shortage, organizations that have a focused approach to data science and know what they're looking for when they post a job are not having the same difficulty hiring data scientists as are those with only a vague idea of what they want to do with data science.

"Data scientists are looking for challenges, and if you have interesting challenges for them to take on, you don't seem to have a shortage of applicants," DosSantos said. "I think that if you don't know what you're doing and have just a vague sense of what you may or not be doing and don't have a culture to support data science, that's going to problematic."

More important than trying to hire data scientists is developing a data strategy, he maintained.

"First, people need to think about the analytics strategy, what are the use cases that bind them, and then they can start thinking about what's next," DosSantos said. "When that happens, will we find ourselves short 10,000 data scientists? Possibly, likely. It's a problem that's there, but it's so easy to read about other people's data science successes and feel like you're falling behind."

Similarly, Farmer, who in his role as a consultant advises organizations in the hiring process, said that there are now enough qualified applicants to meet the demands of organizations that know what they're doing with data science.

It was different a few years ago, however, when the data scientist shortage was even more severe than it is now.

"I recently interviewed 20 candidates for one data scientist role," Farmer said. "Two years ago, we wouldn't have found 20 candidates, never mind 20 that made it through to the interview. That's a big shift."

Playing catch-up

While the shortage of data scientists persists, the sense among analysts is that supply is starting to close the gap on demand.

The gap remains, and is likely to persist, but at some point in the next five to 10 years there will be equilibrium.

And the key is education.

A decade ago, data science wasn't taught at colleges and universities, but with the rise of big data and the evolution of analytics to become a major driver of business decision-making, not only is data science now a common field of study but also many courses in data science are filled.

As students graduate with degrees in data science, they'll help reduce the shortage of data scientists.

"Every course in data science, machine learning and artificial intelligence is oversubscribed," Farmer said.

Likewise, DosSantos said colleges and universities are playing a key role in helping develop a new generation of data scientists.

"All the schools have data science programs, and they didn't five years ago," he said. "I think the education system is rallying around this. I think it's known that you get compensated fairly well, and if we play this right with partnerships between businesses and academic institutions, we will be able to meet the demand."

But colleges and universities aren't alone in developing data scientists.

People already in the workforce are taking it upon themselves to become more data literate and develop an expertise in data science through certification programs. Organizations such as Coursera offer online programs, as do analytics vendors including Qlik and Tableau and technology giants such as IBM.

"What I see happening is a push by people who aren't data scientists to educate themselves about data science," Kelly said. "That's been a real benefit. Now they're better equipped to analyze data sets and use some of the tools to identify outliers or anomalies in the data."

Technology itself, meanwhile, can play a role in reducing the demand for data scientists.

Ease of use has become a mantra for many analytics software vendors, and low-code/no-code tools featuring automated machine learning and AI capabilities now proliferate, enabling users without coding skills and data science expertise to at least dabble in data science.

These tools don't eliminate the need for data scientists -- especially for dealing with ethical issues in which untrained users might do more harm than good -- but they can enable an organization to hire a chief data officer who develops and oversees data strategy that includes business users working with data.

"Most of these [data science] applications are pretty good," Farmer said. "The ones that are built into BI tools are pretty good at finding trends -- such as time-series analysis -- they're good at finding outliers and they're good at serving up recommendations. But more advanced work does require a better understanding of what's happening inside the system and the complexities of handling data."

The outlook

Eventually, the data scientist shortage will disappear.

That increase in job listings for data scientists a decade ago that first created the deficit, and the shortfall that still exists, will be eliminated as more data scientists are developed and technology continues to advance to make at least some data science accessible to users without degrees in the subject.

The consensus is that supply will meet demand in no more than 10 years and perhaps sooner, but that prospect isn't imminent.

"Eventually, it will catch up, but I don't think it's going to happen particularly soon," Kelly said. "We're probably looking at another five years. There are still inefficiencies in organizations in general."

Farmer, meanwhile, noted that he's already seeing an increase in the supply of data scientists and that those organizations with a focused data strategy are able find qualified candidates from which to choose.

He predicted that within a few years there will be a surplus of data scientists, and that after a period of oversupply, the market will find its level. He noted that with the most popular educational courses -- data science, machine learning and AI -- the field could soon be in oversupply mode.

"There is a demand that can't be met," Farmer said. "A supply will result and will find its level by oversupply, and once there's oversupply, the market dynamics work out how many data scientists we need."

According to DosSantos, data science will evolve over the next decade or so to a point at which it becomes part of every department in an organization rather than a department unto itself, and data strategies will evolve in small increments rather than mark abrupt strategic overhauls.

"Data science is almost like when you bring in a personal trainer," he said. "You need someone to tell you how to do it and to get you doing your exercises, but over time, if your personal trainer is good enough, you shouldn't need your personal trainer. Hopefully, 10 years from now it's replacing the curtains in your living room as opposed to building a whole new house."

Next Steps

Data science quiz: Test your knowledge

Tackling business problems with data science

Dig Deeper on Data science and analytics