A future data scientist needs business, deep learning skills

As automation grows, data scientists will focus more on business needs, strategic oversight and deep learning and less on model creation and other routine tasks.

As automation increases and business needs evolve, so will the ranks of data scientists. But while the future data scientist role may look a little different -- with a heavier focus on business operations and oversight -- it will be no less important to enterprises.

"As the adoption of automated machine learning platforms ... spreads, the role of the data scientist will become less about building models and more about implementing them in a meaningful way," said Forrester analyst Brandon Purcell.

Automation is arguably the biggest disruptor to the data scientist role, according to experts. Automated machine learning vendor platforms like DataRobot, H2O's Driverless AI, dotData Inc. and Edgeverve already offer a slew of automated and semi-automated capabilities, including feature engineering; model selection and training; extract, transform and load; data preparation; and model deployment and monitoring.

As those automated capabilities mature, the focus for the future data scientist will shift away from the more tedious, rote tasks that machines can do well and toward more strategy and oversight, said Forrester analyst Kjell Carlsson.

"While the need for developing custom models from scratch will never go away, it'll be increasingly more important for [the future data scientist] to understand the business need, the data, the models suggested by tools, how the model is performing in production, and the degree to which it is meeting the needs of end users," Carlsson said.

Future data scientists focus on business need

Echoing that sentiment, Ryohei Fujimaki, founder and CEO of data science and machine learning platform vendor dotData, said that understanding the business context of data science will be an onus of the future data scientist -- even more so than traditional programming.

As the adoption of automated machine learning platforms ... spreads, the role of the data scientist will become less about building models and more about implementing them in a meaningful way.
Brandon PurcellAnalyst, Forrester

"In five to 10 years, as data science democratizes as a result of the automation tools, the ability to program to execute data science projects will be less critical," Fujimaki said. "Instead, the ability to translate a business problem into a data science problem, and the ability to leverage data science automation tools to solve the business problems will be critical.

From a soft skills perspective, that means there will be a greater need for storytelling skills on data science teams, according to Purcell.

"Data scientists increasingly need to convince business stakeholders to trust how their models make decisions and that those decisions will add business value," he said.

For that reason, knowing which models are appropriate for the task at hand will also become more integral to quickly achieving practical business results, said Pedro Alves Nogueira, director of engineering and head of artificial intelligence and data science for freelancing platform Toptal.

Most in-demand data science skills based on job posting data from Dice.com
A visual of the most sought-after skills related to data science based on job posting data from Dice.

Traditional skills will still be necessary

In terms of technical skills, future data scientists will need to focus more on the machine learning operations process -- sometimes called MLOps -- to ensure the reliability of data pipelines and the scalability of model, Purcell said. Other technical skills will still be critical, as well.

"Since there will always be new and novel data sources to analyze, data scientists will still [need] Hadoop and other big data technical skills," Purcell said. "Leading data scientists will continue to leverage R and Python for two reasons: One, new algorithms generally become available in open source first, and two, companies will still want to build custom models using these languages anytime data science is a core differentiator."

Data scientists can expect Python proficiency to become an even more predominant data science skill in the next five to 10 years, Nogueira said.

The future data scientist will also need to be competent in deep learning, according to Carlsson. He advised data scientists to start devoting time to studying deep learning models and neural networks for fast-growing areas like generative adversarial networks, multitask learning, federated learning, neuromorphic computing and edge computing. The number of job postings for deep learning skills has more than doubled year over year, according to Dice.

"There is an incredible array of untapped applications that have just recently been opened up, and we are really just getting started in terms of understanding their potential," Carlsson said.

Dig Deeper on Data science and analytics