Data scientist vs. machine learning engineer careers

There are many careers under the data science umbrella, including data scientist and machine learning engineer. But what's the difference between the two? Read on to find out.

Whether you're newly entering the workforce, have been recently laid off, are worried about keeping your current job or have been temporarily furloughed and have some time on your hands, there's no better time to pick up some AI-related skills than right now.

According to LinkedIn, artificial intelligence and machine learning jobs have grown 74% annually over the past four years. Job titles in this category include data scientists and machine learning engineers, but if you're confused about the differences between a data scientist vs. machine learning engineer, you're not the only one.

"To begin with, there was no distinction between the two roles," said Pragyansmita Nayak, chief data scientist at Hitachi Vantara Federal, which provides technology services to federal agencies.

When the two jobs first started growing, companies advertised for data scientists whether the job was more on the data scientist vs. machine learning engineer side.

"That confusion [still] exists today," Nayak said.

What is your background?

The biggest difference between a data scientist vs. machine learning engineer, experts said, is that they come from very different places.

"Data science has its foundations in statistics and in the business side," said Justin Richie, data science director at Nerdery, a digital services consultancy.

For example, a data scientist working at a bank might be asked to find out why customers are leaving, he said. The data scientist would decide on what data and analytics are needed and come up with a way to identify customers who are likely to leave.

Machine learning engineers, however, come from the other direction -- from software development.

"They're more focused on the production of the models and embedding them into applications," Richie said.

In the bank example, a machine learning engineer might take the model created by the data scientist and turn it into production code to embed into a mobile banking application. With that, the insights can become actionable, with the bank taking immediate steps to change the minds of customers looking to jump ship.

Breakdown of machine learning process
Some key parts in the machine learning process

Key skills for data scientists

According to Ira Cohen, chief data scientist at Anodot, an autonomous business monitoring platform, "data scientist" is still often used as the overall umbrella term, with machine learning engineer being a narrower subset of it.

But, increasingly, data scientist is becoming a more specialized job category, analyzing business data using machine learning or artificial intelligence, he said. "Similar to the role of business analysts."

Data scientists often start out as business analysts and boost their math and analytics skills with additional courses or on-the-job training. Some also start out right in data science, with academic backgrounds in statistics or artificial intelligence.

In addition to math and business domain knowledge, data scientists typically need programming skills to be able to develop prototypes of their models. R and Python are the most common programming languages for the job, but Scala, Julia, JavaScript, Swift, Matlab and Go can also be useful. Data scientists should also be familiar with data visualization tools like Power BI, Tableau and Qlik.

Andrew Stevenson, CTO at Lenses.io, a company that offers data platform monitoring technology, once worked on a project with data scientists from an energy trading desk.

"They were able to build the models, test and run locally," Stevenson said. And then they hit the limit of their expertise, he said. "The models were not production-grade. They had no monitoring, they weren't version controlled, they were not easily developed in a repeatable way. They were black boxes and if the desktop got rebooted, they had a production incident."

This is the point where machine learning engineers step in.

"Data scientists are typically mathematical but literate in programming," Stevenson said. "Data scientists in a financial trading firm -- the quants -- usually have Ph.D.s in mathematics but are also technically savvy with tooling such as R and Matlab, but they rely on highly skilled, hard-to-find programmers to implement their algorithms and bring them to production."

Key skills for machine learning engineers

Machine learning engineers typically start out on the software development side and add machine learning skills through on-the-job training or additional study, though some are now graduating from specialized degree programs.

It's never been easier for a software developer to become a machine learning engineer, said Sachin Gupta, co-founder and CEO at HackerEarth.

"With more and more open source libraries from tech giants like TensorFlow from Google bringing pretrained models for various use cases, it's becoming simpler for machine learning engineers to experiment with a multitude of models," he said.

Then the machine learning engineer deploys these models, builds APIs and web interfaces, and builds data pipelines, he said.

Machine learning engineers have some overlap with data scientists in terms of skills. Both may be using R or Python, for example, and both need advanced math skills like linear algebra and statistics.

But machine learning engineers are expected to be more highly skilled when it comes to programming, said Alex Ough, senior architect CTO at Sungard Availability Services. Machine learning engineers also need to know production platforms such as AWS, Azure and GCP and their AI services, he said.

Where the 2 jobs overlap

Larger companies typically see data scientists and machine learning engineers as two separate job functions, Richie said. But at smaller and midsize companies, one person may be doing both jobs, with the company hiring either one or the other. That can be a mistake, he said.

"Hiring a single person to do all of those things is not signing up that person for success," Richie said.

He suggested that companies that can only afford to hire one person, hire for the specific job they need the most.

"Then cross-train other people at the company," he said. "That's what I've been advising the customers we work with. For example, business analysts are good for learning data science skills. Use the existing skill sets and hire only the specific niche vertical that you need."

Working hand in hand

For companies that hire both data scientists and machine learning engineers, the two typically work closely on projects.

Think of it as a data scientist being the architect of a building. And the machine learning engineer is the general contractor who actually builds the building.
Pragyansmita NayakChief data scientist, Hitachi Vantara Federal

"Think of it as a data scientist being the architect of a building," Nayak said. "And the machine learning engineer is the general contractor who actually builds the building."

Data scientists start out with the data, the goals and the algorithms, she said, while the machine learning engineer starts with the code. But the two work together on many tasks. Data scientists usually choose the best machine learning algorithm for a particular project, but machine learning engineers have a better idea about the frameworks used by the organization.

"I would talk to the machine learning engineer," Nayak said. "I would ask what the different options are, what they would recommend."

Then, after the machine learning engineer has done the development work and put the application into a production environment, the data scientist may be needed again.

"That's where the data scientist goes back to the end users and works with them and makes sure they are comfortable with the systems," Nayak said.

How to get the skills

Many online educational platforms are making courses available for free or at a low cost, including those that offer credentials that can be added to a resume. And, to put those skills to some use, there are volunteer opportunities available to write code for nonprofits or help with open source projects.

Even during the pandemic, there are opportunities to network. Many local data science groups and larger conferences and other events are going virtual. Look for opportunities to do presentations about your ongoing projects, find volunteer opportunities and pick up industry gossip about who's hiring.

Data science is not pandemic proof

At first, it seemed people working in AI and analytics would be spared from the brunt of this crisis. After all, these are jobs that can very easily transition to a work-from-home model.

According to a survey by recruiting firm Burtch Works that was conducted in partnership with the International Institute for Analytics, in the first week of the social distancing in the U.S. 56% of respondents said that they saw no staffing impact as a result of the pandemic. But by the third week, only 40% of respondents could say the same and 38% reported "some or substantial" cuts at their companies.

There's also been a dramatic decline in LinkedIn data scientist job postings, Burtch Works reported. Listings went from more than 21,000 postings in late March to fewer than 17,000 a month later -- though the listings rebounded somewhat to 18,000 job postings by mid-April.

Long term, however, there is significant optimism about this sector, so you may want to consider getting training to become a data scientist or a machine learning engineer or improve the skills you already have.

Dig Deeper on Business intelligence management