The nine roles you need on your data science research team

Data science research has become an essential element of modern companies' success in the digital economy. Here's how to build a data science team to get the most bang for your buck.

It's easy to focus too much on building a data science research team loaded with Ph.D.s to do machine learning at the expense of developing other data science skills needed to compete in today's data-driven, digital economy. While high-end, specialty data science skills for machine learning are important, they can also get in the way of a more pragmatic and useful adoption of data science. That's the view of Cassie Kozyrkov, chief decision scientist at Google and a proponent of the democratization of data-based organizational decision-making.

To start, CIOs need to expand their thinking about the types of roles involved in implementing data science programs, Kozyrkov said at the recent Rev Data Science Leaders Summit in San Francisco.

For example, it's important to think about data science research as a specialty role developed to provide intelligence for important business decisions. "If an answer involves one or more important decisions, then you need to bring in the data scientists," said Kozyrkov, who designed Google's analytics program and trained more than 15,000 Google employees in statistics, decision-making and machine learning.

But other tasks related to data analytics, like making informational charts, testing out various algorithms and making better decisions, are best handled by other data science team members with entirely different skill sets.

Data science roles: The nine must-haves

There are a variety of data science research roles for an organization to consider and certain characteristics best suited for each. Most enterprises already have correctly filled several of these data science positions, but most will also have people with the wrong skills or motivations in certain data science roles. This mismatch can slow things down or demotivate others throughout the enterprise, so it's important for CIOs to carefully consider who staffs these roles to get the most from their data science research.

Here is Kozyrkov's rundown of the essential data science roles and the part each plays in helping organizations make more intelligent business decisions.

Data engineers are people who have the skills and ability to get data required for analysis at scale.

Basic analysts could be anyone in the organization with a willingness to explore data and plot relationships using various tools. Kozyrkov suggested it may be hard for data scientists to cede some responsibility for basic analysis to others. But, in the long run, the value of data scientists will grow, as more people throughout the company are already doing basic analytics.

Expert analysts, on the other hand, should be able to search through data sets quickly. You don't want to put a software engineer or very methodical person in this role, because they are too slow.

"The expert software engineer will do something beautiful, but won't look at much of your data sets," Kozyrkov said. You want someone who is sloppy and will run around your data. Caution is warranted in buffering expert analysts from software developers inclined to complain about sloppy -- yet quickly produced -- code.

Statisticians are the spoilsports who will explain how your latest theory does not hold up for 20 different reasons. These people can kill motivation and excitement. But they are also important for coming to conclusions safely for important decisions.

A machine learning engineer is not a researcher who builds algorithms. Instead, these AI-focused computer programmers excel at moving a lot of data sets through a variety of software packages to decide if the output looks promising. The best person for this job is not a perfectionist who would slow things down by looking for the best algorithm.

A good machine learning engineer, in Kozyrkov's view, is someone who doesn't know what they are doing and will try out everything quickly. "The perfectionist needs to have the perfection encouraged out of them," she said.

Too many businesses are trying to staff the team with a bunch of Ph.D. researchers. These folks want to do research, not solve a business problem.
Cassie Kozyrkovchief decision scientist at Google

A data scientist is an expert who is well-trained in statistics and also good at machine learning. They tend to be expensive, so Kozyrkov recommended using them strategically.

A data science manager is a data scientist who wakes up one day and decides he or she wants to do something different to benefit the bottom line. These folks can connect the decision-making side of business with the data science of big data. "If you find one of these, grab them and never let them go," Kozyrkov said.

A qualitative expert is a social scientist who can assess decision-making. This person is good at helping decision-makers set up a problem in a way that can be solved with data science. They tend to have better business management training than some of the other roles.

A data science researcher has the skills to craft customized data science and machine learning algorithms. Data science researchers should not be an early hire. "Too many businesses are trying to staff the team with a bunch of Ph.D. researchers. These folks want to do research, not solve a business problem," Kozyrkov said. "This is a hire you only need in a few cases."

Prioritize data science research projects

For CIOs looking to build their data science research team, develop a strategy for prioritizing and assigning data science projects. (See the aforementioned advice on hiring data science researchers.)

Decisions about what to prioritize should involve front-line business managers, who can decide what data science projects are worth pursuing.

In the long run, some of the most valuable skills lie in learning how to bridge the gap between business decision-makers and other roles. Doing this in a pragmatic way requires training in statistics, neuroscience, psychology, economic management, social sciences and machine learning, Kozyrkov said. 

Dig Deeper on IT applications, infrastructure and operations