Data lake vs. data warehouse: Key differences explained Evaluating data warehouse deployment options and use cases
X
Tip

What key roles should a data management team include?

These 10 roles, with different responsibilities, are commonly a part of the data management teams that organizations rely on to make sure their data is ready to use.

Regardless of company size or industry, there's one common attribute that separates high-performance organizations from their less successful counterparts: high-quality decision-making. More and more, that's driven by data. From front-line operational units to the senior executive team, data analysis helps remove guesswork from the decision-making process at all levels of an organization.

But raw data, on its own, provides limited value. The value of data to decision-making increases exponentially when organizations transform it into information that can be easily visualized and interpreted. Data also must be high-quality itself -- clean, consistent and correct. The responsibility for ensuring that data can be turned into actionable insights falls to the data management team.

Why is an effective data management team important for businesses?

Data management is a practice that focuses on the effective administration of an organization's data assets. From data creation and collection to enabling its effective use throughout the enterprise, managing data requires a wide range of roles and responsibilities. In smaller organizations, a single IT professional often assumes multiple roles. But as data management workloads grow, organizations commonly distribute those activities to a team of people.

Without a skilled data management team, an organization can find itself with low-quality data that hampers strategic planning, business operations and the BI, reporting and data science applications that help inform decision-making. For example, errors, inconsistencies and other data issues might skew operational actions and analytics results. Separate data silos hide relevant data from users in different departments. Ultimately, getting the expected business value from data sets becomes a big challenge.

Goals and overall duties of a data management team

A data management team's main goal is to make sure enterprise data assets meet business requirements and the information needs of users. To achieve that, the team works to make data available and accessible for the planned uses and ensure that it's accurate, reliable and properly organized. The team also combines data sets from different systems to give users a full view of operations, customers, financial performance and other areas of business interest.

At a high level, the duties of a data management team include the following:

  • administration of databases, data warehouses and other data repositories;
  • development of a data architecture to document data assets and map data flows;
  • data modeling to create diagrams of data structures and associated business rules;
  • data quality management to identify and fix issues in data sets;
  • data integration work to pull together and consolidate different data sets; and
  • data engineering, which includes building data pipelines between source and target systems.

Many data management teams also do data analysis themselves, handling both BI and advanced analytics applications instead of having those functions be part of separate analytics teams.

10 roles that data management teams include

Now, let's look at the most common data management roles and their responsibilities. It's important to note, though, that the size of the organization and the amount of data it collects and stores will affect how data management responsibilities are assigned to different personnel.

In addition, the technologies the IT department deploys for an organization will affect the specific activities that some of these roles involve. Nonetheless, although the work done by the members of a data management team might differ based on the platform being used, the core responsibilities outlined here are common across all technologies and products.

1. Database administrator

As the job title indicates, database administrators (DBAs) are responsible for managing the organization's databases. Regardless of the specific database management system (DBMS) platform an administrator supports, the core DBA responsibilities are the same: system availability and database performance, security, monitoring and recovery.

In addition to their back-end support duties, DBAs ensure that data is efficiently organized and stored. For databases that require a predefined schema, they work with application developers and data administrators to design and create the physical objects for storing the data; they also index structures and logical objects to provide efficient database access. Other tasks for a DBA include creating user accounts and assigning access privileges, as well as helping developers and end users debug and optimize applications.

As DBA workloads increase, organizations often separate the functions into separate system and application support roles. System DBAs are responsible for supporting the back-end DBMS infrastructure, while their application DBA counterparts work with data and objects inside the platform and assist developers and users. Data warehouse DBA is another specialized role.

The key responsibilities of a DBA include the following:

  • Take administrative ownership of the organization's back-end database systems.
  • Organize and maintain database data throughout its lifecycle.
  • Help application developers and end users with debugging and data access.

2. Data administrator

Data administrators view data from both a business and technical perspective and interact with end users and developers to define, describe, organize and categorize data sets. For example, a standard practice of a data administrator is to interview business users to identify the data they interact with and understand how it applies to the organization.

With that information, data administrators build data models that are based on the business meaning of data sets, the relationships between different data elements and the business rules that govern the data. The modeling creates the framework that enables different business units to share timely and accurate data. Data administrators also develop naming conventions, storage definitions and business rule constraints for the data elements.

In organizations that don't have a data governance team, a data administrator will often take the lead in developing and implementing governance policies, procedures and best practices designed to ensure the security, quality and proper use of data.

The key responsibilities of a data administrator include the following:

  • Help business units to define, organize and categorize data and document associated business rules.
  • Create and maintain conceptual, logical and physical data models.
  • Establish and maintain data management and governance best practices.

3. Data modeler

Although data administrators often take responsibility for data modeling, many businesses -- especially larger ones -- create a specific role for that activity: the data modeler. In such cases, data modelers handle the process of building the conceptual, logical and physical models that become the foundation for the organization's data stores.

It's an important role: The quality of all subsequent development and management processes that interact with the modeled data elements is totally dependent on the quality of the initial logical design and physical implementation. In addition, a data modeler updates and revises models on an ongoing basis to keep them current as data sets and business needs change in an organization.

The key responsibilities of a data modeler include the following:

  • Work with data management and business users to gather the information required to design and document data models.
  • Use modeling tools to build conceptual, logical and physical data models.
  • Develop best practices to ensure the ongoing consistency of data models and verify their relationship to ever-changing business operations.

4. Data architect

Data architect is a senior-level position that typically works across an enterprise. Data architects often have advanced technology degrees and possess a strong understanding of the business. They're responsible for developing a data architecture, which is the blueprint for an organization's data management framework. Many also help select data platforms and systems that best meet the business and technical needs of applications.

As a result, data architects need to stay abreast of both current and emerging technologies. In addition to their architectural skills, data architects are often adept at system and database design and data modeling, and they often have a strong understanding of the systems development life cycle model and project management best practices.

The key responsibilities of a data architect include the following:

  • Develop and maintain the architectural blueprint for an organization's data management framework at an enterprise level.
  • Stay abreast of the latest data management, storage and processing technologies.
  • Help business users and IT personnel select and implement systems.

5. ETL developer

Integrating data sets, converting them from the source data type to a different target one, cleansing the data and applying business rules to standardize it can be a complex process. As a result, many organizations create a separate data integration role that focuses on extract, transform and load (ETL) processing. ETL developer is the most common job title, but data integration developer and data integration engineer are also used in some cases.

This role often involves loading data from source systems into a data warehouse; it also can include integrating different data sources for operational applications. The code used to cleanse and standardize data can range from simple, parameterized processes built into ETL tools to complex programming logic. Data transformation requires knowledge of programming languages like Python, Perl and SQL, and ETL developers must also have a strong understanding of data analysis and data access techniques. Knowledge of other data integration methods besides ETL might be required, too.

The key responsibilities of an ETL developer include the following:

  • Use ETL tools to extract and process data from source systems and load it into target ones.
  • Do data cleansing and validation to ensure that data quality levels meet requirements.
  • Create documentation of ETL and other data integration processes and update it as needed.

6. Data quality analyst

Businesses are generating more data than ever before. As a result, establishing and maintaining high levels of data quality is a constant challenge for many organizations. Data quality analysts are responsible for identifying errors, anomalies and other defects that compromise the quality of data and, ultimately, its business value to the organization.

They evaluate data sets on various dimensions of data quality, such as accuracy, completeness, consistency, conformity and lack of duplicate data. Once data issues are identified, a data quality analyst works to resolve them in order to improve data reliability and ensure that end users have access to trusted data. This role often also involves tracking data quality metrics and educating users on best practices to help prevent quality problems upfront.

The key responsibilities of a data quality analyst include the following:

  • Identify data issues and determine their severity and the scope of their business impact to aid in planning data quality improvement initiatives.
  • Take the necessary steps to fix incorrect data values and other issues and to address their root cause.
  • Establish data quality guidelines and best practices for end users to minimize future problems.

7. Data engineer

Data engineers are the data transfer and storage experts of the IT profession. They're responsible for moving data into analytical data stores and preparing it for use by application developers, data scientists and other analysts. To do so, data engineers use software that ranges from data pipeline products focused on moving data from one system to another to more advanced ETL tools.

A data engineer must have a strong understanding of data warehouses and big data platforms and of how they store data. In addition, the job requires a working knowledge of the APIs that enable access to big data systems and of data administration and analysis best practices. Because data engineers are responsible for pipelines that move large volumes of data across the enterprise, they also need expertise in performance monitoring and troubleshooting.

The key responsibilities of a data engineer include the following:

  • Build data pipelines that retrieve data from source systems, standardize and transform it as needed and load the data into target systems.
  • Evaluate, implement, use and administer data pipeline and ETL tools.
  • Monitor the performance and quality of data transfers between source and target systems.

8. Data scientist

Data scientists are data analytics experts -- highly sought-after, senior-level professionals who usually have masters or doctorate degrees in data science, statistics, mathematics or computer science. The technologies that data scientists use include statistical analysis, predictive analytics, AI, machine learning and deep learning tools, plus programming languages that include Python, R, Scala, SQL and Julia.

In addition to analyzing data to improve current business operations, data scientists develop predictive models to forecast future trends and answer what-if questions. To help business users understand the data insights uncovered by their analytics efforts, most data scientists also use data visualization tools to create graphics, reports and dashboards that present the findings in easy-to-digest formats.

The key responsibilities of a data scientist include the following:

  • Develop analytical and statistical models to analyze data and improve them as needed.
  • Use machine learning and other advanced analytics technologies to uncover hidden insights in data and predict customer behavior and future business trends.
  • Build reports, dashboards and data visualizations to present the insights to business executives and other data consumers.

9. Data analyst

Although this role is sometimes viewed as an entry-level position in the field of data management, a data analyst is often the go-to team member when assistance is needed to turn raw data into meaningful insights. Like data scientists, data analysts cleanse, organize and analyze data sets and build dashboards and reports to help business users understand the results of analytics applications.

In some cases, data analysts work on their own; in others, they support and are overseen by data scientists. They typically aren't as experienced or technically skilled as data scientists. To be effective, though, a data analyst needs to have a strong knowledge of statistics and be able to quickly understand how data applies to complex business operations, in addition to being proficient in using analytics and data visualization tools and programming languages such as Python and R.

The key responsibilities of a data analyst include the following:

  • Model, cleanse, organize and categorize data sets for use in analytics applications.
  • Analyze data to find actionable insights for operational and strategic decision-making.
  • Create reports, dashboards and data visualizations to help business users interpret analytics results.

10. Business intelligence analyst

A business intelligence analyst, or BI analyst for short, focuses more on data analysis and visualization than initial data design and analytical modeling. BI analysts typically work with structured data stored in data marts and data warehouses, as opposed to the more varied data lakes that data scientists and data analysts use. In general, they also run less-complex queries to track business KPIs, customer buying habits, plant-floor production and other operational issues.

Nevertheless, BI analysts often access and aggregate large volumes of data to identify patterns that can help optimize operations and influence business strategies. To effectively make data useful in the decision-making process, they need to understand its business meaning. In addition to their own data analysis work, BI analysts sometimes also help train and support users of self-service BI tools in business units.

The key responsibilities of a BI analyst include the following:

  • Work with business managers to identify data assets that can help them improve current operations and forecast future business needs and trends.
  • Analyze the data to find information that will help drive high-quality business decisions.
  • Build dashboards, reports and data visualizations to communicate findings to business users.

Managing a data management team

In practice, how the roles outlined above are grouped into teams will vary widely. The actual makeup of a data management team depends on the organization's preference. The same goes for the management layers and reporting structure set up to oversee the team.

Company size can be a factor. In midsize organizations, you'll often find DBAs, data administrators and data modelers on the same team, which reports to a midlevel data manager. In larger organizations, these three roles might be separate teams, with each having its own manager; all three report to the data manager.

Data engineers, data scientists and data analysts are often assigned to a specific team that focuses on supporting and using a data lake for advanced analytics. In some shops, data warehouse DBAs, ETL developers and BI analysts might also be assigned to that group. In others, they can be part of a separate BI team. Instead of the data management team, data quality analysts might be included in a data governance team that has its own reporting structure.

When you review IT organizational charts, data management teams often report to senior IT or data managers who might be responsible for overseeing multiple teams in large organizations. The senior managers, in turn, likely report directly to the CIO or the chief data officer, as opposed to the CTO, who focuses more on IT innovation and the implementation of new technologies.

Next Steps

What is a data warehouse analyst?

Data governance roles and responsibilities: What's needed

Evaluating data warehouse deployment options and use cases

Dig Deeper on Data management strategies