Sergey Nivens - Fotolia

How to identify master data in a multi-domain MDM program

In an excerpt from their book on managing multi-domain master data management programs, Mark Allen and Dalton Cervo explain how to identify MDM domains and your master data.

This is an excerpt from Chapter 2 of Multi-Domain Master Data Management: Advanced MDM and Data Governance in Practice, a book by Mark Allen and Dalton Cervo. Allen is manager of enterprise data governance at medical insurer Anthem Inc., which was called WellPoint Inc. before changing its name in December 2014; Cervo is president and founder of Data Gap Consulting, which works with corporate clients on data management initiatives.

This section of the chapter explains the concept of Master data management (MDM) domains and discusses how to identify, inventory and analyze master data as part of a multi-domain MDM program aimed at improving data quality and consistency throughout an organization. Later in the chapter, Allen and Cervo look at cross-domain dependencies and how to determine an optimal domain implementation order; they also cover issues that can occur if the implementation order is not executed correctly.

The term master data management domain refers to a specific data domain where identification and control of the master data is focused. Customer, Product, Locations, Finance and Employee domains have been among the most commonly targeted data domains where MDM initiatives are focused. The focus on these domains has evolved from data management practices associated to customer data integration, product integration management, accounts receivable and human resources practices. These data management practices have paved the road for the introduction of MDM as a more common discipline using a data domain-based approach.

Companies will typically begin their MDM focus in one domain area, and then expand to more domains with implementation of a multi-domain program model. In some cases there may be a multi-domain strategy from the start, but usually a single domain like Customer or Product will still be the starting point and set the tone for subsequent domains. When taking this multi-domain leap, it is critical to determine each domain's master data elements and what aspects of MDM planning and execution are repeatable and scalable across the domains. With that in mind, this chapter offers guidance and questions that are important to consider when pursing a multi-domain model.

Copyright info

Multi-Domain Master Data Management: Advanced MDM and Data Governance in Practice

This excerpt is from the book Multi-Domain Master Data Management: Advanced MDM and Data Governance in Practice, by Mark Allen and Dalton Cervo. Published by Morgan Kaufmann Publishers, Burlington, Mass. ISBN 9780128008355. Copyright 2015, Elsevier BV. To download the full book or other books for 25% off the list price, visit the Elsevier store and use the discount code PBTY15.

Ideally, the domains where master data analysis and MDM practices can be applied will be clearly defined in a company's enterprise architecture, such as in an Enterprise Information Model. However, such an architecture and models often reflect a target state that is only partially implemented and without firm plans for how other key pieces of the architecture design will be implemented. To be successful, an MDM program needs to lock into and provide value to current state operations and where enterprise level or operational initiatives are in progress. Master data exists, regardless of how advanced a company is with its enterprise architecture strategies. If an enterprise architecture design cannot provide a firm point of reference for defining MDM domains, it can instead be derived from other reference points, such as from a review of subject areas in an enterprise data warehouse or from the operational model and functional architecture.

Identifying domains

The concept of master data and how this data is identified and managed needs to be consistent across a company's systems and processes.

Although certain domains such as Customer, Product, Locations and Employee are the most commonly referenced, the domain types and mix can vary due to a company's industry orientation and business model. They may also be influenced by system architecture, such as if a company has implemented an integrated business suite of applications that has an underlying data architecture and predefined data models.

Here are some industry-oriented examples of how domains are often defined:

  • Manufacturing domains: Customers, Product, Suppliers, Materials, Items, Locations
  • Healthcare domains: Members, Providers, Products, Claims, Clinical, Actuarial
  • Financial services domains: Customers, Accounts, Products, Locations, Actuarial
  • Education domains: Students, Faculty, Locations, Materials, Courses

Identifying master data

Regardless of how the domains are determined, the concept of master data and how this data is identified and managed needs to be consistent across a company's systems and processes. Master data should be clearly defined and distinguished from or related to other types of data, such as reference data and transactional data. Here are definitions for these types of data:

  • Master data: Data representing key data entities critical to a company operations and analytics because of how it interacts and provides context to transactional data.
  • Transactional data: Data associated with or resulting from specific business transactions.
  • Reference data: Data typically represented by code set values used to classify or categorize other types of data, such as master data and transactional data.
  • Metadata: Descriptive information about data entities and elements such as the definition, type, structure, lineage, usage, changes and so on.

While each of these types of data will be used together for operational and analytical purposes, and all may be in the scope of a data governance charter, the source control and quality management of the master data will have different priorities, requirements, challenges and practices than will the other data types. MDM is the application of discipline and control over master data to achieve a consistent, trusted, and shared representation of the master data. Therefore, reference data and metadata associated with a master data element should also be included in the MDM scope if any control or consistency problems with the reference data or metadata data will affect the integrity of the master data element. For example, many code sets act as reference data to qualify a master data element or provide a list of values expected to be populated in the master data field. In such cases, the reference data associated with the master data element should be in the MDM scope.

The MDM and data governance programs work together to focus on managing and controlling the elements, definitions and business processes that influence the creation and change of master data. Clearly recognizing and defining this are perhaps the most challenging and foundational actions within a MDM program. MDM and data governance efforts can be and often are initiated with objectives to pull the business entities and definitions together, but this is a much more complicated process that requires attention from many resources to ensure a coordinated approach. If the MDM Program Management Office (PMO) and data governance are not prepared with sufficient resources and support to help pull this information together and coordinate the analysis process, the progress and maturity of the MDM program will be impeded until this work can be completed. The MDM PMO scope and its relationship with data governance are discussed in more detail in Chapter 4.

To fully evaluate the master data and the master data characteristics within a domain, the following artifacts should be inventoried, gathered and reviewed for each domain:

  • Data models: Conceptual, logical and physical models that organize and document business concepts, data entities and data elements and the relationships between them
  • Data dictionary: A listing of the data elements, definitions and other metadata information associated with a data model
  • Functional architecture: Depicts how systems and processes interact within each other within a functional scope
  • Source to target mapping: Describes the data element mapping between a target system and source system
  • Data lifecycle: Depicts the flow of data across application and process areas from data creation to retirement
  • CRUD analysis: Indicates where permissions to create, read, update and delete have been assigned to various groups for certain types of data

All of these artifacts are extremely valuable for evaluating the scope, consistency and use of master data. Unfortunately, not all these artifacts are likely to be available or in a complete form for each data domain in scope. The data governance or MDM program management office should consider opportunities to assist with the initiation or completion of any of these artifacts where needed.

These artifacts should also be the basis for defining key metrics that will demonstrate the value and progress for how MDM practices can drive the alignment and consistency of the master data across these areas and artifacts. For example, an initial analysis of these types of artifacts and data assets is likely to reveal many gaps or conflicts with master data definitions, alignment, lineage, usage and control. From this type of assessment, current state baselines can be determined, leading to quality improvement objectives that can be implemented and tracked as ongoing quality metrics for each domain. This type of data asset inventory and analysis by domain should also be leveraged to help scope the data governance, data quality management and metadata management practices needed for the MDM plan and approach for each domain.

Next Steps

Ten years of MDM technology: A look back, and forward

What is the relationship between MDM and data quality?

MDM design, MDM deployment options and MDM hierarchy

Dig Deeper on Data management strategies