Data Governance series Pt.3 - Metadata as path to your Data Zen

In the first two articles of this series, I introduced what data governance is and why it is an organizational change rather than a technical one. In this article, I will focus more on some of the data involved in data governance processes. And by that I do not mean the data itself that sits in your ERP or CRM database, but rather the data that facilitates data governance. To differentiate between the data that facilitates data governance and the data that is to be governed, I will from now on refer to the facilitating data as metadata.
The sections that follow, describe the different types of metadata you will encounter when implementing Data Governance. The sections follow the DAMA DMBOK terminology, to make it easier for you to research them if you want to dig deeper. We will take each of the 11 DAMA DMBOK Core Knowledge Areas and describe how they interact with Data Governance.
1. Data Architecture
The focus of this knowledge area lies on the design and maintenance of data structures and frameworks that support business strategy.
Why it matters: To understand the governance needs of an object, you need to understand the legal requirements as well as the business requirements. This means that the “why” for storing objects of data is important. And the “how” should always support the “why”.
This is where data governance typically requires input from traditional enterprise architecture, where the modeling of the business side of things is done. It is rooted deeply in the conceptual side of working with data and provides the foundation for other building blocks.
2. Data Modeling and Design
Here, representations of the data are developed. They are made to support business operations and analytics, this means creating the data models and is of course closely linked to data architecture where the conceptual models are developed.
Why it matters: To govern your data, you need to understand, on a logical layer, where it is saved and why. This differs from the data architecture insofar as it describes the “how” of data storage not the “why”.
Architecture sets the constraints and standards that modeling follows (naming, canonical reference models, security patterns). Modeling feeds back practical issues (performance, unmodelled cases) that can refine architecture. Architecture delivers the guardrails and vision. Modeling is concerned with the detailed engineering within those guardrails. It is rooted in the Logical/Modeling side of working with data.
3. Data Storage and Operations
Now you have delivered the guardrails of why and how we want to store data and derived a model that describes the structures to use for storing it. Finally it is time to turn to the technology for storing data.
Why it matters: Describing the “why” and “how” does not define the “where” of data management. And if you want to enforce certain security checks and data quality metrics, the location of your data changes the processes and your ability to enforce certain policies.
This is where your data governance leaves the theoretical space and becomes something that has an actual effect on your data. If you want data governance to be tangible in your organization, you will have to include the technical layer into its considerations.
4. Data Security
Data security is an area that spans the conceptual, the logical and the technical side of data handling, and security issues are often one of the drivers for implementing data governance in an organization.
Why it matters: Of course when governing your data, security considerations should always be part of any action you take. If your data governance activities breach security, you will be worse off than without governance and if it does not help you enforce security measures, your efforts will yield little benefit.
Data security on the conceptual layer means considering the regulations and standards that you need to adhere to, your guardrails when working with data. Then you will need to translate them into security concepts (making it a building block in modeling) that can be implemented later (in implementation).
5. Data Integration and Interoperability
If data is strictly siloed, governance needs differ, however with modern enterprises, all operations tend to be integrated and automated more and more. Enabling that integration is the key feature of this knowledge area.
Why it matters: If you cannot include data integration in your data governance considerations, you will not know what data is exchanged where and how. This means you will always have a blind spot on your data map and this is a risk for your governance undertakings because in this blind spot anything from diluting your data to leaking it might happen, putting your whole data landscape at risk.
This is typically a block that can be subdivided into the modeling and the technical fields. A good definition of your APIs and interfaces should be part of your application architecture, placing it in the modeling area, whereas the implementation and ensuring these models are actually implemented is part of the technical area.
6. Reference and Master Data Management
To ensure consistency in your data entities across business units, Master Data Management (MDM) and Reference Data Management provide a “single source of truth” for the most important business information.
Why it matters: In terms of data governance, Master Data Management can be seen as a measure that is taken to improve data quality and make sure, data is consistent. Improving data is an important result of any data governance effort, so reference and master data management is a building block that should be part of any implementation.
This is a block that for data governance can be seen as part of the conceptual area as it is the blueprint data for the processes that improve the data.
7. Data Warehousing and Business Intelligence
From a data governance point of view, this knowledge area is similar to master data management and can be seen as an for making the efforts and the data quality visible to all users. The data warehouse is a good place to obtain and report an overview over the data estate of an enterprise as it collects many data points and knows about their quality issues.
Why it matters: Nobody likes to reinvent the wheel and since most enterprises today have a means for collecting data and delivering reports into the organization, it makes sense rather than duplicating data and access to data just for a dedicated reporting on data quality, to utilize the existing data warehouse for this purpose.
This knowledge area is also located on the technical side from a data governance perspective, whereas, of course, the data modeling for the data warehouse is part of the modeling area and the business requirements for the data quality reports belong to the conceptual layer.
8. Metadata Management
This area of knowledge deals with managing “data about data”. This is typically done to improve understanding and accessibility of data, however, in the context of data governance, it is also one of the key benefits that a data governance initiative provides to an enterprise.
Why it matters: Understanding what data means, where it comes from and how it is transformed is central to the organization’s ability to effectively govern its data. Only data that is understood by all stakeholders can be dealt with as needed.
This knowledge area covers all three aspects, it translates the business terminology and links it to data concepts, rooting it in the conceptual field (as Information Modeling). It provides “translations” for business terms into the different fields of an enterprise, explaining, how the different uses of a term are connected (as Data Dictionary). And it provides technical information on how the data passes through the different systems and processes (also known as Data Lineage). This is why a lot of the data governance processes are closely related to metadata management.
9. Data Quality Management
This area of knowledge ensures that data is accurate and complete, consistent and accessible in a timely manner. Improving data quality is usually one of the key drivers of data governance undertakings. Improving data quality is a long-term goal that is closely related with master data management and data warehousing and BI.
Why it matters: If your data is an important asset to your organization, its quality does not only matter, it is the central point when working with your data. Bad data will lead to bad data driven processes, to bad data driven decisions and in the end to a loss of business.
We can locate this in all areas if we want to, however, I would like to locate it mainly in the modeling and technical columns as I think from a data governance perspective you can mainly see data quality as defining the quality dimensions your quality should adhere to (hence modeling) and improving quality along these dimensions (hence technical).
10. Documents and Content Management
Deals primarily with unstructured data. For Data Governance, we do not differentiate between structured and unstructured data, so I decided to neglect this block in my considerations.
Putting it together
If you paid attention (or looked at the Areas of Knowledge of DAMA DMBOK), you will have noticed that I deliberately left out one area of knowledge: DATA GOVERNANCE.
This was intentional: I have now described, how the other ten areas interact with each other to form a unified approach to data management. When those capabilities are integrated and maintained, Data Governance (the eleventh area) emerges as the organizational outcome that enables the roles defined by governance to manage data as a valuable asset.
And why it matters should no longer be in doubt in part three of a series on data governance.

If you are able to provide and maintain all these building blocks, you will find harmony in your applications and development. The Business Metadata will increase understanding throughout your enterprise and avoid asking many questions, making working with data more fulfilling, the technical metadata will help users find what they seek and use it properly, the operational metadata will help everyone gain deeper insight into the processes that use the data and the structural metadata will enable them to venture deeper into understanding and uncover new truths. Finally, the reference and master data metadata will help your users find a common language and enable harmony between the people with different backgrounds. All in all you will find balance and harmony when working with your data. You will have found your data zen.
Of course this is painting an overly optimistic and simplified image of working with data after introducing Data Governance. Not all will be golden just because of a new data management method and there will still be misunderstandings and data issues. What I painted here (maybe slightly sarcastically) is the “happy path” of an enterprise with Data Governance. The billboard image of Data Governance, not a realistic view of what will happen and how painful these processes will be… Yet, sometimes it is nice to dream of a better and happier world…
In the final article of this series I will talk about what you will need to change in your organization and how to get started with data governance.
Bonus
As one says, an image speaks more than 1,000 words. So I tried to create an image showing the building blocks discussed in this article and what types of data they exchange. This picture quickly became too overloaded, so I decided to create an interactive diagram containing this information. You can find it embedded below or in its full beauty here. And if you are interested in finding out how I created this, check out this article where you will also find a link to the source code on GitHub (it’s open source, of course).
Photos:
- Data-Head: Photo by cottonbro studio
- Zen: Photo by ClickerHappy
A german version of this post can be found on the virtual7 Blog



