The world of data management and analytics has come a long way since 1970, the year IBM mathematician Edgar F Codd introduced his “relational database” framework. A precursor to modern data lakes and other data management systems, it was the first to store information in a hierarchical format and make data easily accessible to anyone, not just data scientists.

In the years since then, data has become the lifeblood of business. From life sciences to BFSI to space exploration, data is powering innovation across all industries and markets.

The Covid 19 pandemic has amplified the importance and value of enterprise data to unprecedented heights. According to an October 2020 study by Teradata, 91% of global business leaders surveyed said the importance of data within their organizations has “skyrocketed” ever since the onset of COVID-19. In addition, 88% of executives view data as a strategic asset to their business, while 94% agree data is an essential asset and more importantly, key to recovery and the path moving forward.

However, data-driven insights and technologies are only as accurate as the data going into them. As the old adage goes, “bad data in means bad data out.” Ensuring data accuracy across the data management and analytics lifecycle is not just about delivering meaningful business insights (though that is very important), but also about building trust with customers, employees and other stakeholders.

Siloed applications focused on specific tasks are increasingly becoming things of the past. Forward-thinking organizations are now looking to integrate data flows across systems and lines of business. As next generation technologies such as machine learning (ML), artificial intelligence (AI) and predictive analytics run on data, ensuring data accuracy and quality is paramount to the successful implementation of these tools.

In fact, according to the same Teradata study cited above, 77% of global business leaders say that their organizations are more focused on data accuracy than ever before.

Given the importance of enterprise data to an organization’s current and future success, it should no longer fall solely on the shoulders of data scientists and analytics teams to “handle” data. As data is a shared organizational asset, everyone across the enterprise should be responsible for ensuring data is properly collected, stored and used.

For all these reasons and more, data governance is now taking center stage as the cornerstone of enterprise data strategy and a top strategic initiative for organizations all around the globe.

What is Data Governance?

The Data Governance Institute defines data governance as “a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.”

In other words, data governance refers to the people, processes, and technologies involved with data acquisition, archiving and usage.

While “data management” is a technical discipline concerned with controlling and organizing data, “data governance” is, essentially, a business strategy for data.

As Gregory Vial, assistant professor of IT at HEC Montréal, wrote in a recent article for the MIT Sloan Management Review, data governance should be “a bridge that translates a strategic vision acknowledging the importance of data for the organization and codifying it into practices and guidelines that support operations, ensuring that products and services are delivered to customers.”

Generally speaking, the goals of data governance are to:

  • Define what constitutes as data

  • Establish internal rules for data usage

  • Maximize data accuracy and usability by standardizing data systems, policies, procedures and data standards

  • Define roles and assign accountability to employees responsible for data assets throughout its lifecycle

  • Protect data from external and internal threats through access management

  • Maintain regulatory compliance

  • Streamline and strengthen data-related training efforts

  • Implement improved monitoring and tracking mechanisms for Data Quality and other data-related activities

  • Promote data literacy and a shared understanding of data as an asset

What is a Data Governance Framework

Essentially a how-to-guide for your data governance efforts, a “Data Governance Framework” spells out how organizations set up and enforce data governance efforts. In other words, it formally documents all data-related policies and procedures.

Pillars of Data Governance

According to DAMA international, data governance frameworks should address:

Data Quality

Perhaps more than anything else, one of the primary deliverables of data governance is data quality. By controlling how data within an organization is collected, stored, processed and managed, data governance helps ensure data is accurate, complete, timely, and consistent with all requirements and business rules.

There are six common dimensions of data quality standards that data governance should address:

  • Completeness / Comprehensiveness

  • Consistency / Reliability

  • Accuracy

  • Format

  • Timeframe

  • Validity / Integrity

Data Ownership & Data Stewardship

Data ownership and stewardship refer to the “who” of data governance. It outlines who is responsible for what data-related activity.

Effective data governance frameworks not only assign responsibilities, but also include a well-documented description of the roles and how they all interact. According to SaS, these roles usually include:

  • The data governance council. Comprised of senior staff familiar with both the operations and strategic direction of the organization, they are responsible for determining the high-level policies of the program and approving the procedures developed to carry out those policies.

  • Data Owners. Business and IT leaders who are responsible for ensuring that information within a specific data domain is governed across systems and lines of business. They provide feedback to the council and get regular updates on the progress of the program.

  • Data Stewards. The subject-matter experts responsible for executing the policies enacted by the data governance council. They are responsible for the quality of the data in the organization, helping maximize its value.

  • Data producers or consumers. Those who create data through an application or use data to drive decisions as part of a business process. They are the ones who execute the data governance strategy.

Data Architecture

The physical manifestation of data governance strategy, data architecture “defines the blueprint for managing data assets by aligning with organizational strategy to establish strategic data requirements and designs to meet these requirements.” The standardization of policies and procedures in the data architecture prevents duplication of effort and reduces complexity caused by multivariate implementations of similar operations

Data Modeling and Design

“The process of discovering, analyzing, representing and communicating data requirements in a precise form called the data model,” data modeling and design helps organizations better understand and manage massive volumes of data. According to Dataversity, data Modeling typically focuses on the design of a specific database at the physical level or a particular business area at the logical or conceptual level.

Data Storage and Operations

Many organizations lack consistent management policies as well as utilize multiple databases with differing levels of data protection, security, and service level delivery. This lack of consistent oversight increases the risk of data breach and loss.

Your data governance strategy should include management policies around database operations. Typical policies include controlling database environments, performance levels and service delivery, data protection, lifecycle management, and licensing.

Metadata Management

Metadata is the data that describes other data. The unsung hero of data analytics, metadata refers to the granular information on one specific data such as file type, format, origin, date, etc. This “lineage” provides context for data usage as well as proves data integrity and helps establish trust.

Given the complexity of data systems, creating and sustaining an enterprise-wide view of and easy access to underlying metadata can be challenging. However, as metadata “encapsulates the conceptual, logical, and physical information required to transform disparate data sets into a coherent set of models for analysis,” it is absolutely critical to include a metadata management strategy in your data governance framework.

Data Security

Cyber crimes, and the costs associated with it, are on the rise. In fact, according to the Ponemon Institute, security breaches have increased by 11% since 2018 and 67% since 2014. Furthermore, in 2019, the average cost of a data breach was $3.92 million.

By defining processes for safeguarding and accessing data, data governance protects against data breaches as well as inappropriate use of data. It also helps ensure data is classified and stored according to its sensitivity.

Data integration and interoperability

Advanced and predictive analytics require the seamless integration of information from a wide array of sources, applications and formats. By establishing standards for all data uses including common data definitions and data quality best practices, a robust data governance approach helps accelerate data integration and systems interoperability.

Unstructured Data Management

Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. For example, emails, images and other documents that don’t reside in a traditional database format.

The volume of unstructured data is growing exponentially. In fact, Gartner estimates that as much as 80% of enterprise data is unstructured. However, just because data is “unstructured” doesn’t mean it’s any less valuable. Though unstructured data contains many quality dimensions and can be especially difficult to validate, classify and organize, it cannot be overlooked given the severity of security and regulatory risk involved in doing so.

Reference and Main Data

Reference and main data provide the contextual capabilities for transactional data. It enables organizations to understand operational data and analyze disparately collected data effectively.

As Anne Marie Smith, Ph.D., CDMP puts it, main data “are the critical nouns of a business, and generally fall into four groupings: people (e.g., customer, employee, vendor, etc.), things (e.g., product, item, widget, etc.), places (e.g., office locations and geographic divisions), and concepts (e.g., contract, claim, account, etc.).”

While Main Data Management (MDM) is the process of defining and maintaining how main data will be created, integrated, maintained, and used throughout the enterprise, Data Governance creates the rules and adjudication of the operational processes that are executed within those processes. In other words, as the rules created within data governance ensure quality and privacy of the master data, MDM requires data governance.

Data Warehousing, Business Intelligence (BI) and Analytics

At many organizations, data warehousing, business intelligence (BI) and analytics have evolved into a separate data management system. Effective data governance of these systems helps optimize analytical data processing and enables improved access to decision support data for reporting and analysis. In addition, by creating a unified understanding of data, data governance encourages collaboration across the enterprise and leads to more dynamic uses of analytics.

According to a 2016 Forbes survey of 400+ senior executives over, 78% said that data governance was either vital or important to their BI operations, and 65% said governance is a useful means to empower end-users to uncover new insights.

Regulatory Compliance

From HIPAA to GDPR, there are numerous global regulations in place designed to protect people’s privacy and ensure good business practices. By boosting data accuracy and streamlining reporting capabilities, data governance helps organizations stay compliant with these regulations.

Agile Data Governance

Effective data governance is anything but a “one-size-fits-all” set of rules and requirements. Though setting a clear set of robust rules and procedures to ensure data quality and security is a must, these guidelines should not be intense that it hinders the strategic use of data.

It’s also important to remember that not all data is created equal. Data that is especially sensitive, (i.e. medical records), should be subject to very different standards than data that is less so. With that in mind, it’s critical that you account for these nuances within your data governance framework.

Effective data governance requires a delicate balance of control and agility. Some key considerations for building a flexible data governance approach are:

  • Have a clear focus, but don’t be overly specific

  • Be creative about enforcement

  • Ensure scalability and leverage flexible architectures

This article is written by Elizabeth Mixson and was originally published by the AI Data & Analytics Network. We received permission to republish it here for the ADCG community.

Previous
Previous

Minnesota Privacy Act Unveiled

Next
Next

UK Ranks Second Highest in GDPR Fines