A professional world without data governance would feel something like Lewis Carroll’s Alice in Wonderland, with everyone involved sharing nonsensical ideas and talking in terms that don’t make sense to every other person in the organization.
This is often what I run into when I take on a new project with a team. They have been manually tracking data in a spreadsheet with little to no limits on how they enter the data. Or they are using multiple spreadsheets in multiple locations and repeating the work of other people without knowing it, making work more difficult than it needs to be. After years of doing things without rules, they approach me and ask me to make sense of the work they have done. And I can’t.
Not right away anyway. It is going to take hours (if not days or weeks) to make heads or tails of the predicament they have created. Now they are not only asking me to build a report or create an analysis out of the data chaos I am attempting to interpret, but I am being asked to undo all the work they have done while establishing data governance.
This chaos can be created in as simple a way as abbreviating a term in some places but not in others within the same field, making it look as though one common value is actually multiple values. It can also be as insane as the multiple spreadsheets scenario, where we find multiple people doing the work but tracking it in different ways, or one person storing the work in a strangely divided way without any rhyme or reason.
It is a head-scratcher for an analyst like myself, who knows before you endeavor into any process in which you will need to track information, you should establish a set of “rules” for tracking and the data that follows.
Who oversees establishing data governance?
Before I get too far ahead of myself, I am referring to data governance as a micro term, or under the idea of a data management concept. Data governance can be thought of as the rules that surround data processing, collection, storage, and removal.
What labels do we use for specific items or ideas we are tracking? Where do we store this data, and for how long? How long is this data tracked for, and where do we pull it from? All of this is decided as data governance is established, and this set of policies should always be documented and shared with others involved so that no one strays from these rules.
Knowing what data governance is helps us to understand why it is important. However, there are a lot of decisions that need to be made in order to create a set of data policies around a process, and it can sometimes be the reason teams freeze or stop before they start. It’s important to keep the trains running, even if we must take a moment before they start up for the day to make sure everything is on schedule and gets to the right place.
As a data analyst, my opinion may be slightly biased, but your data analyst should almost always be involved in establishing your data collection, management, tracking, and reporting processes. With the data analyst, the key people involved in the project should also be included.
This includes anyone who is setting up the process, and possibly some of the people who originally requested this process to take place, as well as anyone who would have to be a part of the process on a regular basis. This means there is a minimum of at least two people involved in establishing data governance around a new data management process.
Where are places I might run into “lawless data” and what should I watch out for?
Sometimes, it’s hard to understand what might constitute a data process, and for that reason, I have shared two very different examples.
Scenario One: Tellers enter data at the front line for members as they join the credit union, or whenever they need to change their contact information. Some tellers spell out the word SUITE for members who have an address containing SUITE in address line 2. Some tellers abbreviate this as STE and some tellers misspell both the actual word and the abbreviation for some members.
Later, when an analyst or marketer attempts to find all the members with STE in their address line 2 information, they get a stern talking to about how they accidentally excluded members from their list despite finding all members with STE in their address information. We have now created a lapse in data governance because people are following two different procedures for entering address line 2 details for members, and some are not double-checking their work.
Scenario Two: Two separate people are tracking contracts and sales in a single Excel document. The document is shared, but there is no data validation present in the document, meaning staff can type whatever they want into all sections of the document. Each employee has a different understanding of what each stage of the sales and contract processes mean.
Employee One sets the sales stage at complete once a contract is generated, while Employee Two sets the sales stage at complete after the contract has been signed and countersigned. When reporting on the sales and contract metrics six months into their tracking process, there is confusion on how many sales were completed and resulted in revenue and a fully executed contract.
Data governance follows who, what, when, where, for how long, and why?
Data governance would ideally be established before a data management/reporting process begins, or shortly after. If a team has already been performing a specific process and has created a mess out of it, it’s still not too late! The sooner data governance can be established, the better.
So, arrange a meeting with a data analyst or operations expert as soon as possible to help clean things up and get everyone on the same page. Longer-standing processes without data governance may need extra help, and there’s no shame in asking for it from your organization’s experts.
Just like writing an analysis or thinking through most processes, we have to think through who, what, when, where, why, and in this case, how. In establishing data policies around data processes, the team needs to decide who will be the organizer, who will be the processor(s), and who will uphold the quality control process.
What data is being tracked, how long it’s tracked for, where it’s saved, how often it’s updated, and what to do with “old” data are all important pieces of the data governance puzzle, and all of these decisions need to be documented in a shared location in case roles at the organization change, or something in the data itself changes that is out of the team’s control.
The time to start is now
Data governance should not be the reason that starting a new process or project is completely placed on hold, but keep in mind what not taking the time to set these data management policies into place could mean in the long run. Not establishing data governance today may mean being unable to perform reporting and analysis later, and could result in a complete overhaul of the process being performed (and in some cases, means a potential loss of data).
The biggest project types our team takes on and brings to light for credit unions are often surrounding poor data governance, meaning we cannot get to the truth of the credit union’s questions without further procedure adjustments and analysis. Data governance is an important “invisible” factor we should be shedding more light on, and doing so could save you a headache in the future.