If you’re wondering how to rescue your company from bad data, just ask Tom Redman. He wrote the book on data—literally. He’s published five books on the topic, including the beloved classic, Data Driven. With over 40 years of experience and a Ph.D. in Statistics, Tom is the de facto expert on improving data quality. His expertise is regularly cited by the Harvard Business Review (just a few examples).
Through his consultancy Data Quality Solutions, he’s helped the likes of Morningstar, Morgan Stanley, Chevron, and VMware get to the root of data quality issues to improve operations to become more profitable companies.
He joined us to discuss how “data provocateurs” can secure team buy-in for data quality projects, what we can learn about corporate data from the 1970s quality revolution in car manufacturing, and why improving data quality can be a lot more fun than you think.
Nick: What’s the most surprising thing about your work?
Tom: A central paradox: Data quality is easy, data quality is hard. Doing data quality properly is so much easier than doing it improperly. Finding and eliminating the root cause of an error is so much easier than correcting that same error every time it occurs.
The hard part is you have to look in the mirror and you have to admit yourself, “Oh my goodness, I had great intentions, but I spend a third of my day cleaning up bad data. It never occurred to me to eliminate the source. I have been contributing to the problem.”
Getting people to change their mind on anything is incredibly difficult even if you tell them it’ll save them two hours every day. That’s why we get the big bucks when we can do that kind of thing.
The key to getting started is finding your champions. If you’ve got a department with 100 people in it, you’re not going to change 100 minds. Find five people with open minds to a new idea.
Approach the most open-minded people you work with and clearly demonstrate what’s in it for them. That’s all you need to carve out your first small project. Then you’ll have compelling results to show the rest of the department. Put together a presentation that tells the story of what you’ve accomplished: “The data was at this level, and now six months later it’s improved in these ways.”
And remember to show others your genuine passion for solving problems. Nothing is more contagious than enthusiasm. Frankly I’ve never succeeded at changing everybody’s minds. But change a few minds, and wow—the world is just so different.
Is the state of corporate data getting better?
I don’t see any evidence that it’s getting better, but I believe it will in the next few decades. The quality revolution in manufacturing is a helpful comparison to illustrate why. Japanese car manufacturers in the late 70s and early 80s pursued quality as a strategy to distinguish themselves from American companies. Then over a pretty short period of time, consumers started to demand quality. Then the companies who couldn’t offer top quality were in real trouble. Many of them folded. The lesson was clear: If you don’t step up quality, then you’re out.
I think in all sectors, technology included, sooner or later some companies are going to say, “Gee, we can provide better products and we can cut costs, if we invest in data quality.” The great thing about market economies is that people look for an advantage—consumers will pick the superior product. Companies that invest in data quality now are going to create a crisis for their competitors. When that happens, there will be real winners and losers in the marketplace.
Can you remember a story of a client you worked with whose data was the most chaotic? What factors contributed to it being such a mess?
Frankly, I’ve only worked with one company where chaos didn’t absolutely reign. I describe the typical situation this way: One department decides they need to modernize and automate their client-oriented processes. They think through what they want to achieve, re-define their data, and re-design their processes. All well and good.
Cut to another department—they also decide they need to modernize, so they too come up with a new approach for their needs. And so on.
Pretty soon the company has 27 different approaches to client data. No two were in absolute conflict, but their systems don’t “talk.” Worse, it was harder to do anything new across departmental lines. Nobody looked at the situation from a corporate level and pushed back. Lack of alignment can be stifling.
But there is another important lesson to be learned from this. My brother says it this way:, “Life gets complicated all by itself.” Complexity and chaos are just the natural state of things absent some really aggressive management. Unless you manage it deliberately, things will just spin out of control.
Can you remember a story of a client with the biggest financial impact from cleaning up their data?
Years ago, I had the privilege and pleasure of forming the data quality lab at AT&T, the first organization of its kind anywhere. AT&T recognized that it created huge quantities of data every day and too much was either erroneous or unfit for use. We worked on some huge and fascinating problems and saved AT&T tens of millions. In doing so, we developed the underpinnings for data quality management that stand today.
One specific example is AT&T was paying $20 billion per year to hook up its customers to the long distance network. Since it was such a massive expense, we had people and systems carefully checking the bill each time to identify errors. And sure enough, there were many. Pretty soon AT&T had an entire department checking for errors.
Eventually one person in this department said, “Gee, this looks kind of dumb. Why are we doing all this work to mimic the suppliers’ billing systems? Maybe we can figure out better ways to check their bills.” He came to me, and I pointed out that I didn’t think he ought to be checking their bills at all. Instead we decided to apply some of that effort into getting good bills in the first place.
He started working with Cincinnati Bell, and it turns out they were cranky about all this back and forth, too. And in just a matter of a couple of months, they were able to work together and identify and eliminate the root causes of billing errors. Over the next couple of years, AT&T took this company-wide.
The most important part was the change in philosophy from “we’re going to spend our time correcting other people’s bad data” to ‘we’re going to work with them so the data is good at the very start.” It saved both AT&T and its partners enormous amounts of money—just tens of millions of dollars taken out of the expense stream forever. The secret is to find and eliminate the root cause of the errors. When you do that, you can’t count the money fast enough.
Can customer data platforms and master data management platforms solve the data quality problem?
So far, new technologies have not lived up to their hype about improving quality. There’s been a whole slew of new technologies, from data warehouses to blockchain and data lakes. There’s a common misconception that if you took the bad stuff and you put it in a shiny new can that somehow it was going to get better.
You can’t expect new technology to solve the data problem on its own. There’s an old saying that goes if you automate a process that makes junk, you just make more junk. If you want good data, you’ve got to start creating it correctly.
What responsibilities come with being a data creator?
The first thing if you’re a data creator is you got to find out who uses your stuff. Then the second thing is you got to find out what they need. The third thing you need to do is find out if you’re delivering or not. The fourth thing you need to do is if you’re not delivering, you need to fix your processes so you are. Then, the last thing you need to do is you need to make sure your business processes keep doing that over the long term. It’s that simple, and it’s that hard.
You coined the term “data provocateur” 6 years ago, and it’s similar to our concept of “data superheroes.” Can you explain how you define the term?
I define a data provocateur as someone who is the first in their department to address data quality properly. None of the data provocateurs I know started out thinking about data. They started out because they had a business problem. They found the root causes of errors and that’s what led them to the data.
In effect, they provoke the rest of the organization to change. You may not do everything but, at least interrupt the status quo enough to get the company on the right path. To me, these are the real heroes in data quality.
These people are open-minded, they’re curious, and when they need to, they show a little courage. That’s all it takes. You don’t have to have an advanced degree in data science to do really important things in the data quality space.
What’s something you believe about data that few others do?
That fixing data quality can be fun—and empowering. When I first started working on data projects, I was so surprised by how much fun people were having finding and eliminating the root cause of error. One woman told me, she said, “Tom, I’ve worked for this company for 20 years. This is the first time I felt like I had any control over anything. It changed my life. I’m not changing back.” And I saw that over and over again.
A couple years ago, I was having a conversation with Roger Hoerl who led Six Sigma at GE for a time. I mentioned my observation and he goes, “Oh my goodness, Tom, I found the same thing. We give people some simple tools and a methodology and suddenly they feel so empowered. They can go fix things.”
It’s like when a six-year-old learns to ride a bike for the first time. At first, they’re struggling with it, but then they give it a try, and then all of a sudden they can ride. Now they’re walking around proud, chest all puffed out. Now they’ve got the freedom to go places. It’s the same thing with data. If you reorient towards finding root causes instead of cleaning up messes all the time, you’ll be amazed by how much fun you start having.