What do you remember most about the last live theatre show you saw? If you’re like most people, it was likely one ultra-compelling actor, or a song that was stuck in your head for days after. You probably didn’t think of the stage manager once.
If they do their job perfectly, you’ll never know they exist. But if they make a single mistake? It’s hard to miss a trap door swinging open at the wrong moment, or a musical number in the middle of a scene. Data quality managers share the same plight.
“Data quality is one of those things where everyone knows when it’s wrong and no one knows when it’s right,” says Greg Meyer, Data Quality Manager at Redis Labs. “Because when it’s right, everyone else’s jobs go super smoothly.”
When data quality goes awry, complaints abound. Reps get mad when leads from their territory get routed to someone else, and hot inbound leads are frustrated to never get a call they’re eager to receive. What’s worse, potential revenue opportunities are in danger of falling out of the funnel for good.
It takes a high caliber of data quality manager to make these all-too-common pitfalls disappear. Greg, whose background spans both engineering and fine arts, has the rare “ability to speak both geek and English,” which makes him a linchpin essential to keeping his organization’s revenue engine humming. In this interview, Greg shares his hard-won wisdom, thoughts on how tech can help build a more equitable world, plus his top techniques for keeping data quality so high he hardly ever hears a peep.
Nick: Care to share any dirty data horror stories?
Greg: At a past company, I saw lead ingestion go wrong when names in our database ended up getting scrambled with email addresses. Reps ended up outbounding to the wrong person, which is embarrassing at best, and at worst permanently tarnishes the brand reputation. Because, you might not get a second chance to reach that prospect.
Where do you start if your data is a state of chaos and you want to take baby steps towards better data hygiene?
When your data is a mess, start with a definition of one thing you want to fix. For example, you might want to find all accounts that don’t have a website. First, fix that problem. Then identify all the places where accounts are created without websites in the first place and plug that hole. That way, you’re cleaning, but also improving processes in a lasting way.
At first, imagine you’re the detective asking, “What kind of problem do I have? How much effort will it take to fix?” Then, you need to change your mindset into a consultant and think, “How can I find all the places where that information flows in in the organization, and plug future holes before they start?”
How do you prioritize data cleanup projects when there are so many you could take on?
I start by identifying the cost of a particular type of error. For example, if an account has the wrong address information, then the account may get routed to the wrong sales rep. That’s a pretty costly error because multiple people are handling that same record. Other errors may be more cosmetic, like not having a person’s first name capitalized, which is a “nice to fix.”
How do you minimize dirty data entering your system?
Document your processes in more detail than you think you need, then troubleshoot each step. My boss Manny Ortega always says, “To be terrific, you must be specific.” Kind of corny, and totally right! For data procedures that change information over a long pipeline, there’s a lot of opportunities to spoil the data along the way. For example, from lead ingestion through enrichment, to creating an opportunity, each step carries its own error risks. If you can document every step, then you can set up your tech and train your people to sidestep each potential issue along the way.
What’s the role of humans vs. automation in the ideal data process?
There are many ways to enact a process, but not all of those ways are efficient. The system’s not going to stop you from doing a dumb thing. As a human, you have to think, what are the possible ways I could do a thing? And then, what are the ways that I could do that thing that present the least possibility for error?
For example, start by documenting how an inbound lead becomes a usable, fully qualified lead or minimum viable lead. Then you can automate. If you can’t tell that person how to do it, how are you going to tell a system how to do it? That’s the power of a system like Syncari that allows you to create a global data model, then automate maintenance.
As a Syncari pilot customer, you’ve made it your mission to “democratize lead ingestion.” Can you explain what this means?
Traditionally, it is easy for a marketing department to give a sales department a pile of bad leads. Then the sales team gets angry at the marketing team. And the marketing ops team says, “Why can’t you just ingest my leads?” But these leads have flaws—maybe it’s inconsistent information, or it’s missing a country or email—which prevents the system from accepting the lead.
“Why can’t you just ingest my leads?”
What a system like Syncari does is make it possible for the marketing team to understand immediately when there are fields missing—for example, when leads are sourced through a webinar platform or an events platform. And if you’ve set up your automation right, Syncari fixes it for them.
Being clear and explicit about the minimum data quality threshold to accept leads “democratizes” the process because it empowers the entire revenue org to contribute to driving revenue. When we do this right, there are way less sales versus marketing clashes.
The title “Data Quality Manager” is fairly rare, but on the rise. Why do you think that is?
It’s becoming more common to have a data quality function within revenue operations because of the growing intersection of different systems—and the problems this brings.
For example, all those systems might use the concept of a “person,” but the person might look like a “lead” in Marketo, a “contact” in Salesforce, and a “prospect” in Outreach, which can lead to major misalignment. Typically some of the fields overlap—they may all have “First Name” and “Last Name—but many do not.
Figuring out how to map the pieces of that person together over disparate systems is a major challenge. There’s no system that does that today. Syncari is about as close as it gets.
What’s something one thing related to data quality that you’re doing differently today as opposed to last year?
Last year, we were creating accounts in Salesforce without necessarily enriching all of them. Now, every single account is enriched on the way and reviewed as part of a formal process.
Also, to prevent duplicate accounts from entering our system, when we create a new account, we either do it by converting an existing lead into a contact in an account, or else AEs have to provide justification for creating that new account. This process is a request in Salesforce—we built a custom object to start that workflow. AEs provide the company’s basic details, confirm that they’ve searched in Salesforce to see whether that exists, and write a little one-sentence justification.
What’s your best data quality hack?
Build a spreadsheet of a thing that you do frequently. For example, I use a tool for Salesforce called Workbench, which runs Salesforce Object Query Language (SOQL) queries. If I want to find out the owner of an account, or the status of a field on multiple accounts, I simply make a list of IDs, cut and paste it into Workbench, and do a select statement on those IDs.
This process allows me to pull data much faster than writing a report. If I were doing it as a report, I would need to list all of the items and all of the IDs, in Salesforce as a comma-delimited list. I can only do a handful at a time since it’s a 3000 character limit. This method allows me to do hundreds of items at a time.
How can the rest of the revenue org play a role in keeping data quality top-notch?
If you see something, say something to your friendly neighborhood ops team. Usually when you see a problem with data, there are other issues underneath the surface. And if you go and dig into that thing that you’re looking at right now either by creating a report or just trying to fix it, you’re going to uncover a problem that’s worth fixing.
What are you reading these days?
I’ve been reading “How To Be An Antiracist” by Ibram Kendi. I’m trying to educate myself on being a better ally to all kinds of people.
I’ve also been enjoying Octavia Butler’s series on the “Parable of the Sower”—post-apocalyptic fiction combined with the social action.
What’s one thing you’re taking with you from your reading on anti-racism as someone who works in tech?
We have a lot of embedded assumptions in our algorithms and in our language. A classic example is using the terms “master/slave” when you talk about databases, as opposed to a “primary/secondary.” Another one is saying “blacklist/whitelist” as opposed to “denylist/allowlist.”
When you start challenging those common terms, you start raising awareness of the kinds of decisions you can make to be inclusive. Because here’s the thing: Computers are dumb. They will do exactly what you will tell them to do. If you tell them the wrong thing, they’re not smart enough to figure out your actual intention. They’ll just keep doing it.
Technology has the power to perpetuate and even amplify problematic assumptions—or play a significant role in stamping them out. That’s why our industry needs to be incredibly careful with the software we build—it holds immense influence.
About the author: Nick is a CEO, founder, and author with over 25 years of experience in tech who writes about data ecosystems, SaaS, and product development. He spent nearly seven years as EVP of Product at Marketo and is now CEO and Founder of Syncari.