What is third-party data?
Third-party data is data you purchase from a data broker. Which is to say, it’s not the sort of data you collect yourself. The definition can get a little hazy, but it’s generally taken to mean a corpus of data that’s sold to multiple parties. If it’s something you’re simply borrowing from a partner, that’s second-party data. If you captured it on a form on your site, it’s either first-party or zero-party data.
The most common example of third-party data in our world is the humble purchased list. Gathered and synthesized by a big provider like Dun & Bradstreet or ZoomInfo, this sort of data is available to anyone. Your competitors can purchase it. You can purchase it. Anyone with a credit card has access. Yet not all third-party data is created the same.
Knowing which type your team is purchasing, has purchased, or is planning to purchase can tell you a lot about whether it’s about to help or hurt your business systems.
What type of third-party data do I have?
Really good third-party data is collected with consent. Ideally, by a reputable provider who informed the person about what would happen with that data. And also, that person agreed to it and was truthful to the best of their knowledge. This is of course the ideal state.
Much more common, the third-party data was harvested without their knowledge. Or, worse, it involved some sort of trickery. Take the example of one high-growth startup that hired a content syndication provider. The provider promised to circulate their ebook across the world’s biggest publishers, like The Wall Street Journal. But the leads they received were, well, weak. Whenever someone cold-called them, the person on the other end of the phone had no idea what ebook they were talking about.
Naturally curious, the marketers at the high-growth startup visited the publisher’s website to see for themselves. The “form” was really just a misleading banner ad that tricked people into clicking.
Third-party data gathered through trickery may technically be data, but it’s not the kind you want. Good data is gathered honestly.
To sort out how high quality (and thus useful) your third-party data is, it’s worth examining which type you are receiving from partners:
- Declared: The individual gave the data willingly and was presumably truthful. The data reflects their own representation of themselves.
- Observed: The data provider observed someone doing something and captured the data without express consent. E.g. A data provider says someone visited more than ten web pages on their site.
- Inferred: There’s a one-step hop between the collected data and what it represents. For example, you have a list of “in-market buyers,” but it’s based on the fact that they watched a product demonstration. It’s a likely connection, but not concrete.
- Modeled: The data is inferred using machine learning. For instance, the data says they’re a high-value target, but it’s because they share a title, location, and industry with past customers who have purchased.
None of these types are in and of themselves necessarily bad, but when you dig into the methods of collection, you can find issues and it might be worth putting a stop on the list purchases from that partner. For example:
Declared sources where people had a reason to be untruthful are not good. For example, an excessively lengthy form where a suspicious number of people claimed to be accountants with the phone number (999) 999-9999. If you can find it, always go see the form for yourself.
Observed sources where the connection isn’t deterministic are to be treated with suspicion. This isn’t a new idea, but activity metrics like views, downloads, and clicks aren’t worth much alone—especially when provided by someone else. Who’s to say what clicks on someone else’s site mean? And who’s to say the provider is representing what they mean accurately? Do they really correlate to closed-won deals?
Inferred sources should always be tested. What makes for a good story isn’t necessarily what works in reality. Just because the story sounds good does not make it true.
Modeled sources where you don’t know precisely how the model was trained should be investigated. It’s increasingly common for artificial intelligence systems to be explainable, meaning they can provide rationale for why they made the connections they did. If it’s a black box and simply offers one big score, your salespeople probably won’t trust it. Maybe you shouldn’t either.
These scenarios are only a taste. But they should serve as a useful rubric for interrogating your third-party data sources and whether they belong.
It’s worth the effort. Because if you can remove the useless third-party data sources, you can start to clean up your systems, and make far better use of the data. And the opportunities when everything is clean are vast.
Benefits of third-party data
Now we get to the good stuff. Good quality (and well-cared-for) third-party data can expand your understanding of your customer well beyond your four walls. Whereas first-party data can only tell you what you already know, and second-party data can only tell you what your partners know, third-party data sets can cover the world. They can show you your buyer from all sorts of interesting other angles that you might not otherwise—for instance, intent data that they’ve been visiting review sites looking for solutions like yours.
Third-party datasets are also often massive. Many first-party account lists aren’t large enough to support modeling and data science, but enriched with third-party data, they are. And if you measure your data’s fitness, you can identify gaps or inconsistencies in your first-party data, and append more accurate stuff.
Third-party data—is it friend or foe? Winds up it’s both. It’s your friend if you lead your team in interrogating their data sources and checking their credentials. It’s your friend if you use it intelligently, to enrich but not necessarily supplant first-party data. But it’s your foe if it’s taken as truth without any investigation, and hoovered up by all sorts of shadow software, in which case, it’ll only gunk up your system.