Why data accuracy matters more than data scale amid the rise of AI

This audio is auto-generated. Please let us know if you have feedback.

The following is a guest piece written by Gillian MacPherson, senior vice president of product management at Epsilon. Opinions are the author’s own.

For years, the marketing and ad-tech ecosystem has been locked in competition around data scale. Vendors proudly tout the number of households they reach, the devices they recognize or the trillions of signals flowing through their platforms. Bigger numbers have become shorthand for marketing sophistication, with scale still treated as a primary competitive differentiator across the industry.

But in the industry’s relentless race to amass more data —often across increasingly interconnected partners and platforms — one critical question is often overlooked: How accurate is the data behind those numbers?

The accuracy question can no longer be ignored

This question matters more now than at any point in marketing’s history. Artificial intelligence, automation and algorithmic decisioning sit at the center of modern marketing operations. These systems now make thousands, if not millions, of decisions every day across audience targeting, media optimization, personalization and measurement.

The principle of “garbage in, garbage out” has never been more relevant. When inaccurate data feeds AI systems, the consequences compound quickly. Missing values lead to flawed models. Outdated attributes produce misleading customer insights. Duplicate records create wasted spend, fragmented audience views and distorted measurement.

In an automated world, bad data not only misleads marketing teams but also accelerates mistakes. It is a dangerous, double-edged problem that compounds risk at the very moment speed and precision matter most. Errors that once impacted a single campaign can now cascade across entire marketing ecosystems in real time, influencing activation, optimization and business outcomes simultaneously.

The hidden cost of acting on the wrong data

Too often, data accuracy is assumed rather than verified. When that happens, brands expose themselves to a range of risks that do not always surface immediately but are costly all the same.

Media budgets are wasted on audiences that are not in-market for a brand’s products. Marketers misidentify high-value consumers, fail to recognize real people consistently across channels and overlook meaningful opportunities for engagement. Performance insights become distorted by incomplete or outdated information, causing teams to optimize toward the wrong audiences, signals and outcomes.

Because automation operates at speed and scale, these issues spread fast. The problem with inaccurate data is not simply that it is wrong, but that it is wrong at scale, magnifying inefficiency and eroding trust in results.

Scale alone is no longer a competitive advantage

The industry’s fixation on scale is understandable. Large datasets look impressive in pitches, and volume is easy to communicate. But scale alone does not guarantee quality. Large datasets frequently contain duplicate records, stale attributes and disconnected signals that are not grounded in real, reachable consumers.

In many cases, scale has been reinforced by market incentives that reward size over substance. Pricing models that rely primarily on record counts or reach naturally encourage datasets to grow larger, often without the same emphasis on validation, refresh or real-world accuracy. The result is an ecosystem historically incentivized to prioritize data expansion over ongoing validation and quality.

Bigger datasets may win attention, but they do not automatically produce better outcomes.

What actually turns data into performance

The real competitive advantage in modern marketing is not data volume, but data accuracy, validation and usability.

Data drives performance when it is verified and continuously refreshed, resolved to real people and connected across partners and platforms in ways that allow organizations to collaborate and create new, high-value data assets. It must also be structured in a way that AI and automation systems can rely on with confidence.

When data meets these standards, everything downstream improves. Marketers can trust that the audiences they target are the right ones. The signals guiding optimization reflect actual consumer behavior rather than outdated proxies. And the outcomes they measure — from incrementality to lift to return on investment — are grounded in real consumer behavior rather than modeled assumptions alone.

Accuracy does more than improve efficiency. It creates the confidence marketers need to make faster decisions, optimize intelligently and measure performance with greater certainty.

Reframing the industry’s data conversation

As AI and automation continue to reshape marketing, the industry will inevitably move beyond the numbers game around data scale. Marketers will begin asking tougher, more meaningful questions.

How accurate is the data we are relying on? How often is it refreshed and verified? How many records represent real, addressable consumers? How confident can we be that our data is truly AI-ready for decision making?

These questions reflect a broader industry shift from prioritizing data quantity to prioritizing data trustworthiness, usability and the ability to support more collaborative, interoperable approaches to data-driven marketing.

Ultimately, the future of data-driven marketing will not be determined by who possesses the largest dataset, but by who can create the most accurate, actionable and trusted understanding of the consumer and apply it effectively across an increasingly connected data ecosystem.

The bottom line

In the age of AI-powered marketing, the most powerful data is not the biggest dataset in the market. It is the dataset marketers can trust to drive meaningful action confidently across targeting, personalization, activation and measurement.

Because when data drives decisions at scale, accuracy is not a nice-to-have. It is the difference between momentum and misdirection.