Between 80% and 90% of enterprise data is never used, and the reason it piles up is psychological, not technical. Loss aversion makes deleting feel riskier than keeping, and cheap storage removes the only brake. New tools deepen the loop instead of breaking it.
Somewhere between 80% and 90% of the data most organizations store is never used for anything. IBM calls it dark data: the information assets a business collects, processes, and stores during normal operations but never actually analyzes. Gartner, which coined the term, compares it to dark matter. Most of the universe of a company's information sits in the dark, doing nothing.
The usual explanation is technical: bad metadata, incompatible formats, no good way to query it. Those are real, but they're symptoms. The reason data debt keeps accumulating, in organization after organization, is behavioral. People keep data the way people keep things, and storage has become so cheap that the instinct to hold on almost never meets a cost that stops it.
I find behavioral science genuinely fascinating, and this is one of the clearest cases I know where a business problem everyone treats as a systems issue is actually a psychology issue wearing a systems costume.
Up to 90% of stored enterprise data is never used. It doesn't get deleted. It settles into layers nobody revisits.
The fear of throwing away something useful
The strongest driver of data debt is the same one behind a cluttered garage: the worry that the thing you discard is the thing you'll need next week. In a workplace, that shows up as keeping every export, every old report, every duplicate of a file "just in case." Research on digital hoarding behavior, published in Computers in Human Behavior, ties this directly to anxiety about losing potentially useful information and the distress that deleting can provoke.
This is loss aversion doing what it always does. The pain of discarding something that later turns out to matter feels larger than the diffuse, deferred cost of keeping everything. So the default decision, for individuals and teams alike, is to keep. Not because keeping was reasoned through, but because deleting requires a judgment call and keeping doesn't. It's the path of least cognitive resistance, which is closely related to the power of defaults in design: whatever takes no decision is what happens.
Storage got cheap, so the instinct never meets friction
Hoarding instincts used to be checked by physical limits. You ran out of filing cabinets. Digital storage removed that brake almost entirely. When the marginal cost of keeping one more file rounds to zero, the only thing standing between a team and infinite accumulation is a deliberate decision to curate, and deliberate curation is work nobody is assigned and nobody is rewarded for.
So the behavior compounds. A 2026 study in Frontiers in Psychology frames workplace digital hoarding through information ecology theory, describing a self-reinforcing loop: intensive operational needs plus the psychological ease of indefinite retention make hoarding both a coping mechanism and a habit that feeds itself. Each retained file makes the pile slightly less navigable, which makes curating it feel slightly more daunting, which makes keeping everything feel slightly more reasonable. That's how data debt grows the way technical debt does: quietly, then all at once.
Unpruned, accumulation stops being an asset and becomes terrain. The more there is, the harder it feels to ever start clearing it.
Culture and power, not just individuals
Data debt isn't only the sum of individual habits. A workplace study in Behaviour & Information Technology found that employees described their hoarding behavior mainly in terms of organizational culture, and identified a real tension between top-down tools meant to streamline data and people's preference to keep local control of their own files. When a team doesn't trust that a shared system will preserve what matters, individuals quietly keep their own copies. The duplication isn't laziness; it's a rational hedge against a system people don't fully trust.
There's a power dimension too. Sometimes data accumulates in one person's or one team's control because holding information is a form of leverage: the colleague who keeps the spreadsheet, the department that owns the report. That kind of hoarding looks like diligence and functions like a bottleneck. It's one of the ways institutional knowledge fails to become institutional, which is the flip side of building institutional memory that survives turnover.
The part that complicates the story
This is where it gets genuinely interesting, and where I'd caution against treating all accumulation as a vice. A 2025 Frontiers in Psychology study grounded in conservation-of-resources theory argues that some accumulated material genuinely functions as a resource: preserving evidence, preparing for plausible future needs, supporting self-directed learning. In that framing, keeping things can build a kind of reservoir that supports resilience rather than just creating clutter.
The field is honestly still divided on whether digital hoarding is knowledge preservation or a source of stress, and the empirical evidence is thin. I don't think that ambiguity is a reason to dismiss the behavior. I think it's the whole point. The question is never "keep or delete everything." It's "what is this specific data for, and who is responsible for deciding." A retained dataset with a clear purpose and an owner is a resource. The same dataset with neither is debt. The difference is governance, not volume, which is exactly why data governance has to start with decisions about ownership, not with a storage purge.
When the marginal cost of keeping one more file rounds to zero, the only brake left is a deliberate decision to curate. Nobody is assigned that job.
Why new tools rarely fix it
This is the practical payoff. If data debt were purely technical, better tools would solve it. Organizations buy better tools constantly, and the debt keeps growing. It keeps growing because the tool doesn't address the behavior. A new platform that makes storing and duplicating even easier can deepen the loop the 2026 research describes, not break it.
The researchers who study this are consistent on the remedy: technical solutions have a place, but organizational norms and culture have to change in parallel or the tools won't take. In practice that means leadership modeling curation and transparency, making deletion and consolidation a named responsibility rather than an unowned chore, and treating "what data do we keep, and why" as a recurring decision instead of a one-time cleanup. It's the same reason I keep coming back to why I study consumer behavior as an operator: most of what looks like a systems problem is a decisions problem, and decisions are made by people responding to incentives, defaults, and fear.
Data debt is a behavioral phenomenon with a technical surface. You can buy your way to more storage forever. You cannot buy your way out of the instinct that filled it. That takes a decision, an owner, and a culture that rewards clearing the pile instead of quietly adding to it. If you want a partner who builds systems with that governance designed in from the start rather than bolted on after the debt accumulates, that's the work we do at Kief Studio.
Data debt is the accumulated cost of keeping data that is never curated, governed, or used. Like technical debt, it builds up quietly through everyday decisions to retain rather than review, until the volume itself becomes an operational and security burden. Industry estimates put dark, unused data at 80% to 90% of all enterprise data.
Why do teams hoard data instead of deleting it?
Primarily because of loss aversion: the fear of discarding something that later turns out to be useful feels larger than the diffuse cost of keeping everything. Near-zero storage costs remove the friction that used to check the instinct, and organizational culture, including distrust of shared systems, pushes individuals to keep their own duplicate copies.
Is keeping a lot of data always bad?
No. Research grounded in conservation-of-resources theory suggests accumulated data can function as a genuine resource when it has a clear purpose and an owner: preserving evidence, supporting learning, preparing for real future needs. The difference between a resource and debt is governance, not volume.
Why don't new data tools fix data debt?
Because data debt is a behavioral problem with a technical surface. A new tool that makes storing and duplicating easier can deepen the self-reinforcing accumulation loop rather than break it. The research consistently finds that organizational norms and culture must change alongside any tooling for it to work.
How do you actually reduce data debt?
By treating curation as a named responsibility rather than an unowned chore, having leadership model deletion and transparency, and making "what do we keep, and why" a recurring governance decision. Establish one owner and a clear purpose for each meaningful dataset; data without either is the debt to address first.
Martin Seligman's research shows resilience isn't a personality trait — it's a skill. Frontiers in Psychology's 2025 study found that founders with higher psychological capital have measurably lower burnout. This isn't motivational content. It's operational infrastructure.
I have credentials in marketing psychology, consumer behavior, and positive psychology. Not because I wanted to be a therapist — because I wanted to understand why people do what they do.
Google Analytics tells you what happened. Behavioral analytics tells you why. The difference between the two is the difference between data and insight.