Data Proliferation

Data Proliferation

Data Proliferation Jonathan Poland

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data. This can be costly to manage and may pose compliance and operational risks to an organization. While it may be necessary to analyze this data in order to understand its structure, sources, and uses, it may ultimately have little value to the organization and can be difficult to discard. The following are illustrative examples of data proliferation.

Customer Data

It is common for multiple systems in an organization to maintain customer data. Such data is commonly out of sync between systems with no clear single source of truth. This can cause operational failures such as sending a bill to the wrong address.


Knowledge workers tend to create a lot of documents that get checked into a document management system. In many cases, such documents become completely unused with time but are retained as a precaution.


Communications such as emails can gather at the rate of hundreds per employee per day. Most communications lose their value almost immediately but often are retained for an extended period of time.


Backups of data, documents and communications often need to be retained in case something important was deleted from the source systems. If someone deletes a critical email, the only copy may be in a backup from a particular day last year. As such, backups are commonly stored for long periods of time. This can consume considerable resources despite the fact that backups are rarely used.

Transactional Data

Transactional data such as market trades and website purchases can grow extremely quickly. Transactional data is often viewed as valuable for historical research. For example, it is common to look at patterns in stock trades going back decades.

Social Data

Data that is shared by people on a public or private social network. Often viewed as valuable for purposes such as market research and machine learning.

Sensors & Machines

Machine and sensor generated data. Sensors have become cheap to the extent than they can be embedded in everyday objects in great numbers. Such data may be generally less valuable than human generated data. For example, video of a train tunnel or data from a tire pressure sensor isn’t interesting for long. Nevertheless, sensor data potentially represents a gigantic source of data that is far larger than all other sources combined.

Learn More
Decision Framing Jonathan Poland

Decision Framing

Decision framing refers to the way in which a choice or dilemma is presented or structured. This includes the language…

Automation Jonathan Poland


Automation refers to the use of technology to perform tasks that were previously done manually. In recent years, automation has…

Decision Tree Jonathan Poland

Decision Tree

A decision tree is a graphical representation of a decision-making process. It is a flowchart-like structure that shows the various…

Payback Period Jonathan Poland

Payback Period

The payback period is the length of time it takes for an investment to recoup its initial cost and start…

Risk Exposure Jonathan Poland

Risk Exposure

Risk exposure refers to the potential costs that an organization could incur as a result of a particular risk or…

Decision Automation Jonathan Poland

Decision Automation

Decision automation refers to the use of technology to automate the process of making decisions. This can be done through…

What is an Exit Interview? Jonathan Poland

What is an Exit Interview?

An exit interview is a formal meeting or conversation that takes place when an employee is leaving an organization, regardless…

Opportunity Cost Jonathan Poland

Opportunity Cost

Opportunity cost is the value of the next best alternative that is given up as a result of making a…

Creative Destruction Jonathan Poland

Creative Destruction

Creative destruction is a process in which new, innovative ideas and technologies disrupt and replace older, established industries and firms.…

Latest Thinking

Qualified Small Business Stock (QSBS) Jonathan Poland

Qualified Small Business Stock (QSBS)

Qualified Small Business Stock (QSBS) refers to a special classification of stock in the United States that offers significant tax…

Barrick Gold Jonathan Poland

Barrick Gold

Barrick Gold Corporation (NYSE: GOLD) is a significant player in the global economy, particularly within the gold mining industry. Its…

Newmont Corporation Jonathan Poland

Newmont Corporation

Newmont Corporation (NYSE: NEM), being the world’s largest gold mining corporation, with extensive operations in mining and production of not…

Gold is Money Jonathan Poland

Gold is Money

Overview The history of gold as money spans thousands of years and has played a pivotal role in the economic…

What is Leadership? Jonathan Poland

What is Leadership?

In the modern business world, where rapid changes, technological advancements, and global challenges are the norm, effective leadership is more…

Product Durability Jonathan Poland

Product Durability

A durable product, often referred to as a durable good, is a product that does not quickly wear out or,…

Durable Competitive Advantage Jonathan Poland

Durable Competitive Advantage

The most important aspect of durability is market fit. Unique super simple products or services that does change much if…

Praxeology Jonathan Poland


Praxeology is the study of human action, particularly as it pertains to decision-making and the pursuit of goals. The term…

Business Models Jonathan Poland

Business Models

Business models define how a company creates, delivers, and captures value. There are numerous business models, each tailored to specific…