Data Proliferation

Data Proliferation

Data Proliferation Jonathan Poland

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data. This can be costly to manage and may pose compliance and operational risks to an organization. While it may be necessary to analyze this data in order to understand its structure, sources, and uses, it may ultimately have little value to the organization and can be difficult to discard. The following are illustrative examples of data proliferation.

Customer Data

It is common for multiple systems in an organization to maintain customer data. Such data is commonly out of sync between systems with no clear single source of truth. This can cause operational failures such as sending a bill to the wrong address.


Knowledge workers tend to create a lot of documents that get checked into a document management system. In many cases, such documents become completely unused with time but are retained as a precaution.


Communications such as emails can gather at the rate of hundreds per employee per day. Most communications lose their value almost immediately but often are retained for an extended period of time.


Backups of data, documents and communications often need to be retained in case something important was deleted from the source systems. If someone deletes a critical email, the only copy may be in a backup from a particular day last year. As such, backups are commonly stored for long periods of time. This can consume considerable resources despite the fact that backups are rarely used.

Transactional Data

Transactional data such as market trades and website purchases can grow extremely quickly. Transactional data is often viewed as valuable for historical research. For example, it is common to look at patterns in stock trades going back decades.

Social Data

Data that is shared by people on a public or private social network. Often viewed as valuable for purposes such as market research and machine learning.

Sensors & Machines

Machine and sensor generated data. Sensors have become cheap to the extent than they can be embedded in everyday objects in great numbers. Such data may be generally less valuable than human generated data. For example, video of a train tunnel or data from a tire pressure sensor isn’t interesting for long. Nevertheless, sensor data potentially represents a gigantic source of data that is far larger than all other sources combined.

Learn More…

Management Approaches Jonathan Poland

Management Approaches

Management approaches are methods or techniques that are used to direct and…

Systems Theory Jonathan Poland

Systems Theory

Systems theory is a field of study that focuses on the ways…

Channel Pricing Jonathan Poland

Channel Pricing

Channel pricing refers to the practice of setting different prices for a…

Value Creation Jonathan Poland

Value Creation

Value creation refers to the process of creating outputs that have a…

Strategic Management Jonathan Poland

Strategic Management

Strategic management involves the formulation and implementation of the major goals and…

What is a Business Case? Jonathan Poland

What is a Business Case?

A business case is a document that presents a proposal for a…

What is Jevons Effect? Jonathan Poland

What is Jevons Effect?

Jevons paradox, also known as the Jevons effect, is a phenomenon in…

Process Efficiency Jonathan Poland

Process Efficiency

Process efficiency refers to the effectiveness of a process in achieving its…

Automation Jonathan Poland


Automation refers to the use of technology to perform tasks that were…

Jonathan Poland © 2023

Search the Database

Over 1,000 posts on topics ranging from strategy to operations, innovation to finance, technology to risk and much more…

Competitive Intelligence Jonathan Poland

Competitive Intelligence

Competitive intelligence is the process of collecting and analyzing information about competitors,…

Change Management Metrics Jonathan Poland

Change Management Metrics

Change management metrics are quantitative measures used to evaluate the effectiveness of…

What Is Analysis? Jonathan Poland

What Is Analysis?

Analysis is the process of breaking something down into its component parts…

Brand Values Jonathan Poland

Brand Values

Brand values are the principles and beliefs that a brand stands for…

Flat Pricing Jonathan Poland

Flat Pricing

Flat pricing is a pricing strategy in which a fixed price is…

Overthinking Jonathan Poland


Overthinking, also known as rumination, is a thought process that involves excessive…

Legal Risk Jonathan Poland

Legal Risk

Legal risk is the risk of financial loss or other negative consequences…

Compliance Risk Jonathan Poland

Compliance Risk

Compliance risk refers to the risk that an organization may face as…

Business Process Reengineering Jonathan Poland

Business Process Reengineering

Business process reengineering, or BPR, involves examining and redesigning current business processes…