Data Proliferation

Data Proliferation

Data Proliferation Jonathan Poland

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data. This can be costly to manage and may pose compliance and operational risks to an organization. While it may be necessary to analyze this data in order to understand its structure, sources, and uses, it may ultimately have little value to the organization and can be difficult to discard. The following are illustrative examples of data proliferation.

Customer Data

It is common for multiple systems in an organization to maintain customer data. Such data is commonly out of sync between systems with no clear single source of truth. This can cause operational failures such as sending a bill to the wrong address.

Documents

Knowledge workers tend to create a lot of documents that get checked into a document management system. In many cases, such documents become completely unused with time but are retained as a precaution.

Communication

Communications such as emails can gather at the rate of hundreds per employee per day. Most communications lose their value almost immediately but often are retained for an extended period of time.

Backups

Backups of data, documents and communications often need to be retained in case something important was deleted from the source systems. If someone deletes a critical email, the only copy may be in a backup from a particular day last year. As such, backups are commonly stored for long periods of time. This can consume considerable resources despite the fact that backups are rarely used.

Transactional Data

Transactional data such as market trades and website purchases can grow extremely quickly. Transactional data is often viewed as valuable for historical research. For example, it is common to look at patterns in stock trades going back decades.

Social Data

Data that is shared by people on a public or private social network. Often viewed as valuable for purposes such as market research and machine learning.

Sensors & Machines

Machine and sensor generated data. Sensors have become cheap to the extent than they can be embedded in everyday objects in great numbers. Such data may be generally less valuable than human generated data. For example, video of a train tunnel or data from a tire pressure sensor isn’t interesting for long. Nevertheless, sensor data potentially represents a gigantic source of data that is far larger than all other sources combined.

Learn More
Continuous Production Jonathan Poland

Continuous Production

Continuous production is a method of manufacturing in which materials and parts are continuously processed and kept in motion or…

Human Behavior Jonathan Poland

Human Behavior

Behavior is a pattern of actions or reactions that varies depending on factors such as context and mood. It is…

What is a Market? Jonathan Poland

What is a Market?

A market is a place or platform where buyers and sellers come together to exchange goods and services. Markets can…

Customer Relationships Jonathan Poland

Customer Relationships

Customer relationships refer to the interactions between a business and its potential, current, and former customers. These interactions can take…

Alliance Marketing Jonathan Poland

Alliance Marketing

Alliance marketing refers to a strategic partnership between two or more organizations in which they agree to collaborate on marketing…

What is Reliability? Jonathan Poland

What is Reliability?

Reliability is a measure of the ability of a product or service to perform consistently and predictably over time. It…

Product Extension Jonathan Poland

Product Extension

Product extension is the practice of introducing new products or product lines that are related to a company’s existing products.…

Cyber Security Jonathan Poland

Cyber Security

Cybersecurity is the practice of protecting computing resources from unauthorized access, use, modification, misdirection, or disruption. It is a critical…

Systematic Risk Jonathan Poland

Systematic Risk

Systemic risk is the risk that a problem in one part of the financial system will have broader impacts on…

Content Database

Search over 1,000 posts on topics across
business, finance, and capital markets.

Integration Risk Jonathan Poland

Integration Risk

Integration risk is a type of risk that arises when two or more entities, such as businesses, systems, or processes,…

Innovation Metrics Jonathan Poland

Innovation Metrics

Innovation metrics are tools used to assess the innovation efforts of a company. It can be challenging to accurately measure…

Curiosity Drive Jonathan Poland

Curiosity Drive

Curiosity drive, or the desire to obtain new information, is a fundamental human motivation that drives learning and exploration. In…

Customer Expectations Jonathan Poland

Customer Expectations

Customer expectations refer to the base assumptions that customers make about a brand, its products and services, and the overall…

What is Risk Communication? Jonathan Poland

What is Risk Communication?

Risk communication involves informing people about potential hazards and the steps that can be taken to prevent or mitigate those…

Payback Period Jonathan Poland

Payback Period

The payback period is the length of time it takes for an investment to recoup its initial cost and start…

Types of Revolution Jonathan Poland

Types of Revolution

A revolution is a sudden and significant change to the structure and foundations of a society, often involving conflict and…

What is Moral Hazard? Jonathan Poland

What is Moral Hazard?

Moral hazard is a term used in economics to describe a situation in which one party has less incentive to…

Market Environment Jonathan Poland

Market Environment

The market environment refers to all of the factors that can impact a company’s strategy, decision making, and tactics. This…