Data Proliferation

Data Proliferation

Data Proliferation Jonathan Poland

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data. This can be costly to manage and may pose compliance and operational risks to an organization. While it may be necessary to analyze this data in order to understand its structure, sources, and uses, it may ultimately have little value to the organization and can be difficult to discard. The following are illustrative examples of data proliferation.

Customer Data

It is common for multiple systems in an organization to maintain customer data. Such data is commonly out of sync between systems with no clear single source of truth. This can cause operational failures such as sending a bill to the wrong address.

Documents

Knowledge workers tend to create a lot of documents that get checked into a document management system. In many cases, such documents become completely unused with time but are retained as a precaution.

Communication

Communications such as emails can gather at the rate of hundreds per employee per day. Most communications lose their value almost immediately but often are retained for an extended period of time.

Backups

Backups of data, documents and communications often need to be retained in case something important was deleted from the source systems. If someone deletes a critical email, the only copy may be in a backup from a particular day last year. As such, backups are commonly stored for long periods of time. This can consume considerable resources despite the fact that backups are rarely used.

Transactional Data

Transactional data such as market trades and website purchases can grow extremely quickly. Transactional data is often viewed as valuable for historical research. For example, it is common to look at patterns in stock trades going back decades.

Social Data

Data that is shared by people on a public or private social network. Often viewed as valuable for purposes such as market research and machine learning.

Sensors & Machines

Machine and sensor generated data. Sensors have become cheap to the extent than they can be embedded in everyday objects in great numbers. Such data may be generally less valuable than human generated data. For example, video of a train tunnel or data from a tire pressure sensor isn’t interesting for long. Nevertheless, sensor data potentially represents a gigantic source of data that is far larger than all other sources combined.

Long Tail Model Jonathan Poland

Long Tail Model

The long tail refers to a business model that allows a large number of niche products or services to be…

Magical Thinking Jonathan Poland

Magical Thinking

Introduction to Magical Thinking Magical thinking is a type of irrational belief that involves attributing causality to events that are…

Economic Security Jonathan Poland

Economic Security

Economic security refers to the ability of an individual or a household to meet their basic needs, such as food,…

Data Proliferation Jonathan Poland

Data Proliferation

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data.…

Interest Rate Risk Jonathan Poland

Interest Rate Risk

Interest rate risk is the risk that changes in interest rates will negatively impact the value of an investment or…

Relationship Building Jonathan Poland

Relationship Building

Relationship building is the act of establishing and maintaining social connections with others. This is a crucial business skill that…

Examples of Customer Needs Jonathan Poland

Examples of Customer Needs

Customer needs refer to the specific requirements, desires, or expectations that a customer has for a product or service. These…

Risk Estimates Jonathan Poland

Risk Estimates

Risk estimates are predictions or projections of the likelihood and potential consequences of risks. They are used to inform risk…

Cross Merchandising Jonathan Poland

Cross Merchandising

Cross merchandising is a retail strategy that involves placing related or complementary products in close proximity to each other in…

Learn More

Original Equipment Manufacturer Jonathan Poland

Original Equipment Manufacturer

An OEM (original equipment manufacturer) is a company that produces parts or equipment that is used in the manufacture of…

Customer Experience 101 Jonathan Poland

Customer Experience 101

Customer experience (CX) refers to the overall experience that a customer has with a company or brand, from their initial…

What are Finished Goods? Jonathan Poland

What are Finished Goods?

Finished goods are products that have completed the manufacturing process and are ready for sale to customers. They are the…

What is Media? Jonathan Poland

What is Media?

Media refers to the various channels through which information and entertainment can be delivered.

Data Proliferation Jonathan Poland

Data Proliferation

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data.…

Real Estate Investing Jonathan Poland

Real Estate Investing

Real estate investing refers to the process of buying, owning, managing, and selling real estate properties for the purpose of…

Program Efficiency Jonathan Poland

Program Efficiency

Program efficiency refers to the effectiveness with which a computer program uses resources such as time and memory. In general,…

Risk Reduction Jonathan Poland

Risk Reduction

Risk reduction involves the use of various methods to minimize or eliminate risk exposures. This can be done by decreasing…

Market Environment Jonathan Poland

Market Environment

The market environment refers to all of the factors that can impact a company’s strategy, decision making, and tactics. This…