Data Proliferation

Data Proliferation

Data Proliferation Jonathan Poland

Data proliferation refers to the rapid growth of data, often resulting in a large amount of replicated and low-quality data. This can be costly to manage and may pose compliance and operational risks to an organization. While it may be necessary to analyze this data in order to understand its structure, sources, and uses, it may ultimately have little value to the organization and can be difficult to discard. The following are illustrative examples of data proliferation.

Customer Data

It is common for multiple systems in an organization to maintain customer data. Such data is commonly out of sync between systems with no clear single source of truth. This can cause operational failures such as sending a bill to the wrong address.

Documents

Knowledge workers tend to create a lot of documents that get checked into a document management system. In many cases, such documents become completely unused with time but are retained as a precaution.

Communication

Communications such as emails can gather at the rate of hundreds per employee per day. Most communications lose their value almost immediately but often are retained for an extended period of time.

Backups

Backups of data, documents and communications often need to be retained in case something important was deleted from the source systems. If someone deletes a critical email, the only copy may be in a backup from a particular day last year. As such, backups are commonly stored for long periods of time. This can consume considerable resources despite the fact that backups are rarely used.

Transactional Data

Transactional data such as market trades and website purchases can grow extremely quickly. Transactional data is often viewed as valuable for historical research. For example, it is common to look at patterns in stock trades going back decades.

Social Data

Data that is shared by people on a public or private social network. Often viewed as valuable for purposes such as market research and machine learning.

Sensors & Machines

Machine and sensor generated data. Sensors have become cheap to the extent than they can be embedded in everyday objects in great numbers. Such data may be generally less valuable than human generated data. For example, video of a train tunnel or data from a tire pressure sensor isn’t interesting for long. Nevertheless, sensor data potentially represents a gigantic source of data that is far larger than all other sources combined.

Market Forces Jonathan Poland

Market Forces

The interaction that shapes a market economy. Market forces are the factors that determine the supply and demand for a…

Business Cluster Jonathan Poland

Business Cluster

A business cluster is a geographic region that is home to a concentration of companies in a particular industry, and…

What is a Flagship? Jonathan Poland

What is a Flagship?

A flagship is a product or service that represents the best a company has to offer and is intended to…

Channel Structure Jonathan Poland

Channel Structure

Market penetration is the percentage of a target market that purchased a company’s product or service over a period of time.

Phased Implementation Jonathan Poland

Phased Implementation

Phased implementation is a method of developing and introducing a business, brand, product, service, process, capability, or system by dividing…

Advantages vs Disadvantages of Technology Jonathan Poland

Advantages vs Disadvantages of Technology

Technology has brought many advantages to modern society, and has greatly improved the way we live and work. Some of…

Technology 101 Jonathan Poland

Technology 101

Technology is an important component of every business, constantly reshaping entire industries. Keeping pace with new and emerging technology can…

Time to Volume Jonathan Poland

Time to Volume

Time to volume is a marketing metric that measures the time it takes for a new product to go from concept to launch and reach a significant level of sales or usage.

Stakeholders Jonathan Poland

Stakeholders

Stakeholders are individuals or groups who have an interest or concern in something, especially a business. For example, in a…

Learn More

Productivity Rate Jonathan Poland

Productivity Rate

Productivity rate is a measure of the efficiency with which a company or organization produces goods or services. It is…

Operating Revenue Jonathan Poland

Operating Revenue

Operating revenue is the income that a company generates from its core business operations. It is a key measure of…

Value of Offerings Jonathan Poland

Value of Offerings

Value is a concept that refers to the usefulness, worth, and importance that customers assign to products and services. This…

Toxic Positivity Jonathan Poland

Toxic Positivity

Top-down and bottom-up are opposing approaches to thinking, analysis, design, decision-making, strategy, management, and communication. The top-down approach begins with…

Decision Trees Jonathan Poland

Decision Trees

Decision Trees are a popular machine learning algorithm used for both classification and regression tasks. They are part of a…

Adoption Rate Jonathan Poland

Adoption Rate

Adoption rate refers to the speed at which users begin to utilize a new product, service, or feature. It is…

Capital Goods Jonathan Poland

Capital Goods

Capital goods are physical assets that are used in the production of other goods or services. These assets are considered…

Willingness to Pay Jonathan Poland

Willingness to Pay

Willingness to pay (WTP) is a measure of how much a customer is willing to pay for a product or…

Positive Risk Jonathan Poland

Positive Risk

Positive risk refers to the potential for achieving an outcome that is too good. While risk is often associated with…