What is Data Mining?

By Melissa Rudy
Updated: May 09, 2012

What is Data Mining?

You may have noticed that sometimes, the Internet seems almost psychic. Facebook knows who your friends are before you add them, and Google ads suggest products and services you actually need. You may visit a website for the first time, only to find that the sidebar ads know where you live and are suggesting restaurant deals in your area.

Contrary to appearances, these companies don't have crystal balls—they're using the magic of data mining to apply the information they do have about you, and make extraordinarily educated guesses.

What is data mining?

The details of data mining are pretty complex, but at the core, it’s the process of gathering vast amounts of data and then extracting useful information. Using ever-mysterious algorithms that only programmers and statisticians can begin to grasp, the practice can produce marketing gold for businesses.

Data mining gathers and sorts through data from thousands, millions, or even billions of points. This large-scale information discovery can be either descriptive or predictive, and can be used to detect one or more of several different types of patterns:

  • Anomaly detection
  • Association learning
  • Classification
  • Cluster detection
  • Regression

One of these things is not like the others

Anomaly detection looks for differences in data that can be compared against a standard to determine certain information. This type of data mining is often used as part of fraud defense. Credit card companies use anomaly detection to flag suspicious transactions, which are verified with the cardholder before processing.

While anomaly detection isn't commonly used from a marketing standpoint, it's definitely a useful tool for protection. This process makes it easier to pinpoint suspicious activity and prevent possible disaster.

If you like this, try that

Anyone who's bought something from Amazon is familiar with the effects of association learning through data mining. Though Amazon doesn't disclose its algorithms—and probably encourages the rumors that they have a team of programmers changing them every 30 minutes or so—the merchant giant uses association learning to make personalized online recommendations.

Even without Amazon's zealously guarded algorithm secrets, the if you like X, you'll like Y formula that can be derived from association learning can benefit any business. With a plethora of products to choose from, consumers often appreciate a nudge in the direction that's interesting to them.

People who buy car insurance like coffee mugs

With oceans of data to sort through, cluster detection is an essential form of data mining that recognizes sub-categories or distinct clusters, which people reading through piles of reports would otherwise miss. This type of data mining can point out purchasing habits among certain groups, providing an excellent source of targeted marketing.

Separating the wheat from the chaff

Classification enables the application of an existing structure for sorting into pre-determined categories. This type of data mining makes things like automated email folder routing possible. For example, spam filters use sophisticated classification algorithms to weed out messages asking you to buy Viagra or donate large sums of money to Nigerian princes.

Learning from the past

With regression, data from past behavior is collected and applied to predict your future actions. Again, the algorithms are complex, but Facebook uses regression data mining to weigh certain factors and pinpoint new behaviors to encourage, or features to offer—though there might have been an element of anomaly detection behind the decision to introduce Timeline.

How does data mining factor into your life?

If you spend any time online, whether for business or pleasure, you're affected by data mining. Your information is used by companies who want your business in various ways, including:

  • Targeted advertising, such as related products and geographical information
  • Spam that is sent to your email address when you sign up for related services
  • Phone calls from survey companies or lead generation firms using data harvested online
  • Snail mail, including offers related to things you've expressed interest in online
  • Friend suggestions through social media sites like Facebook, Twitter, and Google+
  • Police and security profiling, which sometimes relies on Internet data to identify suspicious activity like credit card fraud and illegal downloading

Data mining practices represent a good reason to protect your privacy online. Never give out personal information to an untrusted source, and avoid posting your email address, phone number, or mailing address on public websites. This can help you avoid spam, junk mail, and other forms of targeted advertising—including those eerily prescient banner ads.

Featured Research
  • Business Intelligence Software Cost Guide

    Your choice in a BI (Business Intelligence) provider can lead you to make better, data-driven decisions for your business, resulting in significant ROI. Or it can cost hundreds of thousands of dollars with mixed results. more

  • The New 2016 Business Intelligence Comparison Guide

    Who are the top BI vendors in 2016? Don't get bogged down with contradictory reviews and complex ranking systems. Check out easy-to-use guide! more

  • How to Interview Your BI Provider

    With the advent of Big Data, businesses are now in possession of an ever-increasing mountain of data about their customers and business operations. Now, businesses need to leverage data into actionable business plans. That’s where Business Intelligence (BI) comes in. more

  • 2016 Business Intelligence Buyer's Guide

    Searching for a Business Intelligence solution provider can be confusing and difficult. That's why our Business Intelligence Buyer’s Guide provides a step-by-step breakdown of the purchasing process, from research to implementation. more

  • Acquire, Grow & Retain Customers

    Big data and analytics change how businesses interact with customers by helping them deliver unique experiences, initiate personalized communications, build long-term relationships, and realize value. Read on the learn how big data and analytics can help you acquire, grow and retain customers. more