What is Data Mining?

By Melissa Rudy
Updated: May 09, 2012

What is Data Mining?

You may have noticed that sometimes, the Internet seems almost psychic. Facebook knows who your friends are before you add them, and Google ads suggest products and services you actually need. You may visit a website for the first time, only to find that the sidebar ads know where you live and are suggesting restaurant deals in your area.

Contrary to appearances, these companies don't have crystal balls—they're using the magic of data mining to apply the information they do have about you, and make extraordinarily educated guesses.

What is data mining?

The details of data mining are pretty complex, but at the core, it’s the process of gathering vast amounts of data and then extracting useful information. Using ever-mysterious algorithms that only programmers and statisticians can begin to grasp, the practice can produce marketing gold for businesses.

Data mining gathers and sorts through data from thousands, millions, or even billions of points. This large-scale information discovery can be either descriptive or predictive, and can be used to detect one or more of several different types of patterns:

  • Anomaly detection
  • Association learning
  • Classification
  • Cluster detection
  • Regression

One of these things is not like the others

Anomaly detection looks for differences in data that can be compared against a standard to determine certain information. This type of data mining is often used as part of fraud defense. Credit card companies use anomaly detection to flag suspicious transactions, which are verified with the cardholder before processing.

While anomaly detection isn't commonly used from a marketing standpoint, it's definitely a useful tool for protection. This process makes it easier to pinpoint suspicious activity and prevent possible disaster.

If you like this, try that

Anyone who's bought something from Amazon is familiar with the effects of association learning through data mining. Though Amazon doesn't disclose its algorithms—and probably encourages the rumors that they have a team of programmers changing them every 30 minutes or so—the merchant giant uses association learning to make personalized online recommendations.

Even without Amazon's zealously guarded algorithm secrets, the if you like X, you'll like Y formula that can be derived from association learning can benefit any business. With a plethora of products to choose from, consumers often appreciate a nudge in the direction that's interesting to them.

People who buy car insurance like coffee mugs

With oceans of data to sort through, cluster detection is an essential form of data mining that recognizes sub-categories or distinct clusters, which people reading through piles of reports would otherwise miss. This type of data mining can point out purchasing habits among certain groups, providing an excellent source of targeted marketing.

Separating the wheat from the chaff

Classification enables the application of an existing structure for sorting into pre-determined categories. This type of data mining makes things like automated email folder routing possible. For example, spam filters use sophisticated classification algorithms to weed out messages asking you to buy Viagra or donate large sums of money to Nigerian princes.

Learning from the past

With regression, data from past behavior is collected and applied to predict your future actions. Again, the algorithms are complex, but Facebook uses regression data mining to weigh certain factors and pinpoint new behaviors to encourage, or features to offer—though there might have been an element of anomaly detection behind the decision to introduce Timeline.

How does data mining factor into your life?

If you spend any time online, whether for business or pleasure, you're affected by data mining. Your information is used by companies who want your business in various ways, including:

  • Targeted advertising, such as related products and geographical information
  • Spam that is sent to your email address when you sign up for related services
  • Phone calls from survey companies or lead generation firms using data harvested online
  • Snail mail, including offers related to things you've expressed interest in online
  • Friend suggestions through social media sites like Facebook, Twitter, and Google+
  • Police and security profiling, which sometimes relies on Internet data to identify suspicious activity like credit card fraud and illegal downloading

Data mining practices represent a good reason to protect your privacy online. Never give out personal information to an untrusted source, and avoid posting your email address, phone number, or mailing address on public websites. This can help you avoid spam, junk mail, and other forms of targeted advertising—including those eerily prescient banner ads.

Featured Research
  • 2017 Business Intelligence Trends

    It's long been thought that business intelligence (BI) could only be utilized by highly trained analysts and was therefore unattainable for most businesses. However, advancements in BI have made it so that everyone can utilize BI solutions to help shape business decisions and drive companies bottom lines. more

  • Business Intelligence Providers Comparison Guide

    Your investment into Business Intelligence (BI) may be one of the most important that your company will make. Nothing has the power to improve decision making on all management levels like good BI tools. But the importance of choosing the right BI software cannot be overstated. more

  • 2017 Business Intelligence Buyer's Guide

    When properly used, business intelligence (BI) tools can drive major improvements to decision making and increase the efficiency of nearly all business processes. more

  • Work Smarter Not Harder with Business Intelligence

    While this may have been true at one time, the days of BI requiring a dedicated team of experts to implement are over. Self-service solutions are making it possible for everyone, including small, local businesses, to easily implement BI in their decision making process. more

  • How BI Can Give Businesses a Competitive Edge

    Your business is collecting massive amounts of data every time a customer makes a purchase or an employee performs a task. Business partners you work with (including Google and Facebook) are also collecting data all of the time. By 2020, 1.7 megabytes of new data will be created every second for every human being on the planet. With BI (Business Intelligence) tools, you can convert data into actionable insights for your business. more