Data Mining – Short Explanation

With the volume of data that’s being generated globally growing at an ever-increasing rate, companies need to have a way of understanding what matters to them and how they can use that information to better serve customers and the business. Like mining in the ground for minerals, oil and gas, data mining looks for the nuggets of useful information in data.

Simply put, data mining looks for usable data available in a larger pool of raw data. By utilizing data mining techniques, businesses can identify patterns and relationships that otherwise would not be readily apparent. A key point to consider and understand, however, is that data mining is useful only if the data being used is accurate. GIGO (garbage-in, garbage-out) is a critical consideration for data analysis.

Understanding Data Mining in the Real World

Data mining is an iterative process and involves several sequential steps that are repeated multiple times. These steps include the clean up of data as well as removal of outlier data to ensure coherence. Data is subsequently integrated into larger subsets based on patterns and statistical techniques. These patterns are further analyzed to provide information and knowledge.

When considering data mining, it is important to understand the different techniques and processes used. These are as follows:


With association analysis, data analysts are looking for patterns and correlations between data points.


Regression analysis is different to association analysis. Here data analysts try to explain how dependent variables are impacted through an analysis of independent variables.

Prediction and Classification

At this stage, models are created to account for data points that fall outside of the association and regression analysis stages.

Use the Clickworker crowd to efficiently classify large amounts of data.

How Data Mining Works in the World of AI

Data mining and AI are related in many ways. Data mining looks for patterns in large volumes of data. AI however strives to replicate human behavior and uses data to learn and grow in scope and skill. Between the two sits Machine Learning (ML). ML uses data to find patterns and based on the pattern discovered, takes pre-programmed actions.