DATA MINING
According to Berry and
Linoff, Data Mining is the exploration and analysis, by automatic or
semiautomatic means, of large quantities of data in order to discover meaningful
patterns and rules. This definition,
justifiably, raises the question: how does data mining differ from OLAP? OLAP
(Online Analytical Processing) is undoubtedly a semiautomatic means of analyzing
data, but the main difference lies in quantities of data that can be handled.
There are other
differences as well. Tables 1 and 2 summarize these differences.
Table-1 : OLAP Vs Data Mining – Past Vs Future
OLAP: Report on the past
|
Data Mining: Predict the future
|
Who are our top 100 best customers for the last three years?
|
Which 100 customers offer the best profit potential?
|
Which customers defaulted on the mortgages last in two years?
|
Which customers are likely to be bad credit risks?
|
What were the sales by territory last quarter compared to the
targets?
|
What are the anticipated sales by territory and region for
next year?
|
Which salespersons sold more than their quota during last four
quarters?
|
Which salespersons are expected to exceed their quotas next
year?
|
Last year, which stores exceeded the total prior year sales?
|
For the next two years, which stores are likely to have best
performance?
|
Last year, which were the top five promotions that performed
well?
|
What is the expected return for next year’s promotions?
|
Which customers switched to other phone companies last year?
|
Which customers are likely to switch to the competition next
year?
|