Data mining is not just for technical people.
And you might have to cluster your data even if you’re just segmenting your clients for your next marketing campaign. Or maybe you’re just a student who’d like to find out the basics of Weka (data mining software).
Here’s a brief data mining tutorial for non-techies to help you get started with clustering:
Where can you get Weka?
The safest option is its official website. Download Weka (Doesn’t work without Java).
And it’s free. 😁
Where do you find the right database?
Weka doesn’t work with just any database. And the algorithms you’re going to choose won’t fit all datasets.
So, if you want to use a specific algorithm, it’s best to just create your own set of data over which you can have full control. Aim for more than 1000 rows for accurate data.
But here are three sources where you could find some decent datasets:
(Drop me a line if you know more.) 😉
And if you’re looking for a case study (in plain English) with few technical elements so you can get an idea of how clustering really works: 🎉
Case study – Bank clients segmentation through clustering
Disclaimer: part of the case study is missing as I’ve done it for a college project and the results are not disclosable
- Highlight the use of Weka for basic data mining processes
- Discover the most representative segment of a bank’s (fictional) clients
- Find out how a bank’s (fictional) services can be improved starting with the data regarding clients’ age, job, marital status, education, account balance, housing, and loans through an online marketing campaign that could bring new clients
Data mining is the process through which valid and previously unknown information is extracted from a specific set of data and is then used to make an important business decision.
Briefly put, data mining is a method that allows YOU to find similar behavioral patterns, trends, or tendencies from an existing data set.
The main goal of the entire process is DISCOVERY.
From this point of view, I’ve chosen to find out the most significant clients of a bank (fictional) through clustering.
For this study, I picked a type of application often used in marketing and retail: identifying significant client profile and behavior patterns.
As a field of applicability, I’ve chosen banking. In this case, the main goal was to identify relevant clients (who are also loyal) and use their profile to create new digital marketing campaigns.
Typically, data mining could’ve been used to identify loyal clients or errors in the use of banking services, to discover new behavior, predict the way in which a service will be used, or estimate possible client administration costs.