As I mentioned in my last post, that I am working on a data mining project for a match making application (it's NOT yahoo personals..)
Here is a partial list of questions that I will attempt to answer using dmx queries.Possible algorithms are listed next to the questions.
- Those who contacted this <person> also contacted the following people (Already done..)
- Perdict list of possible contact profiles based on users demographics (Association Rules) (done).
- Text mining. Going through comments that user entered and extracting some use ful information. (done)
- If you like this person, following people have the similar characteristics as this person. (Microsoft Clustering alogrithm may be??)
- What is the probability that he or she (profile) will contact my profile?? (Microsoft Decistion Trees??)
- Am I compatible to him or her (profile)?? (What algorithm?? Cluster may be??)
- What kind of female demographics are contacting this profile and vice versa? (Association Rules).
- Is this user going to be active user?? (Decision Tree ..)
- List of potential female profiles demographics based on certain male user demographic. (Association rules)
- At what rate the site is acquiring or losing customers. (Time Series).
- Registration forecast of each payment plan category into the futuer (Time Series).
- Identify customer groups based on navigation patterns (Microsoft Sequence Clustering)
- Other different types of forecasts such as rate of male registrations versus female (Time Series).
- Create a report on totals of subscriptions by different attributes such as gender, education, income etc. (OLAP cubes) Done
- Create a report on registrations by different attributes such as gender, education, income etc. (OLAP cubes). Done
- Populate data mining algorithm from olap cubes. Done
- Find hidden similar attributes between different types of users (Clustering algorithm). Done
- Find attributes if any between the customers who belong to certain paid membership group (Clustering algorithm). Done
- What is the probability that I belong to the Contacted Group in male category (or female category). (Clustering algorithm) Done
- Growth Forecast of different categories of site. (Time Series) done
- What truly differentiates the paying from the non-paying customers (Naive Bayes algorithm). Done
- List of users that are going to most like convert to paying customers (Decistion Tree) (Done)
Once these functionality are implemented in the right way, HOPEFULLY, that site will become much more profitable and will be rolling in money.
That is the REAL objective of data mining such as : 1)Identity important correlation in your data 2) Use that correlation to bring in the maximum profits.
So I got lots of work cut out for me and time for me to get busy...
Happy Data Mining...
ZULFIQAR SYED


I am very appreciat to your's Excellent work perfominse.
Posted by: Suresh Kumar Dewani | May 05, 2006 at 09:33 PM
Suresh, thank you for the encouraging comment. I appreciate it very much...
ZULFIQAR SYED
Posted by: ZULFIQAR SYED | May 06, 2006 at 08:42 AM
Just wanted to thank for this excellent information resource.
Keep up the good work.
Rabbani, Mohammed
DW/OLAP/BI Solutions Developer
Posted by: Rabbani Mohammed | July 10, 2006 at 02:38 PM