Machine Learning & Data Mining | CSE 450

09 Prove : Assignment

Apriori Experimenting

Objective

Be able to use Association Rule Mining to discover rules from unlabeled data.

Instructions

For this assignment you will not be implementing the Apriori algorithm, but rather, will be using an existing implementation to discover rules in unlabeled data.

Please refer to your preparation reading for this week. We will be using the same dataset and algorithm that they walk through in their example.

You can obtain the dataset and algorithm in R by installing and using the "arules" package:


install.packages('arules');
library(arules);
data(Groceries);

Then use the apriori function to generate a set of rules as outlined in the reading.

Experiment Guidelines

Your assignment is to play around with this dataset using different parameters of the Apriori algorithm to identify the following:

  1. The 5 rules you can find with the highest support

  2. The 5 rules you can find with the highest confidence

  3. The 5 rules you can find with the highest lift

  4. The 5 rules you think are the most interesting

Note: Please keep in mind that if all you do is run the algorithm with one set of parameters and sort by these different columns you will NOT find the highest values for each one. You'll need to try different combinations to try to find even higher values.

For each of these rules, please list the rule along with its support, confidence, and lift, just as you can obtain them from R. For example:


{other vegetables, butter} => {whole milk} 0.01148958 0.5736041 2.244885

After this you are encouraged to go above and beyond these requirements to explore this dataset to a much deeper level, or explore association rules in other datasets.

Submission

When complete, fill out the submission form and upload it to I-Learn.