09 Prove : Assignment
Apriori Experimenting
Objective
Be able to use Association Rule Mining to discover rules from unlabeled data.
Instructions
For this assignment you will not be implementing the Apriori algorithm, but rather, will be using an existing implementation to discover rules in unlabeled data.
Please refer to your preparation reading for this week. We will be using the same dataset and algorithm that they walk through in their example.
You can obtain the dataset and algorithm in R by installing and using the "arules" package:
install.packages('arules');
library(arules);
data(Groceries);
Then use the apriori function to generate a set of rules as outlined in the reading.
Experiment Guidelines
Your assignment is to play around with this dataset using different parameters of the Apriori algorithm to identify the following:
The 5 rules you can find with the highest support
The 5 rules you can find with the highest confidence
The 5 rules you can find with the highest lift
The 5 rules you think are the most interesting
Note: Please keep in mind that if all you do is run the algorithm with one set of parameters and sort by these different columns you will NOT find the highest values for each one. You'll need to try different combinations to try to find even higher values.
For each of these rules, please list the rule along with its support, confidence, and lift, just as you can obtain them from R. For example:
{other vegetables, butter} => {whole milk} 0.01148958 0.5736041 2.244885
After this you are encouraged to go above and beyond these requirements to explore this dataset to a much deeper level, or explore association rules in other datasets.
Submission
When complete, fill out the submission form and upload it to I-Learn.