Machine Learning & Data Mining | CSE 450

11 Prove : Assignment

Ensemble Learning

Objective

Be able to combine different machine learning algorithms in different forms of ensembles.

Instructions

This assignment is very open-ended. Your task is choose a few different datasets and apply different classification algorithms to these datasets, both individually and in different kinds of ensembles.

In particular, the following are the specific baseline requirements:

For more information about Gradient Boosting Machines, please read Understanding Gradient Boosting Machines.

Languages and Libraries

You are welcome to use any language or set of libraries you like for this assignment. In Python, you can find bagging and other boosting algorithms in the sklearn-ensemble package.

For Random Forests and GBMs, there are a number of good options out there. One that you might consider is LightGBM which has an sk-learn style API, which is easy to use and can be used for both random forests and GBMs by switching the boosting_type parameter. This is also a real-world tool that is used in industry today.

As always, you are encouraged to go above and beyond by experimenting on several more datasets of different make-ups, as well as significant experimentation with the algorithms and their parameters.

Submission

When complete, note the experiments and accuracies in the submission form and upload it to I-Learn.