03 Prepare : Reading
Preparation Material
Before coming to class this week, please work through the the following:
Confusion Matrices
Cross Validation
sklearn documentation (Read at least the intro and 3.1.1)
Handling Data on Different Scales
About Feature Scaling and Normalization (Don't worry about the details of Naive Bayes or PCA, we'll talk about them in future weeks.)
Handling Categorical Data
Handling Missing Data in Python
Handling Missing Data in General
Now that you have a basic idea of working with missing data in Python, we need to consider the proper strategies of what to do at a higher level.
First, we need to understand why it might be missing: Wikipedia and Measuring U: 3 Types of Missing Data
Now, what are some options: MeasuringU: 7 Ways to Handle Missing Data
Other Helpful Resources
Working with Pandas - Data Camp: DataFrames in Python
Pivoting, Stacking, and Unstacking - Reshaping in Pandas