Machine Learning & Data Mining | CSE 450

02 Teach : Python Practice

Objective

Better understand python syntax that will be helpful for the k-Nearest Neighbors assignment.

Instructions

Start with the following variables:


import numpy as np

x = np.array([3, 6])
y = np.array([5, 2])
data = np.array([[2, 3], [3, 4], [5, 7], [2, 7], [3, 2], [1, 2], [9, 3], [4, 1]])
animals = ["dog", "cat", "bird", "fish", "fish", "dog", "cat", "dog"]

Then, work together to accomplish the following programming tasks:

You are welcome to use a Jupyter notebook or write this in a script. If you are using a Jupyter notebook, you can download a template for this assignment here.

  1. Compute the euclidean distance between two points: x and y.

  2. Turn your code into a function that can compute the distance between any two points.

    Then verify that your function works by calling it with x and y.

  3. Compute the distance between x and every row in the data array. Save each of these distances into a new list or numpy array.

  4. Find the smallest two values in the list of distances you created in the last step.

  5. Find the indexes of the smallest values in the list of distances.

    (Hint: This is going to be super useful if you want to know not just the smallest distance but also which item corresponded to the smallest distance, so you could get its target value.)

  6. Find the animal that occurs most frequently in the animals list.

Instructor Help

Please do not open the instructor code until you have worked on this assignment for the class period. At that point, if you are still struggling to complete any of these sections you are welcome to use this code to help guide you through the remaining sections:

Submission

You are welcome to complete this exercise after class (either by yourself or with others), before submitting the quiz.

When complete, report on your progress with the accompanying I-Learn quiz.