Online Courses
Free Tutorials  Go to Your University  Placement Preparation 
Best Products for Students
0 like 0 dislike
5.5k views
in RTU B.Tech (CSE-VI Sem) Machine Learning Lab by Goeduhub's Expert (3.1k points)
edited by
Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print both correct and wrong predictions. Java/Python ML library classes can be used for this problem.

Goeduhub's Top Online Courses @Udemy

For Indian Students- INR 360/- || For International Students- $9.99/-

S.No.

Course Name

 Coupon

1.

Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence || Labeled as Highest Rated Course by Udemy

Apply Coupon

2.

Complete Machine Learning & Data Science with Python| ML A-Z Apply Coupon

3.

Complete Python Programming from scratch | Python Projects Apply Coupon
    More Courses

1 Answer

0 like 0 dislike
by Goeduhub's Expert (3.1k points)
edited by
 
Best answer

K-Nearest Neighbor (KNN)

  • KNN is simple supervised learning algorithm used for both regression and classification problems.
  • KNN is  basically store all available cases and classify new cases based on similarities with stored cases.

Concept: So the concept that KNN works on is Basically similarities measurements, for example, if you look at Mango,it is more similar to Apple then dog or cat, then what KNN will do is put it in the category of fruits not in the category of animals.

What is K in KNN 

What happens in KNN,we trained the model and after that we want to test our model , means we want to classify our new data (test-data),for that  we will check some (K) classes around it and assign the most common class to the test-data.

K- Number of nearest neighbors  

K=1 means the testing data are given the same level as the closet example in training set.

K=4 means the labels of the four closet classes are check and most common class is assign to the testing data.

How does KNN is work ?

Let's understand it with the above given diagram

  1. In this diagram we have 2 classes one blue class one red class
  2. Now we have a new green point, we have to find out whether this point is in class red or blue
  3. For this, we will define the value of K
  4. At K= 1, we will see the distance from the green point to the nearest points, and select the point with lowest distance and classify the green point in that class, here it red.
  5. At K=5 We will calculate the distance from the green point to the nearest points and select the five points with the lowest distance and classify the green point to the most common class, that is red here.
  6. How to choose the value of K? The value of k is not defined, it depends on the cases.

Lazy Learner

  1. KNN is simple algorithm for classification but that's not the reason
  2. KNN is lazy learner  because it doesn't learn a discriminative function from the training data but memorizes the training dataset instead.

KNN Algorithm 

let's understand the concept of KNN algorithm with iris flower problem 

Data: This data consist of total 150  instances (samples) , 4 features , and three classes (targets).

Problem: Using four features we have to classify which flower belongs to which category.

Importing Data-set 

import sklearn 

import pandas as pd

from sklearn.datasets import load_iris

iris=load_iris()

iris.keys()

df=pd.DataFrame(iris['data'])

print(df)

print(iris['target_names'])

iris['feature_names']

Output

Note: 

  1. Now we need a target and data so that we can train the model
  2. As we know that we have to find out the class from the features we have
  3. With this logic,our target is classes (0,1,2) and data is in df.

X=df

y=iris['target']

Splitting Data

  1. The data is split so that with some data we can train the model and from the remaining data we can test the model and can check how well our model is
  2. To do this we have an inbuilt function in sklearn 

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

Note: It will split our 33% data into testing data and remaining data is our training  data

KNN Classifier and Training of the Model

from sklearn.neighbors import KNeighborsClassifier

knn=KNeighborsClassifier(n_neighbors=3)

Note: 

  1. It implements the concepts of KNN. Here we have taken number of neighbors (K)= 3.
  2. First, it will calculate the distance with all the training points to the  test point and then select the three lowest distance points.
  3. And  test data point is classify to the class  most common in among three.
knn.fit(X_train,y_train)

Note:- Training the model with features values (data) and target values (target)

Prediction and Accuracy

Demo:

  1.  Here I want to show you just by taking one data point
  2. we have a data point x_new 

import numpy as np

x_new=np.array([[5,2.9,1,0.2]])

 Now we want to see the class or category of this point 

prediction=knn.predict(x_new)

iris['target_names'][prediction]

Output 

 

Note: As we can see that our point belongs to class (0 or setosa class), this demo is just for understanding 

from sklearn.metrics import confusion_matrix

from sklearn.metrics import accuracy_score

from sklearn.metrics import classification_report

y_pred=knn.predict(X_test)

cm=confusion_matrix(y_test,y_pred) 

print(cm)

print(" correct predicition",accuracy_score(y_test,y_pred))

print(" worng predicition",(1-accuracy_score(y_test,y_pred)))

 Output

Note: As you can see in confusion matrix only one prediction is wrong , and also our accuracy is 0.98 (98%).

Practice KNN - We have a dataset that contains multiple user's information through the social network who are interested in buying SUV Car or not. 

DataSet-Click Here for Download user_data.csv 

Click here for more programs of  RTU ML LAB


Artificial Intelligence(AI) Training in Jaipur 

Machine Learning(ML) Training in Jaipur 

3.3k questions

7.1k answers

394 comments

4.6k users

 Goeduhub:

About Us | Contact Us || Terms & Conditions | Privacy Policy || Youtube Channel || Telegram Channel © goeduhub.com Social::   |  | 
...