Finance[US] Career Guide Free Tutorials Go to Your University Placement Preparation 
0 like 0 dislike
2.9k views
in Python Programming by Goeduhub's Expert (2.2k points)

Predict Loan Eligibility for Dream Housing Finance company

Dream Housing Finance company deals in all kinds of home loans. They have presence across all urban, semi urban and rural areas. Customer first applies for home loan and after that company validates the customer eligibility for loan.

Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have provided a dataset to identify the customers segments that are eligible for loan amount so that they can specifically target these customers.

download Dataset for Predict load eligibilty Decision Tree

5 Answers

0 like 0 dislike
by (132 points)
0 like 0 dislike
by (130 points)
import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

import sklearn

df=pd.read_csv("/content/train_ctrUa4K.csv")

df.head(10)

df.shape

df.dtypes

df.corr()

a= df['Property_Area'].values

                          
df.isnull().sum()
from sklearn.preprocessing import LabelEncoder

le=LabelEncoder()

df.Property_Area=le.fit_transform(df.Property_Area)

df.Property_Area.head(20)

df.Loan_Status=le.fit_transform(df.Loan_Status)

df.Loan_Status.head(20)

newdf=df.replace(np.NAN,{'LoanAmount':100,'Loan_Amount_Term':360.0,'Credit_History':1.0})

newdf

newdf.isnull().sum()

                                        
sns.relplot(x='ApplicantIncome',y='LoanAmount',hue="Credit_History",data=newdf)

x=newdf.drop(['Loan_ID','Gender','Married','Dependents','Education','Self_Employed','Loan_Status'],axis='columns')

print(x)

y=newdf['Loan_Status']

print(y)

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=1)

print(len(x_train))

print(len(x_test))

from sklearn.tree import DecisionTreeClassifier

clf=DecisionTreeClassifier(random_state=5)

clf.fit(x_train,y_train)

y_pred=clf.predict(x_test)

y_pred

from sklearn.metrics import accuracy_score

Accuracy=accuracy_score(y_test,y_pred)

print("Accuracy is",Accuracy*100,'%')

from sklearn.metrics import confusion_matrix

cm=np.array(confusion_matrix(y_test,y_pred))

cm

from sklearn import tree

tree.plot_tree(clf)

plt.figure()

tree.plot_tree(clf,filled=True)  

plt.savefig('tree.jpg',format='jpg',bbox_inches = "tight")




0 like 0 dislike
by (278 points)

GO_STP_6266

https://www.linkedin.com/posts/sahil-parmar-4099391bb_google-colaboratory-activity-6809688042270547968-F1vX

0 like 0 dislike
by (132 points)

GO_STP_379:

import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score,confusion_matrix
import matplotlib.pyplot as plt
#  read dataset
data=pd.read_csv('data loan.csv')
# print(data)
# treating the null values
data['Gender']=np.where(data['Gender'].isnull(),data['Gender'].mode(),data['Gender'])
data['Married']=np.where(data['Married'].isnull(),data['Married'].mode(),data['Married'])
data['Dependents']=np.where(data['Dependents'].isnull(),data['Dependents'].mode(),data['Dependents'])
data['Self_Employed']=np.where(data['Self_Employed'].isnull(),data['Self_Employed'].mode(),data['Self_Employed'])
data['LoanAmount']=np.where(data['LoanAmount'].isnull(),data['LoanAmount'].median(),data['LoanAmount'])
data['Loan_Amount_Term']=np.where(data['Loan_Amount_Term'].isnull(),data['Loan_Amount_Term'].median(),data['Loan_Amount_Term'])
data['Credit_History']=np.where(data['Credit_History'].isnull(),data['Credit_History'].median(),data['Credit_History'])
 
# print(data.info())
# print(data.info())
# Laber encoder of data
 
from sklearn.preprocessing import LabelEncoder
col=['Department','salary']
label_encoder =LabelEncoder()
data['Loan_ID']= label_encoder.fit_transform(data['Loan_ID'])
data['Gender']= label_encoder.fit_transform(data['Gender'])
data['Married']= label_encoder.fit_transform(data['Married'])
data['Dependents']= label_encoder.fit_transform(data['Dependents'])
data['Education']= label_encoder.fit_transform(data['Education'])
data['Self_Emplyed']= label_encoder.fit_transform(data['Self_Employed'])
data['Property_Area']= label_encoder.fit_transform(data['Property_Area'])
data['Loan_Status']= label_encoder.fit_transform(data['Loan_Status'])
# print(data['Loan_Status'].value_counts())
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix,accuracy_score
X=data[['Gender','Married','Dependents','Education','Self_Emplyed','ApplicantIncome','CoapplicantIncome','LoanAmount','Loan_Amount_Term','Credit_History','Property_Area']]
Y=data['Loan_Status']==1
x_train,x_test,y_train,ytest=train_test_split(X,Y,test_size=0.3,random_state=1)
clf_model=DecisionTreeClassifier()
clf_model.fit(x_train,y_train)
print("classifier decision tree score: ",clf_model.score(x_test,ytest))
#prediction
# X=[[6.4,1.7,6.6,2.1,4.5]]
Y_pred=clf_model.predict(X)
print("predict value : ",Y_pred)
Y_pred=clf_model.predict(x_test)
print("accuracy",accuracy_score(ytest,Y_pred)) 
cm=confusion_matrix(ytest,Y_pred)
print("confusion matrix: ",cm)
#plot decision tree
from sklearn import tree
tree.plot_tree(clf_model,fontsize='5')
text_rep=tree.export_text(clf_model)
print("decision tree",(text_rep))
 
data.hist(figsize=(15,12))
#Plotting the categorical columns
import seaborn as sns
sns.countplot(data['Education'],hue=data['Loan_Status'])
sns.countplot(data['Married'],hue=data['Loan_Status'])
sns.countplot(data['Gender'],hue=data['Loan_Status'])
sns.countplot(data['Self_Employed'],hue=data['Loan_Status'])
sns.countplot(data['Property_Area'],hue=data['Loan_Status'])
sns.countplot(data['Dependents'],hue=data['Loan_Status'])
 
plt.show()
 

0 like 0 dislike
by (113 points)
GO_STP_7284 (SACHIN YADAV )

I have complited my assignment-11 by the help of go_eduhub expert.

Learn & Improve In-Demand Data Skills Online in this Summer With  These High Quality Courses[Recommended by GOEDUHUB]:-

Best Data Science Online Courses[Lists] on:-

Claim your 10 Days FREE Trial for Pluralsight.

Best Data Science Courses on Datacamp
Best Data Science Courses on Coursera
Best Data Science Courses on Udemy
Best Data Science Courses on Pluralsight
Best Data Science Courses & Microdegrees on Udacity
Best Artificial Intelligence[AI] Courses on Coursera
Best Machine Learning[ML] Courses on Coursera
Best Python Programming Courses on Coursera
Best Artificial Intelligence[AI] Courses on Udemy
Best Python Programming Courses on Udemy

Related questions

 Important Lists:

Important Lists, Exams & Cutoffs Exams after Graduation PSUs

 Goeduhub:

About Us | Contact Us || Terms & Conditions | Privacy Policy ||  Youtube Channel || Telegram Channel © goeduhub.com Social::   |  | 

 

Free Online Directory

...