Decision Tree Machine Learning Python Algo : Predict Loan Eligibility for Dream Housing Finance company

Question

Decision Tree Machine Learning Python Algo : Predict Loan Eligibility for Dream Housing Finance company

asked Jun 11, 2021 in Python Programming by Sharda Chaudhary Goeduhub's Expert (2.2k points)

Predict Loan Eligibility for Dream Housing Finance company

Dream Housing Finance company deals in all kinds of home loans. They have presence across all urban, semi urban and rural areas. Customer first applies for home loan and after that company validates the customer eligibility for loan.

Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have provided a dataset to identify the customers segments that are eligible for loan amount so that they can specifically target these customers.

download Dataset for Predict load eligibilty Decision Tree

5 Answers

Learn & Improve In-Demand Data Skills Online in this Summer With These High Quality Courses[Recommended by GOEDUHUB]:-

Best Data Science Online Courses[Lists] on:-

Claim your 10 Days FREE Trial for Pluralsight.

Best Data Science Courses on Datacamp

Best Data Science Courses on Coursera

Best Data Science Courses on Udemy

Best Data Science Courses on Pluralsight

Best Data Science Courses & Microdegrees on Udacity

Best Artificial Intelligence[AI] Courses on Coursera

Best Machine Learning[ML] Courses on Coursera

Best Python Programming Courses on Coursera

Best Artificial Intelligence[AI] Courses on Udemy

Best Python Programming Courses on Udemy

Janani · Answer 1 · 2021-06-12T17:27:57+0000

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

import sklearn

df=pd.read_csv("/content/train_ctrUa4K.csv")

df.head(10)

df.shape

df.dtypes

df.corr()

a= df['Property_Area'].values

df.isnull().sum()

from sklearn.preprocessing import LabelEncoder

le=LabelEncoder()

df.Property_Area=le.fit_transform(df.Property_Area)

df.Property_Area.head(20)

df.Loan_Status=le.fit_transform(df.Loan_Status)

df.Loan_Status.head(20)

newdf=df.replace(np.NAN,{'LoanAmount':100,'Loan_Amount_Term':360.0,'Credit_History':1.0})

newdf

                                          newdf.isnull().sum()
                                        
sns.relplot(x='ApplicantIncome',y='LoanAmount',hue="Credit_History",data=newdf)

x=newdf.drop(['Loan_ID','Gender','Married','Dependents','Education','Self_Employed','Loan_Status'],axis='columns')

print(x)

y=newdf['Loan_Status']

print(y)

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=1)

print(len(x_train))

print(len(x_test))

from sklearn.tree import DecisionTreeClassifier

clf=DecisionTreeClassifier(random_state=5)

clf.fit(x_train,y_train)

y_pred=clf.predict(x_test)

y_pred

from sklearn.metrics import accuracy_score

Accuracy=accuracy_score(y_test,y_pred)

print("Accuracy is",Accuracy*100,'%')

from sklearn.metrics import confusion_matrix

cm=np.array(confusion_matrix(y_test,y_pred))

cm

from sklearn import tree

tree.plot_tree(clf)

                                                                              plt.figure()

tree.plot_tree(clf,filled=True)  

plt.savefig('tree.jpg',format='jpg',bbox_inches = "tight")

Naresh Kumar · Answer 2 · 2021-06-17T04:17:58+0000

GO_STP_379:

                        import pandas as pd
                      
                        import numpy as np
                      
                        from sklearn.metrics import accuracy_score,confusion_matrix
                      
                        import matplotlib.pyplot as plt
                      
                        #  read dataset
                      
                        data=pd.read_csv('data loan.csv')
                      
                        # print(data)
                      
                        # treating the null values
                      
                        data['Gender']=np.where(data['Gender'].isnull(),data['Gender'].mode(),data['Gender'])
                      
                        data['Married']=np.where(data['Married'].isnull(),data['Married'].mode(),data['Married'])
                      
                        data['Dependents']=np.where(data['Dependents'].isnull(),data['Dependents'].mode(),data['Dependents'])
                      
                        data['Self_Employed']=np.where(data['Self_Employed'].isnull(),data['Self_Employed'].mode(),data['Self_Employed'])
                      
                        data['LoanAmount']=np.where(data['LoanAmount'].isnull(),data['LoanAmount'].median(),data['LoanAmount'])
                      
                        data['Loan_Amount_Term']=np.where(data['Loan_Amount_Term'].isnull(),data['Loan_Amount_Term'].median(),data['Loan_Amount_Term'])
                      
                        data['Credit_History']=np.where(data['Credit_History'].isnull(),data['Credit_History'].median(),data['Credit_History'])
                      
                        # print(data.info())
                      
                        # print(data.info())
                      
                        # Laber encoder of data
                      
                        from sklearn.preprocessing import LabelEncoder
                      
                        col=['Department','salary']
                      
                        label_encoder =LabelEncoder()
                      
                        data['Loan_ID']= label_encoder.fit_transform(data['Loan_ID'])
                      
                        data['Gender']= label_encoder.fit_transform(data['Gender'])
                      
                        data['Married']= label_encoder.fit_transform(data['Married'])
                      
                        data['Dependents']= label_encoder.fit_transform(data['Dependents'])
                      
                        data['Education']= label_encoder.fit_transform(data['Education'])
                      
                        data['Self_Emplyed']= label_encoder.fit_transform(data['Self_Employed'])
                      
                        data['Property_Area']= label_encoder.fit_transform(data['Property_Area'])
                      
                        data['Loan_Status']= label_encoder.fit_transform(data['Loan_Status'])
                      
                        # print(data['Loan_Status'].value_counts())
                      
                        from sklearn.tree import DecisionTreeClassifier
                      
                        from sklearn.model_selection import train_test_split
                      
                        from sklearn.metrics import confusion_matrix,accuracy_score
                      
                        X=data[['Gender','Married','Dependents','Education','Self_Emplyed','ApplicantIncome','CoapplicantIncome','LoanAmount','Loan_Amount_Term','Credit_History','Property_Area']]
                      
                        Y=data['Loan_Status']==1
                      
                        x_train,x_test,y_train,ytest=train_test_split(X,Y,test_size=0.3,random_state=1)
                      
                        clf_model=DecisionTreeClassifier()
                      
                        clf_model.fit(x_train,y_train)
                      
                        print("classifier decision tree score: ",clf_model.score(x_test,ytest))
                      
                        #prediction
                      
                        # X=[[6.4,1.7,6.6,2.1,4.5]]
                      
                        Y_pred=clf_model.predict(X)
                      
                        print("predict value : ",Y_pred)
                      
                        Y_pred=clf_model.predict(x_test)
                      
                        print("accuracy",accuracy_score(ytest,Y_pred)) 
                      
                        cm=confusion_matrix(ytest,Y_pred)
                      
                        print("confusion matrix: ",cm)
                      
                        #plot decision tree
                      
                        from sklearn import tree
                      
                        tree.plot_tree(clf_model,fontsize='5')
                      
                        text_rep=tree.export_text(clf_model)
                      
                        print("decision tree",(text_rep))
                      
                        data.hist(figsize=(15,12))
                      
                        #Plotting the categorical columns
                      
                        import seaborn as sns
                      
                        sns.countplot(data['Education'],hue=data['Loan_Status'])
                      
                        sns.countplot(data['Married'],hue=data['Loan_Status'])
                      
                        sns.countplot(data['Gender'],hue=data['Loan_Status'])
                      
                        sns.countplot(data['Self_Employed'],hue=data['Loan_Status'])
                      
                        sns.countplot(data['Property_Area'],hue=data['Loan_Status'])
                      
                        sns.countplot(data['Dependents'],hue=data['Loan_Status'])
                      
                        plt.show()

Important Lists:	Important Lists, Exams & Cutoffs	Exams after Graduation	PSUs
Goeduhub:	About Us \| Contact Us \|\| Terms & Conditions \| Privacy Policy \|\| Youtube Channel \|\| Telegram Channel	© goeduhub.com	Social:: \| \|

Decision Tree Machine Learning Python Algo : Predict Loan Eligibility for Dream Housing Finance company

Predict Loan Eligibility for Dream Housing Finance company

Please log in or register to answer this question.

5 Answers

Your comment on this answer:

Your comment on this answer:

Your comment on this answer:

Your comment on this answer:

Your comment on this answer:

Related questions