Please don’t accept if you don’t have or are not familiar with R Studio

This pr

Please don’t accept if you don’t have or are not familiar with R Studio

This problem examines logistic regression. You will want to review the material on linear regression including the use of logistic regression in R Studio. The date needed is attached Below

Here is a video on how to handle parts a.i to a.iii

DataDeliverable:

RMD and the KNITed output

Run the code below and answer parts d and e.

Code

#NOTE: Prepared with R version 3.6.0

#set the working directory to appropriate folder on your machine, so as to access the data

#files.

#load the required librarie(s)/package(s) for this chapter

#Install the package(s) below once on your machine. To do so, uncomment the

#install.packages line(s) below.

NOTE: DO NOT INCLUDE INSTALL.PACKAGES IN .RMD FILE – YOU WILL GET AN ERROR WHEN YOU KNIT

#install.packages(“caret”)

#install.packages(“MASS”)

library(caret)

library(MASS)

## Financial Condition of Banks.

##The file Banks.csv includes data on a sample of 20 banks. The “Financial

##Condition” column records the judgment of an expert on the financial

##condition of each bank. This outcome variable takes one of two possible

##values-weak or strong-according to the financial condition of the bank. The

##predictors are two ratios used in the financial analysis of banks:

##TotLns&Lses/Assets is the ratio of total loans and leases to total assets

##and TotExp/Assets is the ratio of total expenses to total assets. The target

##is to use the two ratios for classifying the financial condition of a newbank.

##Run a logistic regression model (on the entire dataset) that models the

##status of a bank as a function of the two financial measures provided.

##Specify the success class as weak (this is similar to creating a dummy that

##is 1 for financially weak banks and 0 otherwise), and use the default cutoff

##value of 0.5.

#load the data

bank.df <- read.csv("banks.csv")
head(bank.df)
#fit logistic regression model and obtain the summary
reg<-glm(Financial.Condition ~ TotExp.Assets + TotLns.Lses.Assets,
data = bank.df, family = "binomial")
summary(reg)
reg$coefficients
#Coefficients:
# Estimate Std. Error z value Pr(>|z|)

#(Intercept) -14.721 6.675 -2.205 0.0274 *

#TotExp.Assets 89.834 47.781 1.880 0.0601 .

#TotLns.Lses.Assets 8.371 5.779 1.449 0.1474

##a Write the estimated equation that associates the financial condition

##of a bank with its two predictors in three formats:

NOTE: You may need to convert the formulas given here into a form that works in R

##a.i derive the logit values – The logit as a function of the predictors

#the function would be in this form

#logit = -14.721 + (89.834 * TotExp.Assets) + (8.371 * TotLns.Lses.Assets)

#PUT the above code in the correct form so that it will run

##a.ii The odds as a function of the predictors

#the function would be in this form

# Odds = e^(logit) = e^(-14.7207 + (89.8321 * TotExp/Assets) +

# (8.3712 * TotLns&Lses/Assets)

#NOTE: You may need to convert the formulas given here into a form that works in R

#for instance substitute “e^” above the R function for e which is exp()

#PUT the above code in the correct form so that it will run

##a.iii The probability as a function of the predictors

#the function would be in this form

# p = 1/(1 + Exp(-(-14.7207 + (89.8321 * TotExp/Assets) +

# (8.3712 * TotLns&Lses/Assets))))

#Convert the above code in the correct form so that it will run – you will need to change the names to match the

#column names in bank.df

#b Consider a new bank whose total loans and leases/assets ratio = 0.6

##and total expenses/assets ratio = 0.11.

#From your logistic regression model,

##estimate the following four quantities for this bank (use R to do all the

##intermediate calculations; show your final answers to four decimal places):

##the logit, the odds, the probability of being financially weak, and the

##classification of the bank (use cutoff = 0.5).

# new record logit value

# you can use matrix multiplication to determine the probability

#this will be same calculation as above

logit <- c(1, 0.11, 0.6) %*% reg$coefficients
#or you can use the logit formula above replacing the variables with 1, 0.11, and 0.6
#in this form: logit = -14.721 + (89.834 * TotExp.Assets) + (8.371 * TotLns.Lses.Assets)
odds <- exp(-logit)
prob <- 1/(1+odds)
prob
#show your results
#> prob

# [,1]

#[1,] 0.5457504

#probability that the new bank is 0.5457 and therefore the predicted class

#for this new bank is 1, or “financially week”.

##c The cutoff value of 0.5 can be used in conjunction with the probability of

##being financially weak. Compute the threshold that should be used if we want

##to make a classification based on the odds of being financially weak, and

##the threshold for the corresponding logit.

###Convert and RUN thhe following code using Cutoff value of p=0.5.

#first determine the based on the probability of 0.5 which is based on the cutoff value

#Odds = (p) / (1-p) = (0.5) / (1-0.5) = 1

(0.6)/(1-0.6)

#If odds > 1 then classify financial status as “weak” (otherwise classify as

#”strong”).

#now determine the Logit value which is the log of the odds

#Logit = ln (odds) = ln (1) = 0

#If Logit > 0 then classify financial status as “weak” (otherwise, classify it

#as “strong”)

#YOU SHOULD GET THIS CONCLUSION

#Therefore, a cutoff of 0.5 on the probability of being weak is equivalent to a

#threshold of 1 on the odds of being weak, and to a threshold of 0 on the logit.

##d Interpret the estimated coefficient for the total loans & leases to

##total assets ratio (TotLns&Lses/Assets) in terms of the odds of being

##financially weak.

#look at how we determine the odds How does the 8.3712 impact the odds?

#in other words assume we only have the (TotLns&Lses/Assets) ratio

#zero out the -14.7207 and the 89.8321

#this will tell the impact of (TotLns&Lses/Assets)

#so if (TotLns&Lses/Assets)= 1 then what is the effect on the odds?

# Odds = e^(logit) = e^(-14.7207 + (89.8321 * TotExp/Assets) +

# (8.3712 * TotLns&Lses/Assets)

#USING THE EQUATION DETERMINE THE EFFECT ON THE ODD

PUT your answer here

##e When a bank that is in poor financial condition is misclassified as

##financially strong, the misclassification cost is much higher than when a

##financially strong bank is misclassified as weak. To minimize the expected

##cost of misclassification, should the cutoff value for classification

##(which is currently at 0.5) be increased or decreased?

PUT your answer here.