CS 3120 Logistic Regressionfor binary classification solved

$35.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

5/5 - (1 vote)

Reference code: 2_Logostic_ExSKLearn_Demo.py in blackboard

1. Select a dataset with binary target values using https://machinelearningmastery.com/standard-machine-learning-datasets/
e.g. banknote or diabetes dataset

2. Use pandas to read CSV file as dataframe.(1pt)
e.g. The following code helps import pima diabetes dataset
col_names = [‘pregnant’, ‘glucose’, ‘bp’, ‘skin’, ‘insulin’, ‘bmi’, ‘pedigree’, ‘age’, ‘label’]
# load dataset
pima = pd.read_csv(“pima-indians-diabetes-database.csv”, header=None, names=col_names)

3. Select 5 (if not possible then select 4) features from the chosen dataset.(1pt)
List all features you selected in your report.
For example, the following code will select twofeatures
feature_cols = [‘pregnant’, ‘age’]
X = pima[feature_cols]

4. Use “train _test_split” from “sklearn.cross_validationtrain” to split test and training data by 40% testing + 60% training.(1pt)

5. Fit your model with training data and test your model after fitting.

6. Calculate and plot out
the confusion matrix (1pt)
precision score, recall score, F score (3pts)
Copy your console output (these scores) to your report.

7. Plot out the ROC curve and print out the ROC_AUC score (sklearn.metrics.roc_curve() and sklearn.metrics.roc_auc_score() can be used.)(3pts)

——————————————————————————————————————–
Submit your report and your code in two different files.
Please include the required figure/plot in your report.
e.g.
File1: Assignment2_FirstnameLastname.doc/.pdf (this is the report)
+
File2: Assignment2_FirstnameLastname.py (this is the code. only “.py” files accepted.
OR
Assignment2_ FirstnameLastname.zip if you have multiple “.py” files.)