Breast Cancer Classification Program
I took another example for Classification problem which is Breast Cancer Diagnostic Data Set. As I have already done 3 examples of Classification program, my objective for this was to go deeper into each line of code. I got to know how to slice an array using NumPy, and also understanding Correlation Matrix. So I'll not again going to explain every step as it follows same steps as done for previous Classification Programs.
I did observe different algorithms from a different perspective and I'm sharing these observations in this post. In previous posts, I was looking at accuracy score of different algorithms and then choose the "best" one (having highest accuracy score) to do the validation and check the final statistics.
In this example, I tried to check validation output for each algorithm and compare the difference. I've provided the details below which is self-explanatory. However a couple of things I would like to highlight are Confusion Matrix and Classification Report. Basically from these one can make out why "Accuracy Score" is less or high.
The complete source code is available in Github.
KNeighborsClassifier
------KNN------ Accuracy Score 0.9385964912280702 Confusion Matrix [[70 2] [ 5 37]] Classification Report precision recall f1-score support B 0.93 0.97 0.95 72 M 0.95 0.88 0.91 42 accuracy 0.94 114 macro avg 0.94 0.93 0.93 114 weighted avg 0.94 0.94 0.94 114
LogisticRegression
------LR------ Accuracy Score 0.9473684210526315 Confusion Matrix [[70 2] [ 4 38]] Classification Report precision recall f1-score support B 0.95 0.97 0.96 72 M 0.95 0.90 0.93 42 accuracy 0.95 114 macro avg 0.95 0.94 0.94 114 weighted avg 0.95 0.95 0.95 114
LinearDiscriminantAnalysis
------LDA------ Accuracy Score 0.9473684210526315 Confusion Matrix [[72 0] [ 6 36]] Classification Report precision recall f1-score support B 0.92 1.00 0.96 72 M 1.00 0.86 0.92 42 accuracy 0.95 114 macro avg 0.96 0.93 0.94 114 weighted avg 0.95 0.95 0.95 114
DecisionTreeClassifier
------DTC------ Accuracy Score 0.956140350877193 Confusion Matrix [[70 2] [ 3 39]] Classification Report precision recall f1-score support B 0.96 0.97 0.97 72 M 0.95 0.93 0.94 42 accuracy 0.96 114 macro avg 0.96 0.95 0.95 114 weighted avg 0.96 0.96 0.96 114
GaussianNB
------NB------ Accuracy Score 0.9473684210526315 Confusion Matrix [[70 2] [ 4 38]] Classification Report precision recall f1-score support B 0.95 0.97 0.96 72 M 0.95 0.90 0.93 42 accuracy 0.95 114 macro avg 0.95 0.94 0.94 114 weighted avg 0.95 0.95 0.95 114
SVC
------SVC------ Accuracy Score 0.9035087719298246 Confusion Matrix [[72 0] [11 31]] Classification Report precision recall f1-score support B 0.87 1.00 0.93 72 M 1.00 0.74 0.85 42 accuracy 0.90 114 macro avg 0.93 0.87 0.89 114 weighted avg 0.92 0.90 0.90 114
Comments