Version:0.9 StartHTML:0000000105 EndHTML:0000026276 StartFragment:0000001234 EndFragment:0000026260
# ImageAI is a Python library built to empower Computer Vision
"""
Bayes' Theorem Explained
Bayes' theorem is crucial for interpreting the results from binary classification algorithms, and a most know for aspiring data scientists. We show how Bayes' theorem can be established using the results from a binary classification machine learning algorithm.
Author: Benjamin O. Tayo Date: 5/7/2020 - regularize matrix: 5/11/2020 Max Kleiner
https://github.com/bot13956/Bayes_theorem/blob/master/Bayes_Theorem.ipynb
https://towardsdatascience.com/6-amateur-mistakes-ive-made-working-with-train-test-splits-916fabb421bb
"""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
plt.style.use("ggplot")
"""
detection = detector.detectObjectsFromImage(input_image=input_path, \
output_image_path=output_path, minimum_percentage_probability=10)
"""
# 2. Exploratory Data Analysis
df=pd.read_csv(r"C:\maXbox\mX47464\maxbox4\examples\fm_heights.csv")
print(df.head())
plt.figure()
sns.countplot(x="sex", data=df)
plt.show()
df2 = df
df2['count']=range(df.shape[0])
print(df2.head(n=10))
sns.lmplot( x="count", y="height", data=df2, hue='sex', \
legend=False, fit_reg=False, aspect=1.6)
plt.legend(loc='upper left')
plt.title('Scatter plot of heights')
plt.ylabel('height (inch)')
plt.show()
plt.figure(figsize=(10,6))
sns.distplot(df['height'],bins=20)
plt.title('Probability distribution of all heights')
plt.xlabel('height (inch)')
plt.show()
plt.figure(figsize=(10,6))
sns.distplot(df[df.sex=='Male']['height'],bins=None, hist=False, label = 'Male')
plt.title('probability distribution of Male and Female heights')
sns.distplot(df[df.sex=='Female']['height'],bins=None,hist= False,label= 'Female')
plt.legend()
plt.xlabel('height (inch)')
plt.show()
# 3. Model Building and Evaluation
from sklearn.preprocessing import LabelEncoder
class_le = LabelEncoder()
y = class_le.fit_transform(df['sex'].values)
print(pd.value_counts(y))
X = df['height']
X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.3, \
random_state=0, stratify=y)
X_train = X_train.values.reshape(X_train.shape[0],1)
X_test = X_test.values.reshape(X_test.shape[0],1)
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
print('score: ',knn.score(X_test, y_test))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
print('bayes classifier detector compute ends...')
#//----app_template_loaded_code----
#//----File newtemplate.txt not exists - now saved!----
#https://stackabuse.com/object-detection-with-imageai-in-python/
# https://github.com/OlafenwaMoses/ImageAI/releases/download/1.0/yolo-tiny.h5
#https://imageai.readthedocs.io/en/latest/detection/index.html
"""
dtype: int64
[[ 39 32]
[ 22 222]]
precision recall f1-score support
0 0.64 0.55 0.59 71
1 0.87 0.91 0.89 244
accuracy 0.83 315
macro avg 0.76 0.73 0.74 315
weighted avg 0.82 0.83 0.82 315
bayes detector compute ends...
Bayes theorem is crucial for interpreting the results from binary classification algorithms
We will show that Bayes theorem is simply the relationship between precision and recall:
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
dtype: int64
[[ 39 32] 71 =A
[ 22 222]] 244 =B
254 315
Predict
Actual 0 [[ 39 32] 71
1 [ 22 222]] 244
61 254 315
1 precision: 222/254 =0.874 P(A/B)
2 recall: 222/244 =0.909 = Bayes P(B/A)
The probability P(B/A) = 222/244 = 0.91 is called the recall. It simply gives the percentage of the 244 actual B that were correctly predicted by our classification algorithm. We see that Bayes theorem is simply a relationship between recall and precision:
[Predict * precision / Actual = recall] = Bayes
(P_B*P_AB)/P_A = 0.9098360655737705
(P(B)*P(A/B))/(P(A)) = 0.9098360655737705 = P(B/A)
(254/315 * 222/254) / (244/315) = 0.9098
predict * precision / actual = recall
precison / recall = actual / predict
precision recall f1-score support
0 0.64 0.55 0.59 71
1 0.87 0.91 0.89 244
accuracy 0.83 315
macro avg 0.76 0.73 0.74 315
weighted avg 0.82 0.83 0.82 315
Bayes matrix detector compute ends Confusion matrix needs both labels predictions as single-digits, not as one-hot encoded vectors; although you have done this with your predictions using model.predict_classes()
image detector compute ends...
"""