In this article, you will learn to train a Keras Deep Learning model to predict breast cancer in breast histology images.
From there we’ll create a Python script to split the input dataset into three sets: Training set, Validation set and Testing set.
In [1]:
In [2]:
Out[2]:
DecisionTreeClassifier(random_state=0)
In [3]:
Out[3]:
array([ 4, 4, 4, ..., 4984, 4985, 4986])
In [4]:
Out[4]:
<3253x4987 sparse matrix of type '<class 'numpy.int64'>' with 69595 stored elements in Compressed Sparse Row format>
In [5]:
The scikit-learn version is 0.24.2.
In [6]:
Out[6]:
{'ccp_alpha': 0.0, 'class_weight': None, 'criterion': 'gini', 'max_depth': None, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'random_state': 0, 'splitter': 'best'}
In [7]:
Out[7]:
1.0
In [8]:
Out[8]:
array([ 348, 348, 348, ..., 2052, 2050, 2049])
In [9]:
Out[9]:
DecisionTreeRegressor(random_state=0)
In [10]:
Out[10]:
array([ 334.4 , 316.2 , 330.8 , ..., 2033.53333333, 2037.13333333, 2041.46666667])
In [11]:
Out[11]:
array([ 11, 13, 14, ..., 429, 431, 432])
In [12]:
Out[12]:
<3253x433 sparse matrix of type '<class 'numpy.int64'>' with 31915 stored elements in Compressed Sparse Row format>
In [13]:
Out[13]:
{'ccp_alpha': 0.0, 'criterion': 'mse', 'max_depth': None, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'random_state': 0, 'splitter': 'best'}
In [14]:
Out[14]:
0.7880874805337345
In [15]:
Out[15]:
array([2041.46666667])
In [16]:
Out[16]:
array([0, 1, 0])
In [17]:
Out[17]:
['malignant', 'benign']
In [106]:
[[1000025] [1002945] [1015425] [1016277] [1017023] [1017122] [1018099] [1018561] [1033078] [1033078] [1035283] [1036172] [1041801] [1043999] [1044572] [1047630] [1048672] [1049815] [1050670] [1050718] [1054590] [1054593] [1056784] [1057013] [1059552] [1065726] [1066373] [1066979] [1067444] [1070935]]
In [107]:
[['1' 3 1 1 2] ['10' 3 2 1 2] ['2' 3 1 1 2] ... ['3' 8 10 2 4] ['4' 10 6 1 4] ['5' 10 4 1 4]]
In [127]:
Out[127]:
id number | Clump Thickness | Uniformity of Cell Size | Uniformity of Cell Shape | Marginal Adhesion | Single Epithelial Cell Size | Bare Nuclei | Bland Chromatin | Normal Nucleoli | Mitoses | Class | |
0 | 1000025 | 5 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 | 2 |
1 | 1002945 | 5 | 4 | 4 | 5 | 7 | 10 | 3 | 2 | 1 | 2 |
2 | 1015425 | 3 | 1 | 1 | 1 | 2 | 2 | 3 | 1 | 1 | 2 |
3 | 1016277 | 6 | 8 | 8 | 1 | 3 | 4 | 3 | 7 | 1 | 2 |
4 | 1017023 | 4 | 1 | 1 | 3 | 2 | 1 | 3 | 1 | 1 | 2 |
5 | 1017122 | 8 | 10 | 10 | 8 | 7 | 10 | 9 | 7 | 1 | 4 |
6 | 1018099 | 1 | 1 | 1 | 1 | 2 | 10 | 3 | 1 | 1 | 2 |
7 | 1018561 | 2 | 1 | 2 | 1 | 2 | 1 | 3 | 1 | 1 | 2 |
8 | 1033078 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 5 | 2 |
9 | 1033078 | 4 | 2 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | 2 |
Cancer data set dimensions : (699, 11)
In [110]:
Out[110]:
id number 0 Clump Thickness 0 Uniformity of Cell Size 0 Uniformity of Cell Shape 0 Marginal Adhesion 0 Single Epithelial Cell Size 0 Bare Nuclei 0 Bland Chromatin 0 Normal Nucleoli 0 Mitoses 0 Class 0 dtype: int64
In [111]:
Out[111]:
id number | Clump Thickness | Uniformity of Cell Size | Uniformity of Cell Shape | Marginal Adhesion | Single Epithelial Cell Size | Bland Chromatin | Normal Nucleoli | Mitoses | Class | |
---|---|---|---|---|---|---|---|---|---|---|
count | 6.990000e+02 | 699.000000 | 699.000000 | 699.000000 | 699.000000 | 699.000000 | 699.000000 | 699.000000 | 699.000000 | 699.000000 |
mean | 1.071704e+06 | 4.417740 | 3.134478 | 3.207439 | 2.806867 | 3.216023 | 3.437768 | 2.866953 | 1.589413 | 2.689557 |
std | 6.170957e+05 | 2.815741 | 3.051459 | 2.971913 | 2.855379 | 2.214300 | 2.438364 | 3.053634 | 1.715078 | 0.951273 |
min | 6.163400e+04 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 |
25% | 8.706885e+05 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 2.000000 | 1.000000 | 1.000000 | 2.000000 |
50% | 1.171710e+06 | 4.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 3.000000 | 1.000000 | 1.000000 | 2.000000 |
75% | 1.238298e+06 | 6.000000 | 5.000000 | 5.000000 | 4.000000 | 4.000000 | 5.000000 | 4.000000 | 1.000000 | 4.000000 |
max | 1.345435e+07 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 4.000000 |
m Data is id number Clump Thickness Uniformity of Cell Size \ 0 1000025 5 1 1 1002945 5 4 2 1015425 3 1 3 1016277 6 8 4 1017023 4 1 Uniformity of Cell Shape Marginal Adhesion Single Epithelial Cell Size \ 0 1 1 2 1 4 5 7 2 1 1 2 3 8 1 3 4 1 3 2 Bare Nuclei Bland Chromatin Normal Nucleoli Mitoses 0 1 3 1 1 1 10 3 2 1 2 2 3 1 1 3 4 3 7 1 4 1 3 1 1 m shape is (699, 10)
In [122]:
n Data is 0 2 1 2 2 2 3 2 4 2 Name: Class, dtype: int64 n shape is (699,)
In [123]:
m Data is [[1000025 5 1 ... 3 1 1] [1002945 5 4 ... 3 2 1] [1015425 3 1 ... 3 1 1] ... [1334071 4 1 ... 2 1 1] [1343068 8 4 ... 2 5 2] [1343374 10 10 ... 10 3 1]] n Data is 0 2 1 2 2 2 3 2 4 2 .. 565 4 566 2 567 2 568 4 569 4 Name: Class, Length: 570, dtype: int64
In [124]:
In [125]:
m_train shape is (468, 10) m_test shape is (231, 10) n_train shape is (468,) n_test shape is (231,)
In [ ]:
Better data is the key for the better products. We train you data for Machine Learning and better business analytics. We can annotate, collect, evaluate and translate any type of data in any language.