'multilayer_perceptron : ConvergenceWarning: Stochastic Optimizer: Maximum iterations reached and the optimization hasn't converged yet.Warning?

I have written a basic program to understand what's happening in MLP classifier?

from sklearn.neural_network import MLPClassifier

data: a dataset of body metrics (height, width, and shoe size) labeled male or female:

X = [[181, 80, 44], [177, 70, 43], [160, 60, 38], [154, 54, 37], [166, 65, 40],
     [190, 90, 47], [175, 64, 39],
     [177, 70, 40], [159, 55, 37], [171, 75, 42], [181, 85, 43]]
y = ['male', 'male', 'female', 'female', 'male', 'male', 'female', 'female',
     'female', 'male', 'male']

prepare the model:

 clf= MLPClassifier(hidden_layer_sizes=(3,), activation='logistic',
                       solver='adam', alpha=0.0001,learning_rate='constant', 
                      learning_rate_init=0.001)

train

clf= clf.fit(X, y)

attributes of the learned classifier:

print('current loss computed with the loss function: ',clf.loss_)
print('coefs: ', clf.coefs_)
print('intercepts: ',clf.intercepts_)
print(' number of iterations the solver: ', clf.n_iter_)
print('num of layers: ', clf.n_layers_)
print('Num of o/p: ', clf.n_outputs_)

test

print('prediction: ', clf.predict([  [179, 69, 40],[175, 72, 45] ]))

calc. accuracy

print( 'accuracy: ',clf.score( [ [179, 69, 40],[175, 72, 45] ], ['female','male'], sample_weight=None ))

RUN1

current loss computed with the loss function:  0.617580287851
coefs:  [array([[ 0.17222046, -0.02541928,  0.02743722],
       [-0.19425909,  0.14586716,  0.17447281],
       [-0.4063903 ,  0.148889  ,  0.02523247]]), array([[-0.66332919],
       [ 0.04249613],
       [-0.10474769]])]
intercepts:  [array([-0.05611057,  0.32634023,  0.51251098]), array([ 0.17996649])]
 number of iterations the solver:  200
num of layers:  3
Num of o/p:  1
prediction:  ['female' 'male']
accuracy:  1.0
/home/anubhav/anaconda3/envs/mytf/lib/python3.6/site-packages/sklearn/neural_network/multilayer_perceptron.py:563: ConvergenceWarning: Stochastic Optimizer: Maximum iterations reached and the optimization hasn't converged yet.
  % (), ConvergenceWarning)

RUN2

current loss computed with the loss function:  0.639478303643
coefs:  [array([[ 0.02300866,  0.21547873, -0.1272455 ],
       [-0.2859666 ,  0.40159542,  0.55881399],
       [ 0.39902066, -0.02792529, -0.04498812]]), array([[-0.64446013],
       [ 0.60580985],
       [-0.22001532]])]
intercepts:  [array([-0.10482234,  0.0281211 , -0.16791644]), array([-0.19614561])]
 number of iterations the solver:  39
num of layers:  3
Num of o/p:  1
prediction:  ['female' 'female']
accuracy:  0.5

RUN3

current loss computed with the loss function:  0.691966937074
coefs:  [array([[ 0.21882191, -0.48037975, -0.11774392],
       [-0.15890357,  0.06887471, -0.03684797],
       [-0.28321762,  0.48392007,  0.34104955]]), array([[ 0.08672174],
       [ 0.1071615 ],
       [-0.46085333]])]
intercepts:  [array([-0.36606747,  0.21969636,  0.10138625]), array([-0.05670653])]
 number of iterations the solver:  4
num of layers:  3
Num of o/p:  1
prediction:  ['male' 'male']
accuracy:  0.5

RUN4:

current loss computed with the loss function:  0.697102567593
coefs:  [array([[ 0.32489731, -0.18529689, -0.08712877],
       [-0.35425908,  0.04214241,  0.41249622],
       [-0.19993622, -0.38873908, -0.33057999]]), array([[ 0.43304555],
       [ 0.37959392],
       [ 0.55998979]])]
intercepts:  [array([ 0.11555407, -0.3473817 , -0.16852093]), array([ 0.31326347])]
 number of iterations the solver:  158
num of layers:  3
Num of o/p:  1
prediction:  ['male' 'male']
accuracy:  0.5

-----------------------------------------------------------------

I have following questions:

1.Why in the RUN1 the optimizer did not converge?
2.Why in RUN3 the number of iteration were suddenly becomes so low and in the RUN4 so high?
3.What else can be done to increase the accuracy which I get in RUN1.? 


Solution 1:[1]

1: Your MLP didn't converge: The algorithm is optimizing by a stepwise convergence to a minimum and in run 1 your minimum wasn't found.

2 Difference of runs: You have some random starting values for your MLP, so you dont get the same results as you see in your data. Seems that you started very close to a minimum in your fourth run. You can change the random_state parameter of your MLP to a constant e.g. random_state=0 to get the same result over and over.

3 is the most difficult point. You can optimize parameters with

from sklearn.model_selection import GridSearchCV

Gridsearch splits up your test set in eqally sized parts, uses one part as test data and the rest as training data. So it optimizes as many classifiers as parts you split your data into.

you need to specify (your data is small so i suggest 2 or 3) the number of parts you split, a classifier (your MLP), and a Grid of parameters you want to optimize like this:

param_grid = [
        {
            'activation' : ['identity', 'logistic', 'tanh', 'relu'],
            'solver' : ['lbfgs', 'sgd', 'adam'],
            'hidden_layer_sizes': [
             (1,),(2,),(3,),(4,),(5,),(6,),(7,),(8,),(9,),(10,),(11,), (12,),(13,),(14,),(15,),(16,),(17,),(18,),(19,),(20,),(21,)
             ]
        }
       ]

Beacuse you once got 100 percent accuracy with a hidden layer of three neurons, you can try to optimize parameters like learning rate and momentum instead of the hidden layers.

Use Gridsearch like that:

clf = GridSearchCV(MLPClassifier(), param_grid, cv=3,
                           scoring='accuracy')
clf.fit(X,y)


print("Best parameters set found on development set:")
print(clf.best_params_)

Solution 2:[2]

You can consider increasing the number of iterations eg.

clf = MLPClassifier(max_iter=500)

This cleared the error when I did same.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Sara Bernard