'How to increase the number of iterations to optimize my cost function at each step using partial_fit at Scikit SGDClassifier?
When using partial_fit at Scikit SGDClassifier the number of iteration for the convergence of the cost functions equals 1, as stated in the description:
Perform one epoch of stochastic gradient descent on given samples.
Internally, this method uses max_iter = 1. Therefore, it is not guaranteed that a minimum of the cost function is reached after calling it once. Matters such as objective convergence and early stopping should be handled by the user.
How can I increase max_iter such that my cost function is optimized properly and not just with one iteration? Or related to the scikit- description, how can I handle “objective convergence” and “early stopping” to my classifier using partial_fit?
Solution 1:[1]
You can simply execute the partial_fit()
command repeatedly with the same data, e.g. with the same batch. Here is my code fragment, where I just programmed a loop around the partial_fit()
command:
for i_iter in np.arange(iter_per_batch):
clf.partial_fit(X_batch, y_batch, classes=[0,1])
The variable iter_per_batch
defines the number of iterations.
Solution 2:[2]
You can simply use the fit()
method instead of the partial_fit()
method and increase the max_iter
by providing an integer value for the number of iterations you would like to have for the SGDClassifier. The default here is 1000 iterations.
Have a look at the documentation with the max_iter parameter: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Michael Mior |
Solution 2 | Kim Tang |