'Speeding up grid search in sklearn
I am performing a grid search to identify the best SVM parameters. I am using ipython and sklearn. The code is slow and runs on only one core. How can this be seeded up and utilize multiple cores? Thanks
random_state = np.random.RandomState(10)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=.2,random_state=random_state)
model_to_set = OneVsRestClassifier(svm.SVC(kernel="linear"))
parameters = {
"estimator__C": [1, 2, 4, 8, 16, 32],
"estimator__kernel": ["linear", "rbf"],
"estimator__gamma":[1, 0.1, 1e-2, 1e-3, 1e-4],
}
model_tuning = GridSearchCV(model_to_set, param_grid=parameters)
model_tuning.fit(X_train, y_train)
print model_tuning.best_score_
print model_tuning.best_params_
print "Time passed: ", "{0:.1f}".format(time.time()-t), "sec"
Solution 1:[1]
There is an n_jobs
parameter in GridSearchCV
n_jobs : int, default=1
Number of jobs to run in parallel. Changed in version 0.17: Upgraded to joblib 0.9.3.
Solution 2:[2]
By default, GridSearchCV uses 1 job to search over specified parameter values for an estimator.
So, you need to set it explicitly with the number of parallel jobs that you desire by chaning the following line :
model_tuning = GridSearchCV(model_to_set, param_grid=parameters)
into the following to allow jobs running in parallel :
model_tuning = GridSearchCV(model_to_set, param_grid=parameters, n_jobs=4)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | eliasah |