'how to log hydra's multi-run in mlflow
I am trying to manage the results of machine learning with mlflow and hydra. So I tried to run it using the multi-run feature of hydra. I used the following code as a test.
import mlflow
import hydra
from hydra import utils
from pathlib import Path
import time
@hydra.main('config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
with mlflow.start_run() :
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
if __name__ == '__main__':
main()
This code will not work. I got the following error
Exception: Run with UUID [RUNID] is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True
So I modified the code as follows
import mlflow
import hydra
from hydra import utils
from pathlib import Path
import time
@hydra.main('config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
with mlflow.start_run(nested=True) :
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
if __name__ == '__main__':
main()
This code works, but the artifact is not saved. The following corrections were made to save the artifacts.
import mlflow
import hydra
from hydra import utils
from pathlib import Path
import time
@hydra.main('config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
if __name__ == '__main__':
main()
As a result, the artifacts are now saved. However, when I run the following command
python test.py model=A,B hidden=12,212,31 -m
Only the artifact of the last execution condition was saved.
How can I modify mlflow to manage the parameters of the experiment by taking advantage of the multirun feature of hydra?
Solution 1:[1]
MLFlow is not officially supported by Hydra. At some point there will be a plugin that will make this smoother.
Looking at the errors you are reporting (and without running your code): One thing that you can try to to use the Joblib launcher plugin to get job isolation through processes (this requires Hydra 1.0.0rc1 or newer).
Solution 2:[2]
This is related to the way you defined your MLFlow run. You use log_params and then start_run, so you have two concurrent runs of mlflow which explains the error. You could try getting rid of the following line in your first code sample and see what happens
mlflow.log_param('param1',5)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Armand Sauzay |