'How to safely shutdown mlflow ui?
After running mlflow ui
on a remote server, I'm unable to reopen the mlflow ui
again.
A workaround is to kill all my processes in the server using pkill -u MyUserName
.
Otherwise I get the following error:
[INFO] Starting gunicorn 20.0.4
[ERROR] Connection in use: ('127.0.0.1', 5000)
[ERROR] Retrying in 1 second.
...
Running the mlflow server failed. Please see ther logs above for details.
I understand the error but I don't understand:
1. What is the correct way to shutdown mlflow ui
2. How can I identify the mlflow ui
process in order to only kill that process and not use the pkill
Currently I close the browser or use ctrl+C
Solution 1:[1]
I also met a similar problem recently when I call mlflow ui
in the remote server. The Ctrl + C
in the command line to exit usually works.
However, When it doesn't, using pkill -f gunicorn
solves my problem.
Note, you can also use ps -A | grep gunicorn
to first find the process and kill [PID]
manually.
A similar problem seems to have been discussed here once.
Solution 2:[2]
By default, the mlflow UI binds to port 5000, so the subsequent invocation will result in a port busy error.
You can launch multiple MLflow ui and provide a different port numbers:
Usage: mlflow ui [OPTIONS]
Launch the MLflow tracking UI for local viewing of run results. To launch
a production server, use the "mlflow server" command instead.
The UI will be visible at http://localhost:5000 by default, and only
accept connections from the local machine. To let the UI server accept
connections from other machines, you will need to pass ``--host 0.0.0.0``
to listen on all network interfaces (or a specific interface address).
Options:
--backend-store-uri PATH URI to which to persist experiment and run
data. Acceptable URIs are SQLAlchemy-compatible
database connection strings (e.g.
'sqlite:///path/to/file.db') or local
filesystem URIs (e.g.
'file:///absolute/path/to/directory'). By
default, data will be logged to the ./mlruns
directory.
--default-artifact-root URI Path to local directory to store artifacts, for
new experiments. Note that this flag does not
impact already-created experiments. Default:
./mlruns
-p, --port INTEGER The port to listen on (default: 5000).
-h, --host HOST The network address to listen on (default:
127.0.0.1). Use 0.0.0.0 to bind to all
addresses if you want to access the tracking
server from other machines.
--help Show this message and exit.```
Try it and see what happens.
Solution 3:[3]
If u cant connect to mlflow its bc its already running, u can run the following to kill the UI to spawn another one:
lsof -i :5000
Also, with MLFlow u can use -port
to assign a port number u want to prevent confusion if you need multiple UI's launched; e.g. one for tracking, one for serving etc. By default the server runs on port 5000. If that port is already in use, use the –port
option to specify a different port:
mlflow models serve -m runs:/<RUN_ID>/model --port 1234
Solution 4:[4]
Quick solution:
Simply kill the process
fuser -k 5000/tcp
Command syntax
fuser -k <port>/tcp
Bonus: fuser 5000/tcp will print you PID of process bound on that port.
Note: Works on Linux only. More universal is use of lsof -i4 (or 6 for IPv6).
Solution 5:[5]
I was getting error on mlflow ui
command.
Error was
[2022-04-19 10:48:02 -0400] [89933] [INFO] Starting gunicorn 20.1.0
[2022-04-19 10:48:02 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:02 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:03 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:03 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:04 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:04 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:05 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:05 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:06 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:06 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:07 -0400] [89933] [ERROR] Can't connect to ('127.0.0.1', 5000)
Solution that worked for me:
Step 1: Get the process id
ps -A | grep gunicorn
20734 ?? 0:39.17 /usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python /Users/XXX/env/bin/gunicorn -b 127.0.0.1:5000 -w 1 mlflow.server:app
Step 2: Take the PID from last output and kill the process with that PID that is using the port
kill 20734
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Moore |
Solution 2 | Jules Damji |
Solution 3 | |
Solution 4 | Suhas_Pote |
Solution 5 | Shuchita Bora |