'On Premise MLOps Pipeline Stack

My motive is to build a MLOps pipeline which is 100% independnt from Cloud service like AWS, GCP and Azure. I have a project for a client in a production factory and would like to build a Camera based Object Tracking ML service for them. I want to build this pipeline in my own server or (on-premise computer). I am really confused with what stacks i should use. I keep ending up with a Cloud component based solution. It would be great to get some advice on what are the components that i can use and preferably open source.



Solution 1:[1]

Assuming your main objective is to build a 100% no cloud MLOps pipeline you can do that with mostly open source tech. All of the following can be installed on prem / without cloud services

For Training: You can use whatever you want. I'd recommend Pytorch because it plays nicer with some of the following suggestions, but Tensorflow is also a popular choice.

For CI/CD: if this is going to be on prem and you are going to retrain the model with production data / need to trigger updates to your deployment with each code update you can use Jenkins (open source) or CircleCI (commercial)

For Model Packaging: Chassis (open source) is the only project I am aware of for generically turning AI/ ML model files into something useful that can be run on your intended hardware. It basically takes an AI / ML model file as input and creates a docker image as its output. It's open source and supports Intel, ARM, CPU, and GPU. The website is here: http://www.chassis.ml and the git repo is here: https://github.com/modzy/chassis

For Deployment: Chassis model containers are automatically built with internal gRPC servers that can be deployed locally as docker containers. If you just want to stream a single source of data through them, the SDK has methods for doing that. If you want something that accepts multiple streams or auto scales to available resources on infrastructure you'll need a Kubernetes cluster with a deployment solution like Modzy or KServe. Chassis containers work out of the box with either.

  • KServe (https://github.com/kserve/kserve) is free, but basically just gives you a centralized processing platform hosting a bunch of copies of your running model. It doesn't allow later triage of the model's processing history.

  • Modzy (https://www.modzy.com/) is commercial, but also adds in all the RBAC, Job history preservation, auditing, etc. Modzy also has an edge deployment feature if you want to mange your models centrally, but run them in a distributed manner on the camera hardware instead of on a centralized server.

Solution 2:[2]

As per your requirement for on prem solution, you may go ahead with Kubeflow , Also use the following default storage class : nfs-provisioner on prem load balancing : metallb

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Semicolons and Duct Tape
Solution 2 gayan ranasinghe