This post is based on the README here where newer versions might be available.
MLEM promises that you:
Use the same human-readable format for any ML framework
This is a bold promise and in this repo I will explore it a little (and maybe some other features as well). Note that the machine learning part of the content is only secondary. In the foreground we put the process and the tools.
Fetching and preparing the data 👷🏽♀️
This script is used in the first stage of the DVC pipeline
Training the model and persisting it using MLEM 🚀
save( rf, "rf", sample_data=X, description="Random Forest Classifier", )
rf is the fitted model and it is given a name; the string
In addition a description is provided (See issue #279 for a related topic).
Furthermore, by providing a value to the parameter
sample_data, MLEM will include the schema of the data in the model's meta data.
What's next? Or how to get predictions using an API? ⚡️
dvc repro in this project following things will happen:
- Iris data set will be fetched and splitted into train and test sets.
- A model will be train.
- The model will be persisted by MLEM; its metadata (
.mlem/model/rf.mlem) will be tracked by Git and the model itself (
.mlem/model/rf) will be tracked by DVC.
Now comes the fun part. By running:
mlem build rf docker --conf server.type=fastapi --conf image.name=rf-image-test
MLEM will build a docker image that can be used to get predictions from the trained model using an API. Once the image is built, a container can be ran:
docker run --rm -it -p 8080:8080 rf-image-test
As soon as it is up and running, the documentation of the endpoints of the new API can be found here: http://0.0.0.0:8080/docs.
Finally, once MLEM is serving the model, we can get predictions for our test set using
To that end we simply send a list of dictionaries to the
/predict end point and get, in return, a list of predictions.
Isn't it really wonderful?
So, in this repository you can find an end-to-end example how to bring your ML model to life as an API that can return predictions. This bridges a huge hurdle that data science teams face. After completing the hard work related to data fetching, cleaning, feature engineering, models training/evaluation/tuning and so on, the team is ready to deliver great value. Alas... Now support from DevOps and Data engineers is needed to bring the model to production. Using MLEM, the team is much closer to be independent and impact directly and quickly.