'Return confidence score with custom model for Vertex AI batch predictions
I uploaded a pretrained scikit learn classification model to Vertex AI and ran a batch prediction on 5 samples. It just returned a list of false predictions with no confidence score. I don't see anywhere in the SDK documentation or Google console for how to get batch predictions to include the confidence scores. Is that something Vertex AI can do?
My intent is to automate a batch prediction pipeline using the following code.
# Predict
# "csv", ""bigquery", "tf-record", "tf-record-gzip", or "file-list"
batch_prediction_job = model.batch_predict(
job_display_name = job_display_name,
gcs_source = input_path,
instances_format = "", # jsonl, csv, bigquery,
gcs_destination_prefix = output_path,
starting_replica_count = 1,
max_replica_count = 10,
sync = True,
)
batch_prediction_job.wait()
return batch_prediction_job.resource_name
I tried it out in google console as a test to make sure my input data was properly formatted.
Solution 1:[1]
I don't think so; the stock sklearn container provided by vertex doesn't provide such a score I guess. You might need to write a custom container.
Solution 2:[2]
You can now do this with the custom prediction routines. Here are a couple good e2e examples
- Official google
- One of mine - focuses on batch prediction with
predict_proba()
Here's an example of the interface for the predictor.py:
%%writefile src/predictor.py
import joblib
import numpy as np
import pickle
from google.cloud import storage
from google.cloud.aiplatform.prediction.sklearn.predictor import SklearnPredictor
import json
class CprPredictor(SklearnPredictor):
def __init__(self):
return
def load(self, gcs_artifacts_uri: str):
"""Loads the preprocessor artifacts."""
gcs_client = storage.Client()
with open("model.joblib", 'wb') as gcs_model:
gcs_client.download_blob_to_file(
gcs_artifacts_uri + "/model.joblib", gcs_model
)
with open("model.joblib", "rb") as f:
self._model = joblib.load("model.joblib")
def predict(self, instances):
outputs = self._model.predict_proba(instances)
return outputs
Note you have to utilize an experimental branch of the SDK at the moment, will likely change to official.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Shawn |
Solution 2 | JW_ |