'how to debug invocation timeout error in sagemaker batch transform?
I am experimenting with sagemaker, using a container from list here , https://github.com/aws/deep-learning-containers/blob/master/available_images.md to run my model and overwriting model_fn and predict_fn functions in inference.py file for loading model and prediction as shown in link here (https://github.com/PacktPublishing/Learn-Amazon-SageMaker-second-edition/blob/main/Chapter%2007/huggingface/src/torchserve-predictor.py) . I keep getting invocations timeout error => "Model server did not respond to /invocations request within 3600 seconds". am i missing anything in my inference.py code , as to adding something to response to the ping/healthcheck?
file : inference.py
import json
import torch
from transformers import AutoConfig, AutoTokenizer, DistilBertForSequenceClassification
JSON_CONTENT_TYPE = 'application/json'
def model_fn(model_dir):
config_path = '{}/config.json'.format(model_dir)
model_path = '{}/pytorch_model.bin'.format(model_dir)
config = AutoConfig.from_pretrained(config_path)
...
def predict_fn(input_data, model):
//return predictions
...
Solution 1:[1]
The issue is not with the health checks. It is with the container not responding to the /invocations request and this is can be due to model taking longer time than expected to get predictions from the input data.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | CrzyFella |