8 months ago
I tried deploying a web service using the free tier, where one of the endpoints loads a TensorFlow model. However, when I access that endpoint, it fails with a 502 error code. this response in postman
{
"status": "error",
"code": 502,
"message": "Application failed to respond",
"request_id": "Qc4UHT9cTYycFuAkacI7Nw"
} and this log deploy when l tried to access endpoint books/filter2025-06-09 01:14:56.254656: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-06-09 01:14:56.285257: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.3 Replies
7 months ago
I would suggest ensuring you're listening on the correct host and port. Additionally, check the networking to ensure you're not trying to access a private network outside of the network itself.
Status changed to Awaiting User Response Railway • 8 months ago
Status changed to Solved sarahkb125 • 8 months ago
sarahkb125
I would suggest ensuring you're listening on the correct host and port. Additionally, check the networking to ensure you're not trying to access a private network outside of the network itself.
7 months ago
I’ve already made sure that the port and host are correct. I can successfully load the model when running the project locally. Could it be that the TensorFlow model I'm using is too large and exceeds the maximum RAM or CPU limits?
Status changed to Awaiting Railway Response Railway • 8 months ago
7 months ago
Cloud Run has a default request timeout of 5 minutes, but on the free tier, cold starts and model loading (especially TensorFlow) can exceed this. if your model is large, it may take 20–30 seconds or more to load. maybe you can try fixing it by loading the TensorFlow model globally when the app starts (not inside the request handler). this avoids reloading it on every request.
also try to increase memory allocation (e.g., 2GB+), since TensorFlow needs RAM to load weights efficiently. free tier may not be enough. i guess i need more detail, please check for crash logs in Cloud Run and filter by severity='error'.