Better handling of sleeping apps
pxc
HOBBYOP

2 years ago

I have a low-traffic service (Django-based) deployed with the "app sleeping" feature turned on.

When the page is loaded from a sleeping state, I get Railway's "Application failed to respond" page. If I immediately refresh the page, the application has woken up and the page loads normally.

It would be better if I didn't have to reload the page, even if the total time it took to load was about the same. I expect other users of this feature would feel the same.

Solved

10 Replies

2 years ago

Are you using a health check?


pxc
HOBBYOP

2 years ago

No. Should I be?


2 years ago

Yes, you should, even if you didn't have app sleeping enabled.

Without a readiness health check Railway can only know when the container starts, but your app may not be ready to handle a request the exact moment the container starts, and that's likely why you see an initial 503 page.

[https://docs.railway.app/reference/healthchecks](https://docs.railway.app/reference/healthchecks
)

https://docs.railway.app/guides/healthchecks-and-restarts

Now neither of these mention app sleeping but it's the same principle for swapping in a new deployment as it is for routing traffic to a container that just started.


pxc
HOBBYOP

2 years ago

Thank you for the suggestion. I've added a healthcheck endpoint (set to return 200 if the related database [also with app sleeping enabled] is responding and 500 otherwise), but it hasn't solved the problem. I still get the placeholder page when the app resumes from sleeping.


2 years ago

You have added the endpoint to your code, but have you set it in your service settings too?


pxc
HOBBYOP

2 years ago

Yes, both. Initially it prevented deployment because I hadn't added healthcheck.railway.app to ALLOWED_HOSTS in my Django settings; when I added that, the deployment worked. But the behaviour on resuming from sleep is the same as before.


2 years ago

That's not ideal, seems Railway is still handing off the request before your app is able to handle it.


pxc
HOBBYOP

2 years ago

For the record, the missing piece was to add retries to the database calls from Python so that the database service has a chance to wake up from its own sleep. Now I've added that, it seems to be working.


a year ago

Hello,

We've resolved an issue where apps with longer startup times were showing 502 errors. Apps now have up to 10 seconds to start accepting traffic, thus preventing these error pages from appearing.

You will need to trigger a deployment so that the changes we have made take effect.


pxc
HOBBYOP

a year ago

Thanks. All seems ok now.


Status changed to Solved brody over 1 year ago


Loading...