Hobby plan service appears to be asleep/idle overnight - webhook delivery failed

andyelliottx

HOBBYOP

a month ago

Hi,

I'm on the Hobby plan and my service appears to have been idle (or asleep) last night, which caused an externally-originated webhook request to time out. The Hobby plan documentation suggests services run 24/7, so I'd like to understand if this is expected behavior, a misconfiguration on my end, or something to escalate.

Service: web-production-cb4a3b (Hobby plan, single web service, Python Flask + APScheduler, ~24 MB image)

The incident:

On 2026-05-19 at approximately 23:00 UTC, an external service attempted to POST to my /webhook endpoint and received a "request timed out" error after ~3 seconds. The webhook endpoint is a minimal async handler (spawns a background thread, returns 200 OK in milliseconds when warm), so a 3-second timeout implies the service wasn't running when the request arrived.

The evidence from your Metrics tab (last 24 hours):

Memory graph: drops from ~100 MB to ~0 MB at approximately 20:00 UTC on 19 May and stays at 0 MB until ~07:00 UTC on 20 May. A Python/Flask process cannot use 0 MB of memory - this suggests the process was either stopped or your metrics layer stopped collecting from it.

CPU graph: drops to flat zero during the same window.

Public Network Traffic: zero ingress between approximately 20:00 UTC and 07:00 UTC, then resumes when my next scheduled webhook fires.

Requests panel: 4xx errors recorded on 19 May that I believe correspond to the failed external webhook attempt.

My current traffic pattern:

The service receives sporadic webhooks during UK/US hours approximately 07:00–21:00 UTC. Between 21:00 UTC and 07:00 UTC the next day, traffic is essentially zero — but my internal APScheduler is supposed to be running force-close jobs at 20:00 UTC daily, and the Flask app needs to remain reachable for the occasional overnight webhook.

What I'd like to understand:

Is it expected for Hobby-plan services to go fully idle (memory→0) during low-traffic windows?

If so, what's the recommended way to prevent this? Internal heartbeat? External uptime monitor? Upgrade to a different plan?

If not, is there something in my service configuration that might be triggering this (Procfile, builder choice, healthcheck config, etc.)?

I'm happy to share logs from the time window or any other diagnostic info you need.

Thanks,

Andy

Awaiting Railway Response

5 Replies

Railway

BOT

a month ago

Hobby plan services run 24/7 by default and do not sleep unless you have the Serverless toggle enabled on the service (Settings > Deploy > Serverless). The behavior you describe, memory dropping to 0 and a cold-boot timeout on the first inbound request, is exactly what happens when Serverless is active and no outbound traffic is detected for 10 minutes. Your logs confirm the last outbound activity was the 20:00 UTC cron job calling the IG API, after which the service would have slept by ~20:10 UTC. Check your service's Deploy settings and disable Serverless if you need the service reachable around the clock.

Status changed to Awaiting User Response Railway • about 1 month ago

Railway

Hobby plan services run 24/7 by default and do not sleep unless you have the [Serverless](https://docs.railway.com/deployments/serverless) toggle enabled on the service (Settings > Deploy > Serverless). The behavior you describe, memory dropping to 0 and a cold-boot timeout on the first inbound request, is exactly what happens when Serverless is active and no outbound traffic is detected for 10 minutes. Your logs confirm the last outbound activity was the 20:00 UTC cron job calling the IG API, after which the service would have slept by ~20:10 UTC. Check your service's Deploy settings and disable Serverless if you need the service reachable around the clock.

andyelliottx

HOBBYOP

a month ago

Thanks, that lines up with the metrics I was seeing. Quick clarification request: could you confirm from your audit logs whether Serverless was enabled on this service at the time of the 19 May 23:00 UTC incident? I just want to be sure I didn't have a different sleep mechanism kicking in that I should know about. The setting is disabled, and hasn't been changed since it was setup during in initial setup last month.

Status changed to Awaiting Railway Response Railway • about 1 month ago

sam-a

EMPLOYEE

a month ago

Apologies for this canned message but in an effort to help all our customers get back up and running, we are sending this bulk message. As you may know, we had a major interruption to our services yesterday. We've published a post-mortem if you'd like more information on the incident. It describes what happened and what we are doing to prevent it in the future. We are deeply sorry for the impact that it has had on you.

It is taking some time to bring everything back up, but we are working on it as fast as we can. In general, a redeployment should fix most service issues. Due to the volume of customers redeploying right now, builds and deploys may take longer than normal to process.

You can track recovery status here: https://status.railway.com/incident/KVZ1Z8GY

If you are still having other issues that might be related to the incident you can read more here: https://station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c

Feel free to respond if your question has not been addressed.

Status changed to Awaiting User Response Railway • about 1 month ago

sam-a

EMPLOYEE

a month ago

You can track recovery status here: https://status.railway.com/incident/KVZ1Z8GY

If you are still having other issues that might be related to the incident you can read more here: https://station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c

Feel free to respond if your question has not been addressed.

andyelliottx

HOBBYOP

a month ago

My question hasn't been answered, thanks

Status changed to Awaiting Railway Response Railway • about 1 month ago

Welcome!