Serverless not working

javardevops

PROOP

a month ago

Greetings, I have serverless enabled and i can see it settings but i dont see the application going to sleep even when i know no one has been using it for like an hour. After how long does it go to sleep.

Before it used to work but recently having monitoring and they seem not to sleep. will attach one of the projects for your investitation.

Thank you

$20 Bounty

14 Replies

Status changed to Awaiting Railway Response Railway • 29 days ago

sam-a

EMPLOYEE

a month ago

Serverless sleeps a service after 10 minutes of no outbound traffic. Outbound traffic includes database connections, telemetry, health checks, and any network requests your app makes - not just inbound user requests. If your services maintain open database connections (like your Postgres connections) or your monitoring tool periodically pings or polls anything, those outbound packets reset the inactivity timer and prevent the service from ever sleeping. You would need to ensure your app closes idle database connections and that any external monitoring is disabled for serverless to take effect.

Status changed to Awaiting User Response Railway • 29 days ago

javardevops

PROOP

a month ago

you can see here the las time it made requests even though at this time of posting its not sleeping. wouldn't this below be enough to confirm what you were trying to advise me on? Thank you in advance

Attachments

image.png

Status changed to Awaiting Railway Response Railway • 29 days ago

sam-a

EMPLOYEE

a month ago

Your HTTP Logs confirm no users are hitting the service, but the Network Flow Logs you shared actually show the problem. There are outbound TCP connections to external IPs on port 80 every 5 minutes (at 17:13, 17:18, 17:28, 17:33, 17:38, 17:43). Since serverless requires 10 minutes of zero outbound traffic to trigger sleep, those connections reset the timer before it ever completes. Something in your application or its dependencies is making those outbound requests on a regular interval. Disabling or identifying that periodic outbound call will allow the service to sleep.

Status changed to Awaiting User Response Railway • 29 days ago

javardevops

PROOP

a month ago

maybe a good example is this; because atleast it doesnt have any recent logs but it is also unable to sleep. maybe let us use this one to investigate. let me attach snips and the project. Because having no active clients yet, I wanted it to sleep to save resources during no activity that is why i enabled serverless for it as well even though its production

Thank you in advance

https://railway.com/project/bfd2d815-291e-464e-b4f5-f1ed3fc47323/service/d5071d3a-de0d-4026-a046-95b562788834?environmentId=87726d29-9d0d-485f-a46a-9095926a29e1

Attachments

image.png

Status changed to Awaiting Railway Response Railway • 28 days ago

mykal

EMPLOYEE

a month ago

We redeployed your hms-prod-backend service to resolve a tracing issue, and the Network Flow Logs now correctly show constant traffic between the backend and your Postgres database over the private network.

The same root cause applies here as with your other service: your persistent database connection keeps the serverless inactivity timer from ever completing its 10-minute window. Configuring your app to close idle database connections will allow serverless to take effect.

Status changed to Awaiting User Response Railway • 28 days ago

javardevops

PROOP

a month ago

Thank you again for your commitment to help always.

Though After adjusting DB connection behavior, pg_stat_activity no longer shows active/idle backend DB connections, but the backend service still emits outbound internet traffic to port 80 every 5 minutes. This traffic is not Postgres traffic. Can Railway identify whether this 5-minute outbound port-80 traffic is platform/runtime-generated or provide the process/source causing it?

Given I saw almost all services are not able to sleep. decided to focus on this one for investigation. so its the one where i ensured there are no idle connections

https://railway.com/project/2c5f7c4f-d5d1-4cf2-a2d0-05eda97b38a1/service/538d4a88-cd16-4f14-8a14-81fbdb1534b3?environmentId=605f940b-e2dc-4a8a-ac3b-703ffdfbbb67&id=6309c38e-9675-4fd4-8b4e-be105aa9407a#network

Thank you in advance

Status changed to Awaiting Railway Response Railway • 28 days ago

sam-a

EMPLOYEE

a month ago

That 5-minute outbound port-80 traffic is not generated by Railway's platform or runtime. It originates from your application process. The destination IPs are all AWS eu-west-1 addresses, and each request follows the same pattern (385 bytes out, 588 bytes back, rotating through several IPs). This is consistent with a library or SDK in your Python/FastAPI stack that phones home or performs periodic health checks. You would need to audit your application's installed packages and startup code to find which dependency is making those calls.

Status changed to Awaiting User Response Railway • 28 days ago

javardevops

PROOP

a month ago

I believe eu west is where the services are deployed both frontend and backend. I will keep investigating but in case you get any further comments to assist with investigation i will be grateful. what bothers me is that even those that used to sleep like this Javar where i have not changed much, no longer sleep. That is why i was also inclined to you checking your side. Shall let you know in case I find something as well

Status changed to Awaiting Railway Response Railway • 26 days ago

javardevops

PROOP

a month ago

Hi Railway team,

I tested further and I do not think the 5-minute outbound port-80 traffic is coming from my Python/FastAPI application process. Or can not see any evidence of it.

I wrapped the backend Gunicorn/FastAPI process with:

strace -f -tt -s 300 -e trace=network

The trace successfully captured expected DNS and Postgres traffic, so tracing was active. However, when the Railway Network Flow Logs showed the recurring outbound port-80 traffic to AWS eu-west-1 IPs, there was no corresponding connect() or network syscall from the traced Gunicorn/FastAPI process.

More importantly, I am seeing the same 5-minute pattern on a separate React frontend service as well:

service private IP:random_port -> AWS IP:80

385 B out / 588 B back

every 5 minutes

That frontend service is separate from the backend and does not use FastAPI, SQLAlchemy, Brevo, or the Python dependency stack. So the same traffic pattern appearing on both backend and frontend makes it unlikely to be caused by a Python SDK/library.

Can you please confirm what process/source Railway is attributing these flows to? If Railway believes this originates from my application process, please provide the process/PID/command or more detailed attribution which can assist further in our investigations.

Thanks again in advance.

Railway

BOT

a month ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open Railway • 26 days ago

rhuanbarreto

HOBBY

25 days ago

The problem stems mainly from the pictures you sent with the logs. There are some scanners scanning the internet randomly for common vulnerabilities around the web. The traffic is legit, although driven by hackers to try to exploit something from your service, and keeps your service awake. What you need to do is to put your service behind a proxy like cloudflare and define rules to block any requests that don't look legit. Most of those requests don't have good User-Agent headers. Read the http logs and check the request headers. You will find what's the best pattern to block those requests. Then if cloudflare blocks them, your service will not receive this kind of requests and will sleep.

javardevops

PROOP

25 days ago

Greetings,

Thanks for your reply. The challenge is that during those network logs, there are no any http logs thus i can't check for anything unless Railway can check and confirm

rhuanbarreto

HOBBY

24 days ago

if you put your service behind Cloudflare you will have better observability about those requests. Or you can implement a logger yourself in your service. You could also put your service behind a nginx proxy that may help with logging but you will still pay for the requests

javardevops

PROOP

22 days ago

Greetings I don't think even cloudflare can detect any thing without http log that is trying to hit the service. I have got some services using railway provided domain and others with custom domain using cloudflare proxy but still the behavior is similar. network flow log for every 5 minutes exist and no corresponding http log both in railway and no any event in cloudflare at that moment. @Railway could you please be able to provide the flow attribution for those events or any further information you can provide that would aid the investigation. Thank you all

javardevops

PROOP

17 days ago

Greetings,

Kindly find below an update on this issue;

The behavior appears intermittent and does not look like a deterministic application-level job.

Several services that previously showed the recurring 5-minute Network Flow pattern were able to sleep over the weekend without any code/configuration change from my side. Today, the same services are again not sleeping and the 5-minute Network Flow pattern is back.

This affects services on both custom domains and Railway-provided domains, so Cloudflare/custom-domain traffic cannot fully explain it. For the 5-minute Network Flow events, I also checked HTTP logs at the exact timestamps and there were no matching HTTP requests.

On the backend, I traced the Gunicorn/FastAPI process with strace. The trace captured normal DNS/Postgres syscalls, but when Railway Network Flow Logs showed the 5-minute port-80 flow, there was no matching network syscall from the traced app process.

Can you please check whether Railway is intermittently attributing edge/proxy/platform traffic to the service’s outbound flows, or provide the exact source process/PID/command for the 5-minute flow? The intermittent sleep behavior makes it unlikely to be a fixed app-level timer in my code.

Welcome!