a month ago
Hi Railway team,
My web service deploys have been failing since today's private networking/traffic outage (incident started ~12:24 PM). The builds complete successfully but deploys hang indefinitely at "Starting Container" with no further output.
Project details:
Region: us-east4
Service: Web (Rails app using Dockerfile.web)
Database: Railway-managed PostgreSQL
What's happening:
The container starts and runs bin/rails db:migrate via the Docker entrypoint
The migration step attempts to connect to Postgres and the TCP connection hangs forever — no timeout error, no output, just silence
The build step completes fine (9 seconds cached, 331 seconds uncached)
What I've tried:
Restarted the Postgres service (shows healthy/online)
Switched DATABASE_URL to the private internal URL (fd12:...) — hangs
Switched DATABASE_URL to ${{database.DATABASE_PUBLIC_URL}} ([redacted].proxy.rlwy.net:[port]) — also hangs
Multiple redeploys after each change
Waited 10+ minutes per deploy attempt
Error from earlier deploy (when it did eventually timeout on private URL):
ActiveRecord::ConnectionNotEstablished: connection to server at "fd12:xxxx:xxxx:...", port 5432 failed: Connection timed out
Timeline:
App was last deployed successfully ~3 months ago and was running fine
No code changes — this started during today's networking outage
Status page shows the incident as resolved, but connectivity is still broken for my project on both private and public endpoints
It appears the networking outage recovery didn't fully restore connectivity for my project/region. Could someone look into this? Happy to provide project ID or any other details via DM.
Thanks!
5 Replies
a month ago
Update 2: Attempted to create a new web service as a workaround. Received error:
> Error updating private network - automatic repair triggered
This confirms private networking is still broken for my project despite the incident being marked resolved. The automatic repair has not succeeded — deploys remain stuck.
a month ago
Update 3: Created a brand new web service as a workaround. Build fails immediately with:
> Cache mount ID is not prefixed with cache key
This is a Railway build infrastructure error on Docker BuildKit cache mounts. Combined with the "Error updating private network - automatic repair triggered" message when creating the service, it appears my project's infrastructure (networking + build cache) is still degraded from the earlier outage.
Project region: us-east4
This is blocking all deployments. Any help expediting the repair would be appreciated.
a month ago
Hi there, I'm sorry you had to attempt all those workarounds. I'm seeing the services are up and green now, the web service is currently sleeping and I was able to access it at the web service's URL (it was sleeping at first)
Are you still running into any issues?
Status changed to Awaiting User Response Railway • about 1 month ago
a month ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 26 days ago
chandrika
Hi there, I'm sorry you had to attempt all those workarounds. I'm seeing the services are up and green now, the web service is currently sleeping and I was able to access it at the web service's URL (it was sleeping at first)Are you still running into any issues?
16 days ago
Experiencing a similar issue right now — deploys failing with 'no permission to execute start command' on worker and pre-deploy failure on web. Commands work fine inside running containers. Started ~18 hours ago, no code changes to Dockerfiles or permissions. Could this be related?
https://station.railway.com/questions/deploy-failing-we-don-t-have-permissi-48db9af1
Status changed to Awaiting Railway Response Railway • 16 days ago
Status changed to Solved brody • 15 days ago
