2 months ago
Issue: Service is offline. Builds succeed but every deploy fails at container pull/unpack
with a registry timeout. 9 consecutive failures including rollback.
Project ID: 97b5665c-37a4-4b87-9939-ec46f408921e
Service ID: 805b27f5-146d-4c82-b92b-7784b0d1c607
Environment: production (59987721-1e2d-4028-b4fc-4c9891303051)
Error (same on all 9):
Container failed to start
/orchestrator.RouterLegacyService/CreateDeployment DEADLINE_EXCEEDED:
ctrd: failed to pull/unpack image: failed to resolve reference
"production-us-west2.railway-registry.com/...":
dial tcp 162.220.232.122:443: i/o timeout
Key findings:
- Last successful deploy: 4e766d4c at 06:52 UTC today. Failures started at 20:18 UTC — no
code or config changes between the two.
- The original outage was the running container becoming unreachable via the public URL
before any redeploy. Internal /health returned 200 via SSH.
- production-us-west2.railway-registry.com responds to external requests. Deploy nodes can't
reach it — points to internal network partition.
- Tried changing deploy region to us-west1 via API — builds still push to
production-us-west2 registry, same failure.
- Tried us-east4 — got configErrors: "User does not have access to region us-east4"
- Status page shows no incident for March 29.
This is a production outage. Appreciate any help.
1 Replies
Status changed to Awaiting Railway Response Railway • about 2 months ago
2 months ago
This appears to be the same issue as [link to that thread]
same DEADLINE_EXCEEDED / dial tcp i/o timeout on registry pull, different registry hostname. That one was resolved server-side by @brody.
Status changed to Open Railway • about 2 months ago