Outbound networking failure - all external connections lost (March 15, 07:50-08:27 UTC)

Question

We experienced a complete outbound networking failure on our Railway services on March 15, 2026, approximately 07:50 to 08:27 UTC. Both our Primary and Worker services (20 replicas) simultaneously lost the ability to reach two independent external services in two different AWS regions:

\- Supabase PostgreSQL (eu-central-1, Frankfurt) via connection pooler on port 6543

\- Redis Cloud (eu-west-1, Ireland) on port 13326

Both services returned `read ETIMEDOUT` and `connect ETIMEDOUT` errors. The external services were confirmed healthy during this period via their respective dashboards and logs — Supabase showed normal CPU (25%), RAM (35%), and Postgres was accepting internal connections throughout. Redis Cloud reported no incidents.

After our workers crash-looped for \~37 minutes due to the connection failures, Railway's automatic restart mechanism stopped retrying (crash limit exceeded). Our services remained in a failed state from 08:27 until we manually restarted at 13:58 UTC — a total of \~6 hours of downtime.

Upon manual restart at 13:58, all connections succeeded immediately, confirming the networking issue had resolved itself hours earlier.

Questions:

1\. Was there a networking incident affecting outbound connections from Railway around 07:50-08:27 UTC on March 15?

2\. Were other customers affected?

3\. What is Railway's crash loop restart limit, and is there a way to configure it to keep retrying indefinitely or for a longer period?

4\. Is there a way to receive alerts when Railway stops restarting a service due to crash loop detection?

Project: n8n production deployment

Services affected: Primary (1 replica), Worker (20 replicas)