ETIMEDOUT connecting to graph.facebook.com from Asia Southeast region
tonytej
PROOP

a month ago

Issue Description

Our Node.js application deployed on Railway (Asia Southeast region: asia-southeast1-eqsg3a) is experiencing constant ETIMEDOUT errors when attempting to connect to Meta's WhatsApp Cloud API at graph.facebook.com. The same code works perfectly in local development and was working in production until recently.

Impact

  • Severity: High - Core application functionality broken

  • Frequency: Constant failures (100% error rate currently)

  • Affected Service: Backend API service

  • Region: railway/asia-southeast1-eqsg3a (visible in response headers: X-Railway-Edge: railway/asia-southeast1-eqsg3a)

Environment Details

  • Platform: Railway

  • Region: Asia Southeast 1

  • Runtime: Node.js 22.21.0

  • Deployment: Successful build, fails at runtime

  • Target Host: graph.facebook.com (Meta/Facebook Graph API)

  • Target URL: https://graph.facebook.com/v21.0/{business-account-id}/message_templates

Error Details

Error Code: ETIMEDOUT

Error Type: TypeError: fetch failed with AggregateError cause

Description: TCP connection timeout when attempting to reach Meta's API servers

Complete Error Log

{
  "timestamp": "2025-10-25 14:27:06",
  "error": "Error fetching templates fetch failed",
  "name": "TypeError",
  "cause": {
    "code": "ETIMEDOUT"
  },
  "causeName": "AggregateError",
  "causeCode": "ETIMEDOUT",
  "url": "https://graph.facebook.com/v21.0/123123123/message_templates",
  "businessAccountId": "123123123",
  "hasAccessToken": true,
  "stack": "TypeError: fetch failed\n    at node:internal/deps/undici/undici:14900:13\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async Object.listTemplates (file:///app/server/dist/lib/message-provider.js:157:32)\n    at async MessageService.listTemplates (file:///app/server/dist/services/message.service.js:508:27)\n    at async file:///app/server/dist/routes/messages.js:146:27"
}

Deployment Logs (Successful Build)

2025-10-25T10:35:47.349825076Z [inf]  
> @event-management/server@1.0.0 build /app/server
> tsc --build --force && tsc-alias

2025-10-25T10:35:54.396484170Z [inf]  copy /app/node_modules
2025-10-25T10:35:59.291071711Z [inf]  [92mBuild time: 134.90 seconds[0m

Build completes successfully. The application starts without errors.

Runtime Logs (Connection Failures)

2025-10-25 14:06:18:618 [error]: Error occurred:
{
  "error": "fetch failed",
  "stack": "Error: fetch failed\n    at Object.listTemplates (file:///app/server/dist/lib/message-provider.js:232:23)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at async MessageService.listTemplates (file:///app/server/dist/services/message.service.js:508:27)\n    at async file:///app/server/dist/routes/messages.js:146:27",
  "path": "/messages/templates",
  "method": "GET",
  "user": "1321321323"
}

2025-10-25 14:27:06:276 [error]: Error fetching templates fetch failed
{
  "name": "TypeError",
  "cause": {
    "code": "ETIMEDOUT"
  },
  "causeName": "AggregateError",
  "causeCode": "ETIMEDOUT",
  "url": "https://graph.facebook.com/v21.0/123123123/message_templates",
  "businessAccountId": "123123123",
  "hasAccessToken": true
}

What We've Ruled Out

Environment variables: Confirmed all credentials are correctly configured

Code issues: Same code works in local development (returns 200 OK)

Authentication: hasAccessToken: true - credentials are present

Rate limiting: No 429 errors, well within Meta's API limits

Application timeout: Using 30-second AbortController timeout, but TCP connection times out before that

Network Connectivity Test Results

  • Local Development (macOS):

    • fetch('https://graph.facebook.com/...') → 200 OK

    • Response time: ~300ms

    • Works consistently

  • Railway Production (asia-southeast1):

    • fetch('https://graph.facebook.com/...') → ETIMEDOUT

    • Connection times out before establishing TCP connection

    • Fails consistently (was intermittent, now constant)

Timeline

  • Previously: Application worked correctly in production

  • Recently: Started experiencing intermittent failures

  • Currently: 100% failure rate - all requests timeout

Expected Behavior

Railway containers should be able to establish TCP connections to graph.facebook.com (Meta's public API) within reasonable timeout periods (~5-10 seconds).

Request for Support

Could you please investigate:

  1. Network routing from asia-southeast1 region to Meta's API servers

  2. DNS resolution for graph.facebook.com from Railway containers

  3. Firewall rules that might be blocking connections to Meta/Facebook IP ranges

  4. Regional network issues specific to asia-southeast1-eqsg3a

Could this be a Railway infrastructure/network issue as the same application code works in other environments?

$10 Bounty

7 Replies

Railway
BOT

a month ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


hey, looking at these logs, this is probably a railway network issue if im not mistaken (someone else should look at this tho). your logs show its timing out (ETIMEDOUT) before TCP handshake, not DNS or app errors, so the packets arent actually reaching Meta's IPs.

one potential code solution you can do is in node 22+ start your app with NODE_OPTIONS=--dns-result-order=ipv4first to see if thatll fix it by forcing ipv4, you can also use Undici's Agent for HTTP/1.1 client


fra
HOBBY

a month ago

can you try accessing via ssh to the container and to do a curl to graph.facebook.com? in theory is you can't reach the endpoint from the container it should mean the issue is on the railway network (I suppose)

also, can you double check your IP is whitelisted in the fb api?


tonytej
PROOP

a month ago

I'm still experiencing this issue intermittently. I can confirm that the same application (same repo, same branch) deployed on other platforms are not running into this issue. I've added the IP address to the Meta App IP whitelist but it would still fail.


passos
MODERATOR

a month ago

I've seen several reports online regarding specific IP address issues. I would assume that Facebook implements some type of undocumented rate limit per IP. Are you able to switch your application's regions to another location to confirm this issue?


tonytej
PROOP

a month ago

I have switched the applications region to another location and have been operating the application / making the above Graph API requests for about an hour but have not seen the 500 error. I changed it back to the Singapore region and I would get the 500 errors again sometimes. However, curling to graph.facebook.com in railway shell always succeeds. I am now trying to set `NODE_OPTIONS=--dns-result-order=ipv4first` and see if it helps.


tonytej

I have switched the applications region to another location and have been operating the application / making the above Graph API requests for about an hour but have not seen the 500 error. I changed it back to the Singapore region and I would get the 500 errors again sometimes. However, curling to graph.facebook.com in railway shell always succeeds. I am now trying to set `NODE_OPTIONS=--dns-result-order=ipv4first` and see if it helps.

passos
MODERATOR

a month ago

Remember that the railway shell command runs locally on your computer and not on Railway. Facebook is implementing rate limits by IP addresses, and unfortunately, Railway cannot do much about this situation.

If you wish to address this issue, you could potentially route that traffic through a proxy by using a service like Svix or HTTP proxies.


Loading...