Possible Railway platform bug: container DNS resolver contains host machine's nameserver instead of Railway's
ronnieduke
HOBBYOP

4 hours ago

Running a Node.js (Bull/ioredis) service that needs to talk to a Redis service in the same project. Started seeing constant getaddrinfo ENOTFOUND errors for both the private endpoint (redis.railway.internal) AND the public proxy (maglev.proxy.rlwy.net).

After investigation: the container's /etc/resolv.conf contains nameserver 192.168.1.1 — which is my local Mac's home router (not anything Railway-related). The container literally can't resolve any hostname because its DNS resolver points to a network that isn't routable from inside Railway's infrastructure.

What we tried:

  1. .dockerignore to exclude etc/resolv.conf from the build context — didn't help. The bad nameserver is still in the container at runtime.
  2. preDeployCommand: 'rm -f /etc/resolv.conf' — the file gets regenerated with the same bad value after the command runs.
  3. Custom start command to override at boot: sh -c 'echo nameserver 8.8.8.8 > /etc/resolv.conf && npm run start' — set successfully (confirmed in build logs), but Railpack appears to ignore it at runtime and runs npm run start directly. Diagnostic output from the override never appears in deploy logs.
  4. Hardcoding the public proxy IP+port into the Redis URL — works as a temporary unstick but the IP/port seem to rotate, breaking the connection again within ~24h.

Questions:

  • Is Railway injecting host DNS into containers a known issue? If so, what's the supported workaround?
  • Is the Railpack-overrides-custom-start-command behavior expected, or a bug?
  • Is the public-proxy IP supposed to be stable, or do we need to expect IP rotation?

Stack: Railway Railpack v0.23.0, Node 22, ioredis (private networking enabled in project settings).

$10 Bounty

1 Replies

Status changed to Open Railway about 4 hours ago


sheeki03
FREETop 10% Contributor

2 hours ago

I would split this into two separate problems before changing more runtime settings.

First, the redis.railway.internal failure can be caused by the Redis client path, even when the private network itself is fine. With ioredis and BullMQ, make sure the Redis lookup is dual-stack instead of IPv4-only.

For direct ioredis usage:

import Redis from "ioredis";

const redisUrl = new URL(process.env.REDIS_URL);
redisUrl.searchParams.set("family", "0");

const redis = new Redis(redisUrl.toString());

For BullMQ:

import { Queue } from "bullmq";

const redisUrl = new URL(process.env.REDIS_URL);

const queue = new Queue("jobs", {
  connection: {
    family: 0,
    host: redisUrl.hostname,
    port: Number(redisUrl.port),
    username: redisUrl.username || undefined,
    password: redisUrl.password || undefined,
  },
});

That matters because private networking can involve IPv6, and ioredis can otherwise try only an A-record lookup.

Second, nameserver 192.168.1.1 inside the deployed container is not something I would try to fix with .dockerignore or preDeployCommand. /etc/resolv.conf is runtime/container-managed, and preDeployCommand runs before the final app process. If the deployed runtime really has your home-router address as its resolver, that is evidence to hand to Railway.

I would add one temporary diagnostic start command and redeploy once:

{
  "$schema": "https://railway.com/railway.schema.json",
  "deploy": {
    "startCommand": "/bin/sh -c 'echo RESOLV_CONF_START; cat /etc/resolv.conf; echo DNS_TEST_START; node scripts/dns-check.cjs; exec npm run start'"
  }
}

Then add this small script:

// scripts/dns-check.cjs
const dns = require("node:dns");

const hosts = [
  "redis.railway.internal",
  "maglev.proxy.rlwy.net",
];

for (const host of hosts) {
  dns.lookup(host, { all: true }, (err, addresses) => {
    console.log(host, err ? err.code || err.message : addresses);
  });
}

Check the deployment details after redeploy. If Railway shows that startCommand came from railway.json or railway.toml but the RESOLV_CONF_START log never appears, then the start-command override is the bug to report. If the log appears and still shows 192.168.1.1, then the runtime resolver is the bug to report.

For the public proxy, do not hardcode the resolved IP. Use the Railway-provided proxy hostname and port, or the generated public Redis URL if Railway gives you one. The proxy hostname is the stable interface. The underlying IPs can change, and hardcoding them will keep breaking.

The order I would use is:

  1. Use the Redis service reference variable, preferably REDIS_URL, rather than a hand-written hostname.
  2. Add family=0 for ioredis or family: 0 for BullMQ.
  3. Confirm the app and Redis service are in the same Railway project and environment.
  4. Add the temporary DNS diagnostic start command above.
  5. If the deployed runtime still prints nameserver 192.168.1.1, open a Railway support ticket with that deploy ID and the DNS diagnostic logs.

The temporary fix should be family=0 plus the generated Railway Redis URL. The long-term fix, if 192.168.1.1 is confirmed inside the deployed runtime, needs Railway to look at why that resolver is being written into the container.


Welcome!

Sign in to your Railway account to join the conversation.

Loading...