25 days ago
Bug: Snapshot cache serves stale source code despite correct Git state, NO_CACHE=1, and [build.args] CACHE_BUST
Service: bevitel-backend (Dockerfile build, Python 3.11-slim, FastAPI)
Region: US West (California)
---
What is happening
After committing source code changes to GitHub, Railway deploys the new commit but continues serving a container built from a previous snapshot. The deployed container runs old source code despite the correct code being in Git and confirmed via git show HEAD:<file>.
This affects source code changes — not just pip/dependency changes. Even after requirements.txt is correct and google-genai installs from the build logs, a subsequent source file change internal_routes.py) is not reflected in the running container.
Steps to reproduce
1. Commit a change to a Python source file (e.g. fix a wrong attribute reference on one line)
2. Push to master — Railway auto-deploys
3. Confirm via git show HEAD:<file> | grep <changed_line> that the correct code is committed
4. Check Railway deploy logs — the build runs, [builder 5/5] COPY . /app executes
5. Check runtime logs — the old code is still running AttributeError referencing the old attribute name)
6. The running container traceback points to the old line content, not the committed line
What we have tried
- NO_CACHE=1 in Railway Variables — does not invalidate the snapshot cache
- CACHE_BUST=2 in Railway Variables tab — runtime env var, never reaches Dockerfile ARG
- ARG CACHE_BUST=1 in Dockerfile before COPY requirements.txt — correct Docker mechanism but inert without [build.args] in railway.toml
- [build.args] CACHE_BUST = "2" in railway.toml — successfully busted the pip install layer and resolved the dependency issue, but the COPY . /app layer (source code) is still being served from a cached snapshot on subsequent deploys
- Deleting and recreating the service — forces a cold build, works once, but the problem recurs on the next source code change
Confirmed working state
The build logs show [builder 5/5] COPY . /app executing on every deploy. The correct source file is in Git. The healthcheck passes. But the running container traceback shows the old source code.
Environment
```toml
# railway.toml
[build]
builder = "DOCKERFILE"
[build.args]
CACHE_BUST = "2"
[deploy]
healthcheckPath = "/health"
healthcheckTimeout = 100
restartPolicyType = "ON_FAILURE"
restartPolicyMaxRetries = 3
```
```dockerfile
# Dockerfile (builder stage excerpt)
ARG CACHE_BUST=1
COPY requirements.txt .
RUN pip install --no-cache-dir --prefer-binary -r requirements.txt
COPY . /app
```
Question
Is Railway's snapshot cache keying on something other than the Docker layer hash for the COPY . /app step? Is there a supported mechanism to guarantee that source code changes always reach the running container without deleting and recreating the service each time?
Project ID and Service ID available on request via Private Thread.
1 Replies
Status changed to Open Railway • 25 days ago
11 days ago
I don't think this is normal Docker layer behavior. If COPY . /app really receives a changed build context, Docker should invalidate that layer. So I would separate two possibilities: Railway is building from a stale snapshot/context, or the running process is not the image you think it is.
A practical way to prove it is to stamp the image with Railway's deployment metadata and print it at runtime.
Railway provides RAILWAY_GIT_COMMIT_SHA, RAILWAY_SNAPSHOT_ID, and RAILWAY_DEPLOYMENT_ID to builds/deployments: https://docs.railway.com/reference/variables#railway-provided-variables
For Dockerfile builds, Railway says build-time variables must be declared with ARG in the Dockerfile: https://docs.railway.com/reference/dockerfiles#using-variables-at-build-time
Try adding something like this near/after COPY . /app:
```dockerfile
ARG RAILWAY_GIT_COMMIT_SHA
ARG RAILWAY_SNAPSHOT_ID
ARG RAILWAY_DEPLOYMENT_ID
COPY . /app
WORKDIR /app
RUN printf 'commit=%s\nsnapshot=%s\ndeployment=%s\n' \
"$RAILWAY_GIT_COMMIT_SHA" "$RAILWAY_SNAPSHOT_ID" "$RAILWAY_DEPLOYMENT_ID" \
> /app/.railway-build-stamp \
&& grep -n "" /app/internal_routes.py
```
Then log/read /app/.railway-build-stamp on startup. If the stamp shows the new commit but grep sees the old source, that is strong evidence of a Railway snapshot/context bug. If the stamp is old at runtime, the issue is deployment routing/old container overlap rather than Docker caching.
For a workaround, put the commit SHA ARG before a small forced-verification step after COPY; don't rely on CACHE_BUST as a runtime variable. Also check .dockerignore / .railwayignore for rules that might exclude the changed file, because that would make the Docker context unchanged even when Git changed.
If the stamp proves the new commit was built but the active container still runs old code, I would include the stamp values plus the deployment ID in a private Railway thread; that gives the team exactly the IDs needed to inspect the snapshot/deployment handoff.