25 days ago
Project: enthusiastic-recreation (09c6d690-8a69-43e7-a031-c9608e059782)
Service: secretweaponlegaldomination (ed735bb8-df26-4353-a68d-a087da3521df)
Environment: production
Every deploy since 2026-04-26 17:39 EDT has failed with:
failed to compute cache: failed to calculate checksum of ref X::Y: '/frontend': not found
Affected deployments (all FAILED):
67572430-796d-4cab-8061-a1aa2a0e5605 (railway up direct upload, 07:22)
e8bc2793-d3ed-4297-8ee7-d2210a4d7054 (07:14)
f2e4b751-a8ba-434a-9bd9-bc0ac19f20f1 (06:44)
50d74d63, 636ffbac, 2a982323 (all 06:40)
Plus 10+ more from 2026-04-26
Tried:
- ARG CACHEBUST in Dockerfile to invalidate layer cache — failed
- railway up direct upload to bypass GitHub — failed
- Same Dockerfile builds successfully in GitHub Actions CI
Companion incident on backend service: similar BuildKit/snapshot pattern, blocking deploys for ~24h.
Currently running prod is from a prior successful deploy. Login + auth + DB all healthy on the live container, but new commits cannot ship.
Please diagnose / clear cache snapshot for both services.
19 Replies
25 days ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open Railway • 25 days ago
25 days ago
Try setting NO_CACHE=1 in your variables.
https://docs.railway.com/builds/build-configuration#disable-build-layer-caching
25 days ago
Confirmed NO_CACHE=1 set on the service. New deploy attempts still fail with the same error:
failed to calculate checksum of ref pol000iwt5hoho9ztu7jhcshs::0smis5ts4w4p5l5v25ohpb8s9: "/frontend": not found
Failed deploys after setting NO_CACHE=1:
- e41dfd1f-83fa-479d-9d8d-1462c3b4d545 (08:04 EDT)
- f2d6e0ad-69bb-4f15-8c6f-12ffd669878f (07:43 EDT)
Same pol000iwt5hoho9ztu7jhcshs cache ref prefix as the original failures. Looks like the corrupted snapshot persists despite NO_CACHE — could you clear it server-side?
25 days ago
Additional data point — same BuildKit error now also failing on a different service in this project:
Service: backend (beta environment)
Deploy: 54a8ff81 (08:18 EDT)
Error: failed to calculate checksum of ref 6yozesq3fv9p4yc7pskzitqhz::z764quaq3bpgknm404hi6yyc3: "backend/src": not found
This is project-wide, not service-specific. Both secretweaponlegaldomination and backend services have corrupted cache snapshots in project enthusiastic-recreation (09c6d690-8a69-43e7-a031-c9608e059782).
25 days ago
Disabled Metal builder per your suggestion. Now hitting a different infrastructure error — runc itself is failing to spawn containers:
runc run failed: unable to start container process: can't copy bootstrap data to pipe: write init-p: ...
process "/bin/sh -c addgroup -g 1001 -S nodejs && adduser -S nextjs -u 1001" did not complete successfully
Latest deploy: 06dd907b-983e-41b6-a2bf-68a7dcd38a32 (09:39 EDT)
The build now progresses further (gets past the BuildKit cache layer that was failing before), but runc fails when spawning the runner stage. Earlier deploy 4d01ede9 hit a similar runc EOF error on a simpler RUN step. Looks like the container runtime itself is broken on the build host — not a Dockerfile issue. Same Dockerfile builds clean in GitHub Actions CI.
25 days ago
Tried two more redeploys back to back per your suggestion — both failed.
Deploy IDs:
- 06dd907b (your reference)
- 49f17aca-1fa7-4d23-b612-a3acec2b6a8e (15:55 UTC)
- 84ebafd6-d335-48a4-80e7-352d50762a4d (15:56 UTC)
That's three failures across the same backend service in production env on project 09c6d690. Earlier deploys at 14:57 across production/beta/demo envs also failed. CLI build-log retrieval isn't returning anything for me — can you pull the bad build host(s) out of rotation and confirm when done?
25 days ago
Quick update — tried two more redeploys on the backend service per the suggestion to land on a different build host:
- 49f17aca-1fa7-4d23-b612-a3acec2b6a8e (15:55 UTC)
- 84ebafd6-d335-48a4-80e7-352d50762a4d (15:56 UTC)
Both FAILED. Plus 3 earlier failures at 14:57 UTC across production/beta/demo envs (731b8267, ac4e00ad, 7e3bfe3c). That's 5 backend failures since the Metal builder switch.
Build logs aren't returning anything via railway logs --service ... --deployment ... so I can't confirm what error these are hitting. Could you pull the bad build host(s) out of rotation for project 09c6d690 and confirm? Both frontend service ed735bb8 and backend service 225c11fa are blocked.
25 days ago
The issue is related to folders not being found, as it's shown in your logs: '/frontend': not found and "backend/src": not found . It means your COPY or ADD commands are looking for paths (/frontend, backend/src) that don't exist in the specified build context.
Things you need to check, that contribute to this issue:
1. Check your root dir in your service settings.
2. Check your .dockerignore file.
3. Check absolute/relative paths.
4. Check if you are using COPY --from=... correctly.
25 days ago
Thanks darseen. The "/frontend not found" message is BuildKit referencing a path in its corrupted cache snapshot, not an actual Dockerfile/path issue — same Dockerfile builds clean in GitHub Actions, and Railway staff already confirmed the bad-host hypothesis earlier in the thread (after Metal disable cleared the cache corruption, deploys then hit runc init-p errors which point to a broken build host, not the build context).
Still need a Railway staff member to pull the bad host(s) for project 09c6d690. Frontend (ed735bb8) and backend (225c11fa) both blocked.
24 days ago
Bump — still blocked at 24h+. Project 09c6d690 (frontend ed735bb8 + backend 225c11fa) cannot accept any deploys. Need a Railway staff member to identify and pull the bad build host(s) confirmed earlier in the thread. This is preventing all production updates to a live SaaS. Requesting urgent action or escalation path.
raphaelhaddock-blip
Bump — still blocked at 24h+. Project 09c6d690 (frontend ed735bb8 + backend 225c11fa) cannot accept any deploys. Need a Railway staff member to identify and pull the bad build host(s) confirmed earlier in the thread. This is preventing all production updates to a live SaaS. Requesting urgent action or escalation path.
24 days ago
I've cordoned problematic builder. Please try again to build.
Status changed to Awaiting User Response Railway • 24 days ago
24 days ago
Cordon incomplete — still failing. Two more failed deploys after the cordon:
- 2e634707 at 2026-04-28 11:07 AM EDT — failed on backend/requirements.txt: not found (cache ref tzi0xs8jbdgo4qx1a2h1lgsxu::6gkpp151kh9xad0zy127vc4z9)
- 42a66416 at 2026-04-28 11:10 AM EDT — failed on backend/src: not found (cache ref tzi0xs8jbdgo4qx1a2h1lgsxu::4wkuozlb8jpbpstmq8xleiwke)
Same builder failure pattern. Either the cordoned host is still receiving builds or there's another bad host in the pool. Project 09c6d690, service 225c11fa. Please pull all builders for this project until verified clean.
Status changed to Awaiting Railway Response Railway • 24 days ago
raphaelhaddock-blip
Cordon incomplete — still failing. Two more failed deploys after the cordon: \- 2e634707 at 2026-04-28 11:07 AM EDT — failed on `backend/requirements.txt: not found` (cache ref tzi0xs8jbdgo4qx1a2h1lgsxu::6gkpp151kh9xad0zy127vc4z9) \- 42a66416 at 2026-04-28 11:10 AM EDT — failed on `backend/src: not found` (cache ref tzi0xs8jbdgo4qx1a2h1lgsxu::4wkuozlb8jpbpstmq8xleiwke) Same builder failure pattern. Either the cordoned host is still receiving builds or there's another bad host in the pool. Project 09c6d690, service 225c11fa. Please pull all builders for this project until verified clean.
24 days ago
You have been using multiple different builders, so it is not a cache corruption problem.
From what I see your Dockerfile if most likely broken - it sets workdir, then copied from frontend (which works), but then also from /frontend - which doesn't exist.
Try removing leading /
Status changed to Awaiting User Response Railway • 24 days ago
24 days ago
Pushback — checked all 4 Dockerfiles, none use leading-slash COPY. Backend uses:
COPY backend/requirements.txt .
COPY backend/src/ ./src/
Frontend (root Dockerfile referenced by frontend/railway.toml):
COPY frontend/package.json frontend/package-lock.json ./
COPY frontend/prisma ./prisma
COPY frontend/ .
All paths repo-relative, no leading /. Same Dockerfiles build clean in GitHub Actions, only fail on Railway. The failure log says "failed to compute cache < failed to calculate checksum of ref tzi0xs8jbdgo4qx1a2h1lgsxu::..." — that's a BuildKit cache state error, not a path-resolution error.
What file/line are you seeing the /frontend reference in? Possibly looking at a stale snapshot? Happy to share the live Dockerfile directly. Project 09c6d690.
Status changed to Awaiting Railway Response Railway • 24 days ago
raphaelhaddock-blip
Pushback — checked all 4 Dockerfiles, none use leading-slash COPY. Backend uses: COPY backend/requirements.txt . COPY backend/src/ ./src/ Frontend (root Dockerfile referenced by frontend/railway.toml): COPY frontend/package.json frontend/package-lock.json ./ COPY frontend/prisma ./prisma COPY frontend/ . All paths repo-relative, no leading /. Same Dockerfiles build clean in GitHub Actions, only fail on Railway. The failure log says "failed to compute cache < failed to calculate checksum of ref tzi0xs8jbdgo4qx1a2h1lgsxu::..." — that's a BuildKit cache state error, not a path-resolution error. What file/line are you seeing the `/frontend` reference in? Possibly looking at a stale snapshot? Happy to share the live Dockerfile directly. Project 09c6d690.
24 days ago
I was checking your latest build, the error says:
```
Build Failed: build daemon returned an error < failed to solve: failed to compute cache key: failed to calculate checksum of ref b04xnptxlphmk4lsu4elsog92::su4856h8lmisxotpdxyqz82ge: "/frontend": not found >
```
It basically says that /frontend is not found and it can't calculate it checksum to even check for a cache hit/miss.
Status changed to Awaiting User Response Railway • 24 days ago
24 days ago
I see the daemon error says "/frontend": not found, but none of our 4 Dockerfiles use that path. Both candidates:
1. ./Dockerfile (root) — uses COPY frontend/package.json, COPY frontend/. (no leading slash; expects build context = repo root)
2. ./frontend/Dockerfile — uses COPY package.json, COPY . (no "frontend/" prefix at all; expects build context = frontend/)
Neither has /frontend. The only way the daemon would resolve a request to "/frontend" is if Railway is using the wrong combination — e.g., root Dockerfile with build context = frontend/ subdir. That would make COPY frontend/. resolve to /frontend (because the path starts with the directory name that's already the context root).
Can you confirm what the frontend service is configured with for:
- Source / Root Directory
- Dockerfile path
frontend/railway.toml says dockerfilePath = "Dockerfile" with build context = frontend/. If Railway's UI overrides that to use ./Dockerfile (root) with frontend/ as context, that explains the /frontend error.
Project 09c6d690.
Status changed to Awaiting Railway Response Railway • 24 days ago
raphaelhaddock-blip
I see the daemon error says "/frontend": not found, but none of our 4 Dockerfiles use that path. Both candidates: 1\. ./Dockerfile (root) — uses COPY frontend/package.json, COPY frontend/. (no leading slash; expects build context = repo root) 2\. ./frontend/Dockerfile — uses COPY package.json, COPY . (no "frontend/" prefix at all; expects build context = frontend/) Neither has /frontend. The only way the daemon would resolve a request to "/frontend" is if Railway is using the wrong combination — e.g., root Dockerfile with build context = frontend/ subdir. That would make COPY frontend/. resolve to /frontend (because the path starts with the directory name that's already the context root). Can you confirm what the frontend service is configured with for: \- Source / Root Directory \- Dockerfile path frontend/railway.toml says dockerfilePath = "Dockerfile" with build context = frontend/. If Railway's UI overrides that to use ./Dockerfile (root) with frontend/ as context, that explains the /frontend error. Project 09c6d690.
24 days ago
Hm,
So according to logs builder tries to use Dockerfile from the repo root. Can you try to rename/remove Dockerfile from the repo root? If issue is solved, then we can look how to fix it on our side.
24 days ago
Already done — renamed root Dockerfile to Dockerfile.legacy in commit 4b4482f93 about 4 hours ago. Production frontend is now deploying cleanly.
Worth flagging on your side: the frontend service has Source Repo = frontend/ and Dockerfile Path = Dockerfile in railway.toml, which should resolve to frontend/Dockerfile. But BuildKit was reaching up to the repo root and picking the legacy Dockerfile there. Either the explicit dockerfilePath in railway.toml is being ignored, or the build context is actually the repo root despite the Source setting. The renamed file is a workaround — feels like the real fix is on Railway's resolution logic.
Happy to close this thread once you've had a chance to look. Project 09c6d690.
raphaelhaddock-blip
Already done — renamed root Dockerfile to Dockerfile.legacy in commit 4b4482f93 about 4 hours ago. Production frontend is now deploying cleanly. Worth flagging on your side: the frontend service has Source Repo = frontend/ and Dockerfile Path = Dockerfile in railway.toml, which should resolve to frontend/Dockerfile. But BuildKit was reaching up to the repo root and picking the legacy Dockerfile there. Either the explicit dockerfilePath in railway.toml is being ignored, or the build context is actually the repo root despite the Source setting. The renamed file is a workaround — feels like the real fix is on Railway's resolution logic. Happy to close this thread once you've had a chance to look. Project 09c6d690.
23 days ago
Glad it worked.
Yes, we have found a bug and currently rolling the fix into production.
Status changed to Solved Railway • 23 days ago