3 months ago
Title. Here's a reproduction Dockerfile:
FROM alpine:latest
RUN echo "test" > /coolfile
RUN echo "exist" > /awesomefile
RUN rm -f /coolfileDeploy this image on Railway (from ghcr.io, may not occur when it's a Dockerfile deployed on platform), and observe that both coolfile and awesomefile exists on the service through SSH. /coolfile should not exist because it was deleted in a later layer.
Above Dockerfile is pushed to [ghcr.io/6ixfalls/railway-test:latest](ghcr.io/6ixfalls/railway-test:latest)
36 Replies
3 months ago
I noticed this with rabbitmq because the management image deletes a file from the regular image. The file was not deleted.
3 months ago
We use standard buildkit as far as I know, unless you cannot reproduce this locally with buildkit, you would need to open a bug report for buildkit itself.
3 months ago
The issue isn't with the image building, it's with the runtime
3 months ago
its deleted properly in the layer but not when run
3 months ago
i presume you're unpacking the layers one by one to create the "final" container fs (or whatever runtime you use does this)
3 months ago
Podman, can you reduce this with podman locally?
3 months ago
not reproducible with podman
3 months ago
5.7.0
3 months ago
could this be looked into? this is extremely easy to reproduce and has led to issues with the rabbitmq template (and may also be a source of confusion for other users).
3 months ago
you can deploy a service with the image ghcr.io/6ixfalls/railway-test:latest and a start command sleep infinity - after using railway ssh, you can notice that both files exist even though one has been deleted
3 months ago
this issue does not exist on any other container runtime
3 months ago
I will ticket this once back from winter break, but full disclosure, we would need further reports in order to look into it, it would also need to at least be reproducible with the metal builder.
3 months ago
the builder isnt the issue though, it's the runtime. however, the same behavior likely exists with the metal builder because the final image is the same
3 months ago
The runtime is stock podman.
3 months ago
hi, i just wanted to report i'm seeing this same exact behavior and is pretty frustrating! i'm new to railway and am a big fan. i can't say when this started happening, but i noticed it only today. here's an example Dockerfile that illustrates the behavior i am seeing:
# Minimal reproduction case for Railway overlay filesystem bug
# This demonstrates that files deleted with rm -rf still appear in the running container
FROM alpine:latest
# The base alpine image comes with /etc/apk directory
# Let's try to delete it in the same RUN command where we do other setup
RUN set -eux; \
rm -rf /etc/apk; \
ls -la /etc/ || true
# Create test files that we'll try to delete
RUN echo "test" > /should-be-deleted.txt && \
mkdir -p /should-be-deleted-dir && \
echo "test" > /should-be-deleted-dir/file.txt
# Now try to delete them in a separate layer
RUN rm -rf /should-be-deleted.txt /should-be-deleted-dir
# Expected result: /should-be-deleted.txt and /should-be-deleted-dir should NOT exist
# Expected result: /etc/apk should NOT exist
# Actual result on Railway: They still exist!
# To test this bug, build and deploy to Railway.
# Then `railway ssh` and check:
# ls -la /should-be-deleted.txt (should give "No such file")
# ls -la /should-be-deleted-dir (should give "No such file")
# ls -la /etc/apk (should give "No such file")
CMD ["sh", "-c", "echo 'Checking if deleted files exist:'; ls -la /should-be-deleted.txt 2>&1; ls -la /should-be-deleted-dir 2>&1; ls -la /etc/apk 2>&1; echo 'If you see the files above, the overlay FS is broken'; echo 'Container staying alive for inspection...'; tail -f /dev/null"]
3 months ago
what version? could this potentially be a bug in whichever version you're using?
3 months ago
I couldn't disclose the version.
3 months ago
https://github.com/6ixfalls/railway-image-layers
same issue occurs with the metal builder, deployed this image and coolfile exists when it shouldnt
3 months ago
Have you tried with podman locally?
2 months ago
mhm, not reproducible on this version
2 months ago
still seeing this, still affecting the rabbitmq template
2 months ago
I'm sorry, but we would need further reports in order for us to be able to prioritize looking into this.
2 months ago
still happening here too. it's easily reproducible, so it would be great if you all could possibly take a look at this! thanks!
mioi
still happening here too. it's easily reproducible, so it would be great if you all could possibly take a look at this! thanks!
2 months ago
Please also share your reproducible example.
brody
Please also share your reproducible example.
2 months ago
Hi brody! It is available earlier on in this thread (here: https://station.railway.com/questions/image-layers-with-deleted-files-are-not-d7d808ee#qw2g)
brody
Thank you, I'll ticket this.
2 months ago
Excellent! Thanks, brody!
2 months ago
Linking related topic https://station.railway.com/questions/files-that-never-existed-on-the-image-ge-5f629342 where this reproduces and this is super frustrating as it's completely non reproducible on any other platform that runs containers, but is reproducible on Railway runtime
2 months ago
This also applies to mv too.
With:
FROM alpine:latest
RUN CACHE_BUST=2
RUN echo "test" > test.txt
RUN ls -la . && stat test.txt
RUN mv test.txt test2.txt
RUN ls -la . && stat test2.txt || echo "gone"
ENTRYPOINT ["sleep", "infinity"]We get:
PS C:\Development\railway-rm-investigaton> railway ssh
✓ Connected to interactive shell
/ # ls
bin etc lib mnt proc run srv test.txt tmp var
dev home media opt root sbin sys test2.txt usr
/ #2 months ago
So far it seems the whiteout files are being ignored by the tooling that extracts the image layers on Railway, i.e.:
FROM debian:bookworm
RUN touch /repro-file
RUN rm -f /repro-file
CMD sleep infinitydocker build -t whiteout-repro .
docker run --rm -it whiteout-repro ls -l /repro-file
ls: cannot access '/repro-file': No such file or directory
docker save whiteout-repro -o repro.tar
mkdir repro && tar -xf repro.tar -C repro
jq -r '.manifests[0].digest' repro/index.json
sha256:5f5246ed027b82fb1bbe00bd016e3b15dec5b5b4d6e0dc218e1af61a05220c5a
jq . repro/blobs/sha256/0cfaa8515320c857097036846c1c0e92846e2bc5452eebccb346fd25ffa38e2c
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:d596c3844275eb4764b9c23547e1723eb4dfe841781b2227356524842cde00ba",
"size": 1071
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:1029f5ddc0d24726f1cefbb8def7a88f8ec819a1fdc4c05ce523011b4b73c72d",
"size": 48366072
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:e11c9e95aab1c1fd8e0353abb5a14d36ce3a7d85d3c368834554e20a77836f73",
"size": 98
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:9f67ea6e4b03671446cf49a43744772dbcc605cdd4c9777895a798bd8b346833",
"size": 80
}
]
}
tar -tf repro/blobs/sha256/9f67ea6e4b03671446cf49a43744772dbcc605cdd4c9777895a798bd8b346833 | grep '\.wh\.'
.wh.repro-file
So the .wh.repro-file whiteout file is there, but when deployed to Railway it's ignored and the repro-file can be found in the container FS
AFAIK this has not been happening earlier, at least few months ago this was not the case. Thus if any upgrades/changes to the container tools was made at Railway - that must be the first thing to research for the bug
Would be great for this to be escalated with high priority as this can easily break many images deployed to Railway
2 months ago
Hello!
We've escalated your issue to our engineering team.
We aim to provide an update within 1 business day.
Please reply to this thread if you have any questions!
Status changed to Awaiting User Response Railway • about 2 months ago
a month ago
any update on this? reproduction steps still show this bug
a month ago
looks like the railway bot messages weren't forwarded
a month ago
I am also looking for any update on this
Status changed to Awaiting Railway Response Railway • about 1 month ago
16 days ago
Seems like this is still an issue. For reference, we were deploying Six's RabbitMQ template, and it caused some issues with missing analytics due to this problem.

