Data loss due to unexpected redeployments during infrastructure incident (May 20)
moonpoong
PROOP

a month ago

Subject: Data loss due to unexpected redeployments during infrastructure incident (May 20)

Project: franchise-academy-1

Project ID: beb42eee-f68e-4f5b-9ee3-7326819a0914

Service: app (edgekorean.up.railway.app)

Issue:

Our service had its last intentional deployment on April 22 (deployment 58d83d47). No code was pushed between April 22

and May 20.

On May 20, three unexpected deployments were triggered WITHOUT any git push:

  • a74d933a at 18:15:05 KST (09:15 UTC)
  • dfd9ef11 at 18:16:58 KST (09:16 UTC)
  • 5154aa16 at 18:38:49 KST (09:38 UTC)

The first code push that day was at 18:55 KST (09:55 UTC), AFTER these three deployments.

These unexpected redeployments coincide with Railway's build queue incident reported on May 20 (12:16 UTC).

Our application stores data in a JSON file (db.json). A volume was mounted at /data, but due to a Git Bash path

conversion bug, the PERSIST_DIR environment variable was incorrectly set to "C:/Program Files/Git/data" instead of

"/data". As a result, the app was writing to the container filesystem instead of the volume.

The container that had been running since April 22 held approximately one month of production data (students,

lectures, timetables, assignments, exam results). When the unexpected redeployments replaced the container, all data

was lost.

Request:

Is there any possibility of recovering the filesystem contents from the old container/deployment

(58d83d47-da7f-48ef-9c16-c165ef165c17, ran from April 22 to May 20)? Specifically the file at /app/data/db.json or any

path containing db.json.

We have since fixed the PERSIST_DIR configuration and added automatic GitHub backups to prevent future data loss.

Solved

1 Replies

Railway
BOT

a month ago

Unfortunately, we are not able to recover filesystem contents from previous containers. The container filesystem is ephemeral, and once a deployment is replaced, the old container and its filesystem are permanently removed with no recovery mechanism. This applies regardless of whether the redeployment was user-initiated or platform-initiated. The redeployments on May 20 were likely triggered as part of workload recovery during the infrastructure incident that day, which is consistent with the timeline you described. Since your application was writing to the container filesystem rather than the mounted volume due to the path misconfiguration, that data was never persisted to durable storage. We're sorry for the impact, and glad to hear you've already corrected the volume path and added backup measures going forward.


Status changed to Awaiting User Response Railway about 1 month ago


Railway
BOT

a month ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway about 1 month ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...