MongoDB crashes immediately after redeploy due to Linux kernel 6.19 incompatibility

cryptobrief

HOBBYOP

a month ago

After the May 19 Railway outage, my MongoDB service crashes immediately after redeploy.

Logs show:

"MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB."

The service was previously working before the outage/recovery.

Please check whether the workload was moved onto an incompatible host/kernel during recovery and advise the safest recovery path.

Important:

Please do NOT wipe or recreate the attached volume.
Backend/frontend are still deployed.
Backend logs previously showed intermittent successful Mongo auth followed by connectivity failures.

I only used Restart/Redeploy during the Railway recovery process after your status page advised redeploying unhealthy workloads.

Solved$10 Bounty

6 Replies

Status changed to Open Railway • about 1 month ago

futsy

HOBBY

a month ago

This is almost certainly not volume corruption.

That exact startup message matches MongoDB’s known MongoDB 8.x + Linux kernel 6.19+ incompatibility around the vendored TCMalloc. Railway’s MongoDB template currently uses the official mongo:8.0 image with /data/db as the attached data volume, so after the incident/redeploy the MongoDB container likely landed on a host/kernel combination where that image cannot start.

Safest recovery path:

Do not wipe or recreate the attached volume.
Either have Railway move/redeploy the MongoDB service onto a host running a kernel below 6.19, or temporarily pin the MongoDB image to mongo:8.0.4 and redeploy with the same /data/db volume still attached.
Once MongoDB starts, immediately take a logical backup/dump before doing any further image/kernel changes.

Restarting on the same 6.19+ host will keep reproducing the crash; the fix is changing the host kernel compatibility or the MongoDB image version while preserving the existing volume.

futsy

This is almost certainly not volume corruption. That exact startup message matches MongoDB’s known MongoDB 8.x + Linux kernel 6.19+ incompatibility around the vendored TCMalloc. Railway’s MongoDB template currently uses the official `mongo:8.0` image with `/data/db` as the attached data volume, so after the incident/redeploy the MongoDB container likely landed on a host/kernel combination where that image cannot start. Safest recovery path: 1. Do not wipe or recreate the attached volume. 2. Either have Railway move/redeploy the MongoDB service onto a host running a kernel below 6.19, or temporarily pin the MongoDB image to `mongo:8.0.4` and redeploy with the same `/data/db` volume still attached. 3. Once MongoDB starts, immediately take a logical backup/dump before doing any further image/kernel changes. Restarting on the same 6.19+ host will keep reproducing the crash; the fix is changing the host kernel compatibility or the MongoDB image version while preserving the existing volume.

cryptobrief

HOBBYOP

a month ago

Thanks for your reply.

Can Railway please confirm whether they can move/redeploy my existing MongoDB service onto a compatible host/kernel while preserving the existing attached /data/db volume?

I do not want to wipe, detach, or recreate the volume.

Because this is on the Hobby plan with no automated backups, I want to avoid any action that could risk the attached volume or trigger data loss.

Thanks

cryptobrief

HOBBYOP

a month ago

Just following up as this production service has now been unavailable for me for close to 24 hours since the Railway outage/recovery.

I completely understand incidents happen and appreciate the recovery work Railway has already done, however my MongoDB service still remains stuck in a crash loop with the Linux kernel 6.19 incompatibility error.

At this point I am mainly looking for confirmation of the safest recovery path that preserves the existing attached /data/db volume.

Current state:

MongoDB service still crash loops immediately on startup
Current image tag is mongo:8.0
Existing volume is still attached/mounting
Backend/frontend services are still deployed
No destructive actions have been taken

Could Railway please advise whether:

The service should be moved/redeployed onto a compatible host/kernel, or
I should safely pin/change the MongoDB image version for this existing volume

Because there are no automated backups on the Hobby plan, I want to avoid any action that could risk the attached volume or cause data loss.

Thanks.

cryptobrief

HOBBYOP

a month ago

Update: I added GLIBC_TUNABLES=glibc.pthread.rseq=0 to the MongoDB service variables and deployed again.

The service is still crash-looping with the same error:

MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB.

The volume is still mounting:

vol_g0byhhmrx9ut2j8l

Current source image is:

mongo:8.0

Can Railway please confirm the safest next step for this existing attached volume?

Should the image be changed from mongo:8.0 to a specific compatible tag, or does Railway need to move the workload to a compatible host?

mayoriii

HOBBYTop 5% Contributor

a month ago

i successfully deployed it with mongo:8 version, you need to set env var GLIBC_TUNABLES="glibc.pthread.rseq=1", in your comment above you're putting 0 instead of 1

also, this is not an official fix, we still have to wait official new linux release or downgrade mongo version back to 6.19

source fix i've used: https://miliucci.org/post/mongodb-tcmalloc-rseq/

Attachments

image.png

mayoriii

i successfully deployed it with mongo:8 version, you need to set env var `GLIBC_TUNABLES="glibc.pthread.rseq=1"`, in your comment above you're putting 0 instead of 1 also, this is not an official fix, we still have to wait official new linux release or downgrade mongo version back to 6.19 source fix i've used: https://miliucci.org/post/mongodb-tcmalloc-rseq/ ![image.png](https://station-server.railway.com/attachments/att_01ks3t7f5renaap1hc6p93h33q)

cryptobrief

HOBBYOP

a month ago

Thanks!

Confirmed this worked for my MongoDB service too.

I set:

GLIBC_TUNABLES=glibc.pthread.rseq=1

Then deployed the MongoDB service again.

Result:

MongoDB started successfully
existing volume mounted
I did not change the MongoDB image tag
I did not wipe, detach, or recreate the volume
backend/frontend were already running
my app is reachable again

Also confirming: glibc.pthread.rseq=0 did not work for my service, but glibc.pthread.rseq=1 did.

Thanks for the help.

Status changed to Solved Railway • about 1 month ago

Welcome!