9 days ago
I am running my application (nextjs+payload cms+postgre) on railway with 6vcpu and 12gb ram for long time. Since the last outage I started to face this issue but not sure that is the reason. For 5 days app was running without any issues. Today 2 times in 2 hours i got notification of it crashed.
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----
1: 0xe46bbe node::OOMErrorHandler(char const*, v8::OOMDetails const&) [next-server (v16.2.6)]
<--- Last few GCs --->
5: 0x1472853 [next-server (v16.2.6)]
2: 0x1243640 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [next-server (v16.2.6)]
3: 0x1243917 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [next-server (v16.2.6)]
6: 0x148b92a [next-server (v16.2.6)]
4: 0x1472825 [next-server (v16.2.6)]
[20:0x3f76f000] 645930 ms: Scavenge (interleaved) 2035.3 (2075.8) -> 2034.0 (2079.4) MB, pooled: 0 MB, 16.33 / 0.00 ms (average mu = 0.266, current mu = 0.242) allocation failure;
7: 0x148eaf8 [next-server (v16.2.6)]
[20:0x3f76f000] 648726 ms: Mark-Compact 2039.1 (2081.6) -> 2034.0 (2081.7) MB, pooled: 0 MB, 2781.70 / 0.01 ms (average mu = 0.081, current mu = 0.028) allocation failure; scavenge might not succeed
8: 0x1cf7681 [next-server (v16.2.6)]
Aborted
<--- JS stacktrace --->
And until now it was crashed, I just restarted the app and its working. But this happened 2 times today.
Wht his can be happening, how to we fix this? I see usages were not hitting limits
51 Replies
9 days ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open Railway • 9 days ago
9 days ago
Try adding NODE_OPTIONS=--max-old-space-size=8192 into your service variables and redeploy. If you're using a Dockerfile, make sure to add ARG/ENV statements.
0x5b62656e5d
Try adding `NODE_OPTIONS=--max-old-space-size=8192` into your service variables and redeploy. If you're using a Dockerfile, make sure to add ARG/ENV statements.
9 days ago
I added for 4096 as beginning, probably it was 2048 before. If doesn't help, I will increase that to 8192. I am using Railpack for build. Thats why I only added it into my service variables and deployed.
How much would this increase my cost? And how can I see the current exact limit for node?
mansoorahmad653
check your Node's heap limit is it too low
9 days ago
Thank you, that looks like the reason. Is there a way to check current heap limit of node I have?
9 days ago
some thing like that
console.log(
require('v8').getHeapStatistics().heap_size_limit / 1024 / 1024
)
or
NODE_OPTIONS=--max-old-space-size=8192
that
9 days ago
its mostly in environment variables
audacity-london
Thank you, that looks like the reason. Is there a way to check current heap limit of node I have?
9 days ago
check my msg
9 days ago
Even after adding this, it throws 499 or 502 for the pages which I don't generate before I deploy. SSR/dynamic rendering getting stuck but this was not happening before, it started to happen suddently.
9 days ago
the 502 error is mostly the port error
9 days ago
and i think its the common error in the railway
9 days ago
because I get also this 502 page error when deploying my project
mansoorahmad653
because I get also this 502 page error when deploying my project
9 days ago
maybe we should redeploy the db instead of a restart?
audacity-london
maybe we should redeploy the db instead of a restart?
9 days ago
yes try to redepliy
9 days ago
As mentioned above, make sure the port your URL is mapped to is the same port your application is mapped to.
audacity-london
maybe we should redeploy the db instead of a restart?
9 days ago
some time the port is the common issue
0x5b62656e5d
As mentioned above, make sure the port your URL is mapped to is the same port your application is mapped to.
9 days ago
The issue is my system was working without any issues for months, right now when any of my pages which is needing DB connection, is failing.
But during the redeployment, its working as expected, able to generate pages by using the db connection.
mansoorahmad653
some time the port is the common issue
9 days ago
The issue is my system was working without any issues for months, right now when any of my pages which is needing DB connection, is failing.
But during the redeployment, its working as expected, able to generate pages by using the db connection.
audacity-london
The issue is my system was working without any issues for months, right now when any of my pages which is needing DB connection, is failing. But during the redeployment, its working as expected, able to generate pages by using the db connection.
9 days ago
Solved or not
9 days ago
Still not solved, app is able to render and start in building phase, after that second any request I do through private db network is failing, getting 499 error.
Trying to distribute with external db url, if that works then I will see what can I do.
Even development local server (localhost:3000) is not able to connect right now to external db tho. But while creating a build its able to connect, fetch and deploy.
9 days ago
Keep in mind that private networking is not available during the build phase. I'd recommend moving them to the pre-deploy phase.
audacity-london
Still not solved, app is able to render and start in building phase, after that second any request I do through private db network is failing, getting 499 error. Trying to distribute with external db url, if that works then I will see what can I do. Even development local server (localhost:3000) is not able to connect right now to external db tho. But while creating a build its able to connect, fetch and deploy.
9 days ago
which db are you using
0x5b62656e5d
Keep in mind that private networking is not available during the build phase. I'd recommend moving them to the pre-deploy phase.
9 days ago
I have a build command for external network during build phase, currently even in localhost i get this:
Error: cannot connect to Postgres. Details: Connection terminated unexpectedly
mansoorahmad653
which db are you using
9 days ago
postgresql, Error: cannot connect to Postgres. Details: Connection terminated unexpectedly
0x5b62656e5d
Are there any errors in your Postgres logs?
9 days ago
Not sharing all but I see something like this:
2026-05-31 08:09:32.604 UTC [56] FATAL: connection to client lost
2026-05-31 08:09:32.604 UTC [56] STATEMENT: select "pages"."id", "pages"."full_path", "pages"."slug", "pages"."parent_id", "pages"."meta_image_id", "pages"."meta_canonical_url", "pages"."meta_structured_data", "pages"."meta_twitter_card", "pages"."meta_og_type", "pages"."updated_at", "pages"."created_at", "pages"."_status", "pages__blocks_content"."data" as "_blocks_content", "pages__blocks_fundingSteps"."data" as "_blocks_fundingSteps", "pages__blocks_pageTitle"."data" as "_blocks_pageTitle", "pages__blocks_fourGridImage"."data" as "_blocks_fourGridImage", "pages__blocks_successfulTraders"."data" as "_blocks_successfulTraders", "pages__blocks_pricingTable"."data" as "_blocks_pricingTable", "pages__blocks_globalBlockRef"."data" as "_blocks_globalBlockRef", "pages__blocks_textImageSection"."data" as "_blocks_textImageSection", "pages__blocks_textImageBullet"."data" as "_blocks_textImageBullet", "pages__blocks_faq"."data" as "_blocks_faq", "pages__blocks_reviewRatings"."data" as "_blocks_reviewRatings", "pages__blocks_awardWinningProp"."data" as "_blocks_awardWinningProp", "pages__blocks_imageDivider"."data" as "_blocks_imageDivider", "pages__blocks_doubleCTA"."data" as "_blocks_doubleCTA", "pages__blocks_welcomeBanner"."data" as "_blocks_welcomeBanner", "pages__blocks_competitionBanner"."data" as "_blocks_competitionBanner", "pages__blocks_trustedReviews"."data" as "_blocks_trustedReviews", "pages__blocks_exclusiveBenefits"."data" as "_blocks_exclusiveBenefits", "pages__blocks_payoutCertificates"."data" as "_blocks_payoutCertificates", "pages__blocks_howItWorks"."data" as "_blocks_howItWorks", "pages__blocks_slidingStories"."data" as "_blocks_slidingStories", "pages__blocks_freeTrialBanner"."data" as "_blocks_freeTrialBanner", "pages__blocks_successCountUp"."data" as "_blocks_successCountUp", "pages__blocks_successStoryGrid"."data" as "_blocks_successStoryGrid", "pages__blocks_livePrices"."data" as "_blocks_livePrices", "pages__blocks_competitionHero"."data" as "_blocks_competitionHero", "pages__blocks_competitionPrizes"."data" as "_blocks_competitionPrizes", "pages__blocks_competitionTimeline"."data" as "_blocks_competitionTimeline", "pages__blocks_contentSideNav"."data" as "_blocks_contentSideNav", "pages__blocks_contactSupport"."data" as "_blocks_contactSupport", "pages__blocks_reachOutSupport"."data" as "_blocks_reachOutSupport", "pages__blocks_explorePrograms"."data" as "_blocks_explorePrograms", "pages__blocks_aboutUsNumbers"."data" as "_blocks_aboutUsNumbers", "pages__blocks_introduceCompany"."data" as "_blocks_introduceCompany", "pages__blocks_slidingLogos"."data" as "_blocks_slidingLogos", "pages__blocks_ourValues"."data" as "_blocks_ourValues", "pages__blocks_benefitCards"."data" as "_blocks_benefitCards", "pages__blocks_shakingBoxes"."data" as "_blocks_shakingBoxes", "pages__blocks_keyEvents"."data" as "_blocks_keyEvents", "pages__blocks_commissionCards"."data" as "_blocks_commissionCards", "pages__blocks_commisionBanners"."data" as "_blocks_commisionBanners", "pages__blocks_dxTradeHero"."data" as "_blocks_dxTradeHero", "pages__blocks_tradingPlatformsHero"."data" as "_blocks_tradingPlatformsHero", "pages__blocks_tradingPlatformsWhy"."data" as "_blocks_tradingPlatformsWhy", "pages__blocks_tradingPlatformsShowcase"."data" as "_blocks_tradingPlatformsShowcase", "pages__blocks_dxTradeWhyChoose"."data" as "_blocks_dxTradeWhyChoose", "pages__blocks_dxTradeWhyUse"."data" as "_blocks_dxTradeWhyUse", "pages__blocks_comparisonTable"."data" as "_blocks_comparisonTable", "pages__blocks_privacyRequest"."data" as "_blocks_privacyRequest", "pages__blocks_tradingGuidelines"."data" as "_blocks_tradingGuidelines", "pages__blocks_knowledgeCenter"."data" as "_blocks_knowledgeCenter", "pages__blocks_tocCompetition"."data" as "_blocks_tocCompetition", "pages__blocks_scalingPlans"."data" as "_blocks_scalingPlans", "pages__blocks_scalingUpToDate"."data" as
connection to client lost is most critical part I believe?
9 days ago
Did your service run into OOM during this query? Or is the OOM solved now?
9 days ago
The problem is happening because your app is losing the database connection during runtime, not during build. Postgres shows “connection to client lost,” which means your Node.js app is disconnecting while a query is still running. This usually happens when the server crashes, restarts, or the request takes too long and gets killed (which also explains the 499 errors). Your Payload CMS query is very heavy with many joins and data blocks, so it may be slowing down and triggering timeouts. After the Railway outage, there is likely also some instability with database connections or pooling. Overall, it’s not just memory, but a mix of slow heavy queries, connection pool issues, and requests being cut off before they finish.
9 days ago
The best fix is to make your database connection stable and reduce how heavy your queries are. Right now your app is likely opening too many database connections and running very large Payload CMS queries that take too long to finish. This causes requests to time out and get cut off, which is why you see 499 errors and “connection to client lost” in Postgres. You should use a single shared database pool instead of creating new connections each time, reduce query depth and unnecessary data fetching in Payload, and make sure slow queries are limited or optimized. Once the connections are stable and the queries are lighter, the crashes and disconnections should stop.
0x5b62656e5d
Did your service run into OOM during this query? Or is the OOM solved now?
9 days ago
I can't test because its not being deployed since it cant connect to db now.. I really need some professional help from railway team I believe. This system is running with many deployments for last 5 months. Today its not working suddenly. Trying to use an old db backup from 2 days ago, but getting stuck in db connection part
Attachments
9 days ago
couple of days ago my deployment is stuck so I delete that deployment and redeploy my whole project again from start and its work
mansoorahmad653
couple of days ago my deployment is stuck so I delete that deployment and redeploy my whole project again from start and its work
9 days ago
Its a company website, unfortunately I can not do that.
audacity-london
I can't test because its not being deployed since it cant connect to db now.. I really need some professional help from railway team I believe. This system is running with many deployments for last 5 months. Today its not working suddenly. Trying to use an old db backup from 2 days ago, but getting stuck in db connection part 
9 days ago
if you want then try to setup each and every thing from start again
audacity-london
Its a company website, unfortunately I can not do that.
9 days ago
ohhh
9 days ago
I'd try running pg_dump on your Postgres service via the console (Enable priority boarding in https://railway.com/account/feature-flags), and download the dump. Then, create a new environment and deploy your services there and see if it works. You can then use the console again on the newly created Postgres service to restore the database with pg_restore. Keep in mind that this will cause a spike in usage costs as you are deploying a new copy of everything.
audacity-london
I can't test because its not being deployed since it cant connect to db now.. I really need some professional help from railway team I believe. This system is running with many deployments for last 5 months. Today its not working suddenly. Trying to use an old db backup from 2 days ago, but getting stuck in db connection part 
9 days ago
on of the issue is
that your app is trying to pre-build a lot of pages from the database, but the database queries are too slow or too heavy. Because of that, Next.js gives up after 60 seconds per page and retries, which eventually breaks the build process.
0x5b62656e5d
I'd try running `pg_dump` on your Postgres service via the console (Enable priority boarding in https://railway.com/account/feature-flags), and download the dump. Then, create a new environment and deploy your services there and see if it works. You can then use the console again on the newly created Postgres service to restore the database with `pg_restore`. Keep in mind that this will cause a spike in usage costs as you are deploying a new copy of everything.
9 days ago
I tried to use a old copy, it throws this, what can I do about this?
You reached the start of the range
May 31, 2026, 11:34 AM
Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/1b3210fb-5b06-44cb-867d-ed80048a997d/vol_c0h6dsv6zdq9rblo
Starting Container
wrapper: removing stale /var/lib/postgresql/data/pgdata/postmaster.pid (no postgres running at container start)
Certificate will not expire
pgbackrest: volume 878 MiB; sized wal-drop=87 MiB queue-max=439 MiB
pgbackrest: restore-gate WAL_RECOVER_FROM_BUCKET= POSTGRES_RECOVERY_TARGET_TIME= PG_VERSION=present PG_CONTROL=present RESTORED_MARKER=missing PGDATA=/var/lib/postgresql/data/pgdata
PostgreSQL Database directory appears to contain a database; Skipping initialization
2026-05-31 08:35:41.057 UTC [16] FATAL: database files are incompatible with server
2026-05-31 08:35:41.057 UTC [16] DETAIL: The data directory was initialized by PostgreSQL version 17, which is not compatible with this version 18.4 (Debian 18.4-1.pgdg13+1).
Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/1b3210fb-5b06-44cb-867d-ed80048a997d/vol_c0h6dsv6zdq9rblo
2026-05-31 08:35:42.887 UTC [15] FATAL: database files are incompatible with server
2026-05-31 08:35:42.887 UTC [15] DETAIL: The data directory was initialized by PostgreSQL version 17, which is not compatible with this version 18.4 (Debian 18.4-1.pgdg13+1).
Certificate will not expire
pgbackrest: volume 878 MiB; sized wal-drop=87 MiB queue-max=439 MiB
pgbackrest: restore-gate WAL_RECOVER_FROM_BUCKET= POSTGRES_RECOVERY_TARGET_TIME= PG_VERSION=present PG_CONTROL=present RESTORED_MARKER=missing PGDATA=/var/lib/postgresql/data/pgdata
PostgreSQL Database directory appears to contain a database; Skipping initialization
Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/1b3210fb-5b06-44cb-867d-ed80048a997d/vol_c0h6dsv6zdq9rblo
Certificate will not expire
pgbackrest: volume 878 MiB; sized wal-drop=87 MiB queue-max=439 MiB
pgbackrest: restore-gate WAL_RECOVER_FROM_BUCKET= POSTGRES_RECOVERY_TARGET_TIME= PG_VERSION=present PG_CONTROL=present RESTORED_MARKER=missing PGDATA=/var/lib/postgresql/data/pgdata
PostgreSQL Database directory appears to contain a database; Skipping initialization
9 days ago
When i remove SSR rendering, now its not opening website at all since its dependent on database and theres no SSR page..
My db is not responding, please help!
9 days ago
Click on your database, go to settings, edit the source image, and change the number to 17. Make sure to disable auto updates as well.
9 days ago
Stucked here.. gonna redeploy. When I do calls from my terminal i can read db but my app can't, how can this be possible? theres no code change literally
Attachments
0x5b62656e5d
Did you change the image version to 17?
9 days ago
Yes, and redeployed. Now db is up, should i try to connect?
9 days ago
I started to use new db in build phase and runtime phase.
ERROR: Error: cannot connect to Postgres. Details: password authentication failed for user "postgres"
In build phase it worked as expected:
DATABASE_URI=postgresql://postgres:[HIDINGONPURPOSE]@zephyr.proxy.rlwy.net:10644/railway pnpm run build
30s
payload-test@0.1.0 build /app
next build
▲ Next.js 16.2.6 (Turbopack)
-
Experiments (use with caution):
· serverActions
⚠ The "middleware" file convention is deprecated. Please use "proxy" instead. Learn more: https://nextjs.org/docs/messages/middleware-to-proxy
Creating an optimized production build ...
✓ Compiled successfully in 16.3s
Running TypeScript ...
Finished TypeScript in 9.6s ...
Collecting page data using 19 workers ...
Generating static pages using 19 workers (0/4) ...
Generating static pages using 19 workers (1/4)
Generating static pages using 19 workers (2/4)
Generating static pages using 19 workers (3/4)
✓ Generating static pages using 19 workers (4/4) in 368ms
Finalizing page optimization ...
Bun in runtime it throws Error: cannot connect to Postgres. Details: password authentication failed for user "postgres"
Why can this be happening? using new db created with you
Attachments
9 days ago
I used external for both, just to make it work at least for now. Then I was planning to switch private networking db url but during build it works, during runtime it does not..
0x5b62656e5d
Is your Postgres throwing the authentication error in its logs?
9 days ago
Yes, can see this:
2026-05-31 08:58:19.580 UTC [2681] FATAL: password authentication failed for user "postgres"
2026-05-31 08:58:19.580 UTC [2681] DETAIL: Connection matched file "/var/lib/postgresql/data/pgdata/pg_hba.conf" line 128: "host all all all scram-sha-256"
2026-05-31 08:58:19.608 UTC [2682] FATAL: password authentication failed for user "postgres"
2026-05-31 08:58:19.608 UTC [2682] DETAIL: Connection matched file "/var/lib/postgresql/data/pgdata/pg_hba.conf" line 128: "host all all all scram-sha-256"
2026-05-31 08:58:20.469 UTC [2684] FATAL: password authentication failed for user "postgres"
2026-05-31 08:58:20.469 UTC [2684] DETAIL: Connection matched file "/var/lib/postgresql/data/pgdata/pg_hba.conf" line 128: "host all all all scram-sha-256"
2026-05-31 08:58:20.504 UTC [2685] DETAIL: Connection matched file "/var/lib/postgresql/data/pgdata/pg_hba.conf" line 128: "host all all all scram-sha-256"
9 days ago
Try this:
- Disable all public networking on the database if you have any, as the following steps will disable user authentication
- SSH into your database service (right click your service and select
Copy SSH Command) - Run this command:
sed -i 's/host all all all scram-sha-256/host all all ::\/0 trust/' /var/lib/postgresql/data/pgdata/pg_hba.conf(This will bypass user authentication) - Redeploy your database
- SSH again, and run the command
psql - Run
ALTER USER postgres with password '<PASSWORD>';where<PASSWORD>is the value of the variablePGPASSWORDin your Railway dashboard - Type
exit - Run
sed -i 's/host all all ::\/0 trust/host all all all scram-sha-256/' /var/lib/postgresql/data/pgdata/pg_hba.conf(This will re-enable user authentication) - Redeploy your database
8 days ago
I did that, still not connecting to db after build phase. Runtime is failing but theres no change in code, I will try to rollback to 4 days ago commit with 4 days ago db and update you back..
8 days ago
Its working back to normal with new db service we created, i just mounted our db to that db service. What could be possible that?
Status changed to Open passos • 5 days ago
