4 months ago
Hi Railway Support,
I'm experiencing severe database performance issues with my PostgreSQL service that appear to be related to storage I/O throttling/limitations.
Current Setup:
- Service: PostgreSQL 
- Plan: 32 vCPU, 32GB RAM 
- Application: Medusa.js e-commerce platform 
Issue: Database operations (user registration, cart operations) have become extremely slow over the past 2 weeks. PostgreSQL checkpoint logs show concerning patterns:
Checkpoint Performance:
- Normal checkpoints taking 85-180+ seconds (should be <30 seconds) 
- One checkpoint took 855 seconds (14+ minutes) 
- Write times consistently 80-800+ seconds 
- Sync times are normal (<0.1 seconds) 
Example Log Entry:
2025-07-07 10:30:28.228 UTC [27] LOG: checkpoint complete: wrote 857 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=85.812 s, sync=0.028 s, total=85.867 s; sync files=215, longest=0.013 s, average=0.001 s; distance=6178 kB, estimate=145656 kBAlready Completed:
- Optimized PostgreSQL configuration (checkpoint_timeout, wal_buffers, etc.) 
- Database maintenance (VACUUM, ANALYZE) 
- Query optimization 
Questions:
- What are the current storage I/O limits (IOPS, throughput) for my plan? 
- Are there any I/O throttling alerts or metrics showing my service hitting limits? 
- What storage upgrade options are available to improve write performance? 
- Can you see any storage-related performance metrics for my database service? 
The issue appears to be infrastructure-level storage performance rather than database configuration, as the extremely long write times indicate I/O bottlenecking.
56 Replies
4 months ago
Settings:
max_connections: 200
shared_buffers: 8GB
effective_cache_size: 24GB
maintenance_work_mem: 2GB
checkpoint_completion_target: 0.95
wal_buffers: 64MB
default_statistics_target: 100
random_page_cost: 1
effective_io_concurrency: 300
work_mem: 4MB
huge_pages: try
min_wal_size: 1GB
max_wal_size: 4GB
4 months ago
Hello!
We've escalated your issue to our engineering team.
We aim to provide an update within 1 business day.
Please reply to this thread if you have any questions!
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
The checkpoint logs show the issue is getting worse. I'm seeing some alarming patterns:
Sync Performance Degradation
- 04:30:52: sync=9.035s (was <0.1s before) 
- 04:06:11: sync=7.462s 
- 03:30:43: sync=8.759s - Sync times should be under 0.1 seconds. When sync times spike to 6-9 seconds, it indicates the storage system is severely struggling to flush data to disk. - One Extremely Bad Checkpoint - 04:06:11: 416 seconds total (nearly 7 minutes) 
- 408 seconds write time + 7.4 seconds sync time - 2025-07-08 04:45:41.255 UTC [27] LOG: checkpoint complete: wrote 695 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=77.143 s, sync=3.848 s, total=85.668 s; sync files=94, longest=2.621 s, average=0.041 s; distance=4901 kB, estimate=8421 kB; lsn=C/918DEF38, redo lsn=C/91857DA8 
 
2025-07-08 04:30:52.487 UTC [27] LOG: checkpoint complete: wrote 718 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=84.770 s, sync=9.035 s, total=96.636 s; sync files=75, longest=6.358 s, average=0.121 s; distance=4970 kB, estimate=8812 kB; lsn=C/91457D50, redo lsn=C/9138E950
2025-07-08 04:15:40.751 UTC [27] LOG: checkpoint complete: wrote 813 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=80.978 s, sync=0.033 s, total=84.816 s; sync files=107, longest=0.016 s, average=0.001 s; distance=6355 kB, estimate=9239 kB; lsn=C/90F95E30, redo lsn=C/90EB3F90
2025-07-08 04:06:11.835 UTC [27] LOG: checkpoint complete: wrote 3977 buffers (0.4%); 0 WAL file(s) added, 0 removed, 1 recycled; write=408.466 s, sync=7.462 s, total=416.800 s; sync files=213, longest=3.860 s, average=0.036 s; distance=9559 kB, estimate=9559 kB; lsn=C/90C09B78, redo lsn=C/9087F090
2025-07-08 03:45:27.945 UTC [27] LOG: checkpoint complete: wrote 619 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=72.841 s, sync=0.732 s, total=73.918 s; sync files=93, longest=0.294 s, average=0.008 s; distance=4438 kB, estimate=9532 kB; lsn=C/8FF49BE0, redo lsn=C/8FF29128
2025-07-08 03:30:43.928 UTC [27] LOG: checkpoint complete: wrote 806 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=80.758 s, sync=8.759 s, total=90.479 s; sync files=158, longest=6.599 s, average=0.056 s; distance=5590 kB, estimate=10099 kB; lsn=C/8FBE3AB8, redo lsn=C/8FAD3908
2025-07-08 03:16:44.351 UTC [27] LOG: checkpoint complete: wrote 1378 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=138.379 s, sync=0.023 s, total=151.299 s; sync files=117, longest=0.011 s, average=0.001 s; distance=10600 kB, estimate=10600 kB; lsn=C/8F74DF78, redo lsn=C/8F55E038
2025-07-08 03:00:19.955 UTC [27] LOG: checkpoint complete: wrote 663 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=66.284 s, sync=0.071 s, total=67.004 s; sync files=79, longest=0.040 s, average=0.001 s; distance=4839 kB, estimate=8028 kB; lsn=C/8F1D41A0, redo lsn=C/8EB03FF0
2025-07-08 02:46:00.852 UTC [27] LOG: checkpoint complete: wrote 907 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=95.083 s, sync=0.020 s, total=108.015 s; sync files=200, longest=0.007 s, average=0.001 s; distance=6261 kB, estimate=8382 kB; lsn=C/8E72CA38, redo lsn=C/8E64A298
2025-07-08 02:30:26.739 UTC [27] LOG: checkpoint complete: wrote 615 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=69.092 s, sync=4.309 s, total=73.801 s; sync files=49, longest=2.570 s, average=0.088 s; distance=4538 kB, estimate=8617 kB; lsn=C/8E0E1B80, redo lsn=C/8E02CC98
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Hi Ted.
I've gone ahead and applied a config that I hope will resolve this. Would you mind letting me know how it performs over the next 5 minutes (and then circling back)?
Apologies you're running into this!
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
Hi, thank you for responding.
2025-07-08 05:17:59.168 UTC [27] LOG: checkpoint complete: wrote 2234 buffers (0.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=223.564 s, sync=0.008 s, total=223.576 s; sync files=226, longest=0.001 s, average=0.001 s; distance=14723 kB, estimate=14723 kB; lsn=C/92C8E7D8, redo lsn=C/92B8C7D0
2025-07-08 05:19:02.419 UTC [25344] LOG: duration: 2444.116 ms statement: SELECT "id", "workflow_id", "transaction_id", "execution", "context", "state", "created_at", "updated_at", "deleted_at", "retention_time", "run_id" FROM "public"."workflow_execution" ORDER BY "created_at" DESC NULLS FIRST, "workflow_id" DESC, "transaction_id" DESC, "run_id" DESC LIMIT 1000
Good news: Sync time back to normal (0.008s) Bad news: Write time still extremely high (223 seconds / 3.7 minutes)
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Alright that's good. Would you mind checking for me, one more time, over the next 5 minutes.
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
2025-07-08 05:30:41.914 UTC [27] LOG: checkpoint complete: wrote 861 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=86.640 s, sync=0.004 s, total=86.647 s; sync files=185, longest=0.001 s, average=0.001 s; distance=5961 kB, estimate=13847 kB; lsn=C/932515B8, redo lsn=C/9315EDB8
This checkpoint shows the consistent pattern continuing - 86+ seconds is still far too slow for normal operations.
Application severely impacted - user registration, cart operations failing.
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
2025-07-08 05:45:40.227 UTC [27] LOG: checkpoint complete: wrote 841 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=84.205 s, sync=0.006 s, total=84.214 s; sync files=207, longest=0.001 s, average=0.001 s; distance=5580 kB, estimate=13020 kB; lsn=C/93790B18, redo lsn=C/936D1EC0
4 months ago
2025-07-08 06:05:59.364 UTC [27] LOG: checkpoint complete: wrote 4024 buffers (0.4%); 0 WAL file(s) added, 0 removed, 0 recycled; write=403.031 s, sync=0.005 s, total=403.038 s; sync files=93, longest=0.001 s, average=0.001 s; distance=8670 kB, estimate=12585 kB; lsn=C/94294F10, redo lsn=C/93F499E8
4 months ago
We're looking into this this week as it's deeply important. However, this may take a bit of time to get to the bottom of it. Apologies I don't have better news here.
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
This is a production database, with live users trying to use the project. Not sure what to do, we can't afford a week of lost registrations and orders.
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Gotchya. I can temporarily move you back to our old cloud machines to see if that resolves it
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
We are using static IP's, and don't have fast option to change it on payment providers. And this move requires change of static IP's, yes?
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Yes it would. I can attempt to move you regions, and then back, which would land you on a different host (this one seems to be a bit more temperamental)
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
I can try to get IP's sorted after the move, what down time we are looking at?
Status changed to Awaiting Railway Response Railway • 4 months ago
diffted
I can try to get IP's sorted after the move, what down time we are looking at?
4 months ago
Let’s attempt the cross regional reselection first, which will prevent the need for altering the IP
I’m quite confident that we can resolve it by doing the above
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
ok, we can proceed, anything from our side needed?
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
I’ll simply need you to confirm the integrity of the database once moved to the new region
I’ll move it to US East first, then back to Amsterdam after confirming
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
Copy process in progress. It should complete in less than 2 minutes from now
4 months ago
Completed AMS -> US east
Please confirm everything looks good, and I’m happy to move it back (to another instance)
4 months ago
All good
Status changed to Awaiting Railway Response Railway • 4 months ago
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
Worth noting the speeds in the US East. We will look for this once this lands in Europe
Attachments
jake
Worth noting the speeds in the US East. We will look for this once this lands in Europe
4 months ago
wow
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
We appear to be nominal in Europe again. Please confirm data and speeds
Attachments
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
I will keep tracing throughout the day, update in a few hours. Thanks for this! Really.
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
You're very welcome. I'm sorry we caused your business undue harm. I've applied a $100 credit to your account for this inconvenience.
Status changed to Awaiting User Response Railway • 4 months ago
Status changed to Solved jake • 4 months ago
4 months ago
Although the first checkpoint after the migration looked promising, from the second one onward we have been seeing the same problem as before:
2025-07-08 07:46:08.363 UTC [27] LOG: checkpoint complete: wrote 977 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=97.873 s, sync=0.019 s, total=97.930 s; sync files=186, longest=0.007 s, average=0.001 s; distance=6117 kB, estimate=6117 kB; lsn=C/9744A1A0, redo lsn=C/97341958
2025-07-08 08:01:06.419 UTC [27] LOG: checkpoint complete: wrote 958 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=95.888 s, sync=0.030 s, total=95.956 s; sync files=212, longest=0.015 s, average=0.001 s; distance=6066 kB, estimate=6112 kB; lsn=C/97A2A438, redo lsn=C/9792E218
2025-07-08 08:15:54.591 UTC [27] LOG: checkpoint complete: wrote 839 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=84.031 s, sync=0.018 s, total=84.073 s; sync files=205, longest=0.005 s, average=0.001 s; distance=5664 kB, estimate=6067 kB; lsn=C/97FBA438, redo lsn=C/97EB6590
2025-07-08 08:30:39.841 UTC [27] LOG: checkpoint complete: wrote 690 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=69.092 s, sync=0.025 s, total=69.150 s; sync files=73, longest=0.013 s, average=0.001 s; distance=4692 kB, estimate=5930 kB; lsn=C/984109A8, redo lsn=C/9834B8D0
2025-07-08 08:45:56.431 UTC [27] LOG: checkpoint complete: wrote 853 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=85.431 s, sync=0.025 s, total=85.492 s; sync files=217, longest=0.007 s, average=0.001 s; distance=5639 kB, estimate=5900 kB; lsn=C/989AE8D0, redo lsn=C/988CD620
Status changed to Awaiting Railway Response Railway • 4 months ago
Status changed to Solved diffted • 4 months ago
4 months ago
Anything else we can do?
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
2025-07-08 09:00:54.299 UTC [27] LOG: checkpoint complete: wrote 837 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=83.726 s, sync=0.031 s, total=83.776 s; sync files=214, longest=0.011 s, average=0.001 s; distance=5568 kB, estimate=5867 kB; lsn=C/9953A068, redo lsn=C/98E3D6D8
2025-07-08 09:08:06.741 UTC [1992] LOG: duration: 1597.122 ms statement: COMMIT;
2025-07-08 09:17:50.683 UTC [27] LOG: checkpoint complete: wrote 1999 buffers (0.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=200.199 s, sync=0.032 s, total=200.285 s; sync files=244, longest=0.009 s, average=0.001 s; distance=14388 kB, estimate=14388 kB; lsn=C/99F5F9D0, redo lsn=C/99C4A7F8
2025-07-08 09:23:26.513 UTC [2343] LOG: duration: 1976.771 ms statement: COMMIT;
2025-07-08 09:28:32.296 UTC [2435] LOG: duration: 1208.386 ms statement: COMMIT;
2025-07-08 09:37:03.871 UTC [27] LOG: checkpoint complete: wrote 4524 buffers (0.4%); 0 WAL file(s) added, 0 removed, 1 recycled; write=453.007 s, sync=0.041 s, total=453.091 s; sync files=217, longest=0.014 s, average=0.001 s; distance=12982 kB, estimate=14247 kB; lsn=C/9ADB5970, redo lsn=C/9A8F8160
2025-07-08 09:46:23.821 UTC [27] LOG: checkpoint complete: wrote 1127 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=112.800 s, sync=0.020 s, total=112.850 s; sync files=210, longest=0.008 s, average=0.001 s; distance=7859 kB, estimate=13608 kB; lsn=C/9B1CA980, redo lsn=C/9B0A5058
2025-07-08 09:54:08.584 UTC [3005] LOG: duration: 2469.892 ms statement: COMMIT;
2025-07-08 09:54:08.611 UTC [3026] LOG: duration: 2454.961 ms statement: COMMIT;
2025-07-08 09:54:08.611 UTC [3011] LOG: duration: 1192.506 ms statement: COMMIT;
2025-07-08 10:01:16.156 UTC [27] LOG: checkpoint complete: wrote 1051 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=105.191 s, sync=0.018 s, total=105.236 s; sync files=208, longest=0.006 s, average=0.001 s; distance=7020 kB, estimate=12950 kB; lsn=C/9B859BA0, redo lsn=C/9B780410
2025-07-08 10:15:43.133 UTC [27] LOG: checkpoint complete: wrote 727 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=72.811 s, sync=0.018 s, total=72.878 s; sync files=100, longest=0.007 s, average=0.001 s; distance=5104 kB, estimate=12165 kB; lsn=C/9BEA71F8, redo lsn=C/9BC7C5F0
2025-07-08 10:31:22.926 UTC [27] LOG: checkpoint complete: wrote 1125 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=112.616 s, sync=0.022 s, total=112.696 s; sync files=216, longest=0.010 s, average=0.001 s; distance=7874 kB, estimate=11736 kB; lsn=C/9C5936A8, redo lsn=C/9C42CEA8
2025-07-08 10:34:56.398 UTC [3914] LOG: duration: 3045.045 ms statement: COMMIT;
2025-07-08 10:34:56.480 UTC [3935] LOG: duration: 2960.344 ms statement: COMMIT;
4 months ago
Hello!
We've escalated your issue to our engineering team.
We aim to provide an update within 1 business day.
Please reply to this thread if you have any questions!
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
2025-07-08 15:31:15.481 UTC [27] LOG: checkpoint complete: wrote 988 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=98.967 s, sync=0.045 s, total=99.095 s; sync files=220, longest=0.014 s, average=0.001 s; distance=6625 kB, estimate=13776 kB; lsn=C/A8C26BC8, redo lsn=C/A8B08EE0
2025-07-08 15:41:24.555 UTC [10975] LOG: duration: 1772.422 ms statement: COMMIT;
2025-07-08 15:51:39.357 UTC [11150] LOG: duration: 2499.706 ms statement: COMMIT;
2025-07-08 15:51:39.409 UTC [11176] LOG: duration: 2545.245 ms statement: COMMIT;
2025-07-08 15:53:21.289 UTC [27] LOG: checkpoint complete: wrote 5240 buffers (0.5%); 0 WAL file(s) added, 0 removed, 1 recycled; write=524.654 s, sync=0.021 s, total=524.710 s; sync files=226, longest=0.008 s, average=0.001 s; distance=15058 kB, estimate=15058 kB; lsn=C/A9D20A80, redo lsn=C/A99BDAA8
2025-07-08 16:01:28.211 UTC [27] LOG: checkpoint complete: wrote 1116 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=111.779 s, sync=0.025 s, total=111.828 s; sync files=232, longest=0.007 s, average=0.001 s; distance=7881 kB, estimate=14341 kB; lsn=C/AA29F418, redo lsn=C/AA170078
2025-07-08 16:12:08.271 UTC [11643] LOG: duration: 1027.636 ms statement: COMMIT;
2025-07-08 16:16:47.144 UTC [27] LOG: checkpoint complete: wrote 1305 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=130.770 s, sync=0.024 s, total=130.835 s; sync files=223, longest=0.009 s, average=0.001 s; distance=8958 kB, estimate=13802 kB; lsn=C/AAC75A48, redo lsn=C/AAA2FB18
2025-07-08 16:31:23.063 UTC [27] LOG: checkpoint complete: wrote 1066 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=106.749 s, sync=0.029 s, total=106.821 s; sync files=207, longest=0.010 s, average=0.001 s; distance=7275 kB, estimate=13150 kB; lsn=C/AB2A3070, redo lsn=C/AB14A820
2025-07-08 16:46:26.560 UTC [27] LOG: checkpoint complete: wrote 1102 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=110.379 s, sync=0.022 s, total=110.436 s; sync files=207, longest=0.008 s, average=0.001 s; distance=7611 kB, estimate=12596 kB; lsn=C/ABACABD0, redo lsn=C/AB8B9730
2025-07-08 16:53:07.216 UTC [12526] LOG: duration: 1242.326 ms statement: COMMIT;
2025-07-08 16:53:08.620 UTC [12526] LOG: duration: 1365.486 ms statement: COMMIT;
2025-07-08 17:01:45.981 UTC [27] LOG: checkpoint complete: wrote 1291 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=129.228 s, sync=0.039 s, total=129.323 s; sync files=221, longest=0.019 s, average=0.001 s; distance=9157 kB, estimate=12252 kB; lsn=C/ACA16748, redo lsn=C/AC1AADF0
2025-07-08 17:03:21.606 UTC [12783] LOG: duration: 2352.834 ms statement: COMMIT;
2025-07-08 17:03:21.643 UTC [12784] LOG: duration: 2385.618 ms statement: COMMIT;
2025-07-08 17:17:41.365 UTC [27] LOG: checkpoint complete: wrote 1840 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=184.238 s, sync=0.018 s, total=184.285 s; sync files=242, longest=0.007 s, average=0.001 s; distance=13658 kB, estimate=13658 kB; lsn=C/AD150C58, redo lsn=C/ACF01898
2025-07-08 17:31:21.483 UTC [27] LOG: checkpoint complete: wrote 1038 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=103.961 s, sync=0.027 s, total=104.021 s; sync files=223, longest=0.009 s, average=0.001 s; distance=7265 kB, estimate=13019 kB; lsn=C/ADDFB580, redo lsn=C/AD619FC0
2025-07-08 17:34:09.217 UTC [13490] LOG: duration: 3221.338 ms statement: COMMIT;
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Looking into this one once again. I've been able to modify some config to drop the timing from 500s to 50-60s
I believe this will continue to improve as your snapshots age out. The reason for this, is that I believe the issue is one of write amplification caused by the way we've implemented snapshots.
As such, I believe, if you need the performance ASAP, you should be able to remove 3-4 of the old snapshots and see an improvement (which should hold as new backups are created due to the config I've applied)
Once again, apologies you're running into this and please let me know
- How the current state of the application is 
- If you're going to attempt the backup removal to increase compression (and decrease write amplification) 
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
We’ve removed four snapshots and retained two. As user activity is currently low, we’ll have better insight into the impact of this change in the coming hours. I’ll continue to monitor the application and follow up with an update here.
Thank you.
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Gotchya. Let's see where it lands for now. I'll be monitoring it but please let us know as well
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
Hi, we're not seeing any noticeable improvement at this stage— the database is still experiencing performance issues.
2025-07-09 05:06:18.360 UTC [27] LOG: checkpoint complete: wrote 1827 buffers (0.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=183.078 s, sync=0.052 s, total=183.148 s; sync files=218, longest=0.036 s, average=0.001 s; distance=13781 kB, estimate=13781 kB; lsn=C/C1788800, redo lsn=C/C16872B8
2025-07-09 05:19:48.375 UTC [27] LOG: checkpoint complete: wrote 927 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=92.868 s, sync=0.031 s, total=92.916 s; sync files=212, longest=0.013 s, average=0.001 s; distance=6257 kB, estimate=13029 kB; lsn=C/C1CB0228, redo lsn=C/C1CA3A58
2025-07-09 05:34:17.605 UTC [27] LOG: checkpoint complete: wrote 620 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=62.095 s, sync=0.015 s, total=62.130 s; sync files=88, longest=0.004 s, average=0.001 s; distance=4455 kB, estimate=12171 kB; lsn=C/C20FD7F0, redo lsn=C/C20FD7B8
2025-07-09 05:49:44.538 UTC [27] LOG: checkpoint complete: wrote 865 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=86.553 s, sync=0.931 s, total=88.836 s; sync files=207, longest=0.901 s, average=0.005 s; distance=5887 kB, estimate=11543 kB; lsn=C/C26EDC20, redo lsn=C/C26BD458
2025-07-09 05:53:01.910 UTC [24515] LOG: duration: 1201.381 ms statement: COMMIT;
2025-07-09 06:04:50.755 UTC [27] LOG: checkpoint complete: wrote 949 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=95.070 s, sync=0.025 s, total=95.117 s; sync files=216, longest=0.006 s, average=0.001 s; distance=6657 kB, estimate=11054 kB; lsn=C/C2ECAF50, redo lsn=C/C2D3D8E8
2025-07-09 06:20:01.193 UTC [27] LOG: checkpoint complete: wrote 1052 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=105.280 s, sync=0.017 s, total=105.339 s; sync files=220, longest=0.006 s, average=0.001 s; distance=7166 kB, estimate=10665 kB; lsn=C/C34BBBF8, redo lsn=C/C343D448
2025-07-09 06:34:59.620 UTC [27] LOG: checkpoint complete: wrote 1042 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=104.276 s, sync=0.028 s, total=104.327 s; sync files=215, longest=0.006 s, average=0.001 s; distance=7322 kB, estimate=10331 kB; lsn=C/C3DDB1E0, redo lsn=C/C3B63F98
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
2025-07-09 09:29:28.222 UTC [27] LOG: checkpoint complete: wrote 6689 buffers (0.6%); 0 WAL file(s) added, 0 removed, 1 recycled; write=670.090 s, sync=0.019 s, total=670.126 s; sync files=237, longest=0.007 s, average=0.001 s; distance=16271 kB, estimate=16271 kB; lsn=C/CAEAF4B8, redo lsn=C/CA9A3780
2025-07-09 09:34:50.906 UTC [27] LOG: checkpoint complete: wrote 924 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=92.551 s, sync=0.015 s, total=92.587 s; sync files=99, longest=0.005 s, average=0.001 s; distance=7439 kB, estimate=15388 kB; lsn=C/CB15A6D0, redo lsn=C/CB0E7560
2025-07-09 09:50:00.766 UTC [27] LOG: checkpoint complete: wrote 1015 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=101.673 s, sync=0.067 s, total=101.762 s; sync files=222, longest=0.053 s, average=0.001 s; distance=7084 kB, estimate=14558 kB; lsn=C/CB7E9230, redo lsn=C/CB7D25A8
2025-07-09 10:04:54.876 UTC [27] LOG: checkpoint complete: wrote 949 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=94.938 s, sync=0.069 s, total=95.023 s; sync files=218, longest=0.058 s, average=0.001 s; distance=6830 kB, estimate=13785 kB; lsn=C/CC0029B8, redo lsn=C/CBE7E120
2025-07-09 10:20:09.295 UTC [27] LOG: checkpoint complete: wrote 1092 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=109.282 s, sync=0.023 s, total=109.321 s; sync files=189, longest=0.008 s, average=0.001 s; distance=7558 kB, estimate=13162 kB; lsn=C/CC628750, redo lsn=C/CC5DF9D0
4 months ago
2025-07-09 17:06:15.660 UTC [27] LOG: checkpoint complete: wrote 1725 buffers (0.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=172.649 s, sync=0.029 s, total=172.699 s; sync files=238, longest=0.011 s, average=0.001 s; distance=12621 kB, estimate=12621 kB; lsn=C/DA5BF480, redo lsn=C/DA4DD680
2025-07-09 17:26:52.852 UTC [38296] LOG: duration: 1007.132 ms statement: COMMIT;
2025-07-09 17:29:58.443 UTC [27] LOG: checkpoint complete: wrote 6949 buffers (0.7%); 0 WAL file(s) added, 0 removed, 1 recycled; write=695.653 s, sync=0.016 s, total=695.685 s; sync files=99, longest=0.008 s, average=0.001 s; distance=14218 kB, estimate=14218 kB; lsn=C/DB7AE3A8, redo lsn=C/DB2C0240
4 months ago
We've been able to look into it and have some ideas. This occurs on machines as load increases.
However, they're gonna take a sec to implement
In the interim, we're happy to move you back to the Cloud machines (which are dead cold as we move to shut them down)
Please let us know; it's the same process we went through prior
Status changed to Awaiting User Response Railway • 4 months ago
4 months ago
Hi, you will be moving database only? same region as we are using now?
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
Yup we can start with just that database for now, see if it's alleviated. If not, we can progress to move any connected applications (which should 100% alleviate it)
Status changed to Awaiting User Response Railway • 4 months ago
3 months ago
Hi, we can't risk moving before weekend, let's see Monday, there is no general update on this issue? Even if we migrate to not metal, later on we will have to migrate back.
Status changed to Awaiting Railway Response Railway • 4 months ago
3 months ago
Noted, as for a general update. We know what the shape of the issue looks like, but we need to reproduce it reliably. Since this is not widespread on every single host, it's hard to debug. 
We're still working on this however and will give you updates as they come along.
Status changed to Awaiting User Response Railway • 4 months ago
3 months ago
2025-07-14 05:18:24.052 UTC [27] LOG: checkpoint complete: wrote 12500 buffers (1.2%); 0 WAL file(s) added, 0 removed, 3 recycled; write=854.887 s, sync=0.020 s, total=854.938 s; sync files=230, longest=0.007 s, average=0.001 s; distance=37089 kB, estimate=37089 kB; lsn=D/D1A3CEF8, redo lsn=D/D10E4330
2025-07-14 05:21:14.984 UTC [27] LOG: checkpoint complete: wrote 1258 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=125.849 s, sync=0.023 s, total=125.888 s; sync files=189, longest=0.011 s, average=0.001 s; distance=9594 kB, estimate=34339 kB; lsn=D/D1B88CB0, redo lsn=D/D1A42C40
2025-07-14 05:25:18.251 UTC [38439] LOG: duration: 1903.774 ms statement: COMMIT;
2025-07-14 05:36:10.861 UTC [27] LOG: checkpoint complete: wrote 1205 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=120.706 s, sync=0.047 s, total=120.779 s; sync files=200, longest=0.036 s, average=0.001 s; distance=8546 kB, estimate=31760 kB; lsn=D/D23CF7A0, redo lsn=D/D229B598
Status changed to Awaiting Railway Response Railway • 3 months ago
3 months ago
Update Ted, we have applied a number of configuration changes on the hosts that may fix the issue, can you confirm that you are still seeing those high p99 values?
Status changed to Awaiting User Response Railway • 3 months ago
3 months ago
Hi, 
As you can see from the log, it does not look good: write=315.021 s, write=154.935 s, write=285.339 s
Not sure what to do as this has been more then a week now..
025-07-15 02:24:34.053 UTC [27] LOG: checkpoint complete: wrote 3147 buffers (0.3%); 0 WAL file(s) added, 0 removed, 1 recycled; write=315.021 s, sync=0.051 s, total=315.108 s; sync files=221, longest=0.024 s, average=0.001 s; distance=10809 kB, estimate=12142 kB; lsn=E/371F1F8, redo lsn=E/35567A0
2025-07-15 02:35:23.314 UTC [27] LOG: checkpoint complete: wrote 651 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=65.119 s, sync=0.026 s, total=65.164 s; sync files=65, longest=0.009 s, average=0.001 s; distance=4803 kB, estimate=11408 kB; lsn=E/3A075D8, redo lsn=E/3A075A0
2025-07-15 02:50:40.975 UTC [27] LOG: checkpoint complete: wrote 805 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=80.629 s, sync=1.845 s, total=82.562 s; sync files=168, longest=1.796 s, average=0.011 s; distance=5734 kB, estimate=10841 kB; lsn=E/3FA0F58, redo lsn=E/3FA0F20
2025-07-15 03:06:07.701 UTC [60824] LOG: duration: 3536.711 ms statement: COMMIT;
2025-07-15 03:06:54.055 UTC [27] LOG: checkpoint complete: wrote 1546 buffers (0.1%); 0 WAL file(s) added, 0 removed, 1 recycled; write=154.935 s, sync=0.023 s, total=154.981 s; sync files=103, longest=0.013 s, average=0.001 s; distance=12035 kB, estimate=12035 kB; lsn=E/4C63BA8, redo lsn=E/4B61BC0
2025-07-15 03:20:17.730 UTC [27] LOG: checkpoint complete: wrote 585 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=58.565 s, sync=0.015 s, total=58.593 s; sync files=38, longest=0.005 s, average=0.001 s; distance=4415 kB, estimate=11273 kB; lsn=E/4FB2A70, redo lsn=E/4FB1AD0
2025-07-15 03:39:05.193 UTC [27] LOG: checkpoint complete: wrote 2850 buffers (0.3%); 0 WAL file(s) added, 0 removed, 1 recycled; write=285.339 s, sync=0.006 s, total=285.369 s; sync files=27, longest=0.005 s, average=0.001 s; distance=9000 kB, estimate=11045 kB; lsn=E/59CFF18, redo lsn=E/587BE70
Status changed to Awaiting Railway Response Railway • 3 months ago
3 months ago
I can make it so that you can deploy back onto GCP in the meantime if that will help tide you over.
Status changed to Awaiting User Response Railway • 3 months ago
3 months ago
🛠️ The ticket Disk Performance Issue on Metal has been marked as todo.
2 months ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 3 months ago
13 days ago
✅ The ticket Performance issue with disk operations on metal has been marked as completed.
13 days ago
🛠️ The ticket Performance issue with disk operations on metal has been marked as in progress.
13 days ago
✅ The ticket Performance issue with disk operations on metal has been marked as completed.
13 days ago
🛠️ The ticket Performance issue with disk operations on metal has been marked as in progress.
13 days ago
🛠️ The ticket Performance issue with disk operations on metal has been marked as in progress.
12 days ago
✅ The ticket Performance issue with disk operations on metal has been marked as completed.


