a year ago
Good evening,
I’m reaching out with a somewhat vague issue in the hope that someone here has experienced something similar and can offer insights or advice.
To get straight to the point, we have a PostgreSQL database connected to a Node.js API, with Redis handling background task management. Every night, starting at 11:30 PM UTC and lasting for approximately 30 minutes, we encounter a significant response time spike. During this period, our API’s average response time jumps from milliseconds to over 60 seconds, often resulting in timeouts.
This recurring issue is severely impacting performance, and I’m wondering if anyone has encountered a similar scenario. Could this be related to a PostgreSQL default configuration or some overlooked process running in the background?
Any suggestions or shared experiences would be greatly appreciated!
Thank you!
18 Replies
If it is happening at the same time every night. It could be a CRON job or something weird with the datetime. I would check there.
As far as we can tell, there isn’t a CRON job scheduled to run at those times—at least, not one that we configured. Could you clarify what you mean by ‘something weird with the datetime’? A bit more detail would help us understand and investigate further.
Anyhow, have you noticed any usage metrics deviations during these slowdown periods?
According to Railway’s dashboard (covering the last 7 days) and Axiom’s concrete user usage data, there doesn’t appear to be any correlation. I’ve also attached a report from Axiom showing the average elapsed time during these spikes for further reference.
p.s.: Red is the Postgres database on Railway


a year ago
looks like there is definitely some kind of concentrated activity, on two kinds of metrics there
Nothing unusual on Railway; however, for reference, it’s expected for our product to see increased activity during those hours, as the majority of our users are based in the US.

a year ago
do you have any tracing for debugging in your app?
It’s a mobile app designed specifically for iOS and macOS. From our logs, nothing unusual stands out. However, the fact that this issue consistently occurs around the same time suggests it’s unlikely to be user-driven—especially since activity spikes to extreme levels within just 60 seconds. For example, last night, everything was functioning as expected at 11:20, but by 11:30, things suddenly went haywire
a year ago
I shall take that as a no.
I would highly recommend adding tracing to your backend application, it will give you far more insight and help you more than anything anyone here could say
@rubenamorim might be in a better position to answer you that, I'm sorry but I'm mostly focused on the client side of it 😅
Closing this post as we believe we've found the source of this issue. Regardless, thank you for mentioning tracing, we'll add it soon.
a year ago
well please do tell us the source of the issue!?
It seemed to be a mix of autovaccum and unoptimized queries. Once we've optimized them with indexes things seem to be working out better than ever.
a year ago
ah gotcha, thank you for sharing!
a year ago
I shall mark as solved
a year ago
!s
Status changed to Solved brody • about 1 year ago