Multiple Cron runs active at the same time?
louisdeconinck
PROOP

12 days ago

I have a feeling that multiple cron runs are active at the same time.

I have set my cron schedule to trigger every minute. The aim is that it checks for work and if there is none, it stops. Sometimes there is work and then it needs about 2 hours to complete.

I feel like multipe cron runs get triggered causing RAM to to pile up and crash the service.

I was under the impression that if an existing cron run exists, new cron triggers automtically skip.

It is a Python script that parses a large 30GB XML into parquet. When I run this with 1 sequential worker I don't get this service crash, but I would like to speed it up with multiple parallel workers. Towards the end (I believe when recombining the results) the service crashes, I believe due to out of memory.

$20 Bounty

6 Replies

Status changed to Open Railway 12 days ago


domehane
FREE

12 days ago

Hello louisdeconinck,

so yes railway does skip new cron triggers if a previous run is still active, so multiple overlapping cron runs is not your problem

also worth knowing railway has a minimum interval of 5 minutes between cron runs so your every minute schedule isnt working the way you think

your actual issue is that the oom crash happening within a single run when your parallel workers recombine results at the end , 1 sequential worker no crash, parallel workers crash at the end your memory graph shows a steady climb then a sudden jump past 20gb right before the crash thats the recombination step eating all your ram at once

Hope this help you :)


louisdeconinck
PROOP

12 days ago

Thanks for the info.

In this case I had my cron schedule at every 15 minutes. In the middle of the run I get "Starting Container" when the next run was scheduled, but the old run was still running. What does this log mean? I'm expecting that the cron trigger would be skipped.

You can see in the list of cron runs that I have a run every 15 minutes, even though it's just two runs that span 30min / 1 hour. Why does it show this as multiple runs every 15 minutes if it's just 2 longer spanning runs?

Are these just UI bugs?


louisdeconinck
PROOP

12 days ago

Also, does Railway provide error message or logs on why a service crashed?


domehane
FREE

12 days ago

so looking at your screenshots the multiple runs every 15 minutes are not a ui bug those are genuinely separate cron executions each lasting around 14to15 minutes the numbers in parentheses like (1m), (16m), (30m), (1h) are just "time since that run completed" at the moment you took the screenshot, not a second duration the 3:15pm run with the red dot is your crashed run

on the "starting container" log appearing mid-run, that deployment is marked as "removed" which means a new deployment was pushed to replace it that "starting container" is railway booting the new deployment, not a new cron trigger. so not a bug either

on crash logs, yes railway shows them. the deploy logs of that removed deployment (image 1) is exactly where you find what happened before the crash just scroll to the very bottom of those logs on the removed deployment and you'll see the last output before it died


louisdeconinck
PROOP

12 days ago

Thanks for looking into this.

I don't think that's correct. Both UI runs have the same id and contain the same logs, so I would think it's the same run even though the UI shows multiple runs.

At the end of the deploy logs I don't see any crash log from Railway itself, just my own logs.

Attachments


louisdeconinck

Thanks for looking into this. I don't think that's correct. Both UI runs have the same id and contain the same logs, so I would think it's the same run even though the UI shows multiple runs. At the end of the deploy logs I don't see any crash log from Railway itself, just my own logs. ![](https://station-server.railway.com/attachments/att_01kr9gx0n1eh496tjk5n5926yd)

domehane
FREE

12 days ago

okay so the ui showing multiple runs is a display quirk on railway's side not actual concurrent runs which matches what you found yourself (same id, same logs)

on the missing crash log i think that's normal and expected when the linux kernel kills your process due to out of memory it sends sigkill, which is an immediate hard kill your app physically cannot catch sigkill so it never gets a chance to write a final log line the logs just stop that's not a railway bug, that's just how linux works

your red dot on the 3:15pm run plus your ram hitting the ceiling on the metrics graph is your evidence it was an oom kill not the deploy log


Welcome!

Sign in to your Railway account to join the conversation.

Loading...