2 years ago
I worked on a cron application yesterday that fired at 6AM CST correctly on the first night, but did not do it again last night.
49 Replies
2 years ago
3c0425aa-9fe4-449a-a161-8a1efb3b53ee
2 years ago
The schedule is 0 11 * * *, which means everyday at 11:00 AM UTC.
2 years ago
Logs show the last invocation was on June 1st at 6:01 AM CST (correct, although 1 minute late?).
2 years ago
The expected last invocation would actually be June 2nd at 6:00AM CST, but there is no activity, logs, or even an invocation on the dashboard.
2 years ago
I only found out about this because my cron monitor raised an issue.
2 years ago
The history below doesn't show anything interesting, if you're curious.

2 years ago
from my understanding, there are too many jobs being ran at 11am utc that some get skipped, until the team addresses this i would recommend switching to an in-code scheduler
2 years ago
are you serious? i literally just undid the node cron scheduler because it was consuming 50 MB memory constantly and i thought it'd be fun to move away from that
2 years ago
not mad at you, just… that's pretty sucky
2 years ago
i feel you, you could also try another time? 10:30am utc?
2 years ago
Yeah, I was thinking that some weird off-color time would be less likely to incur issues.
2 years ago
Something like XX:48
2 years ago
The whole job is done in 5 seconds usually.
2 years ago
yep you got the right idea
2 years ago
Gonna try 10:48 UTC and see what happens.
2 years ago
sounds good!
2 years ago
i have also sent this thread in a thread i have with cooper for gathering cron issues like yours
2 years ago
Alrighty; just my thought: a little detail about this being skipped, or likely to be skipped, or some transparency on the issues with cron would be nice.
2 years ago
I don't mind that Railway's platform is in need of improvement; but letting users explore until they hit a landmine isn't ideal.
2 years ago
i assume they had never thought they would be over scheduled so they never designed error handing and the ui around it
2 years ago
Kinda an interesting problem to think about in retrospect.
2 years ago
ideally the only issues that you could get out of a cron job would be an issue with the build or deploy
2 years ago
Maybe a check-mark that says "I'm okay with this being rescheduled slightly" would be good.
Crons that are more important could be charged at a higher rate, but they'll be prioritized on runners or whatever.
2 years ago
If you have 600 jobs every day at 11AM UTC, spinning up tons of machines to work on them is not exactly smart. Especially when most of them are tiny jobs.
2 years ago
Working in bursts would be better. And working early + late, like queueing. Start at 10:58 or even earlier to start executing.
2 years ago
im sure they have more than 600 at 11am utc, and if i recall correctly, its only a single schedular on their backplane
2 years ago
aha i really have no idea about the scale railway works at tbh
2 years ago
i dont really either, im just going off the crumbs they give us
2 years ago
i mean they do tell us a fair bit, but more info can never hurt in our position of community help
2 years ago
@Brody

2 years ago
This is pretty sucky as crons go.
2 years ago
I'm not sure what's going on, actually; I cannot tell if Railway is to fault here.
2 years ago
Nevermind, seems like something with Sentry is going wrong?Error while running backup: AxiosError: connect EHOSTUNREACH 34.120.195.249:443
2 years ago
host unreachable eh? you aren't the first person to see this error even after they resolved the incident
2 years ago
are you on the legacy or v2 runtime? check your service settings, if legacy, switch it to v2
2 years ago
Got it, switched it to V2. Didn't know that was a setting lol.
2 years ago
just for clarity, the v2 runtime has been confirmed to fix host unreachable, but it has no impact on cron being skipped since that's a completely different system
2 years ago
👍
a year ago
Working good!

a year ago
No failures since.
a year ago
they made changes to the cron scheduler too
a year ago

a year ago
the changes they made did not help 😦
a year ago
but thank you for trying
a year ago
i mean i guess at least it happened the same day?
a year ago
lol
a year ago
very odd
a year ago
lol
a year ago
back to in code