Cron Missed Completely, No Logs

2 years ago

I worked on a cron application yesterday that fired at 6AM CST correctly on the first night, but did not do it again last night.

49 Replies

2 years ago

3c0425aa-9fe4-449a-a161-8a1efb3b53ee


2 years ago

The schedule is 0 11 * * *, which means everyday at 11:00 AM UTC.


2 years ago

Logs show the last invocation was on June 1st at 6:01 AM CST (correct, although 1 minute late?).


2 years ago

The expected last invocation would actually be June 2nd at 6:00AM CST, but there is no activity, logs, or even an invocation on the dashboard.


2 years ago

I only found out about this because my cron monitor raised an issue.


2 years ago

The history below doesn't show anything interesting, if you're curious.

1246964730141343700


2 years ago

from my understanding, there are too many jobs being ran at 11am utc that some get skipped, until the team addresses this i would recommend switching to an in-code scheduler


2 years ago

are you serious? i literally just undid the node cron scheduler because it was consuming 50 MB memory constantly and i thought it'd be fun to move away from that


2 years ago

not mad at you, just… that's pretty sucky


2 years ago

i feel you, you could also try another time? 10:30am utc?


2 years ago

Yeah, I was thinking that some weird off-color time would be less likely to incur issues.


2 years ago

Something like XX:48


2 years ago

The whole job is done in 5 seconds usually.


2 years ago

yep you got the right idea


2 years ago

Gonna try 10:48 UTC and see what happens.


2 years ago

sounds good!


2 years ago

i have also sent this thread in a thread i have with cooper for gathering cron issues like yours


2 years ago

Alrighty; just my thought: a little detail about this being skipped, or likely to be skipped, or some transparency on the issues with cron would be nice.


2 years ago

I don't mind that Railway's platform is in need of improvement; but letting users explore until they hit a landmine isn't ideal.


2 years ago

i assume they had never thought they would be over scheduled so they never designed error handing and the ui around it


2 years ago

Kinda an interesting problem to think about in retrospect.


2 years ago

ideally the only issues that you could get out of a cron job would be an issue with the build or deploy


2 years ago

Maybe a check-mark that says "I'm okay with this being rescheduled slightly" would be good.
Crons that are more important could be charged at a higher rate, but they'll be prioritized on runners or whatever.


2 years ago

If you have 600 jobs every day at 11AM UTC, spinning up tons of machines to work on them is not exactly smart. Especially when most of them are tiny jobs.


2 years ago

Working in bursts would be better. And working early + late, like queueing. Start at 10:58 or even earlier to start executing.


2 years ago

im sure they have more than 600 at 11am utc, and if i recall correctly, its only a single schedular on their backplane


2 years ago

aha i really have no idea about the scale railway works at tbh


2 years ago

i dont really either, im just going off the crumbs they give us


2 years ago

i mean they do tell us a fair bit, but more info can never hurt in our position of community help


2 years ago

@Brody

1249188566031138800


2 years ago

This is pretty sucky as crons go.


2 years ago

I'm not sure what's going on, actually; I cannot tell if Railway is to fault here.


2 years ago

Nevermind, seems like something with Sentry is going wrong?
Error while running backup: AxiosError: connect EHOSTUNREACH 34.120.195.249:443


2 years ago

host unreachable eh? you aren't the first person to see this error even after they resolved the incident


2 years ago

are you on the legacy or v2 runtime? check your service settings, if legacy, switch it to v2


2 years ago

Got it, switched it to V2. Didn't know that was a setting lol.


2 years ago

just for clarity, the v2 runtime has been confirmed to fix host unreachable, but it has no impact on cron being skipped since that's a completely different system


2 years ago

👍


2 years ago

Working good!

1253090840746659800


2 years ago

No failures since.


2 years ago

they made changes to the cron scheduler too


2 years ago

1255748537069736000


2 years ago

the changes they made did not help 😦


2 years ago

but thank you for trying


2 years ago

i mean i guess at least it happened the same day?


2 years ago

lol


2 years ago

very odd


2 years ago

lol


2 years ago

back to in code


Loading...