Cron won't stop running

gigameshHOBBY

10 months ago

My cron job won't stop running. I keep getting logs. This is going to explode the costs for all my dependencies that this function relies on.

Project: e0b4fa59-9df9-4101-99fc-5ca18e50af81

Service: 71550f4a-b09f-426f-a163-3ff735a06d40

This is the 3rd problem I've experienced with your crons in the past week.

Your system is so buggy. I will be leaving asap.

Awaiting User Response

24 Replies

10 months ago

Hey, can you share a link to the logs you're receiving? I checked that service and nothing appears to be running on our end.


Status changed to Awaiting User Response railway[bot] 11 months ago


gigameshHOBBY

10 months ago

It stopped after about 7 minutes but ran the start command at least 10 times. Why?

https://railway.app/project/e0b4fa59-9df9-4101-99fc-5ca18e50af81/logs


Status changed to Awaiting Railway Response railway[bot] 11 months ago


10 months ago

10 times is the exact default amount of times Railway will restart your app if it crashes (exits with a non zero code)

Edit, I'm mistaken, CRON jobs do not have restart policies.


gigameshHOBBY

10 months ago

I just counted. It was actually 39 times.

Btw I asked in another thread if I should be worried about multiple instances showing as being in a READY state. That was the case for this function. For some reason there were many rows showing as READY. Is it possible more than one was switched to active?

I tried killing the service as soon as it started repeating, but the logs kept coming for several minutes.


gigameshHOBBY

10 months ago

Also, there was no indication the function was failing. I throw an error if it is getting called more than once per day, which seemed to work because my database wasn't negatively impacted. However its a mystery why there are no error logs.


10 months ago

It looks like all of your recent crons failed due to a build error so they were never actually deployed and ran. I'm unable to see which logs you're referring to sorry, could you link to them specifically by clicking on the timestamp? That should give you a URL like https://railway.app/project/e0b4fa59-9df9-4101-99fc-5ca18e50af81/logs?context=<timestamp> that will help pinpoint the logs.

We also recently pushed a fix for cron scheduling, but I'm unsure if that's related to what you're experiencing.


Status changed to Awaiting User Response railway[bot] 11 months ago


gigameshHOBBY

10 months ago

They started here: https://railway.app/project/e0b4fa59-9df9-4101-99fc-5ca18e50af81/logs?filter=%40service%3A71550f4a-b09f-426f-a163-3ff735a06d40&start=1721688649730

I'm not sure what you mean about all recent crons failing. Only the one I shut down is currently in a failed state:


Status changed to Awaiting Railway Response railway[bot] 11 months ago


gigameshHOBBY

10 months ago

Btw you mentioned it retries 10 times by default. How do I disable that?


10 months ago

In the service settings.


gigameshHOBBY

10 months ago

Where specifically? The closest thing I'm seeing is this:


"Services with a cron schedule to do not have a restart policy"

----------

Some additional context about the logs. These are just console.warn() calls and not errors. It doesn't look like any error was getting thrown.

Does your infra catch the error and not display it when it resarts?


10 months ago

My bad, I had forgotten that CRON services don't have a restart policy.

console.warn() emits to stderr, thus are being colored red.

If you're not seeing logs you want, its because your app didn't print any.


10 months ago

You mentioned 39 times, are you running your app with something like pm2 that could restart your app itself?


gigameshHOBBY

10 months ago

No it just calls once and exits.

If it was succeeding each time, there should be evidence of it in my database but there is not.
Is it possible the logging is buggy? Maybe it only ran once?


10 months ago

First of all, deepest apologies you've had a couple issues with crons. We've been working to make them better recently. I've gone ahead and issued you $100 of credit for the inconvenience, which should cover 1-2 months of usage. You're welcome to use it or not; totally get your frustration.

From our side, if an instance actually gets started or stopped, you will see a "Container Started" and "Container Stopped" line item

You can see this for example in your previous deployment. This one is very weird and we're looking at it internally. It LOOKS like a dupe logging issue, but it's quite odd that the timestamps are 8 minutes apart, but for just that replicaID (aka, we didn't fire the deployment multiple times on our side, you would see it in the deployments list).


Status changed to Awaiting User Response railway[bot] 11 months ago


10 months ago

Metrics don't indicate that it ran more than once just to ease any downstream issues

Attachments


10 months ago

Confirmed as a logging issue.

We've identified the issue as our log collector for the V2 runtime crash looping due to a port conflict. It currently will try and grab an ephemeral port on the instance. If it fails to, it will loop. It should have a fingerprint file to prevent this; we're looking into why your logs were duplicated.

Your cron never actually ran multiple times and this was purely a logging issue on our end. Sorry for the confusion.

Also, it sounds like you've hit a couple very rare bugs on Railway in general, and I'm deeply sorry about that. We'd happily get on a call with you and field any and all gripes you might have and resolve them.


gigameshHOBBY

10 months ago

Can you please check there aren't still rogue crons or servers running on my account?

I'm experiencing 10x the expected traffic to my redis cache but I can't pinpoint the source. The increase in traffic started the day I set up your native crons.

I've since shut them and went back to manually scheduling from a persistent server, but I have a nagging feeling something is still running that I can't see in my dashboard.


Status changed to Awaiting Railway Response railway[bot] 10 months ago


10 months ago

There is definitely another service in that project that isn't showing up on the canvas, it can be removed from within the project settings.

Let me know if you still think something is running after that and we can look into this further!


Status changed to Awaiting User Response railway[bot] 10 months ago


gigameshHOBBY

10 months ago

Where in the settings?

And why wouldn't it show up on the canvas??


Status changed to Awaiting Railway Response railway[bot] 10 months ago


10 months ago

There is a Danger tab within the project settings.


Status changed to Awaiting User Response railway[bot] 10 months ago


gigameshHOBBY

10 months ago

I just have 2 services listed there, prod & staging. Each corresponds to an environment and both are showing up on the canvas.

Were you referring to something else?


Status changed to Awaiting Railway Response railway[bot] 10 months ago


10 months ago

My apologies I didn't notice you had two environments.


Status changed to Awaiting User Response railway[bot] 10 months ago


gigameshHOBBY

10 months ago

I'm still seeing signs of services continuing to run after I shut them down. For example, I stopped this server for my staging environment yesterday:

https://railway.app/project/e0b4fa59-9df9-4101-99fc-5ca18e50af81/service/a7cb1a90-ec14-4707-92dc-cea280d1b17c

It was the only service using that env but I'm still receiving error logs on Sentry as recently as an hour ago, as if they're coming from that server.

I've verified this isn't a result of mistakenly setting the wrong env in my production config.

Can you please investigate?


Status changed to Awaiting Railway Response railway[bot] 10 months ago


10 months ago

Hi, really sorry your environments still feel off. lets see what we can do. I've asked the team to dig deeper. In the meantime:

I'm still receiving error logs on Sentry as recently as an hour ago

can you share some of those here? thanks!


Status changed to Awaiting User Response railway[bot] 10 months ago


Cron won't stop running - Railway Help Station