API Crashes 20+ Times Daily
jared-leddy
HOBBYOP

2 years ago

Trying to figure out what is going on. Our uptime monitor sends out an email when the page goes down for longer than 5 min. For the past few weeks, the API is showing that it goes down 20-40 times a day for about a 3 day stretch. Then it's fine for a few days, and it's back to a 3 days stretch of chaos.

What?
How?
Why?

34 Replies

2 years ago

please provide more information, for starters, are there any error logs?


jared-leddy
HOBBYOP

2 years ago

There are plenty of info logs, but they don't have any details on them that indicate an issue. There are no error, debug or warn logs for the past 2 months.


2 years ago

I would recommend adding some very verbose debug logging sonyoi can determine at what point your code crashes


jared-leddy
HOBBYOP

2 years ago

It's built on Nest.js and the error/warning logs are typically pretty solid. I can look at adding something else, but that may take a bit.


2 years ago

railway isn't going to have the observability into your app if your app doesn't have the observability you need to determine the issue


jared-leddy
HOBBYOP

2 years ago

In English, you're saying that the logs in Railway are only as good as the ones built into the app.


2 years ago

that's correct


2 years ago

if you don't know why your app is crashing, railway isn't going to know either


2 years ago

besides things like OOM but that's easy enough to determine from your side


jared-leddy
HOBBYOP

2 years ago

That's the problem I believe. We're using Nest instead of Express in part because of the built-in logs. If the app actually crashes, Nest will let you know. But I'm not seeing any logs that say that the app actually crashed.


jared-leddy
HOBBYOP

2 years ago

If the app never actually crashed, then the app has a problem with 1 page going haywire, or the response time is too long.


jared-leddy
HOBBYOP

2 years ago

The uptime monitor is showing that the API went down 7 times on 2024-06-21 for an estimated total 7 minutes.


jared-leddy
HOBBYOP

2 years ago

It doesn't actually tell me why, but this is a "keyword found" type monitor.

1254584995037581300


jared-leddy
HOBBYOP

2 years ago

This monitor is an HTTP ping. It shows nothing happening on that date.

1254585224185119000


2 years ago

I'm sure there's a hundred or more ways your app could crash or soft lock without nest knowing.

are you on the v2 runtime? and on the new edge proxy?


jared-leddy
HOBBYOP

2 years ago

I suspect that is probably true, though the how that can happen seem lost on me. I'm guessing no on v2 and edge, as I don't know what those are.


2 years ago

check your service settings


jared-leddy
HOBBYOP

2 years ago

My Railway settings say Legacy runtime.


2 years ago

and the edge proxy?


jared-leddy
HOBBYOP

2 years ago

Not enabled.


2 years ago

does your service have a volume?


jared-leddy
HOBBYOP

2 years ago

I don't think so. I can't find anything that says Volume in the settings.


2 years ago

it's not in the settings, look at the project canvas


jared-leddy
HOBBYOP

2 years ago

I'm guessing this is the canvas. If so, then it's just a Github repo and PostgreSQL DB.

1254669661874688000


2 years ago

there's no volume on the API service


2 years ago

you can see the postgres service has a volume


2 years ago

enable the v2 runtime and edge proxy on your API service


jared-leddy
HOBBYOP

2 years ago

I see what you're talking about. The bottom box.


jared-leddy
HOBBYOP

2 years ago

Deploying the updates now.


jared-leddy
HOBBYOP

2 years ago

That's done.


2 years ago

okay continue monitoring the service and report back


jared-leddy
HOBBYOP

2 years ago

Copy.


ayush-lal
HOBBY

2 years ago

i've also been having an issue with my nestjs API restarting sporatically throughout the day. Havent had time to look into it though, i believe i used one of the existing railway templates. Did you also use the template @jared.leddy?


jared-leddy
HOBBYOP

2 years ago

No, we didn't do anything fancy. Just connect the repo and quick deploy it with ENVs and a DB.


Loading...