2 years ago
Trying to figure out what is going on. Our uptime monitor sends out an email when the page goes down for longer than 5 min. For the past few weeks, the API is showing that it goes down 20-40 times a day for about a 3 day stretch. Then it's fine for a few days, and it's back to a 3 days stretch of chaos.
What?
How?
Why?
34 Replies
2 years ago
please provide more information, for starters, are there any error logs?
There are plenty of info logs, but they don't have any details on them that indicate an issue. There are no error, debug or warn logs for the past 2 months.
2 years ago
I would recommend adding some very verbose debug logging sonyoi can determine at what point your code crashes
It's built on Nest.js and the error/warning logs are typically pretty solid. I can look at adding something else, but that may take a bit.
2 years ago
railway isn't going to have the observability into your app if your app doesn't have the observability you need to determine the issue
In English, you're saying that the logs in Railway are only as good as the ones built into the app.
2 years ago
that's correct
2 years ago
if you don't know why your app is crashing, railway isn't going to know either
2 years ago
besides things like OOM but that's easy enough to determine from your side
That's the problem I believe. We're using Nest instead of Express in part because of the built-in logs. If the app actually crashes, Nest will let you know. But I'm not seeing any logs that say that the app actually crashed.
If the app never actually crashed, then the app has a problem with 1 page going haywire, or the response time is too long.
The uptime monitor is showing that the API went down 7 times on 2024-06-21 for an estimated total 7 minutes.
It doesn't actually tell me why, but this is a "keyword found" type monitor.

2 years ago
I'm sure there's a hundred or more ways your app could crash or soft lock without nest knowing.
are you on the v2 runtime? and on the new edge proxy?
I suspect that is probably true, though the how that can happen seem lost on me. I'm guessing no on v2 and edge, as I don't know what those are.
2 years ago
check your service settings
2 years ago
and the edge proxy?
2 years ago
does your service have a volume?
I don't think so. I can't find anything that says Volume in the settings.
2 years ago
it's not in the settings, look at the project canvas
I'm guessing this is the canvas. If so, then it's just a Github repo and PostgreSQL DB.

2 years ago
there's no volume on the API service
2 years ago
you can see the postgres service has a volume
2 years ago
enable the v2 runtime and edge proxy on your API service
2 years ago
okay continue monitoring the service and report back
i've also been having an issue with my nestjs API restarting sporatically throughout the day. Havent had time to look into it though, i believe i used one of the existing railway templates. Did you also use the template @jared.leddy?
No, we didn't do anything fancy. Just connect the repo and quick deploy it with ENVs and a DB.
