3 months ago
I'm having an issue where my database is simply unreachable, I've tried a re-deploy, restarting it and nothing happens, I can see the logs and it's still up.
that's causing us a downtime
9 Replies
3 months ago
Unreachable over the private or public network?
3 months ago
over private networking as it seems but weirdely enough I can reach it over Tailscale
3 months ago
does this help?

3 months ago
What is the source service that is trying to access the database
3 months ago
3 months ago
Can the data tab access it if you add a TCP proxy
3 months ago
Yeah I'm able
3 months ago
moving it over to us-west also didn't make any difference, healthchecks don't even go through
3 months ago
I'm not seeing other reports, and the database and backend aren't using the beta IPv4 networking, so I'm not sure of the issue.
I'm also not seeing any errors in the logs besides the failing health check?
3 months ago
let me try railway ssh
3 months ago
psql is able to connect, yeah might be our fault
3 months ago
will investigate more
3 months ago
psql over the private network?
3 months ago
did ssh into the service container, installed psql there and a connection was made
3 months ago
just weird that we're getting these errors from the database

3 months ago
even tho no deploy was made and the postgres metrics is normal
3 months ago
What is your timeout set to?
3 months ago
whatever typeorm uses by default
3 months ago
i'll try increasing it but doubt its that
3 months ago
even satellite services, with totally different source code than ours, are also unable to connect to our databse
I don't know if it's related but I am having something somewhat similar, one of my services stopped working and when restarting the deploy fails on the health check. it looks like it might be unable to connect to the pg db that I have running, but I can connect to it over public network, (maybe private network issue?) nothing have changed in the service in the last few days no new deployments no changes. Any support would be appreaciated
3 months ago
same here, still unable to debug
3 months ago
seems like that some connections go through
3 months ago
now the problem is also affecting our other project, completely unrelated
One strange thing I noticed is that the "Architecture" UI for a PG DB usually show how much of the db storage is used, and it does for the project that I still have running fine, but not longer does that for the one that is having the problem
see the difference in the screenshots


3 months ago
ohh same here
3 months ago
wish I could dettach the volume re-attach to another service
3 months ago
tried to do a backup and restore from it, still having issues
3 months ago
all of our major providers are still up and no issues whatsover
3 months ago
I have the same problem. I can access it internally from my Node app, but it's inaccessible from an external app. It's not possible to access it from DBeaver or a Java connection.
3 months ago
can anyone from the Railway team confirm that they're looking into it? would keep me calm
3 months ago
HELP!! railway team, conexion not found
3 months ago
dumping the database and restoring it into another service solved my issue for one of my projects
volume size appears ok without any problems
3 months ago
When did you all first see errors?
3 months ago
14:30-14:50 Brazilian time
3 months ago
my only issue now is with this database:
3 months ago
I gotta start asking for timestamps in UTC
3 months ago
in your timezone:
3 months ago
Please provide a direct link to your database.
3 months ago
3 months ago
I'm sorry but that's not quite what I asked for, please provide the URL of your browser's omni bar while opened to the database.
3 months ago
Hello!
We're acknowledging your issue and attaching a ticket to this thread.
We don't have an ETA for it, but, our engineering team will take a look and you will be updated as we update the ticket.
Please reply to this thread if you have any questions!
3 months ago
I've rasied this to the infra team.
3 months ago
3 months ago
Please, my job depends on this, I have clients working who can't use the service.
3 months ago
people that highly depends on their service, do a pgdump and pgrestore to another service, I'm in the middle of doing it for another project of ours.
3 months ago
also, use an ubuntu container and railway ssh for a faster dump
3 months ago
How do you connect? I can't connect.
3 months ago
just did a pgrestore and pgdump for both of our databases and they're back up again, feel free to do anything to those services (well, as long as you don't delete them)
3 months ago
make sure to increase your connections count to a really high value and then try to connect
3 months ago
our connections were pilling up and thus we were getting too many clients
3 months ago
We are actively looking into the cause.
3 months ago
and obviously, run a railway backup just to be sure
3 months ago
it already works!! thanks
2 months ago
Hi, can I know what happened?
2 months ago
A host's networking locked up.
2 months ago
✅ The ticket Database performance issue has been marked as completed.
2 months ago
great to know, would a high availability pg cluster prevent that from happening in the future or was that happening on the service itself? looking for ways to prevent that from happening again.
2 months ago
Unlikely, since something could go wrong with the pooler service, there's still a single point of failure.
2 months ago
there's probably someway to replicate that, for the service would replicas do the trick? i dont know if they're deployed to the same host
2 months ago
They are not deployed on the same host, but then your own code would have to handle fallback to another pooler if one isn't available, since we don't handle that on the private network
2 months ago
probably i would also need an API gateway to automatically fail over in case a service replica goes down, damn HA is hard 💀
2 months ago
fair enough, will look into ways, thanks brody
2 months ago
thread can be closed
2 months ago
No problem!
2 months ago
!s
Status changed to Solved brody • 3 months ago