5 months ago
I'm experience a huge different in performance between my local dev environment and the railway setup.
I have a payloadcms app and even simply browsing the admin feel slow, some process can take up to 10 time longer than on dev environments.
What can I do to find out what is taking some much time on production server ?
Project ID: 9bd6efc6-3bb8-4c31-b1bf-885307cf2f9e
0 Replies
5 months ago
you have your database in a different region than your other services, everything should be in the same region
I'm talking about this service:
b06a97c3-f560-4355-ad23-6aaeaf9b5e73
I moved it back to the same region as the database (I tested the metal to see if it did perform better on it)
5 months ago
did you delete the project?
5 months ago
please have everything in the same region
Here i'm just testing the cms / db relationship, I have added sentry on the payloadcms and I can see some things like this on requests
5 months ago
are you connecting for every db query?
After some investigation/updates/testing, I can reduce the issue database, I have setup a postgresql ha from the template provided by railway, but it seem that pgpool is always connecting to the US server even if the query come from the EU application.
I have tested to make 1 pg pool in the eu region and setting up PGPOOL_BACKEND_NODES
with weight to try to make pgpool use the replica in priority
0:${{pg-primary.RAILWAY_PRIVATE_DOMAIN}}:5432:0,
1:${{pg-eu-west.RAILWAY_PRIVATE_DOMAIN}}:5432:1,
2:${{pg-us-west.RAILWAY_PRIVATE_DOMAIN}}:5432:0
But i still get an aweful latency.
Here is some stats:
Connected to pg-eu directly: 400ms average page load
Connected to pgpool both server: 2000ms average page load
Connect to pgpool with only the EU server: 400ms average page load
After enabling the pgpool logs:
I can see he is redirecting all my requests to the primary node :
2025-02-24 11:08:11.420: main pid 3: LOG: find_primary_node: primary node is 0
2025-02-24 11:08:56.246: child pid 137: LOG: new connection received
2025-02-24 11:08:56.098: [unknown] pid 148: LOG: DB node id: 0 backend pid: 1253 statement: Bind: select distinct
5 months ago
can you move pgpool and all the nodes into the same region as your main application?
They are, I could pin point the issue to pgpool.
Here is my testing setup:
PG admin on europe-west4
PG Pool on europe-west4
1 PG Replica on europe-west4, 1 PG Primary on us-west1
I connect on pg admin (EU) and run a basic select query
query on pg-0.railway.internal (US) : Total query runtime: 1 secs 2 msec.
query on pg-1.railway.internal (EU) : Total query runtime: 102 msec.
query on pgpool-eu.railway.internal (EU): Total query runtime: 804 msec.
If i run show pool_nodes
on pgpool, I get this result, and in the pgpool logs I can see that the queries are going to the pg-1 server but some unrelated queries are made to pg-0 which I guess, slow the whole process.
there will not be a lot of writing to the databasebut it is important is that I get fast read performance on all regions
Unless you have an other solution for a single app to serve request from all regions in a fast and efficient way, for now I enable multi region replica on my main app, used the RAILWAYREPLICAREGION env variable internally to pick a database url and pointing to a different pg pool in us or eu based on the region
And to be honest it seem to do the trick as it's fast if I bypass pgpool and connect directly to my read replica but unfortunatly I need to be able to make some write queries too 😦
5 months ago
unfortunately we cannot yet modify the laws of physics, so if your app or pgpool is connecting to a database in a region far away there will be increased latency
yes, I don't have issue with that, my issue is more why pgpool connect to the us db when it's not needed
5 months ago
i'm sorry but I don't have an answer for that, i think that would need to be asked in a forum of people who know pgpool