Unable to reach public URL of other service during build step

efstajas

PROOP

a year ago

We have a GQL API hosted on Railway, and in the same environment build our app with graphql-codegen. It connects to the API to make an introspection query, and then builds types based on the results.

Unfortunately, relatively frequently (but weirdly not always), this steps failed because the API couldn't be reached. Here's an excerpt from the logs:

#15 265.3 [FAILED] Failed to load schema from https://base-pr-drips-api-app-pr-1415.up.railway.app/:

15 265.3 [FAILED]

15 265.3 [FAILED] connect ETIMEDOUT 35.212.162.221:443

15 265.3 [FAILED] Error: connect ETIMEDOUT 35.212.162.221:443

15 265.3 [FAILED] at TCPConnectWrap.afterConnect as oncomplete

We have a total of 3 services in this environment that all run the exact same graphql codegen step during build — all three kicked off deployment in the same moment, for two of them it worked, for this one it didn't and retrying the build also doesn't help.

We've seen this happen a few times before, and usually after a while retrying the build (with equal configuration) would make it work. The build already retries the codegen 10 times, but sometimes (like right now) that's not enough.

Some observations:

While this issue occurs, the GQL API at https://base-pr-drips-api-app-pr-1415.up.railway.app/ is up and responding to introspection queries fast. I've verified this also with a request from my local machine.
Within the build step, it seems to resolve the public URL to 35.212.162.221, but when I dig the domain locally it resolves to 35.214.184.4. Might be expected?
The issue seems to occur more frequently for newly-created environments. In this particular instance, the environment was just created about 30 minutes ago in response to a new PR.

I'm confused because as far as I can understand, there are no limitations on being able to reach public domains during build steps. Are we doing something wrong or is this a bug / temporary regression?

View Deploy details

ⓘ Deployment information is only viewable by project members and Railway employees.

Solved

5 Replies

efstajas

PROOP

a year ago

... and as expected, now an hour later, it just works. Same config, same everything.

brody

EMPLOYEE

a year ago

Hello,

Do you know how many requests you are making? if you are making more than 100 concurrent connections from a single IP you will be rate limited.

I would recommend running whatever is making all the http requests as a pre-deploy command -

https://docs.railway.com/guides/pre-deploy-command

That way you can use the private network to make the http requests as there are no limitations for the private network.

And of course make sure you are using a health check for zero downtime deployments.

Best,
Brody

Status changed to Awaiting User Response Railway • about 1 year ago

efstajas

PROOP

a year ago

Hey!

Do you know how many requests you are making? if you are making more than 100 concurrent connections from a single IP you will be rate limited.

I wasn't aware that a rate limit was in place for the public service URLs. To be clear, this only applies to the public URL, not internal domains, correct? Actually, being rate limited matches the behavior perfectly. It'd explain why it starts working a while later.

... Actually, this also would actually explain a few outages we had a while back that we weren't able to explain. Some of our services (which at the time were all using public URLs to interact with the API) would suddenly be unable to talk to our API, bringing down our entire app. The problem would resolve itself eventually after a while. We actually reached out on Discord about this, maybe you remember.

Is this rate limit with all its details documented somewhere?

Also, do all services within a single env share the same public IP when making external requests? As in, would one service hitting our API's public URL a lot prevent all other services within the same environment from accessing it too?

I would recommend running whatever is making all the http requests as a pre-deploy command -
https://docs.railway.com/guides/pre-deploy-command

That way you can use the private network to make the http requests as there are no limitations for the private network.

That's amazing to know that there's a way to use internal networking within a build step now. We'll switch to this. Thanks.

Best,

Jason

Status changed to Awaiting Railway Response Railway • about 1 year ago

efstajas

PROOP

a year ago

Sorry, one more thing. Just saw this disclaimer on the pre-deploy commands docs:

Pre-deploy commands execute in a separate container from your application. Changes to the filesystem are not persisted.

So this would mean that it's not suitable for running graphql-codegen, correct? It needs to write types to the application code. And we need to run this before actually building the app because the build step relies on the generated types being present. So IIUC there's still no way to use internal networking for this?

ray-chen

EMPLOYEE

a year ago

Yes, there's currently no way to do that.

base-pr-drips-api-app-pr-1415

Has the service for this domain been fully spun up and responding to requests while this is happening? i.e. if you browse to https://base-pr-drips-api-app-pr-1415.up.railway.app during the build, is it available?

Status changed to Awaiting User Response Railway • about 1 year ago

Railway

BOT

7 months ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway • 7 months ago