serviceInstanceUpdate with multiRegionConfig updates UI but doesn't scale actual instances

nikigan

PROOP

5 months ago

1. Using the serviceInstanceUpdate mutation with multiRegionConfig to set numReplicas:

mutation ServiceInstanceUpdate($serviceId: String!, $environmentId: String!, $input: ServiceInstanceUpdateInput!) {

serviceInstanceUpdate(serviceId: $serviceId, environmentId: $environmentId, input: $input)

}

# Variables:

{

"serviceId": "xxx",

"environmentId": "xxx",

"input": {

"multiRegionConfig": {

"us-east4-eqdc4a": {

"numReplicas": 3

}

2. The mutation returns true, indicating success.

What I expected:

The service should scale to the specified number of replicas, similar to when I manually change the replica count in the Railway dashboard and click "Apply Changes".

What actually happens:

- The numReplicas value updates in the Railway UI

- However, the actual number of running instances does not change

Is there an additional API call required to apply staged changes after serviceInstanceUpdate? I've tried looking for mutations like serviceInstanceDeploy or environmentPatchCommitStaged but

I'm not sure which one (if any) is needed to apply replica scaling changes specifically.

I also initially tried using numReplicas directly in ServiceInstanceUpdateInput but found it was deprecated in favor of multiRegionConfig.

Solved$25 Bounty

Pinned Solution

nikigan

PROOP

5 months ago

It is not really conviniet to scale with redeploy and it's not working like that when you do it through UI. I found out that combination of stageEnvironmentChanges and environmentPatchCommitStaged works best to replicate default Dashboard UI behaivour.

mutation stageEnvironmentChanges($environmentId: String!, $payload: EnvironmentConfig!, $merge: Boolean) {

environmentStageChanges(

environmentId: $environmentId

input: $payload

merge: $merge

) {

}

mutation environmentPatchCommitStaged($environmentId: String!, $message: String, $skipDeploys: Boolean) {

environmentPatchCommitStaged(

environmentId: $environmentId

commitMessage: $message

skipDeploys: $skipDeploys

)

}

2 Replies

ayitsomar

HOBBY

5 months ago

What you’re seeing is very typical of Railway’s staged configuration model, and your intuition is correct:

👉serviceInstanceUpdate usually does NOT trigger a deployment.

It writes configuration.

It does not reconcile runtime state.

Think of it like editing Terraform without running apply.

What is actually happening internally

When you call:

serviceInstanceUpdate → returns true

Railway is saying:

“Config accepted.”

NOT:

“Infrastructure reconciled.”

So the flow becomes:

Update config
↓
Config is staged
↓
Runtime stays unchanged

That is why the UI shows the new replica count but nothing scales.

When you press “Apply Changes” in the dashboard, Railway triggers a deployment event behind the scenes.

Your mutation is only doing step 1.

The missing step (this is the important part)

You must trigger a deployment / redeploy after updating the instance config.

Historically, Railway does this via one of these GraphQL operations:

Most likely required mutation:

serviceInstanceRedeploy

or sometimes:

serviceRedeploy

Example pattern:

mutation Redeploy($serviceId: String!, $environmentId: String!) {
  serviceInstanceRedeploy(
    serviceId: $serviceId
    environmentId: $environmentId
  )
}

If your schema differs slightly, search introspection for:

redeploy

deployment

apply

reconcile

Railway nearly always exposes one.

Why Railway separates these

This is deliberate infra design.

They avoid automatic deploys because config updates can be batched:

Example:

replicas → CPU → memory → regions

One deploy.

Not four.

This is standard in modern platform engineering (Render, Fly.io, ECS patterns).

Important subtlety (many people miss this)

Replica scaling is not purely horizontal autoscaling in Railway.

It is treated as a deployment topology change.

That means:

new containers must be scheduled

networking updated

health checks registered

load balancer targets updated

So Railway requires a reconciliation event.

One more thing to verify (very important)

Make sure your service is not locked to a single-instance runtime, such as:

volume-attached services
certain TCP modes
legacy instance types

Some cannot scale horizontally.

If that were the case though, the UI usually blocks it. So this is less likely given your config updates.

Strong Recommendation (Production-grade approach)

Instead of thinking:

update → hope Railway scales

Treat Railway like a mini control plane:

Correct sequence:

1. serviceInstanceUpdate
2. serviceInstanceRedeploy
3. poll deployment status
4. verify replica count

You want this automated anyway if QuantZK is heading toward serious infra maturity.

Advanced Insight (worth knowing)

If redeploy does NOT scale replicas, the next likely cause is:

Builder capacity exhaustion in that region.

Your earlier PyPI timeout strongly hints Railway may be under regional pressure today.

Try:

us-central
us-west

as a test region.

If scaling suddenly works → capacity issue confirmed.

Quick Diagnostic (30 seconds)

After calling redeploy, query:

deployments {
  status
}

If you see:

PENDING
QUEUED

for a long time → infra capacity problem.

Not your mutation.

Bottom Line

You did nothing wrong.

You are just missing the reconciliation step.

👉Yes, you almost certainly need a redeploy mutation.

serviceInstanceUpdate alone will not scale runtime instances.

ayitsomar

What you’re seeing is very typical of **Railway’s staged configuration model,** and your intuition is correct: 👉`serviceInstanceUpdate` usually **does NOT trigger a deployment**. It writes configuration. It does not reconcile runtime state. Think of it like editing Terraform without running `apply`. ## What is actually happening internally When you call: ``` serviceInstanceUpdate → returns true ``` Railway is saying: > “Config accepted.” NOT: > “Infrastructure reconciled.” So the flow becomes: ``` Update config ↓ Config is staged ↓ Runtime stays unchanged ``` That is why the UI shows the new replica count but nothing scales. When you press **“Apply Changes”** in the dashboard, Railway triggers a deployment event behind the scenes. Your mutation is only doing step 1. ## The missing step (this is the important part) You must trigger a **deployment / redeploy** after updating the instance config. Historically, Railway does this via one of these GraphQL operations: ### Most likely required mutation: ``` serviceInstanceRedeploy ``` or sometimes: ``` serviceRedeploy ``` Example pattern: ```graphql mutation Redeploy($serviceId: String!, $environmentId: String!) { serviceInstanceRedeploy( serviceId: $serviceId environmentId: $environmentId ) } ``` If your schema differs slightly, search introspection for: `redeploy` `deployment` `apply` `reconcile` Railway nearly always exposes one. ## Why Railway separates these This is deliberate infra design. They avoid automatic deploys because config updates can be batched: Example: ``` replicas → CPU → memory → regions ``` One deploy. Not four. This is standard in modern platform engineering (Render, [Fly.io](http://Fly.io), ECS patterns). ## Important subtlety (many people miss this) Replica scaling is **not** purely horizontal autoscaling in Railway. It is treated as a **deployment topology change**. That means: new containers must be scheduled networking updated health checks registered load balancer targets updated So Railway requires a reconciliation event. ## One more thing to verify (very important) Make sure your service is **not locked to a single-instance runtime**, such as: * volume-attached services * certain TCP modes * legacy instance types Some cannot scale horizontally. If that were the case though, the UI usually blocks it. So this is less likely given your config updates. ## Strong Recommendation (Production-grade approach) Instead of thinking: ``` update → hope Railway scales ``` Treat Railway like a mini control plane: ### Correct sequence: ``` 1. serviceInstanceUpdate 2. serviceInstanceRedeploy 3. poll deployment status 4. verify replica count ``` You want this automated anyway if QuantZK is heading toward serious infra maturity. ## Advanced Insight (worth knowing) If redeploy does NOT scale replicas, the next likely cause is: ### Builder capacity exhaustion in that region. Your earlier PyPI timeout strongly hints Railway may be under regional pressure today. Try: ``` us-central us-west ``` as a test region. If scaling suddenly works → capacity issue confirmed. ## Quick Diagnostic (30 seconds) After calling redeploy, query: ``` deployments { status } ``` If you see: ``` PENDING QUEUED ``` for a long time → infra capacity problem. Not your mutation. ## Bottom Line You did nothing wrong. You are just missing the reconciliation step. 👉**Yes, you almost certainly need a redeploy mutation.** `serviceInstanceUpdate` alone will not scale runtime instances.

nikigan

PROOP

5 months ago

mutation stageEnvironmentChanges($environmentId: String!, $payload: EnvironmentConfig!, $merge: Boolean) {

environmentStageChanges(

environmentId: $environmentId

input: $payload

merge: $merge

) {

}

mutation environmentPatchCommitStaged($environmentId: String!, $message: String, $skipDeploys: Boolean) {

environmentPatchCommitStaged(

environmentId: $environmentId

commitMessage: $message

skipDeploys: $skipDeploys

)

}

Status changed to Solved noahd • 5 months ago

Welcome!