Slow Streaming Issue on Production Server
mortezamehrabi
PROOP

3 months ago

Hello Railway Support Team,

We are experiencing a significant delay in streaming data from our Node.js/Next.js API when deployed on Railway. The behavior is as follows:

  • On local development, streaming chunks from our LLM API arrive almost instantly:

    chunk_0_time: 2.156ms
    chunk_1_time: 0.014ms
    chunk_2_time: 0.014ms
    ...
    
  • On production server (Railway), only the first one or two chunks arrive immediately, and then there is a very long delay (up to 30 seconds or more) before subsequent chunks arrive:

    chunk_0_time: 1.698ms
    chunk_1_time: 1.252ms
    ...
    chunk_767 received after 30s delay
    
  • All other server metrics, including console.time("streamStep"), show that the LLM response itself is generated quickly. The delay seems specific to streaming the chunks over the WebSocket.

Environment:

  • Node.js / Next.js API

  • WebSocket (Socket.IO) streaming

  • LLM: OpenAI via LangChain

  • Railway Deployment

We suspect the issue might be related to:

  • Network buffering on Railway's container

  • WebSocket traffic handling or throttling

  • Any default proxy / timeout configuration

Could you please advise on why the streaming is delayed on Railway and how we can fix it so that streaming chunks arrive in real-time as in local development?

Thank you!

$30 Bounty

4 Replies

Railway
BOT

3 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


mortezamehrabi
PROOP

3 months ago

Thanks for the suggestion!

The issue is not related to outbound WebSocket connections. Even when using plain HTTP requests (no WebSocket involved), we observe the same long delay between chunks on the production server.

It seems the delay might be happening somewhere between our server and the OpenAI API, or possibly due to network routing/latency in the Railway environment. We are not sure exactly where, but it’s definitely not a client-side or WebSocket issue.

Could you please help us investigate this server-to-server latency issue?


brody
EMPLOYEE

3 months ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open brody 3 months ago


mortezamehrabi

Thanks for the suggestion!The issue is not related to outbound WebSocket connections. Even when using plain HTTP requests (no WebSocket involved), we observe the same long delay between chunks on the production server.It seems the delay might be happening somewhere between our server and the OpenAI API, or possibly due to network routing/latency in the Railway environment. We are not sure exactly where, but it’s definitely not a client-side or WebSocket issue.Could you please help us investigate this server-to-server latency issue?

idiegea21
HOBBY

3 months ago

Heyy, I think the delay on Railway is due to the proxy buffering responses before flushing.
Try explicitly flushing headers in your API:

res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.flushHeaders();

Also check if your framework (next.js or socket.io) has its own buffering layer, sometimes you need to enable streaming mode there too.


noahd
EMPLOYEE

2 months ago

Hey there! Could we get any info on the stack of that project? Would love to know more if possible


Loading...